The low quality of air in Poland is responsible for the anticipated death of 50000 people every year. Not only fatalities, life comfort of inhabitants is impacted with smog triggering chronicle respiratory difficulties, nauseas or headaches. Fight against smog encompasses a large range of action supported by public funds, from promotion of more recent furnace to the local ban of coal combustions.
A cornerstone of public measure lays in the ability to describe an initial state and monitor its evolution; it makes possible the investigation of the expected improvement associated with a strategy. In this respect, a network of station measuring air quality across Poland provides hourly evaluation of various pollutant. The station data are either directly delivered through synthetic maps or less refined as raw data (a measure at a give date and time for a specific pollutant monitored by a peculiar station).
Stations provide air quality estimation in a specific location and provide therefore a localised description of the air quality. The number of station changes with time, with a general trend to increase the number of stations available. In this condition, the stated of air quality is not available for the whole territory.
To account for spacial discrete distribution of station, and variation with time of data acquisition availability the complete network of air quality monitoring can be interpolated so as to obtain a local estimation in any position within the map. This will help representing the general situation of air quality in Poland and also understanding the limit of the current air quality monitoring schemes. The approach presented hereafter is a possible general framework for the generation of interpolated maps along with prediction variance estimations.
The general idea is to retrieve and select air quality data from station scattered across the country and represent them on a map. The stations represent singular points, the scattered values are used for interpolation so as to estimated the air quality over all territory in a raster. The raster, the country boundaries (shape), the station measure (points) are then stacked over an open street map background.
Updated PM10 data are available via an API interface: https://powietrze.gios.gov.pl/pjp/content/api.
The map shape was downloaded from the Geographic Information System of the COmmission (GISCO), a service within Eurostats. The GISCO proposes a nomenclature of territorial units for Statistics, named NUTS. The NUTS of the year 2016 was downloaded before running the application1.
Three Coordinate Reference System are used through this example: world scale data are based on the WGS84 (EPSG 4326) and station localisation is also given in this coordinate system, the European territorial unite are initially served in the ETRS89-extended / LAEA Europe (EPSG 3035), the Polish wide data are projected in ETRS89 / Poland CS92 (EPSG 2180)2. For CRS conversions with R, find out more at https://www.nceas.ucsb.edu/sites/default/files/2020-04/OverviewCoordinateReferenceSystems.pdf.
httr
(Wickham 2019) and jsonlite
(Ooms 2014) and geojsonio
(Chamberlain and Teucher 2020).sp
(Pebesma and Bivand 2005; Bivand, Pebesma, and Gomez-Rubio 2013), rgdal
(Bivand, Keitt, and Rowlingson 2019), raster
(Hijmans 2020) and gstat
(Gräler, Pebesma, and Heuvelink 2016).leaflet
and htmlwidgets
.library(httr);
library(jsonlite);
library(rgdal);
library(raster);
library(gstat);
library(leaflet);
library(htmlwidgets);
dir_wd <- file.path(".");
setwd(dir_wd);
paths <- list(
f01 = file.path(
'extdep', 'gisco', 'ref-nuts-2016-60m', 'NUTS_RG_60M_2016_3035'
),
f02 = file.path(getwd(), "docs", "html", "html_widget", "PM10_map.html"),
f03 = file.path("data", "station.rda"),
f04 = file.path("data", "sensor.rda"),
f05 = file.path("data", "measure.rda"),
f06 = file.path("data", "shape.rda")
);
write_ref <- c(TRUE, FALSE)[2];
use_ref <- c(TRUE, FALSE)[1];
The API interface was used to query the list of available stations.
path <- "http://api.gios.gov.pl/pjp-api/rest/station/findAll";
request <- httr::GET(url = path);
response <- httr::content(request, as = "text", encoding = "UTF-8");
station <- jsonlite::fromJSON(response, flatten = TRUE);
head(station)
if(write_ref) {
save(station, file = paths$f03)
} else if(use_ref){
load(paths$f03, verbose = TRUE);
} else {}
Then, for every station, the list of sensors were stacked in a table.
sensor <- lapply(
X = station$id,
FUN = function(x){
y <- paste0("http://api.gios.gov.pl/pjp-api/rest/station/sensors/", x);
y <- httr::GET(url = y);
y <- httr::content(y, as = "text", encoding = "UTF-8");
y <- jsonlite::fromJSON(y, flatten = TRUE);
return(y);
}
);
sensor <- do.call(sensor, what = rbind);
head(sensor);
if(write_ref) {
save(sensor, file = paths$f04)
} else if(use_ref) {
load(paths$f04, verbose = TRUE);
} else {}
Finally, measures are extracted for a selected air quality parameter, (e.g. PM10).
param_name <- unique(sensor$param.paramName);
measure <- subset(sensor, param.paramName == "pył zawieszony PM10");
measure <- split( measure, f = measure$id);
measure <- lapply(
X = measure,
FUN = function(x){
# print(x$id);
y <- paste0("http://api.gios.gov.pl/pjp-api/rest/data/getData/", x$id);
y <- httr::GET(url = y);
y <- httr::content(y, as = "text", encoding = "UTF-8");
y <- jsonlite::fromJSON(y, flatten = TRUE);
y <- merge(x, y$value);
return(y);
}
);
measure <- do.call(rbind, measure);
head(measure)
date_range <- range(
strptime(unique(measure$date), format = "%Y-%m-%d %H:%M:%S")
);
if(write_ref) {
save(measure, date_range, file = paths$f05)
} else if(use_ref) {
load(paths$f05, verbose = TRUE);
} else {}
The shape data were downloaded querying the GISCO API. Eventually, the shape file could be downloaded3 and then loaded into the session. The Shape is then restricted to the voivodeships (second administrative division) in Poland. The shape Coordinate Reference System was transformed in the World Geodetic System 1984 (WGS84).
gis_sh <- geojsonio::geojson_read(
file.path(
"https://ec.europa.eu/eurostat/cache",
"GISCO/distribution/v2/nuts/geojson",
"NUTS_RG_60M_2016_3035_LEVL_2.geojson"
),
what = "sp"
);
gis_sh <- subset(gis_sh, CNTR_CODE == "PL");
# Conversion to ETRS89 Poland CS92 (EPSG 2180)
gis_sh <- sp::spTransform(gis_sh, sp::CRS("+init=epsg:2180"));
if(write_ref) {
save(gis_sh, file = paths$f06)
} else if (use_ref){
load(paths$f06, verbose = TRUE);
} else {}
sp::plot(gis_sh);
gis_sh <- rgdal::readOGR(dsn = paths$f01, layer = "NUTS_RG_60M_2016_3035");
gis_sh <- subset( gis_sh, CNTR_CODE == "PL" & LEVL_CODE == 2);
# Conversion to ETRS89 Poland CS92
gis_sh <- sp::spTransform(gis_sh, sp::CRS("+init=epsg:2180"));
if(write_ref) {
save(gis_sh, file = paths$f06)
} else if (use_ref){
load(paths$f06, verbose = TRUE);
} else {}
sp::plot(gis_sh);
The PM10 data prepared earlier were restricted to the latest date and time with recorded by a sufficient number of station. The points were then added to the available shape previously obtained.
library(ggplot2);
dtaplot <- aggregate(
value ~ date + param.paramCode,
data = subset(measure, !is.na(value)),
FUN = length
)
ggplot(
data = dtaplot,
mapping = aes(y = value, x = date)
) + geom_point(
) + coord_cartesian(
ylim = range(c(dtaplot$value, 0))
) + geom_hline(
yintercept = quantile(dtaplot$value, probs = 0.85)
) + theme(
axis.text.x=element_text(angle = 45, hjust = 1),
);
sel_date <- with(measure, table(date[!is.na(value)]));
sel_date <- sel_date[order(names(sel_date))];
sel_date <- sel_date[sel_date >= quantile(sel_date, probs = 0.85)];
sel_date <- names(rev(sel_date))[1];
if(use_ref) { sel_date <- "2020-04-16 14:00:00" }
fetch_pts <- function(
datetime,
measure = measure, station = station
){
gis_pts <- subset(measure, date == datetime);
gis_pts <- subset(gis_pts, !(is.na(value)));
gis_pts <- merge(
x = station, by.x = "id",
y = gis_pts, by.y = "stationId"
);
gis_pts <- within(
gis_pts,
{
lon <- as.numeric(gegrLon);
lat <- as.numeric(gegrLat);
}
);
gis_pts <- sp::SpatialPointsDataFrame(
coords = gis_pts[c('lon', 'lat')], data = gis_pts,
proj4string = sp::CRS("+init=epsg:4326")
);
gis_pts <- sp::spTransform(gis_pts, sp::CRS("+init=epsg:2180"));
return(gis_pts);
}
gis_pts <- fetch_pts(datetime = sel_date, measure = measure, station = station);
sp::plot(gis_sh);
sp::plot(gis_pts, add = TRUE);
Density of points was heterogeneous, while some area were poorly monitored, some other area cumulated locally many stations. The map was then divided into a grid composed of 300 rows and 300 columns, defining therefore 90000 squares; the grid and squares, are later referred as raster and pixels. The value of each pixel was estimated as the average of the local value falling in the area covered by that pixel.
# I have sample points in Poland, I want an estimation in every point in Poland.
gis_grid <- raster::raster(
ncols=300, nrows=300, ext = extent(gis_sh),
crs = sp::proj4string(gis_sh)
);
gis_raster <- raster::rasterize(
x = gis_pts, field = "value", fun = mean,
y = gis_grid
);
sp::plot(gis_sh);
sp::plot(gis_raster, add= TRUE);
sp::plot(gis_pts, add = TRUE, pch = 1);
A linear interpolation was computed to fill the raster gap.
gis_mod <- gstat::gstat(id = "PM10", formula = value~1, data = gis_pts);
gis_raster_intpl <- raster::interpolate(gis_raster , gis_mod);
gis_raster_intpl <- raster::crop(gis_raster_intpl, raster::extent(gis_sh));
gis_raster_intpl <- raster::mask(gis_raster_intpl, gis_sh);
sp::plot(gis_raster_intpl);
sp::plot(gis_sh, add= TRUE);
sp::plot(gis_pts, add = TRUE, pch = 1);
Eventually, zone contour were delimited and a more appropriate colour scale was proposed.
sp::plot(
gis_raster_intpl, col = colorRampPalette(c("green", "orange", "red4"))(15),
useRaster = TRUE, interpolate = TRUE,
);
sp::plot(gis_sh, add = TRUE, border = 'gray75');
sp::plot(gis_pts, add = TRUE, col = 'blue');
sp::plot(raster::rasterToContour(gis_raster_intpl), add = TRUE)
interpolate_raster <- function(gis_sh = gis_sh, gis_pts = gis_pts){
gis_grid <- raster::raster(
ncols=300, nrows=300, ext = extent(gis_sh),
crs = sp::proj4string(gis_sh)
);
gis_raster <- raster::rasterize(
x = gis_pts, field = "value", fun = mean,
y = gis_grid
);
gis_mod <- gstat::gstat(id = "PM10", formula = value~1, data = gis_pts);
gis_raster_intpl <- raster::interpolate(gis_raster , gis_mod);
gis_raster_intpl <- raster::crop(gis_raster_intpl, raster::extent(gis_sh));
gis_raster_intpl <- raster::mask(gis_raster_intpl, gis_sh);
return(gis_raster_intpl);
}
gis_raster_intpl <- interpolate_raster(gis_sh = gis_sh, gis_pts = gis_pts)
Finally, the raster, the shape and the point were overlayed on a Open Street Map background. The leaflet
package proposes a convenient high-level graphical programming interface were layers are added to each other. Note that a colour scale was refined so as to correspond to Polish standards for which a green-to-red gradient is limited to PM10 concentrations between 0 and 150.
mymax <- 150;
gis_raster_intpl[gis_raster_intpl >= mymax] <- mymax-1;
# Conversion of the objects into CRS WGS84 (epsg 4326)
gis_sh <- sp::spTransform(gis_sh, sp::CRS("+init=epsg:4326"));
gis_pts <- sp::spTransform(gis_pts, sp::CRS("+init=epsg:4326"));
gis_raster_intpl <- projectRaster(
gis_raster_intpl, crs = sp::CRS("+init=epsg:4326")
);
pal <- leaflet::colorNumeric(
colorRampPalette(c("green", "orange", "red4"))(15),
domain = c(0, mymax),
reverse = FALSE,
na.color = NA
);
m <- leaflet::leaflet(gis_sh);
m <- leaflet::addTiles(map = m);
m <- leaflet::addProviderTiles(
map = m, leaflet::providers[[37]]#3, 36, 42, 53,61, 62, 63
);
m <- leaflet::addRasterImage(
map = m, gis_raster_intpl, colors = pal, opacity = .8
);
m <- leaflet::addLegend(
map = m, pal = pal, values = raster::values(gis_raster_intpl)
);
m <- leaflet::addPolylines(
map = m, weight = 1, opacity = .8, color = "#EEEEEE"
);
m <- leaflet::addPolylines(
map = m, weight = 1, opacity = .8, color = "#949494",
data = raster::rasterToContour(gis_raster_intpl)
);
m <- leaflet::addCircleMarkers(
map = m,
lat = gis_pts$lat,
lng = gis_pts$lon,
popup = paste(gis_pts$value, gis_pts$stationName),
clusterOptions = leaflet::markerClusterOptions(),
opacity = 1,
weight = 2,
fillOpacity = 1,
fillColor = pal(ifelse(gis_pts$value >= mymax, mymax -1, gis_pts$value))
);
m <- leaflet::addControl(
map = m,
paste0(
'<div style="text-align:center"><h3>',sel_date, '</h3>',
'<a href="https://fcacollin.github.io/Latarnia">
<img
src="http://fcacollin.github.io/Latarnia/signature/logo_signature.png"
width="100px">
</a>
<p>Administrative boundaries:<br>
© EuroGeographics for the<br>
administrative boundaries</p>
</div>'
),
position = 'bottomright'
);
htmlwidgets::saveWidget(m, file = paths$f02);
# m;
gis_sh <- sp::spTransform(gis_sh, sp::CRS("+init=epsg:2180"));
datetimes <- unique(measure$date);
names(datetimes) <- datetimes;
raster_gif <- lapply(
X = datetimes,
FUN = function(x){
print(x)
gis_pts <- fetch_pts(datetime = x, measure = measure, station = station);
gis_raster_intpl <- interpolate_raster(gis_sh = gis_sh, gis_pts = gis_pts);
mymax <- 150;
gis_raster_intpl[gis_raster_intpl > mymax] <- mymax;
return(gis_raster_intpl);
}
);
raster_gif <- raster_gif[order(names(raster_gif))];
pngdir <- tempdir()
paths_png <- file.path(pngdir, paste0("raster", names(raster_gif), '.png'));
export <- function(layer, path, main){
png(path);
plot(
layer, zlim = c(0, 150), main = main, useRaster = TRUE, interpolate = TRUE,
col = colorRampPalette(c("green", "orange", "red4"))(15)
);
sp::plot(gis_sh, add = TRUE, border = 'gray75');
sp::plot(raster::rasterToContour(layer), add = TRUE, col = 'gray35')
dev.off();
}
# <http://adv-r.had.co.nz/Functionals.html>
Map(export, raster_gif, paths_png, names(raster_gif));
system(paste0("convert -delay 40 ", pngdir, "/ra*.png docs/img/movie.gif"));
# animate(
# stack(rev(raster_gif)),
# col = colorRampPalette(c("green", "orange", "red4"))(20),
# pause=.05, n = 1
# )
## R version 3.6.3 (2020-02-29)
## Platform: x86_64-pc-linux-gnu (64-bit)
## Running under: Debian GNU/Linux 10 (buster)
##
## Matrix products: default
## BLAS: /usr/lib/x86_64-linux-gnu/openblas/libblas.so.3
## LAPACK: /usr/lib/x86_64-linux-gnu/libopenblasp-r0.3.5.so
##
## locale:
## [1] LC_CTYPE=en_GB.UTF-8 LC_NUMERIC=C
## [3] LC_TIME=en_GB.UTF-8 LC_COLLATE=en_GB.UTF-8
## [5] LC_MONETARY=en_GB.UTF-8 LC_MESSAGES=en_GB.UTF-8
## [7] LC_PAPER=en_GB.UTF-8 LC_NAME=C
## [9] LC_ADDRESS=C LC_TELEPHONE=C
## [11] LC_MEASUREMENT=en_GB.UTF-8 LC_IDENTIFICATION=C
##
## attached base packages:
## [1] stats graphics grDevices utils datasets methods base
##
## other attached packages:
## [1] ggplot2_3.3.0 htmlwidgets_1.5.1 leaflet_2.0.3 gstat_2.0-5
## [5] raster_3.0-12 rgdal_1.4-8 sp_1.4-1 jsonlite_1.6.1
## [9] httr_1.4.1
##
## loaded via a namespace (and not attached):
## [1] Rcpp_1.0.4.6 lattice_0.20-41 FNN_1.1.3
## [4] png_0.1-7 leaflet.providers_1.9.0 class_7.3-16
## [7] zoo_1.8-7 assertthat_0.2.1 digest_0.6.25
## [10] V8_3.0.2 R6_2.4.1 evaluate_0.14
## [13] e1071_1.7-3 geojson_0.3.2 highr_0.8
## [16] pillar_1.4.3 rlang_0.4.5 lazyeval_0.2.2
## [19] curl_4.3 geojsonio_0.9.2 DT_0.13
## [22] rmarkdown_2.1 labeling_0.3 urltools_1.7.3
## [25] jqr_1.1.0 stringr_1.4.0 foreign_0.8-76
## [28] triebeard_0.3.0 munsell_0.5.0 compiler_3.6.3
## [31] xfun_0.13 base64enc_0.1-3 pkgconfig_2.0.3
## [34] rgeos_0.5-2 htmltools_0.4.0 tidyselect_1.0.0
## [37] tibble_3.0.0 httpcode_0.3.0 intervals_0.15.2
## [40] codetools_0.2-16 fansi_0.4.1 spacetime_1.2-3
## [43] withr_2.1.2 dplyr_0.8.5 crayon_1.3.4
## [46] sf_0.9-1 crul_0.9.0 grid_3.6.3
## [49] gtable_0.3.0 lifecycle_0.2.0 DBI_1.1.0
## [52] magrittr_1.5 units_0.6-6 scales_1.1.0
## [55] KernSmooth_2.23-16 cli_2.0.2 stringi_1.4.6
## [58] farver_2.0.3 ellipsis_0.3.0 xts_0.12-0
## [61] vctrs_0.2.4 tools_3.6.3 glue_1.4.0
## [64] purrr_0.3.3 crosstalk_1.1.0.1 yaml_2.2.1
## [67] colorspace_1.4-1 maptools_0.9-9 classInt_0.4-3
## [70] knitr_1.28
The Polish GOŚ API is a easy-to-use interface to query the recent measure accounting for air quality at a national scale. The projection of this this information to a map layer and linear interpolation is rather straightforward. The high-level function leaflet is of particular interest to combine the different map layers into a fashionable interactive map easily integrated into a web page as a html widget.
If the map can be updated, the information projected is nonetheless static. It could be of interest to add a reactive component such as proposed by the Shiny
package so as to be able to choose a possible date. What is more, an animation cycling through available measure date and time would help perceiving the temporal trend.
Bivand, Roger, Tim Keitt, and Barry Rowlingson. 2019. Rgdal: Bindings for the ’Geospatial’ Data Abstraction Library. https://CRAN.R-project.org/package=rgdal.
Bivand, Roger S., Edzer Pebesma, and Virgilio Gomez-Rubio. 2013. Applied Spatial Data Analysis with R, Second Edition. Springer, NY. http://www.asdar-book.org/.
Chamberlain, Scott, and Andy Teucher. 2020. Geojsonio: Convert Data from and to ’Geojson’ or ’Topojson’. https://CRAN.R-project.org/package=geojsonio.
Gräler, Benedikt, Edzer Pebesma, and Gerard Heuvelink. 2016. “Spatio-Temporal Interpolation Using Gstat.” The R Journal 8 (1): 204–18. https://doi.org/10.32614/rj-2016-014.
Hijmans, Robert J. 2020. Raster: Geographic Data Analysis and Modeling. https://CRAN.R-project.org/package=raster.
Ooms, Jeroen. 2014. “The Jsonlite Package: A Practical and Consistent Mapping Between Json Data and R Objects.” arXiv:1403.2805 [stat.CO]. https://arxiv.org/abs/1403.2805.
Pebesma, Edzer J., and Roger S. Bivand. 2005. “Classes and Methods for Spatial Data in R.” R News 5 (2): 9–13. https://CRAN.R-project.org/doc/Rnews/.
Wickham, Hadley. 2019. Httr: Tools for Working with Urls and Http. https://CRAN.R-project.org/package=httr.