1 Introduction

The low quality of air in Poland is responsible for the anticipated death of 50000 people every year. Not only fatalities, life comfort of inhabitants is impacted with smog triggering chronicle respiratory difficulties, nauseas or headaches. Fight against smog encompasses a large range of action supported by public funds, from promotion of more recent furnace to the local ban of coal combustions.

A cornerstone of public measure lays in the ability to describe an initial state and monitor its evolution; it makes possible the investigation of the expected improvement associated with a strategy. In this respect, a network of station measuring air quality across Poland provides hourly evaluation of various pollutant. The station data are either directly delivered through synthetic maps or less refined as raw data (a measure at a give date and time for a specific pollutant monitored by a peculiar station).

Stations provide air quality estimation in a specific location and provide therefore a localised description of the air quality. The number of station changes with time, with a general trend to increase the number of stations available. In this condition, the stated of air quality is not available for the whole territory.

To account for spacial discrete distribution of station, and variation with time of data acquisition availability the complete network of air quality monitoring can be interpolated so as to obtain a local estimation in any position within the map. This will help representing the general situation of air quality in Poland and also understanding the limit of the current air quality monitoring schemes. The approach presented hereafter is a possible general framework for the generation of interpolated maps along with prediction variance estimations.

2 Material and methods

2.1 Principle

The general idea is to retrieve and select air quality data from station scattered across the country and represent them on a map. The stations represent singular points, the scattered values are used for interpolation so as to estimated the air quality over all territory in a raster. The raster, the country boundaries (shape), the station measure (points) are then stacked over an open street map background.

2.2 Data

Updated PM10 data are available via an API interface: https://powietrze.gios.gov.pl/pjp/content/api.

The map shape was downloaded from the Geographic Information System of the COmmission (GISCO), a service within Eurostats. The GISCO proposes a nomenclature of territorial units for Statistics, named NUTS. The NUTS of the year 2016 was downloaded before running the application1.

Three Coordinate Reference System are used through this example: world scale data are based on the WGS84 (EPSG 4326) and station localisation is also given in this coordinate system, the European territorial unite are initially served in the ETRS89-extended / LAEA Europe (EPSG 3035), the Polish wide data are projected in ETRS89 / Poland CS92 (EPSG 2180)2. For CRS conversions with R, find out more at https://www.nceas.ucsb.edu/sites/default/files/2020-04/OverviewCoordinateReferenceSystems.pdf.

2.3 Packages

  • The API queries were managed with httr (Wickham 2019) and jsonlite (Ooms 2014) and geojsonio (Chamberlain and Teucher 2020).
  • The geographic data were handled with sp (Pebesma and Bivand 2005; Bivand, Pebesma, and Gomez-Rubio 2013), rgdal (Bivand, Keitt, and Rowlingson 2019), raster (Hijmans 2020) and gstat (Gräler, Pebesma, and Heuvelink 2016).
  • The interactive rendering was managed with leaflet and htmlwidgets.
library(httr);
library(jsonlite);
library(rgdal);
library(raster);
library(gstat);
library(leaflet);
library(htmlwidgets);

dir_wd <- file.path(".");
setwd(dir_wd);
paths <- list(
  f01 = file.path(
    'extdep', 'gisco', 'ref-nuts-2016-60m', 'NUTS_RG_60M_2016_3035'
    ),
  f02 = file.path(getwd(), "docs", "html", "html_widget", "PM10_map.html"),
  f03 = file.path("data", "station.rda"),
  f04 = file.path("data", "sensor.rda"),
  f05 = file.path("data", "measure.rda"),
  f06 = file.path("data", "shape.rda")
);

write_ref <- c(TRUE, FALSE)[2];
use_ref   <- c(TRUE, FALSE)[1];

3 Results

3.1 PM10 data

The API interface was used to query the list of available stations.

path     <- "http://api.gios.gov.pl/pjp-api/rest/station/findAll";
request  <- httr::GET(url = path);
response <- httr::content(request, as = "text", encoding = "UTF-8");
station  <- jsonlite::fromJSON(response, flatten = TRUE);

head(station)
if(write_ref) {
  save(station, file = paths$f03)
} else if(use_ref){
  load(paths$f03, verbose = TRUE);
} else {}

Then, for every station, the list of sensors were stacked in a table.

sensor <- lapply(
  X   = station$id, 
  FUN = function(x){

    y <- paste0("http://api.gios.gov.pl/pjp-api/rest/station/sensors/", x);
    y <- httr::GET(url = y);
    y <- httr::content(y, as = "text", encoding = "UTF-8");
    y <- jsonlite::fromJSON(y, flatten = TRUE);

    return(y);
  }
  );

sensor <- do.call(sensor, what = rbind);

head(sensor);
if(write_ref) {
  save(sensor, file = paths$f04)
} else if(use_ref) {
  load(paths$f04, verbose = TRUE);
} else {}

Finally, measures are extracted for a selected air quality parameter, (e.g. PM10).

param_name <- unique(sensor$param.paramName);

measure    <- subset(sensor, param.paramName == "pył zawieszony PM10");
measure    <- split( measure, f = measure$id);

measure <- lapply(
  X   = measure,
  FUN = function(x){

    #     print(x$id);
    y <- paste0("http://api.gios.gov.pl/pjp-api/rest/data/getData/", x$id);
    y <- httr::GET(url = y);
    y <- httr::content(y, as = "text", encoding = "UTF-8");
    y <- jsonlite::fromJSON(y, flatten = TRUE);
    y <- merge(x, y$value);

    return(y);
  }
  );

measure <- do.call(rbind, measure);

head(measure)

date_range <- range(
  strptime(unique(measure$date), format = "%Y-%m-%d %H:%M:%S")
  );
if(write_ref) {
  save(measure, date_range, file = paths$f05)
} else if(use_ref) {
  load(paths$f05, verbose = TRUE);
} else {}

3.2 GIS objects

3.2.1 Shape

The shape data were downloaded querying the GISCO API. Eventually, the shape file could be downloaded3 and then loaded into the session. The Shape is then restricted to the voivodeships (second administrative division) in Poland. The shape Coordinate Reference System was transformed in the World Geodetic System 1984 (WGS84).

3.2.1.1 Via API Query

gis_sh <- geojsonio::geojson_read(
  file.path(
    "https://ec.europa.eu/eurostat/cache",
    "GISCO/distribution/v2/nuts/geojson",
    "NUTS_RG_60M_2016_3035_LEVL_2.geojson"
    ),
  what = "sp"
  );
gis_sh <- subset(gis_sh, CNTR_CODE == "PL");
# Conversion to ETRS89 Poland CS92 (EPSG 2180)
gis_sh <- sp::spTransform(gis_sh, sp::CRS("+init=epsg:2180"));
if(write_ref) {
  save(gis_sh, file = paths$f06)
} else if (use_ref){
  load(paths$f06, verbose = TRUE);
} else {}
sp::plot(gis_sh);
Poland shape file.

Poland shape file.

3.2.1.2 Via downloaded files

gis_sh <- rgdal::readOGR(dsn = paths$f01, layer = "NUTS_RG_60M_2016_3035");
gis_sh <- subset(     gis_sh, CNTR_CODE == "PL" & LEVL_CODE == 2);
# Conversion to ETRS89 Poland CS92
gis_sh <- sp::spTransform(gis_sh, sp::CRS("+init=epsg:2180"));
if(write_ref) {
  save(gis_sh, file = paths$f06)
} else if (use_ref){
  load(paths$f06, verbose = TRUE);
} else {}
sp::plot(gis_sh);
Poland shape file.

Poland shape file.

3.2.2 Points

The PM10 data prepared earlier were restricted to the latest date and time with recorded by a sufficient number of station. The points were then added to the available shape previously obtained.

library(ggplot2);
dtaplot <- aggregate(
    value ~ date + param.paramCode,
    data = subset(measure, !is.na(value)),
    FUN = length
    )
ggplot(
  data = dtaplot,
  mapping = aes(y = value, x = date)
  ) + geom_point(
  ) + coord_cartesian(
  ylim = range(c(dtaplot$value, 0))
  ) + geom_hline(
  yintercept = quantile(dtaplot$value, probs = 0.85)
  ) + theme(
  axis.text.x=element_text(angle = 45, hjust = 1),
  );

sel_date <- with(measure, table(date[!is.na(value)]));
sel_date <- sel_date[order(names(sel_date))];
sel_date <- sel_date[sel_date >= quantile(sel_date, probs = 0.85)];
sel_date <- names(rev(sel_date))[1];
if(use_ref) { sel_date <- "2020-04-16 14:00:00" }

fetch_pts <- function(
  datetime,
  measure = measure, station = station
  ){

  gis_pts <- subset(measure, date == datetime);
  gis_pts <- subset(gis_pts, !(is.na(value)));
  gis_pts <- merge(
    x = station, by.x = "id",
    y = gis_pts, by.y = "stationId"
    );

  gis_pts <- within(
    gis_pts,
    {
      lon <- as.numeric(gegrLon);
      lat <- as.numeric(gegrLat);
    }
    );

  gis_pts <- sp::SpatialPointsDataFrame(
    coords = gis_pts[c('lon', 'lat')], data = gis_pts,
    proj4string = sp::CRS("+init=epsg:4326")
    );
  gis_pts <- sp::spTransform(gis_pts, sp::CRS("+init=epsg:2180"));

  return(gis_pts);

}

gis_pts <- fetch_pts(datetime = sel_date, measure = measure, station = station);
sp::plot(gis_sh);
sp::plot(gis_pts, add = TRUE);
Addition of station location to the shape file.

Addition of station location to the shape file.

Density of points was heterogeneous, while some area were poorly monitored, some other area cumulated locally many stations. The map was then divided into a grid composed of 300 rows and 300 columns, defining therefore 90000 squares; the grid and squares, are later referred as raster and pixels. The value of each pixel was estimated as the average of the local value falling in the area covered by that pixel.

3.2.3 Raster

# I have sample points in Poland, I want an estimation in every point in Poland.
gis_grid <- raster::raster(
    ncols=300, nrows=300, ext = extent(gis_sh),
    crs = sp::proj4string(gis_sh)
    ); 

gis_raster <- raster::rasterize(
  x = gis_pts, field = "value", fun = mean,
  y = gis_grid
  );
sp::plot(gis_sh);
sp::plot(gis_raster, add= TRUE); 
sp::plot(gis_pts, add = TRUE, pch = 1);
From station to pixel value estimation.

From station to pixel value estimation.

A linear interpolation was computed to fill the raster gap.

gis_mod <- gstat::gstat(id = "PM10", formula = value~1, data = gis_pts);

gis_raster_intpl <- raster::interpolate(gis_raster , gis_mod);
gis_raster_intpl <- raster::crop(gis_raster_intpl, raster::extent(gis_sh));
gis_raster_intpl <- raster::mask(gis_raster_intpl, gis_sh);
sp::plot(gis_raster_intpl); 
sp::plot(gis_sh, add= TRUE);
sp::plot(gis_pts, add = TRUE, pch = 1);
Linear interpolation of local PM10 measures.

Linear interpolation of local PM10 measures.

Eventually, zone contour were delimited and a more appropriate colour scale was proposed.

sp::plot(
  gis_raster_intpl, col = colorRampPalette(c("green", "orange", "red4"))(15),
  useRaster = TRUE, interpolate = TRUE,
  ); 
sp::plot(gis_sh, add = TRUE, border = 'gray75');
sp::plot(gis_pts, add = TRUE, col = 'blue');
sp::plot(raster::rasterToContour(gis_raster_intpl), add = TRUE)
Automatic detection of contours.

Automatic detection of contours.

interpolate_raster <- function(gis_sh = gis_sh, gis_pts = gis_pts){

  gis_grid <- raster::raster(
    ncols=300, nrows=300, ext = extent(gis_sh),
    crs = sp::proj4string(gis_sh)
    ); 

  gis_raster <- raster::rasterize(
    x = gis_pts, field = "value", fun = mean,
    y = gis_grid
    );

  gis_mod <- gstat::gstat(id = "PM10", formula = value~1, data = gis_pts);

  gis_raster_intpl <- raster::interpolate(gis_raster , gis_mod);
  gis_raster_intpl <- raster::crop(gis_raster_intpl, raster::extent(gis_sh));
  gis_raster_intpl <- raster::mask(gis_raster_intpl, gis_sh);

  return(gis_raster_intpl);

}

gis_raster_intpl <- interpolate_raster(gis_sh = gis_sh, gis_pts = gis_pts)

3.3 Interactive map

Finally, the raster, the shape and the point were overlayed on a Open Street Map background. The leaflet package proposes a convenient high-level graphical programming interface were layers are added to each other. Note that a colour scale was refined so as to correspond to Polish standards for which a green-to-red gradient is limited to PM10 concentrations between 0 and 150.

mymax <- 150;
gis_raster_intpl[gis_raster_intpl >= mymax] <- mymax-1;

# Conversion of the objects into CRS WGS84 (epsg 4326)
gis_sh  <- sp::spTransform(gis_sh, sp::CRS("+init=epsg:4326"));
gis_pts <- sp::spTransform(gis_pts, sp::CRS("+init=epsg:4326"));
gis_raster_intpl <- projectRaster(
  gis_raster_intpl, crs = sp::CRS("+init=epsg:4326")
);

pal  <- leaflet::colorNumeric(
  colorRampPalette(c("green", "orange", "red4"))(15),
  domain   = c(0, mymax),
  reverse  = FALSE,
  na.color = NA
  );

m <- leaflet::leaflet(gis_sh);
m <- leaflet::addTiles(map = m);
m <- leaflet::addProviderTiles(
  map = m, leaflet::providers[[37]]#3, 36, 42, 53,61, 62, 63
  );
m <- leaflet::addRasterImage(
  map = m, gis_raster_intpl, colors = pal, opacity = .8
  );
m <- leaflet::addLegend(
  map = m, pal = pal, values = raster::values(gis_raster_intpl)
  );
m <- leaflet::addPolylines(
  map = m, weight = 1, opacity = .8, color = "#EEEEEE"
  );
m <- leaflet::addPolylines(
  map = m, weight = 1, opacity = .8, color = "#949494",
  data = raster::rasterToContour(gis_raster_intpl)
  );
m <- leaflet::addCircleMarkers(
  map   = m,
  lat   = gis_pts$lat,
  lng   = gis_pts$lon,
  popup = paste(gis_pts$value, gis_pts$stationName),
  clusterOptions = leaflet::markerClusterOptions(),
  opacity     = 1,
  weight      = 2,
  fillOpacity = 1,
  fillColor   = pal(ifelse(gis_pts$value >= mymax, mymax -1, gis_pts$value))
  );
m <- leaflet::addControl(
  map = m,
  paste0(
    '<div style="text-align:center"><h3>',sel_date, '</h3>',
    '<a href="https://fcacollin.github.io/Latarnia">
    <img
    src="http://fcacollin.github.io/Latarnia/signature/logo_signature.png"
    width="100px">
    </a>
    <p>Administrative boundaries:<br>
    © EuroGeographics for the<br>
    administrative boundaries</p>
    </div>'
    ),
  position = 'bottomright'
  );

htmlwidgets::saveWidget(m, file = paths$f02);
# m;

Open in a new tab.

gis_sh  <- sp::spTransform(gis_sh, sp::CRS("+init=epsg:2180"));

datetimes        <- unique(measure$date);
names(datetimes) <- datetimes;

raster_gif <- lapply(
  X   = datetimes,
  FUN = function(x){

    print(x)
    gis_pts <- fetch_pts(datetime = x, measure = measure, station = station);
    gis_raster_intpl <- interpolate_raster(gis_sh = gis_sh, gis_pts = gis_pts);

    mymax <- 150;
    gis_raster_intpl[gis_raster_intpl > mymax] <- mymax;

    return(gis_raster_intpl);
  }
);

raster_gif <- raster_gif[order(names(raster_gif))];
pngdir     <- tempdir()
paths_png  <- file.path(pngdir, paste0("raster", names(raster_gif), '.png'));

export <- function(layer, path, main){

  png(path);
  plot(
    layer, zlim = c(0, 150), main = main, useRaster = TRUE, interpolate = TRUE,
    col = colorRampPalette(c("green", "orange", "red4"))(15)
    );
  sp::plot(gis_sh, add = TRUE, border = 'gray75');
  sp::plot(raster::rasterToContour(layer), add = TRUE, col = 'gray35')
  dev.off();

}

# <http://adv-r.had.co.nz/Functionals.html>
Map(export, raster_gif, paths_png, names(raster_gif));
system(paste0("convert -delay 40 ", pngdir, "/ra*.png docs/img/movie.gif"));

# animate(
#   stack(rev(raster_gif)),
#   col = colorRampPalette(c("green", "orange", "red4"))(20),
#   pause=.05, n = 1
# )
The rasters were stacked in an animated gif.

The rasters were stacked in an animated gif.

3.4 Session information

## R version 3.6.3 (2020-02-29)
## Platform: x86_64-pc-linux-gnu (64-bit)
## Running under: Debian GNU/Linux 10 (buster)
## 
## Matrix products: default
## BLAS:   /usr/lib/x86_64-linux-gnu/openblas/libblas.so.3
## LAPACK: /usr/lib/x86_64-linux-gnu/libopenblasp-r0.3.5.so
## 
## locale:
##  [1] LC_CTYPE=en_GB.UTF-8       LC_NUMERIC=C              
##  [3] LC_TIME=en_GB.UTF-8        LC_COLLATE=en_GB.UTF-8    
##  [5] LC_MONETARY=en_GB.UTF-8    LC_MESSAGES=en_GB.UTF-8   
##  [7] LC_PAPER=en_GB.UTF-8       LC_NAME=C                 
##  [9] LC_ADDRESS=C               LC_TELEPHONE=C            
## [11] LC_MEASUREMENT=en_GB.UTF-8 LC_IDENTIFICATION=C       
## 
## attached base packages:
## [1] stats     graphics  grDevices utils     datasets  methods   base     
## 
## other attached packages:
## [1] ggplot2_3.3.0     htmlwidgets_1.5.1 leaflet_2.0.3     gstat_2.0-5      
## [5] raster_3.0-12     rgdal_1.4-8       sp_1.4-1          jsonlite_1.6.1   
## [9] httr_1.4.1       
## 
## loaded via a namespace (and not attached):
##  [1] Rcpp_1.0.4.6            lattice_0.20-41         FNN_1.1.3              
##  [4] png_0.1-7               leaflet.providers_1.9.0 class_7.3-16           
##  [7] zoo_1.8-7               assertthat_0.2.1        digest_0.6.25          
## [10] V8_3.0.2                R6_2.4.1                evaluate_0.14          
## [13] e1071_1.7-3             geojson_0.3.2           highr_0.8              
## [16] pillar_1.4.3            rlang_0.4.5             lazyeval_0.2.2         
## [19] curl_4.3                geojsonio_0.9.2         DT_0.13                
## [22] rmarkdown_2.1           labeling_0.3            urltools_1.7.3         
## [25] jqr_1.1.0               stringr_1.4.0           foreign_0.8-76         
## [28] triebeard_0.3.0         munsell_0.5.0           compiler_3.6.3         
## [31] xfun_0.13               base64enc_0.1-3         pkgconfig_2.0.3        
## [34] rgeos_0.5-2             htmltools_0.4.0         tidyselect_1.0.0       
## [37] tibble_3.0.0            httpcode_0.3.0          intervals_0.15.2       
## [40] codetools_0.2-16        fansi_0.4.1             spacetime_1.2-3        
## [43] withr_2.1.2             dplyr_0.8.5             crayon_1.3.4           
## [46] sf_0.9-1                crul_0.9.0              grid_3.6.3             
## [49] gtable_0.3.0            lifecycle_0.2.0         DBI_1.1.0              
## [52] magrittr_1.5            units_0.6-6             scales_1.1.0           
## [55] KernSmooth_2.23-16      cli_2.0.2               stringi_1.4.6          
## [58] farver_2.0.3            ellipsis_0.3.0          xts_0.12-0             
## [61] vctrs_0.2.4             tools_3.6.3             glue_1.4.0             
## [64] purrr_0.3.3             crosstalk_1.1.0.1       yaml_2.2.1             
## [67] colorspace_1.4-1        maptools_0.9-9          classInt_0.4-3         
## [70] knitr_1.28

4 Discussion

The Polish GOŚ API is a easy-to-use interface to query the recent measure accounting for air quality at a national scale. The projection of this this information to a map layer and linear interpolation is rather straightforward. The high-level function leaflet is of particular interest to combine the different map layers into a fashionable interactive map easily integrated into a web page as a html widget.

If the map can be updated, the information projected is nonetheless static. It could be of interest to add a reactive component such as proposed by the Shiny package so as to be able to choose a possible date. What is more, an animation cycling through available measure date and time would help perceiving the temporal trend.

References

Bivand, Roger, Tim Keitt, and Barry Rowlingson. 2019. Rgdal: Bindings for the ’Geospatial’ Data Abstraction Library. https://CRAN.R-project.org/package=rgdal.

Bivand, Roger S., Edzer Pebesma, and Virgilio Gomez-Rubio. 2013. Applied Spatial Data Analysis with R, Second Edition. Springer, NY. http://www.asdar-book.org/.

Chamberlain, Scott, and Andy Teucher. 2020. Geojsonio: Convert Data from and to ’Geojson’ or ’Topojson’. https://CRAN.R-project.org/package=geojsonio.

Gräler, Benedikt, Edzer Pebesma, and Gerard Heuvelink. 2016. “Spatio-Temporal Interpolation Using Gstat.” The R Journal 8 (1): 204–18. https://doi.org/10.32614/rj-2016-014.

Hijmans, Robert J. 2020. Raster: Geographic Data Analysis and Modeling. https://CRAN.R-project.org/package=raster.

Ooms, Jeroen. 2014. “The Jsonlite Package: A Practical and Consistent Mapping Between Json Data and R Objects.” arXiv:1403.2805 [stat.CO]. https://arxiv.org/abs/1403.2805.

Pebesma, Edzer J., and Roger S. Bivand. 2005. “Classes and Methods for Spatial Data in R.” R News 5 (2): 9–13. https://CRAN.R-project.org/doc/Rnews/.

Wickham, Hadley. 2019. Httr: Tools for Working with Urls and Http. https://CRAN.R-project.org/package=httr.


  1. https://ec.europa.eu/eurostat/cache/GISCO/distribution/v1/nuts-2016.html

  2. https://epsg.io/2180, https://pl.wikipedia.org/wiki/Uk%C5%82ad_wsp%C3%B3%C5%82rz%C4%99dnych_1992

  3. https://ec.europa.eu/eurostat/cache/GISCO/distribution/v1/nuts-2016.html