Selecting comma separated data as multiple rows with SQLite

A while back I needed to split data stored in one column as a comma separated string into multiple rows in a SQL query from a SQLite database.

My table looked like this:

CREATE TABLE "predictor_sets" 
  (`id` INTEGER, `nvar` INTEGER, `predictors` TEXT, 
    `experiment` TEXT, PRIMARY KEY(`id`));

Insert some sample data:

INSERT INTO predictor_sets VALUES
  (1659, 5, 'BO_bathymax,BO_calcite,BO_parmax,BO_damean,BO_sstrange', 'bathymetry'),
  (1660, 5, 'BO_bathymin,BO_calcite,BO_parmax,BO_damean,BO_sstrange', 'bathymetry'),
  (1661, 5, 'BO_bathymean,BO_calcite,BO_parmax,BO_damean,BO_sstrange', 'bathymetry');

Splitting up the different predictors in seperate rows can be done with the following recursive common table expression:

WITH RECURSIVE split(predictorset_id, predictor_name, rest) AS (
  SELECT id, '', predictors || ',' FROM predictor_sets WHERE id
  SELECT predictorset_id, 
         substr(rest, 0, instr(rest, ',')),
         substr(rest, instr(rest, ',')+1)
    FROM split
   WHERE rest <> '')
SELECT predictorset_id, predictor_name 
  FROM split 
 WHERE predictor_name <> ''
 ORDER BY predictorset_id, predictor_name;

Check out the documentation for more info on writing your own common table expressions in SQLite, PostgreSQL or your favorite database.

Creating a single pixel wide raster along the coastline or another edge

At one point during my PhD I needed a single pixel of raster values along the coastline. In order to get this I used the following code in R:


coast_mask <- function(layer) {
  edges <- raster::boundaries(raster(layer, layer=1), type="inner")
  values <- getValues(edges) | values == 0

l <- load_layers("BO_sstmean", equalarea = FALSE)

mask <- coast_mask(l)
l[mask] <- NA

plot(l, col=rev(heat.colors(255)))

Source code available on GitHub at

Main R Packages used:

Creating a high resolution png from a raster in R

In this post I'll show you two different ways to create a png from an equal-area bathymetry grid in R.

Version 1: using the raster::plot function

The first option for plotting a raster is to use the raster::plot function and write this to a sufficiently large png file. This works reasonably well but the only disadvantage is that both the title and legend are really small. The title is optional but I didn't find a way to disable the legend. The main trick to get the raster::plot function to output a high resolution png consists in setting the maxpixels parameter. Colors where selected on ColorBrewer but with a small addition that the color range was reverted and that the darkest color is repeated to ensure that rare depths (-6000 to -10000) get the same color.

if(!requireNamespace("sdmpredictors")) {
x <- sdmpredictors::load_layers("BO_bathymean", equalarea = TRUE)
png("bathymetry_plot1.png", width=ncol(x), height=nrow(x))
col <- rev(c("#f7fbff", "#deebf7", "#c6dbef", "#9ecae1", "#6baed6", "#4292c6","#2171b5","#08519c", rep("#08306b",7)))
plot(x, maxpixels = ncell(x), col = col, colNA = "#818181", 
     main = "Bathymetry", axes = FALSE, ylim=extent(x)[3:4])

Version 2: write to png using leaflet colors

While this version looks and is a bit more complicated, it produces really good looking results. The main gist of this code is that it transforms the values to a range from 0 to 1000 while making sure that extreme values at both extremes will get the same colors as more common values. This is similar to what you general would do when e.g. creating a color scale with QGIS. Once values are mapped to colors they are converted to raw bytes with the right dimensions and written to a png file. Remark that this code was inspired by some of the internal functions in the leaflet package.

if(!requireNamespace("leaflet")) {
if(!requireNamespace("sdmpredictors")) {

# create colors
colors <- leaflet::colorNumeric(rev(c("#f7fbff", "#deebf7", "#c6dbef", "#9ecae1", "#6baed6", "#4292c6","#2171b5","#08519c", "#08306b")),
                                -1:1001, na.color =  "#818181")
cols <- c(colors(-1:1001), colors(NA))
x <- sdmpredictors::load_layers("BO_bathymean", equalarea = TRUE)
# scale values and remove extreme values from the color range 
vals <- values(x)
vals <- scale(vals)
minmax <- quantile(vals, probs=c(0.01, 0.99), na.rm = TRUE)
vals <- round((((vals - minmax[1]) / (minmax[2] - minmax[1])) * 1000))
vals[vals < 0] <- 0
vals[vals > 1000] <- 1000
vals[] <- 1002

# lookup colors for scaled values, convert to raw and write to file
valcolors <- cols[vals+2] # +2 because -1 and 0 are in cols (value 0 is at index 2 in cols)

rgb_data <- col2rgb(valcolors, alpha = TRUE)
raw_data <- as.raw(rgb_data)
dim(raw_data) <- c(4, ncol(x), nrow(x))

png::writePNG(raw_data, "bathymetry_plot2.png")

All source code is available on GitHub at

Main R Packages used:

Intersecting a data frame with longitude/latitudes with a polygon shapefile in R

I finally cleaned up a little script that I use to get the Spalding marine ecoregions for a set of points in R. The function needs as input a data frame with coordinates in longitude/latitude, the names of the coordinate columns (first the longitude column then the latitude column) and the path to the ecoregions shapefile. You can off course substitude this ecoregions shapefile with any other polygon shapefile (eg.countries, districts, lakes, protected areas, ...).


get_ecoregions <- function(data, xy_cols, meow_path) {
  df <- data
  row.names(df) <- 1:nrow(df)
  coordinates(data) <- xy_cols
  data@proj4string <-  CRS("+proj=longlat +datum=WGS84 +ellps=WGS84 +towgs84=0,0,0")
  layer <- tools::file_path_sans_ext(basename(meow_path))
  meow <- readOGR(dsn = meow_path, layer = layer,verbose = FALSE, stringsAsFactors = FALSE)
  meow@proj4string <- data@proj4string # set the same, because they are
  intersect_meow_df <- gIntersection(meow, data, byid = TRUE)
  id_pairs <- strsplit(row.names(intersect_meow_df), " ")
  id_pairs_mat <- as.matrix(, id_pairs))
  ecoids <- id_pairs_mat[,1]
  names(ecoids) <- id_pairs_mat[,2] ## store rowname from df.sp in names(ecoids)
  meow_row_names <- ecoids[row.names(data)]
  meow_row_names[] <- "invalid row name"
  cbind(df, meow@data[meow_row_names,])

# df <- data.frame(species=rep("a", 4), longitude=c(4.087523,3.66044,2.926591, 0), latitude=c(51.671932, 51.55974, 51.234695,0))
# m <- get_ecoregions(df, c("longitude", "latitude"), "D:/a/data/ecoregions/meow_ecos.shp")

Compressing TIFF files in R and Windows

Compressing TIFF files is an easy way to save disk space when working with raster files or image outputs for research papers. But this is rarely the default option causing used diskspace to quickly inflate heavily. For instance this figure 7000 by 5000 pixel image as a TIFF uses 136MB.

While after lossless compression with the libtiff command tiffcp it is only 8MB. To install libtiff on windows download and run the latest setup from To compress a tiff file you need to run the following command (assuming that tiffcp is in your path). Make sure to specify a different output file name, otherwise the original will be overwritten with a broken and incomplete copy of the original.
tiffcp -c lzw uncompressed_input.tif compressed_output.tif

In R you can also write compressed raster (geo)tiff files by adding extra options to the writeRaster function from the raster package.


x <- raster("uncompressed_input.tif")
tifoptions <- c("COMPRESS=DEFLATE", "PREDICTOR=2", "ZLEVEL=6")
writeRaster(x, "compressed_output.tif",
            options = tifoptions, overwrite = FALSE)