Pages

Creating a kernel density estimate map in R

In this post I'm going to create a kernel density estimate map in R from a file with latitude/longitude coordinates. WARNING: depending on your application the following gives incorrect results because a non-spherical kernel density estimator is used with spherical data (big thanks too Brian Rowlingson for pointing that out). I didn't find any spherical kernel density estimation implementation in R but if you know any please let me know. While exploring this issue I also wrote a similar implementation as the below code for the generation of a kernel density estimate with the spatial statistics package splancs instead of KernSmooth (splancs_kernel_density.R).

To create this map we'll use the bkde2D function in the KernSmooth package to generate the kernel density estimates and the raster package for outputting a raster file. If you don't have the KernSmooth and raster packages then they can as usually be installed with:

```install.packages("KernSmooth")
install.packages("raster")
```

In order to create the 2D binned kernel density map I first loaded the KernSmooth and raster package.

```library("KernSmooth")
library("raster")
```

Then we read the input file with the coordinates of the points that we want to generate the kernel density estimate off. This input file is an export from the Ocean Biographic Information System (OBIS) and represents distribution records from the seaweed Alaria esculenta.

```input <- "Alaria_esculenta.csv"
output <- "Alaria_esculenta_density.asc"

# get the coordinates
coordinates <- records[,2:3]
```

Next we compute the 2D binned kernel density estimate. In order to limit the size of the generated output files I set all values smaller then 0.00001 to 0.

```# compute the 2D binned kernel density estimate
est <- bkde2D(coordinates,
bandwidth=c(3,3),
gridsize=c(4320,2160),
range.x=list(c(-180,180),c(-90,90)))
est\$fhat[est\$fhat<0.00001] <- 0 ## ignore very small values
```

Finally we create a raster file from the output of bkde2D, inspect it visually and export it as an ascii file.

```# create raster
est.raster = raster(list(x=est\$x1,y=est\$x2,z=est\$fhat))
projection(est.raster) <- CRS("+init=epsg:4326")
xmin(est.raster) <- -180
xmax(est.raster) <- 180
ymin(est.raster) <- -90
ymax(est.raster) <- 90
# visually inspect the raster output
plot(est.raster)
# write an ascii file
writeRaster(est.raster,output,"ascii")
```

And the output looks like this. As you can see most records are located near 50°N and 0° which corresponds to the distribution records map generated by OBIS.
The output ascii raster file can be used as an input file for OccurrenceThinner which uses the kernel density estimate grid for filtering out records from densely sampled regions in order to avoid overfitting while building species distribution models (SDM).

Code and data for this post are in my blog github repository.

Unknown said...

You're using lat-long in a 2d kernel smoother. Your kernels are, especially at 50N, more like ellipses. I'm not sure if there is a spherical kernel smoothing function in R, but if you want to smooth global data, you need it!

Samuel Bosch said...

Thanks for your comment! I didn't find a spherical kernel smoothing function in R.

Nate Wessel said...

Any idea how to weight values in the bkde2D function or another similar function?

Samuel Bosch said...

I don't know how to do it in R but CrimeStat (http://nij.gov/topics/technology/maps/Pages/crimestat.aspx) has the necessary functionality. Just make sure to check the "Use weighting variable" checkbox. But take your time to explore the documentation because the CrimeStat UI its not very intuitive.

Unknown said...

Hello ! Thanks for this tutorial. I am a beginner with R but I want to do exactly the same thing with my personal data. But R display error messages when I enter this:

coordinates <- records[,2:3]"

What is the signification of the number 2 and 3 ? The columns ?

And is there a way to change the colors of the density map ?

Samuel Bosch said...

This examples expects that columns that the second and third columns have the x and y coordinates. So yes, 2 and 3 refer too the columns.

Colors can be changed while plotting with the "col" parameter but if you want to create nicer maps I would advice to use something like QGIS.

Anonymous said...

Hey,

thanks so much for this blog. I am trying to do the same for my data. Could you maybe elaborate on the choice of bandwidth and gridsize? I've been googling around trying to find a good explanation but haven't come across one unfortunately.

Thanks so much!

Samuel Bosch said...

Bandwidth determines how smooth your kernel density map will be, see this Wikipedia article for more information on bandwith selection https://en.wikipedia.org/wiki/Kernel_density_estimation#Bandwidth_selection.
The gridsize determines the resolution of the raster grid (number of columns and rows).

Unknown said...

hi, thank you for this blog, i found it useful for my analysis, but i dont understand about this coding, can u eloberate what is x=est\$1, y=est\$x2,z=est\$fhat? from est.raster = raster(list(x=est\$x1,y=est\$x2,z=est\$fhat)).

Samuel Bosch said...

x = est\$x1 is the x-coordinate or longitude, y = est\$x2 is the y-coordinate or latitude and z = est\$fhat is the density estimate at that location. So est.raster = raster(list(x=est\$x1,y=est\$x2,z=est\$fhat)) will create a raster with as value the estimated density.