July 2009

Building and Installing Spatial Index on Ubuntu

This is a short post about installing Spatial Index on Ubuntu. First download the latest release. At the moment it's version 3.2 and cd to the location of your download. Then issue the following commands :

tar xzvf spatialindex-1.3.2.tar.gz
cd spatialindex-1.3.2
./configure
sudo make install

If you want to configure your install you should take a look at the installation notes. You can find more information about Spatial Index at the projects trac.

Tokyo Cabinet 2 : Loading and querying point data

After setting up Tokyo Cabinet and Ruby its time to use it. As with my post about MongoDB I'm going to load 500.000 POIs in a database and query them with a bounding box query. I will use the table database from Tokyo Cabinet because it supports the most querying facilities. With a table database you can query numbers with full matched and range queries and for strings you can do full matching, forward matching, regular expression matching,...

To load the data in my database I will need to read my shapefile with POIs with Ruby and write the attributes to a new database. First we create the database with the following code.

require 'tokyocabinet'
include TokyoCabinet

# create the object
tdb = TDB::new

# open or create  the database
if !tdb.open("poi_db.tct", TDB::OWRITER | TDB::OCREAT)
  STDERR.printf("open error: %s\n", tdb.errmsg(tdb.ecode))
end

To read the features in my shapefile I am going to use the Ruby bindings for GDAL/OGR. Because I installed Tokyo Cabinet on GISVM I already had FWTools installed but I still needed to install the Ruby bindings for it. I did this with the following command.

sudo apt-get install libgdal-ruby

Now we are going to read a shapefile with 500.000 point features and write the records to the database. First we open the shapefile and get the layer. Then we loop over the features, create a new record and fill the record with the x,y information and the other fields when they aren't empty. The values need to be converted to strings otherwise the record can't be saved. Then we put the record in the database.

require 'gdal/ogr'

# open my shapefile
dataset = Gdal::Ogr.open("poi_500000.shp")
layer = dataset.get_layer(0) 

feature_defn = layer.get_layer_defn

layer.get_feature_count.times do |i|
 record = Hash.new # create new record
    feature = layer.get_feature(i)
 geom = feature.get_geometry_ref()
 record['x'] = geom.get_x(0).to_s()
 record['y'] = geom.get_y(0).to_s()
 pkey = tdb.genuid # init primary key
 feature_defn.get_field_count.times do |i|
  field_defn = feature_defn.get_field_defn(i)
  fieldname = field_defn.get_name_ref
  value = feature.get_field_as_string(i);
  if not value.nil? and value != ""
   if field_defn.get_name_ref == "ID"
    pkey = value
   else
    record[fieldname] = value.to_s()
   end
  end
 end
 # store the record in Tokyo Cabinet
 tdb.put(pkey, record)
end

To add indexes on the x and y field we call the following code. This creates two supplementary files called poi_db.tct.idx.x.dec and poi_db.tct.idx.y.dec.

# add index on x and y
tdb.setindex('x', TDB::ITDECIMAL)
tdb.setindex('y', TDB::ITDECIMAL)

To query the POIs in the database I created a function to query the POIs for a given bounding box and then I benchmarked it. I used the same bounding box as in my previous posts about MongoDB, Rtree, Pythonnet and PostGIS.

# query POIs by bounding box
def query(tdb, minx, maxx, miny, maxy)
 qry = TDBQRY::new(tdb)
 qry.addcond("x", TDBQRY::QCNUMGE, minx.to_s())
 qry.addcond("x", TDBQRY::QCNUMLE, maxx.to_s())
 qry.addcond("y", TDBQRY::QCNUMGE, miny.to_s())
 qry.addcond("y", TDBQRY::QCNUMLE, maxy.to_s())
 qry.setorder("x", TDBQRY::QONUMASC)

 res = qry.search
 puts res.length # number of results found
 return res
end

require 'benchmark'
puts Benchmark.measure { query(tdb, 4.5, 5.0, 50.5, 51.0) }

The query returned 98000 POIs. I ran the benchmark 12 times and this where the results :

  1.620000   0.190000   1.810000 (  1.866339)
  1.570000   0.030000   1.600000 (  1.625303)
  1.640000   0.030000   1.670000 (  1.668573)
  1.650000   0.000000   1.650000 (  1.664806)
  1.650000   0.020000   1.670000 (  1.708228)
  1.730000   0.010000   1.740000 (  1.744645)
  1.410000   0.310000   1.720000 (  1.749268)
  1.620000   0.050000   1.670000 (  1.724199)
  1.610000   0.010000   1.620000 (  1.657794)
  1.660000   0.020000   1.680000 (  1.680383)
  1.710000   0.020000   1.730000 (  1.767141)
  1.720000   0.010000   1.730000 (  1.809114)

According to the Ruby documentation the benchmark outputs the user CPU time, the system CPU time, the sum of the user and system CPU times, and the elapsed real time. So this means that the query took between 1.65 and 1.87 seconds to get a list of 98000 POIs within the given bounding box. This is a nice indication of the speed of Tokyo Cabinet.

To demonstrate how you can access the attribute I created the following code. It loops over the first 100 found POIs and prints the ID and the x- and y-coordinate.

res = query(tdb, 4.5, 5.0, 50.5, 51.0)
# print the first hundred found POIs
i = 0
res.each do |rkey|
 rcols = tdb.get(rkey)
 puts rcols['id'].to_s() + " " + rcols['x'].to_s() + " " + rcols['y'].to_s()
 i += 1
 if i > 100
  break
 end
end

Now we are ready to close the database. I hope you enjoyed this post and as always I welcome any comments.

# close the database
if !tdb.close
 ecode = tdb.ecode
 STDERR.printf("close error: %s\n", tdb.errmsg(ecode))
end

Installing Tokyo Cabinet and Ruby on Ubuntu

After MongoDB its time for another alternative to relational databases called Tokyo Cabinet. Tokyo Cabinet is a library of routines for managing a file based key-value store. It's a high performing database and it can be accessed over a network with Tokyo Tyrant. In this post I install Tokyo Cabinet, Ruby and the Ruby bindings for Tokyo Cabinet. But there will be a follow up post where I load and query POIs like I did with MongoDB and PostgreSQL/PostGIS.

Tokyo Cabinet only works on Linux so I installed it in on an Ubuntu virtual machine. It took me some time to figure everything out but if I can do it you can too. First you need to download the latest version of Tokyo Cabinet from the project site. Once downloaded you open a terminal window, navigate to the download location and issue the following commands.

tar xzvf tokyocabinet-1.4.20.tar.gz
cd tokyo-cabinet/
# install dependencies
sudo apt-get install checkinstall build-essential libbz2-dev
# now compile
./configure --prefix=/usr
make clean
make
# creates and installs a Debian package
sudo checkinstall -D

I decided to use the Ruby bindings so if you don't have it you can install it with the below command. This installs Ruby, an interactive shell, an interactive reference, the Ruby documentation and the dev part of the Ruby Standard Library. We will need the ruby-dev package for building the Ruby bindings of Tokyo Cabinet.

sudo apt-get install ruby irb ri rdoc ruby1.8-dev

After downloading the Ruby bindings for Tokyo Cabinet from the project site, I installed them with the following commands.

tar xzvf tokyocabinet-ruby-1.26.tar.gz
cd tokyocabinet-ruby-1.26
ruby extconf.rb
make
sudo make install

In the specifications document you can find a lot of information about Tokyo Cabinet and the underlying concepts. As you can read there are four types of databases : a hash database, a B+ tree database, a fixed-length database and a table database. In the examples directory of the Ruby bindings you will find some samples that create and use these four database. There is also a sample that uses the abstract database API with which you can communicate to the four database types. For more info on the Ruby bindings you can read the docs.

Problems with the installation ? Here are the sources that helped me with the installation process or post a comment and maybe I can help out.
http://openwferu.rubyforge.org/tokyo.html
http://oui.com.br/blog/nando-en/post/installing-tokyo-cabinet
http://www.ubuntugeek.com/how-to-install-ruby-on-rails-ror-in-ubuntu.html
http://blogs.law.harvard.edu/hoanga/2006/10/27/fixing-mkmf-load-error-ruby-in-ubuntu/

Samuel Bosch

Pages

Building and Installing Spatial Index on Ubuntu

Tokyo Cabinet 2 : Loading and querying point data

Installing Tokyo Cabinet and Ruby on Ubuntu