Samuel Bosch

Showing posts with label Ubuntu. Show all posts

What I learned: Part 2 PostgreSQL and PostGIS

In my previous post in the "What I learned" series on Linux commands. Today its time to share some things that I learned while working with PostgreSQL and PostGIS.

Problem: Where are the PostgreSQL logs in ubuntu?

On Ubuntu you can find them here:

cd /var/log/postgresql

With the tail command you can keep an eye on the logs while it is being written to. On my system the exact command for doing this is:

tail -f /var/log/postgresql/postgresql-9.1-main.log

Problem: Copying a subset of a table

I needed to develop some SQL scripts for some very big tables. To be able to quickly test my work I decided to copy the tables to a new schema but with a limited set of records.

CREATE TABLE  AS (SELECT * FROM  LIMIT );

The PostgreSQL documentation contains more information on this topic and some extra samples.

If you also want to copy the original column constraints then its better to create your table with the following two commands:

CREATE TABLE tbl_new (LIKE tbl_org INCLUDING CONSTRAINTS);

INSERT INTO tbl_new 
SELECT col_a, col_b, array[col] aS col_c
FROM tbl_org;

Note that primary and foreign keys and indexes aren't created so if you want them you'll have to create them for your new table.

Problem: Typing the same query over and over again in pgAdmin

Fortunatly you can add macros to pgAdmin with following steps:

Open a query window in pgAdmin
Click Macros -> Manage Macros
Click the key combination that you'd like and type a name and the actual query and hit save.
Repeat for all other macro queries that you'd like

Some sample macros:

-- inspect the first 100 records of a table.

-- Select the table and hit the keyboard shortcut you've picked.
SELECT * FROM $SELECTION$ LIMIT 100

-- how many records does a statement return. Also useful for timing queries and subqueries without having to load all the results in your Data Output window.
SELECT COUNT(*) FROM ($SELECTION$) counttable

Some extra information can be found in the official docs.

Problem: Getting the centroid of multiple points in PostGIS

SELECT ST_CENTROID(ST_COLLECT(geom)) FROM points

Note that ST_Collect is many times faster than ST_Union. The difference is that ST_Collect puts the geometries into one multi geometry and ST_Union combines geometries whenever they overlap.

Problem: Creating an array from a query in PostgreSQL

SELECT array(SELECT id FROM table)

SELECT array_agg(id) FROM table

Problem: Repair polygons that are not closed in PostGIS 1.5.4

Solution add the first point at the end of the exterior ring of the polygon when the first point is not the same as the last point. Note that polygons lose their inner rings if there were any. See also: http://lists.osgeo.org/pipermail/postgis-users/2011-December/031615.html

WITH fixed_geom As (
  SELECT gid, ST_Collect(ST_BuildArea(ST_AddPoint(ST_EXTERIORRING(geom), ST_StartPoint(ST_EXTERIORRING(geom)), -1 /* append the point */))) geom
    FROM (SELECT gid,(ST_DUMP(geom)).geom geom FROM qc.global_shorelines) l
   WHERE NOT ST_StartPoint(ST_EXTERIORRING(geom)) = ST_EndPoint(ST_EXTERIORRING(geom))
   GROUP BY gid
)
UPDATE qc.global_shorelines gs SET geom = f.geom
  FROM fixed_geom f
 WHERE gs.gid  = f.gid

Problem : Executing a SQL script within psql

If you want to execute a SQL script within psql then start psql and type:

\i "path .. to .. file"

Problem: Cancelling a command with psql

To cancel a command hit Ctrl+c and to quit Ctrl+d. An alternative for quitting is typing \q.

Problem: Writing output of psql to a file

If you want to redirect psql output to a file instead of to the prompt then type:

\o filename.here

Type \o again to restore to the standard output.

Problem: Get execution time from query in psql

\timing

Problem: Listing the open connections and active sql queries

SELECT datname,usename,procpid,client_addr,waiting,query_start,current_query FROM pg_stat_activity;

Problem: killing all open connections to a database

SELECT pg_terminate_backend(pg_stat_activity.procpid)
FROM pg_stat_activity
WHERE pg_stat_activity.datname = 'YOUR_DATABASE_NAME'
  AND procpid <> pg_backend_pid()

Problem: Copying data from one server to another with psql

\c dbname username source_db_server
COPY (SELECT * FROM your_table WHERE some_filter) TO '/data/table_copy';
\c dbname username destination_db_server
COPY '/data/table_copy' TO your_table

Problem: On the fly compression of the copy output

To solve this problem in PostgreSQL 9.3 an option was added to the COPY command to directly pipe the output through an external program.
This can be used to pipe the copy output to gzip in the following way:

COPY test TO PROGRAM 'gzip > /data/test.gz';

To achieve the same behavior on older systems you can copy to STDOUT with psql and then pipe this to gzip.

psql -c "COPY test TO STDOUT WITH CSV" | gzip > /data/test.gz

Problem: Tired of typing your password

After filling in my password for the n-th time while using tools like pg_dump and pg_restore I searched for a solution:

export PGPASSWORD=

Conclusions

I hope some of the tips on PostgreSQL and PostGIS might come in handy some day. If you want to learn more on PostgreSQL I would suggest you to check the excellent documentation. For all things related to performance there is the excellent book: PostgreSQL 9.0 High Performance.
The best book on PostGIS is definitely PostGIS in Action. The second edition is coming out in may 28, 2014.

Loading and querying spatial data with PostGIS
What I learned: tmux, top, ls and pscp

Building and Installing Spatial Index on Ubuntu

This is a short post about installing Spatial Index on Ubuntu. First download the latest release. At the moment it's version 3.2 and cd to the location of your download. Then issue the following commands :

tar xzvf spatialindex-1.3.2.tar.gz
cd spatialindex-1.3.2
./configure
sudo make install

If you want to configure your install you should take a look at the installation notes. You can find more information about Spatial Index at the projects trac.

Installing Tokyo Cabinet and Ruby on Ubuntu

After MongoDB its time for another alternative to relational databases called Tokyo Cabinet. Tokyo Cabinet is a library of routines for managing a file based key-value store. It's a high performing database and it can be accessed over a network with Tokyo Tyrant. In this post I install Tokyo Cabinet, Ruby and the Ruby bindings for Tokyo Cabinet. But there will be a follow up post where I load and query POIs like I did with MongoDB and PostgreSQL/PostGIS.

Tokyo Cabinet only works on Linux so I installed it in on an Ubuntu virtual machine. It took me some time to figure everything out but if I can do it you can too. First you need to download the latest version of Tokyo Cabinet from the project site. Once downloaded you open a terminal window, navigate to the download location and issue the following commands.

tar xzvf tokyocabinet-1.4.20.tar.gz
cd tokyo-cabinet/
# install dependencies
sudo apt-get install checkinstall build-essential libbz2-dev
# now compile
./configure --prefix=/usr
make clean
make
# creates and installs a Debian package
sudo checkinstall -D

I decided to use the Ruby bindings so if you don't have it you can install it with the below command. This installs Ruby, an interactive shell, an interactive reference, the Ruby documentation and the dev part of the Ruby Standard Library. We will need the ruby-dev package for building the Ruby bindings of Tokyo Cabinet.

sudo apt-get install ruby irb ri rdoc ruby1.8-dev

After downloading the Ruby bindings for Tokyo Cabinet from the project site, I installed them with the following commands.

tar xzvf tokyocabinet-ruby-1.26.tar.gz
cd tokyocabinet-ruby-1.26
ruby extconf.rb
make
sudo make install

In the specifications document you can find a lot of information about Tokyo Cabinet and the underlying concepts. As you can read there are four types of databases : a hash database, a B+ tree database, a fixed-length database and a table database. In the examples directory of the Ruby bindings you will find some samples that create and use these four database. There is also a sample that uses the abstract database API with which you can communicate to the four database types. For more info on the Ruby bindings you can read the docs.

Problems with the installation ? Here are the sources that helped me with the installation process or post a comment and maybe I can help out.
http://openwferu.rubyforge.org/tokyo.html
http://oui.com.br/blog/nando-en/post/installing-tokyo-cabinet
http://www.ubuntugeek.com/how-to-install-ruby-on-rails-ror-in-ubuntu.html
http://blogs.law.harvard.edu/hoanga/2006/10/27/fixing-mkmf-load-error-ruby-in-ubuntu/