What I learned: Part 2 PostgreSQL and PostGIS

In my previous post in the "What I learned" series on Linux commands. Today its time to share some things that I learned while working with PostgreSQL and PostGIS.

Problem: Where are the PostgreSQL logs in ubuntu?

On Ubuntu you can find them here:
cd /var/log/postgresql
With the tail command you can keep an eye on the logs while it is being written to. On my system the exact command for doing this is:
tail -f /var/log/postgresql/postgresql-9.1-main.log

tail postgresql log

Problem: Copying a subset of a table

I needed to develop some SQL scripts for some very big tables. To be able to quickly test my work I decided to copy the tables to a new schema but with a limited set of records.
CREATE TABLE  AS (SELECT * FROM  LIMIT );
The PostgreSQL documentation contains more information on this topic and some extra samples.

If you also want to copy the original column constraints then its better to create your table with the following two commands:
CREATE TABLE tbl_new (LIKE tbl_org INCLUDING CONSTRAINTS);

INSERT INTO tbl_new 
SELECT col_a, col_b, array[col] aS col_c
FROM tbl_org;

Note that primary and foreign keys and indexes aren't created so if you want them you'll have to create them for your new table.

Problem: Typing the same query over and over again in pgAdmin

Fortunatly you can add macros to pgAdmin with following steps:
  1. Open a query window in pgAdmin
  2. Click Macros -> Manage Macros
  3. Click the key combination that you'd like and type a name and the actual query and hit save.
  4. Repeat for all other macro queries that you'd like
pgAdmin macros screenshot

Some sample macros:
-- inspect the first 100 records of a table. 
-- Select the table and hit the keyboard shortcut you've picked.
SELECT * FROM $SELECTION$ LIMIT 100
-- how many records does a statement return. Also useful for timing queries and subqueries without having to load all the results in your Data Output window.
SELECT COUNT(*) FROM ($SELECTION$) counttable
Some extra information can be found in the official docs.

Problem: Getting the centroid of multiple points in PostGIS

SELECT ST_CENTROID(ST_COLLECT(geom)) FROM points
Note that ST_Collect is many times faster than ST_Union. The difference is that ST_Collect puts the geometries into one multi geometry and ST_Union combines geometries whenever they overlap.

Problem: Creating an array from a query in PostgreSQL

SELECT array(SELECT id FROM table)
Or
SELECT array_agg(id) FROM table

Problem: Repair polygons that are not closed in PostGIS 1.5.4

Solution add the first point at the end of the exterior ring of the polygon when the first point is not the same as the last point. Note that polygons lose their inner rings if there were any. See also: http://lists.osgeo.org/pipermail/postgis-users/2011-December/031615.html
WITH fixed_geom As (
  SELECT gid, ST_Collect(ST_BuildArea(ST_AddPoint(ST_EXTERIORRING(geom), ST_StartPoint(ST_EXTERIORRING(geom)), -1 /* append the point */))) geom
    FROM (SELECT gid,(ST_DUMP(geom)).geom geom FROM qc.global_shorelines) l
   WHERE NOT ST_StartPoint(ST_EXTERIORRING(geom)) = ST_EndPoint(ST_EXTERIORRING(geom))
   GROUP BY gid
)
UPDATE qc.global_shorelines gs SET geom = f.geom
  FROM fixed_geom f
 WHERE gs.gid  = f.gid

Problem : Executing a SQL script within psql

If you want to execute a SQL script within psql then start psql and type:
\i "path .. to .. file"

Problem: Cancelling a command with psql 

To cancel a command hit Ctrl+c and to quit Ctrl+d. An alternative for quitting is typing \q.

Problem: Writing output of psql to a file

If you want to redirect psql output to a file instead of to the prompt then type:
\o filename.here
Type \o again to restore to the standard output.

Problem: Get execution time from query in psql

\timing

Problem: Listing the open connections and active sql queries

SELECT datname,usename,procpid,client_addr,waiting,query_start,current_query FROM pg_stat_activity;

Problem: killing all open connections to a database

SELECT pg_terminate_backend(pg_stat_activity.procpid)
FROM pg_stat_activity
WHERE pg_stat_activity.datname = 'YOUR_DATABASE_NAME'
  AND procpid <> pg_backend_pid()

Problem: Copying data from one server to another with psql

\c dbname username source_db_server
COPY (SELECT * FROM your_table WHERE some_filter) TO '/data/table_copy';
\c dbname username destination_db_server
COPY '/data/table_copy' TO your_table

Problem: On the fly compression of the copy output

To solve this problem in PostgreSQL 9.3 an option was added to the COPY command to directly pipe the output through an external program.
This can be used to pipe the copy output to gzip in the following way:
COPY test TO PROGRAM 'gzip > /data/test.gz';
To achieve the same behavior on older systems you can copy to STDOUT with psql and then pipe this to gzip.
psql -c "COPY test TO STDOUT WITH CSV" | gzip > /data/test.gz

Problem: Tired of typing your password

After filling in my password for the n-th time while using tools like pg_dump and pg_restore I searched for a solution:
export PGPASSWORD=

Conclusions

I hope some of the tips on PostgreSQL and PostGIS might come in handy some day. If you want to learn more on PostgreSQL I would suggest you to check the excellent documentation. For all things related to performance there is the excellent book: PostgreSQL 9.0 High Performance.
The best book on PostGIS is definitely PostGIS in Action. The second edition is coming out in may 28, 2014.

Related Posts

Loading and querying spatial data with PostGIS
What I learned: tmux, top, ls and pscp

No comments: