Samuel Bosch

Showing posts with label geoprocessing. Show all posts

Looping over a Workspace

Ever wanted to execute a function on all the tables and/or featureclasses in a workspace then the following code is something for you. What it basically does is first loop over all the featureclasses that are in featuredatasets. Then loop over all the other featureclasses and finally loop over all the tables in the provided workspace.

To achieve this and still be able to retrieve the result of each function execution on the tables and featureclasses I created a generator function. I did this by using the yield statement. In order to avoid duplicate code I also created a nested function for looping over the featureclasses.

def executeiter(gp, workspace, featclassfunction=None, tablefunction=None):
    import os
    
    def loopfcs():
        fcs = gp.listfeatureclasses()
        for fc in iter(fcs.next, None):
            yield featclassfunction(fc)
    
    gp.workspace = workspace
    if featclassfunction is not None:
        for dataset in iter(gp.listdatasets().next, None):
            datasetworkspace = os.path.join(workspace, dataset)
            gp.workspace = datasetworkspace
            for result in loopfcs():
                yield result

        gp.workspace = workspace
        for result in loopfcs():
            yield result
            
    if tablefunction is not None:
        tables = gp.listtables()
        for table in iter(tables.next, None):
            yield tablefunction(table)

If you don't need the result of the function you can call the following function. It takes the same arguments as the executeiter function but its a regular function instead of a generator.

def execute(gp, workspace, featclassfunction=str, tablefunction=str):
    for x in executeiter(workspace, featclassfunction, tablefunction):
        pass

To demonstrate the usage of my code I first initialized a geoprocessing object and a workspace variable. Then I used the executeiter method to make it return the uppercase version of the name of the tables and featureclasses in my workspace. I could also have passed for example the describe or the listfields method of the geoprocessing object or a custom function. When you want to pass a function that doesn't return a result like the deleterows or deletefeatures function its more convenient to call the execute function.

import arcgisscripting, string

gp = arcgisscripting.create()
workspace = r'D:\temp\temp.gdb' # path to your workspace

# print the uppercase names of all the tables and featureclasses
for uppername in executeiter(gp, workspace, string.upper, string.upper):
    print uppername

# delete all rows of all the tables and featureclasses
execute(gp, workspace, gp.deletefeatures, gp.deleterows)

I've come to the end of this post. Did I miss something ? Know a Python idiom I really should start using ? Feel free to comment.

Geoprocessing 1 : Intersect

The intersect operation is one of the many overlay operations. When intersecting layer A with a layer B the result will include all those parts that occur in both A and B. In this post I'm going to compare the results of the ArcGIS Intersect tool with the corresponding operation in 3 other GIS packages. The reason why did this was that I noticed that the output from ArcGIS contains more features then I expected when dealing with overlapping geometries. This is usually not a problem but it becomes one when you have lots of overlapping geometries.

The GIS tools I used to compare the result of ArcGIS with were : uDig, Quantum GIS and gvSIG. All these and some more can be found on Portable GIS. Portable GIS brings open source GIS to your usb.

To test the code I prepared 2 shapefiles. One with a long small polygon and the other one with 3 overlapping rectangles (see image below).

When intersecting with ArcGIS (version 9.2 and 9.3 tested) the output looks like below. As you can see in the attribute table the result contains 9 polygons.

To be able to do the intersect operation with uDig I had to install the Axios Spatial Operations Extension. You can install this extension by clicking Help -> Find and install from the menubar. With Quantum GIS I needed to enable the ftools plugin to enable the geoprocessing functionality. The results of the 3 used open source GIS packages where the same. Only 3 features where created in the output shapefile. For completeness I added the screenshots of the results.

uDig :

Quantum GIS :

gvSIG :

Have any comments or questions ? Let me know !

Python Toolbox 3 : Pythonnet

After logging and Pygments I want to uncover another useful Python tool called pythonnet. Pythonnet provides a way to call .NET assemblies from within Python. Its usage is very similar to that of IronPython and also comparable to that of ctypes which I used in my post about using the ArcGIS projection engine dll (pe.dll) to project coordinates.

To get working binaries for this project you need to make your hands a little bit dirty but with some small effort you're good to go. And although the last release is rather old I've read about people using it on Python 2.6 with .NET 3.5. Before anything I suggest you to read the readme because this will explain and clarify a lot of things. What I did is download the project from sourceforge and open the visual studio solution. Then I applied the patch described here. For more information about this patch and sample code to test whether the patch was applied successfully I suggest you to read this. Finally you have to build the solution and copiy the clr.pyd and Python.Runtime.dll to the root of your Python installation. In my case this is C:\Python24.
To test if everything went fine you can take a look at the demo directory in the pythonnet folder. I had to add import clr before the other import statements to make the samples work. This is not a bug but merely due to the fact that I used the standard Python executable.

One of the possible uses of pythonnet in a GIS context is when you're creating some arcgisscripting code but want to do things that can't be done with the Python bindings of ArcGIS. One of those things is a spatial filter although you can create a workaround with an extra query layer and the SelectLayerByLocation tool. I first created a static method called Hello to test if pythonnet worked with my dll. Then I created the CreateLayerFromSpatialIntersect method. The goal of this method is to do a bounding box query and export the queried features. This method takes input the path to a shapefile, a destination folder and name and the minimum and maximum values of x and y. The found features are exported to a new shapefile. I also added a helper method to get an IFeatureClass object from the shapefile. I also used the LicenseInitializer class. This class is part of the ArcGIS Console application template and has the sole purpose of getting and releasing an ArcGIS license.

using System;
using ESRI.ArcGIS.Geodatabase;
using ESRI.ArcGIS.Geometry;
using ESRI.ArcGIS.GeoDatabaseUI;
using ESRI.ArcGIS.esriSystem;
using ESRI.ArcGIS.DataSourcesFile;
namespace GisSolved.SpatialQueryTool
{
    public class SpatialQueryTool
    {

        public static string Hello()
        {
            return "Hello !!!";
        }
        public static void CreateLayerFromSpatialIntersect(string shapeSourcePath, string destinationFolder, string destinationName, double xMin, double xMax, double yMin,double yMax)
        {
            LicenseInitializer licenseInitializer = new LicenseInitializer();
            licenseInitializer.InitializeApplication(
    new esriLicenseProductCode[] { esriLicenseProductCode.esriLicenseProductCodeArcView },
    new esriLicenseExtensionCode[] { });

            IFeatureClass sourceFeatureClass = GetShapeFile(shapeSourcePath);

            // create  the spatial filter
            ISpatialFilter spatialFilter = new SpatialFilterClass();
            spatialFilter.SpatialRel = esriSpatialRelEnum.esriSpatialRelIntersects;

            IEnvelope boundingBox = new EnvelopeClass();
            boundingBox.XMax = xMax;
            boundingBox.XMin = xMin;
            boundingBox.YMax = yMax;
            boundingBox.YMin = yMin;

            spatialFilter.Geometry = (IGeometry)boundingBox;

            // create other parameters for the export operation
            IDataset sourceDataset = (IDataset)sourceFeatureClass;
            IDatasetName sourceDatasetName = (IDatasetName)sourceDataset.FullName;

            IFeatureClassName destinationClassName = new FeatureClassNameClass();
            IDatasetName destinationDatasetName = (IDatasetName)destinationClassName;

            destinationDatasetName.Name = destinationName;

            IWorkspaceName destinationWorkspaceName = new WorkspaceNameClass();
            destinationWorkspaceName.PathName = destinationFolder;
            destinationWorkspaceName.WorkspaceFactoryProgID = "esriDataSourcesFile.ShapefileWorkspaceFactory";
            destinationDatasetName.WorkspaceName = destinationWorkspaceName;

            // do the actual export
            IExportOperation exportOp = new ExportOperationClass();
            exportOp.ExportFeatureClass(sourceDatasetName, (IQueryFilter)spatialFilter, null, null, destinationClassName, 0);

            licenseInitializer.ShutdownApplication();
        }

        private static IFeatureClass GetShapeFile(string path)
        {
            string workspacePath = System.IO.Path.GetDirectoryName(path);
            string name = System.IO.Path.GetFileName(path);
            IWorkspaceFactory shapefileWorkspaceFactory = new ShapefileWorkspaceFactoryClass();
            IFeatureWorkspace workspace = (IFeatureWorkspace)shapefileWorkspaceFactory.OpenFromFile(workspacePath, 0);
            return workspace.OpenFeatureClass(name);
        }
    }
}

It's easy to use this code from pythhonnet. The first step is to build the .NET project and copy the resulting dll to the same directory as your Python code. Then you need to import clr and add a reference to the dll. Next thing to do is import the desired class from your assembly. Now you can use your class like you would do in IronPython. To test my code I exported 500.000 POIs to a shapefile which I then queried with the same bounding box as in my two previous posts about MongoDb and Rtree.

import clr
clr.AddReference('SpatialQueryTool')
from GisSolved.SpatialQueryTool import SpatialQueryTool

print SpatialQueryTool.Hello()

source= r"D:\source\poi_500000.shp";
destinationFolder = r"D:\destination";
destinationName = "poi_45_50_505_510_2";
SpatialQueryTool.CreateLayerFromSpatialIntersect(source, destinationFolder, destinationName, 4.5, 5.0, 50.5, 51.0)

I hope that this post was useful to you and as always I welcome any comments.

Table to csv

On the ESRI support forum someone asked how to convert an attribute table to a csv file. This can easily be done with a Python script. First we import arcgisscripting and the CursorIterator and initialize the geoprocessing object.

import arcgisscripting, string
from cursoriterator import CursorIterator

gp = arcgisscripting.create()

The next step is to initialize the different variables for this script. The workspace is the path to the directory, file geodatabase, personal geodatabase, sde,... where the script can find the table or feature class that you want to convert. The table is the name of the table or feature class you want to export. The outputpath is the path to file where you want your export to be written to. Be aware that the output directory should exist. If the file doesn't exist it will be created otherwise it will be overwritten. The csvseparator is the column separator for the output file. It could be anything you want. If you want to use tabs you should set this variable to '\t'. The last variable is ignorefields. If a field name in the table or featureclass is equal to a string in this list then this field won't be exported. This can be used for excluding the shape field, the objectid field or other fields you don't want to appear in the output file.

gp.workspace = r'' ## workspace of the table or feature class
table = '' ## table or feature class from wich the attributes should be exported
outputpath = r'' ## path to the file where the output should be written to
csvseparator = ',' ## column separator field
ignorefields = [] ##list with fields to ignore

Then we define a function wich purpose is to create a list of fieldnames that are to be exported to the output file. This is done by iterating over the fields cursor that can be accessed by the gp function listfields. The fields that are in the ignorefields list won't be returned.

def get_fieldnames(fields, ignorefields=[]):
   ignorefieldsupper = map(string.upper, ignorefields)
   fields = CursorIterator(fields)
   fields_output = []
   for field in fields:
       if not field.name.upper() in ignorefieldsupper:
           fields_output.append(field.name)
   return fields_output


fields = gp.listfields(table)
fieldnames = get_fieldnames(fields)

Now we are going to iterate over the rows in the table or featureclass and create a list with a string representation of each row. For each field we add the value of a field to a list. Then we concactenate this values with the chosen separator and add this to the output list.

rows = gp.searchcursor(table)
rows = CursorIterator(rows)

output = []
output.append(csvseparator.join(fieldnames))
for row in rows:
   outputrow = []
   for fieldname in fieldnames:
       outputrow.append(str(row.getvalue(fieldname)))

   outputrow = csvseparator.join(outputrow)
   output.append(outputrow)

Finally we write the outputlist to the defined outputpath.

f = open(outputpath, 'w')
f.write('\n'.join(output))
f.close()

The full code can be found on the arcscripts site.

Projections and transformation

On his blog Richie Carmichael (First edition Second edition) showed how to use the ESRI Projection Engine from .NET. In this article I am going to do the same with Python.

My original problem was that I needed to convert some coordinates from ED50 Projected and ED50 geographic to ETRS89 projected for the UTM zone 31.

First of all we will need the pe.dll provided by ESRI which can be found in the BIN folder of the ArcGIS installation. Copy this dll in the same folder as your Python script.

To use this dll from Python you can use the ctypes package. This extension started has a separate project in Python 2.3 but it has been fully incorporated in Python 2.5. So if you're still using Python 2.4 that shipped with ArcGIS 9.2 You will have to download and install ctypes. This extensions creates the possibility to easily use C dll's from within Python.

Now lets start writing some code. First we import the ctypes package.

    import ctypes

The pe.dll is accessed in the following way.

    pe = ctypes.windll.pe

To transform from one projection to another we need to first create a projection object for the input spatial reference and the output spatial reference. Then we also need a geographic transformation object. We do this by calling the appropriate factory functions which can be found in the SDE C API documentation on EDN.

    input_proj = pe.pe_factory_projcs(input_proj_code)
    transformation = pe.pe_factory_geogtran(transformation_code)
    output_proj = pe.pe_factory_projcs(output_proj_code)

To lookup the code of your projection you will need to take a look at the following sites page 1 page 2 page 3 and page 4. The code of your geographic transformation can be found on the this sites page 1 page 2 and page 3.

To transform points with the projection engine we need to pass the points as a 2 dimensional array of points. In order to do this I created a small Python point class and methods to convert from a list of points to a 2 dimensional C array and back to a list of points. If you want more information about this I suggest you take a look at the ctypes documentation (tutorial reference) and also have a look at the needed datatypes for the pe functions.

    class Point:
        def __init__(self, x, y):
            self.x = x
            self.y = y
        def __str__(self):
            return 'x : ' + str(self.x) + ' y : ' + str(self.y)
    
    def _ctypes_to_points(point_array):
        points = []
        for point in point_array:
            point = Point(point[0], point[1])
            points.append(point)
        return points

    def _points_to_ctypes(points):
        array_type = ctypes.ARRAY(self.point_type, len(points))
        ctype_points = []
        for point in points:
            ctype_point = self.point_type(point.x, point.y)
            ctype_points.append(ctype_point)
        return array_type(*ctype_points)

Now we can create the function for doing the actual transformation from input projected reference system to the output projected reference system. As you can see we first need to convert the projected coordinates to geographic coordinates. Then transform these geographic coordinates with the give transformation. Finally we convert the geographic coordinates back to projected coordinates.

def transform(self, points):
    points_count = len(points)
    point_array = self._points_to_ctypes(points)
    pe.pe_proj_to_geog(self.input_proj, points_count, 
                       point_array)
    pe.pe_geog1_to_geog2(self.transformation, points_count, 
                         point_array, None)
    pe.pe_geog_to_proj(self.output_proj, points_count, 
                       point_array)
    return self._ctypes_to_points(point_array)

The complete code looks like this. I've put the projection related code in a class and at the bottom I added some test code.

import ctypes

class Point:
    def __init__(self, x, y):
        self.x = x
        self.y = y
    def __str__(self):
        return 'x : ' + str(self.x) + ' y : ' + str(self.y)

class Proj:
    def __init__(self, input_proj_code, transformation_code, 
        output_proj_code):
        self.pe = ctypes.windll.pe
        self.input_proj = self.pe.pe_factory_projcs(input_proj_code)
        self.output_proj = self.pe.pe_factory_projcs(output_proj_code)
        self.transformation = self.pe.pe_factory_geogtran(transformation_code)
        self.point_type = ctypes.c_double * 2

    def _points_to_ctypes(self, points):
        array_type = ctypes.ARRAY(self.point_type, len(points))
        ctype_points = []
        for point in points:
            ctype_point = self.point_type(point.x, point.y)
            ctype_points.append(ctype_point)
        return array_type(*ctype_points)

    def _ctypes_to_points(self, point_array):
        points = []
        for point in point_array:
            point = Point(point[0], point[1])
            points.append(point)
        return points

    def transform(self, points):
        points_count = len(points)
        point_array = self._points_to_ctypes(points)
        self.pe.pe_proj_to_geog(self.input_proj, points_count, 
                                point_array)
        self.pe.pe_geog1_to_geog2(self.transformation, 
                                  points_count, point_array, None)
        self.pe.pe_geog_to_proj(self.output_proj, points_count, 
                                point_array)
        return self._ctypes_to_points(point_array)

if __name__ == '__main__':
    ed50_proj_code = 23031
    ed50_etrs89_transformation = 1650
    etrs89_proj_code = 25831
    proj = Proj(ed50_proj_code, ed50_etrs89_transformation, etrs89_proj_code)
    points = [Point(585635, 5690770), Point(585654, 5690110)]
    points = proj.transform(points)
    for point in points:
        print point

I hope you liked this post. Have fun.

Using for-loops for cursors

When writing Python scripts to automate ArcGIS you often encounter all kind of cursors. Cursors are returned mostly returned by the gp object when calling functions like listtables, listfeatureclasses, listrasters, listfields, searchcursor, updatecursor, ... Usually cursors are used in the following way :

    import arcgisscripting
    gp = arcgisscripting.create()
    inputFC = r'path to a feature class'

    inRows = gp.searchcursor(inputFC)
    inRow = inRows.next()
    while inRow:
        ...do something
    inRow = inRows.next()

This is quiet a verbose way to loop through the rows. Fortunately there is a way to turn the cursor into an iterator that can be used in a for loop. This can be done by adding the following short class to your code.

    class CursorIterator(object):
        def __init__(self, cursor):
            self.cursor = cursor
            self.cursor.reset()
        def next(self):
            n = self.cursor.next()
            if n is None:
                self.cursor.reset()
                raise StopIteration
            return n
        def __iter__(self):
            return self

As it is nicely explained in Expert Python Programming by Tarek Ziadé we just created an iterator class. This is a container class with two methods namely next and __iter__. Usage of the CursorIterator class looks like this.

    rows = gp.searchcursor(inputFC)
    rowsIterator = CursorIterator(rows)
    for row in rowsIterator:
        ... do something

or shorter

    for row in CursorIterator(gp.searchcursor(inputFC)):
        ... do something

I hope you enjoyed my first post. Have fun.

Update : As Jason Schreider points out in his comment the CursorIterator class can be easily replaced by the following :

    rows = gp.searchcursor(inputFC)
    for row in iter(rows.next, None):
        ... do something

Pages