Using yield to create an iterator

Not so long ago I blogged about using the foreach statement with an ICursor. I achieved this by inheriting from IEnumerator and IEnumerable.

But by using the yield statement we can achieve a similar effect with much less code. The generator function looks like this:

public static IEnumerable<IFeature> Iter(IFeatureCursor cursor)
{
    IFeature feat;
    while((feat = cursor.NextFeature()) != null)
    {
        yield return feat;
    }
    yield break;
}

You can use this method like below.

IFeatureClass featureClass; // initialize feature class
IQueryFilter queryFilter = new QueryFilterClass();
queryFilter.WhereClause = ... your where clause here ...
bool recycling = true;
IEnumerable<IFeature> features = Iter(featureClass.Search(queryFilter, recycling));
foreach (IFeature feature in features)
{
    ... your code here ...
}

If you're running .NET 3.0 or more you can create the following extension method in an extension class. This method adds the SearchIter method to the IFeatureClass interface which replaces the Search method.

public static class Extensions
{
    public static IEnumerable<IFeature> SearchIter(this IFeatureClass featureClass, IQueryFilter queryFilter, bool recycling)
    {
        IFeatureCursor cursor = featureClass.Search(queryFilter, recycling);
        IFeature feat;
        while ((feat = cursor.NextFeature()) != null)
        {
            yield return feat;
        }
        yield break;
    }
}

The usage of the extension method is similar to the usage of the Iter method but you can replace

IEnumerable<IFeature> features = Iter(featureClass.Search(queryFilter, recycling));

with this

IEnumerable<IFeature> features = featureClass.SearchIter(queryFilter, recycling);

So this was it. I hope you learned something and feel free to leave a comment.

Related posts
Disposable Editing
Drag Drop from ArcCatalog
Inserting Features and Rows

Looping over a Workspace

Ever wanted to execute a function on all the tables and/or featureclasses in a workspace then the following code is something for you. What it basically does is first loop over all the featureclasses that are in featuredatasets. Then loop over all the other featureclasses and finally loop over all the tables in the provided workspace.

To achieve this and still be able to retrieve the result of each function execution on the tables and featureclasses I created a generator function. I did this by using the yield statement. In order to avoid duplicate code I also created a nested function for looping over the featureclasses.

def executeiter(gp, workspace, featclassfunction=None, tablefunction=None):
    import os
    
    def loopfcs():
        fcs = gp.listfeatureclasses()
        for fc in iter(fcs.next, None):
            yield featclassfunction(fc)
    
    gp.workspace = workspace
    if featclassfunction is not None:
        for dataset in iter(gp.listdatasets().next, None):
            datasetworkspace = os.path.join(workspace, dataset)
            gp.workspace = datasetworkspace
            for result in loopfcs():
                yield result

        gp.workspace = workspace
        for result in loopfcs():
            yield result
            
    if tablefunction is not None:
        tables = gp.listtables()
        for table in iter(tables.next, None):
            yield tablefunction(table)

If you don't need the result of the function you can call the following function. It takes the same arguments as the executeiter function but its a regular function instead of a generator.

def execute(gp, workspace, featclassfunction=str, tablefunction=str):
    for x in executeiter(workspace, featclassfunction, tablefunction):
        pass

To demonstrate the usage of my code I first initialized a geoprocessing object and a workspace variable. Then I used the executeiter method to make it return the uppercase version of the name of the tables and featureclasses in my workspace. I could also have passed for example the describe or the listfields method of the geoprocessing object or a custom function. When you want to pass a function that doesn't return a result like the deleterows or deletefeatures function its more convenient to call the execute function.

import arcgisscripting, string

gp = arcgisscripting.create()
workspace = r'D:\temp\temp.gdb' # path to your workspace

# print the uppercase names of all the tables and featureclasses
for uppername in executeiter(gp, workspace, string.upper, string.upper):
    print uppername

# delete all rows of all the tables and featureclasses
execute(gp, workspace, gp.deletefeatures, gp.deleterows)

I've come to the end of this post. Did I miss something ? Know a Python idiom I really should start using ? Feel free to comment.

Related posts
Inserting features and rows
Export a table to a csv

Editing with ArcObjects

Recently I found myself rewriting a lot of code which started and stopped edit sessions to decorate it with try ... finally blocks to make sure that every started edit session closes even when an error occurs. The code I wrote looked like below.

bool saveEdits = false;
try
{
    StartEditing();
    ... edit a featureclass or table ...
    saveEdits = true;
}
finally
{
    StopEditing(saveEdits);
}

I thought that there had to be a cleaner way to achieve the same functionality. Here comes the using statement to the rescue. To be able to use the using statement with a class it has to implement the IDisposable interface. So I wrote the following wrapper class for starting and stopping edit sessions.

public class EditSession : IDisposable
{
    IWorkspaceEdit _workspaceEdit = null;

    public IWorkspaceEdit WorkspaceEdit
    {
        get { return _workspaceEdit; }
        set { _workspaceEdit = value; }
    }

    public EditSession(IWorkspace workspace, bool withUndoRedo)
    {
        if (workspace != null)
        {
            _workspaceEdit = (IWorkspaceEdit)workspace;
        }
    }

    public static EditSession Start(IWorkspace workspace, bool withUndoRedo)
    {
        EditSession editSession = new EditSession(workspace, withUndoRedo);
        editSession.Start(withUndoRedo);
        return editSession;
    }

    public void Start(bool withUndoRedo)
    {
        if (_workspaceEdit.IsBeingEdited() == false)
        {
            _workspaceEdit.StartEditing(withUndoRedo);
            _workspaceEdit.StartEditOperation();
        }
    }

    public void SaveAndStop()
    {
        Stop(true);
    }

    public void Stop(bool save)
    {
        if (_workspaceEdit.IsBeingEdited() == true)
        {
            _workspaceEdit.StopEditOperation();
            _workspaceEdit.StopEditing(save);
        }
    }

    #region IDisposable Members

    public void Dispose()
    {
        if (_workspaceEdit != null && _workspaceEdit.IsBeingEdited())
        {
            Stop(false);
        }
    }

    #endregion
}

You can use the EditSession class like this.

using (EditSession editSession = EditSession.Start(workspace, false))
{
    ... edit a featureclass or table ...
    editSession.SaveAndStop();
}

If you're using .NET 3.0 or a later version you can use the following class which creates an extension method for the IWorkspace.

public static class Extensions
{
    public static EditSession StartEditing(this IWorkspace workspace, bool withUndoRedo)
    {
        return EditSession.Start(workspace, withUndoRedo);
    }
}

With the extension method you can directly call StartEditing on a workspace.

using (EditSession editSession = workspace.StartEditing(false))
{
    ... edit a featureclass or table ...
    editSession.SaveAndStop();
}

Related posts
Drag Drop from ArcCatalog
Using foreach with ICursor
Inserting Features and Rows

Inserting Features and Rows

Recently I was inserting features in a feature class but my code was rather slow so I looked around for another method and the following is what I found.

First you need to start an edit session and then you can use the code below. This is just a short body of code to give you an idea on how to use the different objects.

IFeatureCursor insertFeatCursor = outputFeatClass.Insert(true);

foreach (object featureToInsert in featuresToInsert)
{
    IFeatureBuffer outputFeatBuffer = outputFeatClass.CreateFeatureBuffer();
    // set the shape
    outputFeatBuffer.Shape = ...
    // set the different values
    outputFeatBuffer.set_Value(fieldIndex, value);
    // insert the feature buffer
    insertFeatCursor.InsertFeature(outputFeatBuffer);
}
insertFeatCursor.Flush();

If you now save and close your edit session the features are inserted. The code for inserting rows in a table is very similar.

ITable table;
ICursor insertCursor = table.Insert(true);
IRowBuffer rowBuffer = table.CreateRowBuffer();
rowBuffer.set_Value(fieldIndex, value);
insertCursor.InsertRow(rowBuffer);
insertCursor.Flush();

If you have any comments or questions, let me know !!!

Related posts
Using foreach with the ICursor
Drag Drop from ArcCatalog
Pythonnet (call .NET from Python)

Geoprocessing 1 : Intersect

The intersect operation is one of the many overlay operations. When intersecting layer A with a layer B the result will include all those parts that occur in both A and B. In this post I'm going to compare the results of the ArcGIS Intersect tool with the corresponding operation in 3 other GIS packages. The reason why did this was that I noticed that the output from ArcGIS contains more features then I expected when dealing with overlapping geometries. This is usually not a problem but it becomes one when you have lots of overlapping geometries.

The GIS tools I used to compare the result of ArcGIS with were : uDig, Quantum GIS and gvSIG. All these and some more can be found on Portable GIS. Portable GIS brings open source GIS to your usb.

To test the code I prepared 2 shapefiles. One with a long small polygon and the other one with 3 overlapping rectangles (see image below).

ImageHost.org

When intersecting with ArcGIS (version 9.2 and 9.3 tested) the output looks like below. As you can see in the attribute table the result contains 9 polygons.

ImageHost.org

To be able to do the intersect operation with uDig I had to install the Axios Spatial Operations Extension. You can install this extension by clicking Help -> Find and install from the menubar. With Quantum GIS I needed to enable the ftools plugin to enable the geoprocessing functionality. The results of the 3 used open source GIS packages where the same. Only 3 features where created in the output shapefile. For completeness I added the screenshots of the results.

uDig :

intersect_1_2_uDig.jpg (28 KB)

Quantum GIS :

intersect_1_2_QGIS.jpg (41 KB)

gvSIG :

intersect_1_2_gvSIG.jpg (19 KB)

Have any comments or questions ? Let me know !

Related posts
Exporting an ArcGIS table to a text file
Projections and Transformations with pe.dll
Accessing a .NET dll from within Python

Using foreach with the ICursor

My first post on this blog was about how to loop over an ESRI cursor in Python with the for statement instead of the while statement. The same problem exists in .NET when using the various cursor objects like ICursor and IFeatureCursor. Normally you would code something like this :

ICursor cursor = table.Search(null, false);
IRow row;
while ((row = cursor.NextRow()) != null)
{
    // some code
}

Or this:

ICursor cursor = table.Search(null, false);
IRow row = cursor.NextRow();
while (row != null)
{
    // some code
    row = cursor.NextRow();
}

And I want to replace it with the following.

ITable table; // get table from somewhere
IQueryFilter queryFilter; // create QueryFilter or leave it null
CursorGS cursorGS = new CursorGS(table, queryFilter);
foreach (IRow row in cursorGS)
{
    // ... insert your code here
}

To make a class usable in a foreach statement you need to implement the IEnumerable and IEnumerator interfaces. I found a good introduction to making class usable in a foreach statement in this Microsoft article. So what I did was creating a class that inherited from IEnumerator<IRow> and IEnumerable<IRow> and implement all the needed properties and methods. For IEnumerator these where Current, MoveNext, Reset and Dispose and for IEnumerable only the method GetEnumerator was needed. The biggest problem was the Reset method because an ICursor doesn't have a reset method. I decided to set the cursor null and the recreated at the moment that MoveNext is called. This also makes sure the ICursor is only created when we really need it. Below you can find my full class

using System;
using System.Collections;
using System.Collections.Generic;
using System.Text;
using ESRI.ArcGIS.Geodatabase;

namespace GisSolved.GeoDiff.Esri
{
    public class CursorGS : IEnumerator<IRow>, IEnumerable<IRow>
    {
        ITable _table;
        IQueryFilter _queryFilter;

        ESRI.ArcGIS.ADF.ComReleaser _comReleaser;
        IRow _currentRow = null;
        ICursor _cursor = null;

        public CursorGS(ITable table, IQueryFilter queryFilter)
        {
            _table = table;
            _queryFilter = queryFilter;
            _comReleaser = new ESRI.ArcGIS.ADF.ComReleaser();
        }

        #region IEnumerator<IRow> Members

        public IRow Current
        {
            get { return _currentRow; }
        }

        #endregion

        #region IEnumerator Members

        object System.Collections.IEnumerator.Current
        {
            get { return _currentRow; }
        }

        public bool MoveNext()
        {
            if(_cursor == null) // initialize the cursor
            {
                _cursor = _table.Search(_queryFilter, false);
                _comReleaser.ManageLifetime(_cursor);
            }
            _currentRow = _cursor.NextRow();
            return _currentRow != null;
        }

        public void Reset()
        {
            _cursor = null;
        }

        #endregion

        #region IEnumerable<IRow> Members

        public IEnumerator<IRow> GetEnumerator()
        {
            return (IEnumerator<IRow>)this;
        }

        #endregion

        #region IEnumerable Members

        System.Collections.IEnumerator System.Collections.IEnumerable.GetEnumerator()
        {
            return (System.Collections.IEnumerator)this;
        }

        #endregion

        #region IDisposable Members

        public void Dispose()
        {
            _comReleaser.Dispose();
        }

        #endregion
    }
}

As you can see I used the ESRI ComReleaser object to make sure that the ICursor references get released properly.

So now you know how to create a wrapper around the ICursor you can start creating other wrappers for often used cursors like the IFeatureCursor, IFields, ... and I can go to my bed. Success and feel free to post any comments or your implementation !!!

Related posts
Inserting Features and Rows Using for-loops for cursors (Python)
Drag Drop from ArcCatalog

DatagridView Tricks

This post will entirely be about the .NET DatagridView control. I will show you how to synchronize the scrolling of two Datagridviews, disable column sorting, disable cell selection and focus cues.

To synchronize the scrolling of two DatagridViews you should subscribe to the Scroll event of the two Datagridviews and add the following code. What it does is equalizing the first row index and the scrolled horizontal offset. I did set the FirstDisplayedScrollingRowIndex because the VerticalScrollingOffset is a readonly property.

private void dataGridViewA_Scroll(object sender, ScrollEventArgs e)
{
    dataGridViewB.FirstDisplayedScrollingRowIndex = dataGridViewA.FirstDisplayedScrollingRowIndex;
    dataGridViewB.HorizontalScrollingOffset = dataGridViewA.HorizontalScrollingOffset;
}

private void dataGridViewB_Scroll(object sender, ScrollEventArgs e)
{
    dataGridViewA.FirstDisplayedScrollingRowIndex = dataGridViewB.FirstDisplayedScrollingRowIndex;
    dataGridViewA.HorizontalScrollingOffset = dataGridViewB.HorizontalScrollingOffset;
}

For the following tricks I created a custom control called DatagridViewGS that derives from the DatagridView control.

Disabling column sorting for all columns is done by setting the SortMode to NotSortable for every column of the DatagridView. In order to do that I subscribed the DataBindingComplete event and added the following code to it

void DatagridViewGS_DataBindingComplete(object sender, DataGridViewBindingCompleteEventArgs e)
{
    foreach (DataGridViewColumn c in Columns)
    {
        c.SortMode = DataGridViewColumnSortMode.NotSortable;
    }
}

Cell selection is disabled by subscribing to the CellStateChanged event and setting the Selected property of the cell whose state changed to false. I've also set the SelectionMode of the DataGridView to DataGridViewSelectionMode.CellSelect and the MultiSelect property to false.

void DatagridViewGS_CellStateChanged(object sender, DataGridViewCellStateChangedEventArgs e)
{
    if(e.StateChanged == DataGridViewElementStates.Selected)
        e.Cell.Selected = false;
}

After disabling cell selection there where still focus cues. These are the small dotted lines around the selected elements. To hide these I needed to override ShowFocusCues property of the DataGridView and make it always return false.

protected override bool ShowFocusCues
{
    get
    {
        return false;
    }
}

The full code of my custom DatagridView looked like this :

using System;
using System.Collections.Generic;
using System.Text;
using System.Windows.Forms;
using System.Drawing;

namespace GisSolved.GUI
{
    internal class DatagridViewGS:DataGridView
    {
        public DatagridViewGS()
        {
            MultiSelect = false;
            SelectionMode = DataGridViewSelectionMode.CellSelect;
            CellStateChanged += new DataGridViewCellStateChangedEventHandler(DatagridViewGS_CellStateChanged);
            
            DataBindingComplete += new DataGridViewBindingCompleteEventHandler(DatagridViewGS_DataBindingComplete);
        }
        
        // Disable column sorting
        void DatagridViewGS_DataBindingComplete(object sender, DataGridViewBindingCompleteEventArgs e)
        {
            foreach (DataGridViewColumn c in Columns)
            {
                c.SortMode = DataGridViewColumnSortMode.NotSortable;
            }
        }
        // Disable cell selection
        void DatagridViewGS_CellStateChanged(object sender, DataGridViewCellStateChangedEventArgs e)
        {
            if(e.StateChanged == DataGridViewElementStates.Selected)
                e.Cell.Selected = false;
        }
        // Disable focus cues
        protected override bool ShowFocusCues
        {
            get
            {
                return false;
            }
        }
    }
}

Do you know other useful tricks with DatagridViews or other .NET controls ? Feel free to write them down in the comments section !

Related posts
Drag and drop from ArcCatalog to a .NET form.
Calling .NET code from Python

Building and Installing Spatial Index on Ubuntu

This is a short post about installing Spatial Index on Ubuntu. First download the latest release. At the moment it's version 3.2 and cd to the location of your download. Then issue the following commands :

tar xzvf spatialindex-1.3.2.tar.gz
cd spatialindex-1.3.2
./configure
sudo make install

If you want to configure your install you should take a look at the installation notes. You can find more information about Spatial Index at the projects trac.

Related Posts
Installing Tokyo Cabinet and Ruby on Ubuntu
Rtree and MongoDB

Tokyo Cabinet 2 : Loading and querying point data

After setting up Tokyo Cabinet and Ruby its time to use it. As with my post about MongoDB I'm going to load 500.000 POIs in a database and query them with a bounding box query. I will use the table database from Tokyo Cabinet because it supports the most querying facilities. With a table database you can query numbers with full matched and range queries and for strings you can do full matching, forward matching, regular expression matching,...

To load the data in my database I will need to read my shapefile with POIs with Ruby and write the attributes to a new database. First we create the database with the following code.

require 'tokyocabinet'
include TokyoCabinet

# create the object
tdb = TDB::new

# open or create  the database
if !tdb.open("poi_db.tct", TDB::OWRITER | TDB::OCREAT)
  STDERR.printf("open error: %s\n", tdb.errmsg(tdb.ecode))
end

To read the features in my shapefile I am going to use the Ruby bindings for GDAL/OGR. Because I installed Tokyo Cabinet on GISVM I already had FWTools installed but I still needed to install the Ruby bindings for it. I did this with the following command.

sudo apt-get install libgdal-ruby

Now we are going to read a shapefile with 500.000 point features and write the records to the database. First we open the shapefile and get the layer. Then we loop over the features, create a new record and fill the record with the x,y information and the other fields when they aren't empty. The values need to be converted to strings otherwise the record can't be saved. Then we put the record in the database.

require 'gdal/ogr'

# open my shapefile
dataset = Gdal::Ogr.open("poi_500000.shp")
layer = dataset.get_layer(0) 

feature_defn = layer.get_layer_defn

layer.get_feature_count.times do |i|
 record = Hash.new # create new record
    feature = layer.get_feature(i)
 geom = feature.get_geometry_ref()
 record['x'] = geom.get_x(0).to_s()
 record['y'] = geom.get_y(0).to_s()
 pkey = tdb.genuid # init primary key
 feature_defn.get_field_count.times do |i|
  field_defn = feature_defn.get_field_defn(i)
  fieldname = field_defn.get_name_ref
  value = feature.get_field_as_string(i);
  if not value.nil? and value != ""
   if field_defn.get_name_ref == "ID"
    pkey = value
   else
    record[fieldname] = value.to_s()
   end
  end
 end
 # store the record in Tokyo Cabinet
 tdb.put(pkey, record)
end

To add indexes on the x and y field we call the following code. This creates two supplementary files called poi_db.tct.idx.x.dec and poi_db.tct.idx.y.dec.

# add index on x and y
tdb.setindex('x', TDB::ITDECIMAL)
tdb.setindex('y', TDB::ITDECIMAL)

To query the POIs in the database I created a function to query the POIs for a given bounding box and then I benchmarked it. I used the same bounding box as in my previous posts about MongoDB, Rtree, Pythonnet and PostGIS.

# query POIs by bounding box
def query(tdb, minx, maxx, miny, maxy)
 qry = TDBQRY::new(tdb)
 qry.addcond("x", TDBQRY::QCNUMGE, minx.to_s())
 qry.addcond("x", TDBQRY::QCNUMLE, maxx.to_s())
 qry.addcond("y", TDBQRY::QCNUMGE, miny.to_s())
 qry.addcond("y", TDBQRY::QCNUMLE, maxy.to_s())
 qry.setorder("x", TDBQRY::QONUMASC)

 res = qry.search
 puts res.length # number of results found
 return res
end

require 'benchmark'
puts Benchmark.measure { query(tdb, 4.5, 5.0, 50.5, 51.0) }

The query returned 98000 POIs. I ran the benchmark 12 times and this where the results :

  1.620000   0.190000   1.810000 (  1.866339)
  1.570000   0.030000   1.600000 (  1.625303)
  1.640000   0.030000   1.670000 (  1.668573)
  1.650000   0.000000   1.650000 (  1.664806)
  1.650000   0.020000   1.670000 (  1.708228)
  1.730000   0.010000   1.740000 (  1.744645)
  1.410000   0.310000   1.720000 (  1.749268)
  1.620000   0.050000   1.670000 (  1.724199)
  1.610000   0.010000   1.620000 (  1.657794)
  1.660000   0.020000   1.680000 (  1.680383)
  1.710000   0.020000   1.730000 (  1.767141)
  1.720000   0.010000   1.730000 (  1.809114)

According to the Ruby documentation the benchmark outputs the user CPU time, the system CPU time, the sum of the user and system CPU times, and the elapsed real time. So this means that the query took between 1.65 and 1.87 seconds to get a list of 98000 POIs within the given bounding box. This is a nice indication of the speed of Tokyo Cabinet.

To demonstrate how you can access the attribute I created the following code. It loops over the first 100 found POIs and prints the ID and the x- and y-coordinate.

res = query(tdb, 4.5, 5.0, 50.5, 51.0)
# print the first hundred found POIs
i = 0
res.each do |rkey|
 rcols = tdb.get(rkey)
 puts rcols['id'].to_s() + " " + rcols['x'].to_s() + " " + rcols['y'].to_s()
 i += 1
 if i > 100
  break
 end
end

Now we are ready to close the database. I hope you enjoyed this post and as always I welcome any comments.

# close the database
if !tdb.close
 ecode = tdb.ecode
 STDERR.printf("close error: %s\n", tdb.errmsg(ecode))
end

Related Posts
Installing Tokyo Cabinet and Ruby on Ubuntu
Populating a MongoDb with POIs
Spatial indexing a MongoDb with Rtree
PostGIS : Loading and querying data

Installing Tokyo Cabinet and Ruby on Ubuntu

After MongoDB its time for another alternative to relational databases called Tokyo Cabinet. Tokyo Cabinet is a library of routines for managing a file based key-value store. It's a high performing database and it can be accessed over a network with Tokyo Tyrant. In this post I install Tokyo Cabinet, Ruby and the Ruby bindings for Tokyo Cabinet. But there will be a follow up post where I load and query POIs like I did with MongoDB and PostgreSQL/PostGIS.

Tokyo Cabinet only works on Linux so I installed it in on an Ubuntu virtual machine. It took me some time to figure everything out but if I can do it you can too. First you need to download the latest version of Tokyo Cabinet from the project site. Once downloaded you open a terminal window, navigate to the download location and issue the following commands.

tar xzvf tokyocabinet-1.4.20.tar.gz
cd tokyo-cabinet/
# install dependencies
sudo apt-get install checkinstall build-essential libbz2-dev
# now compile
./configure --prefix=/usr
make clean
make
# creates and installs a Debian package
sudo checkinstall -D 

I decided to use the Ruby bindings so if you don't have it you can install it with the below command. This installs Ruby, an interactive shell, an interactive reference, the Ruby documentation and the dev part of the Ruby Standard Library. We will need the ruby-dev package for building the Ruby bindings of Tokyo Cabinet.

sudo apt-get install ruby irb ri rdoc ruby1.8-dev

After downloading the Ruby bindings for Tokyo Cabinet from the project site, I installed them with the following commands.

tar xzvf tokyocabinet-ruby-1.26.tar.gz
cd tokyocabinet-ruby-1.26
ruby extconf.rb
make
sudo make install

In the specifications document you can find a lot of information about Tokyo Cabinet and the underlying concepts. As you can read there are four types of databases : a hash database, a B+ tree database, a fixed-length database and a table database. In the examples directory of the Ruby bindings you will find some samples that create and use these four database. There is also a sample that uses the abstract database API with which you can communicate to the four database types. For more info on the Ruby bindings you can read the docs.

Problems with the installation ? Here are the sources that helped me with the installation process or post a comment and maybe I can help out.
http://openwferu.rubyforge.org/tokyo.html
http://oui.com.br/blog/nando-en/post/installing-tokyo-cabinet
http://www.ubuntugeek.com/how-to-install-ruby-on-rails-ror-in-ubuntu.html
http://blogs.law.harvard.edu/hoanga/2006/10/27/fixing-mkmf-load-error-ruby-in-ubuntu/

Related posts
Tokyo Cabinet 2 : Loading and querying points
Populating a MongoDb with POIs
Spatial indexing a MongoDb with Rtree
PostGIS : Loading and querying data

Drag Drop from ArcCatalog

As you probably already noticed you can drag drop items within ArcGIS. For example you can drag a feature class from ArcCatalog to ArcMap or from ArcMap or ArcCatalog to a geoprocessing form. In this post I'm going to show you what has to be done to enable drag drop behavior from ArcCatalog and Windows Explorer to a textbox.

To enable drag drop on a textbox I added event handlers for DragEnter, DragOver and DragDrop for my textbox called textBoxPath. In the DragEnter and DragOver handlers I check whether the dragged object is valid. If this is the case the drag drop effect is set to All. This is a combination of the Copy, Move, and Scroll effect. But its in the DragDrop event handler and more precisely in the helper function GetPaths that the most import stuff happens. As you can see the Data property of the DragEventArgs is processed by the GetPaths method and if any paths to feature classes or tables are found the first path is shown. After that I added a small hack to put the cursor at the end of the text in the textbox.

private void TextBoxPath_DragEnter(object sender, DragEventArgs e)
{
 e.Effect = EsriDragDrop.IsValid(e.Data) ? DragDropEffects.All : DragDropEffects.None;
}

private void TextBoxPath_DragOver(object sender, DragEventArgs e)
{
 e.Effect = EsriDragDrop.IsValid(e.Data) ? DragDropEffects.All : DragDropEffects.None;
}

private void TextBoxPath_DragDrop(object sender, DragEventArgs e)
{
 List<string> paths = EsriDragDrop.GetPaths(e.Data);
 TextBox txtBoxPath = (TextBox)sender;
 if (paths.Count > 0)
 {
  // set value of textbox to the first found path
  txtBoxPath.Text = esriDragDrop.Paths[0];
  // place cursor at the end of the textbox
  txtBoxPath.SelectionStart = txtBoxPath.TextLength;
 }
}

In the below EsriDragDrop class I placed the IsValid and GetPaths methods. The IsValid method checks whether the IDataObject coming from the drag event contains any valid objects. The GetPaths method retrieves those valid objects and returns the paths to the found feature classes and tables. It uses the IDataObjectHelper interface and its GetNames and GetFiles methods to access the objects in the IDataObject. Note that only feature classes and tables will be returned by my code but this constraint can easily be removed by not checking the Type of the datasetName. I didn't add any functionality to check whether the file path dragged from a Windows Explorer to the textbox was valid but you can implement this by using the Geoprocessor object and its Exists and Describe methods or by trying to open the table or feature class.

using System.Collections.Generic;
using System.Windows.Forms;
using ESRI.ArcGIS.esriSystem;
using ESRI.ArcGIS.Geodatabase;
using ESRI.ArcGIS.SystemUI;

namespace GisSolved.DragDrop
{
 public class EsriDragDrop
 {
  const string DATAOBJECT_ESRINAMES = "ESRI Names";
  public static bool IsValid(IDataObject dataObject)
  {
   return dataObject.GetDataPresent(DATAOBJECT_ESRINAMES) ||
    dataObject.GetDataPresent(System.Windows.Forms.DataFormats.FileDrop);
  }
  public static List<string> GetPaths(IDataObject dataObject)
  {
   List<string> foundPaths = new List<string>();
   IDataObjectHelper dataObjectHelper = new DataObjectHelperClass();
   dataObjectHelper.InternalObject = (object)dataObject;
   
   if (dataObjectHelper.CanGetNames())
   {
    IEnumName enumNames = dataObjectHelper.GetNames();
    IName name;
    while ((name = enumNames.Next()) != null)
    {
     if (name is IDatasetName)
     {
      IDatasetName datasetName = (IDatasetName)name;
      // only accept feature classes and tables
      if (datasetName.Type == esriDatasetType.esriDTFeatureClass ||
       datasetName.Type == esriDatasetType.esriDTTable)
      {
       string path = System.IO.Path.Combine(datasetName.WorkspaceName.PathName, datasetName.Name);
       foundPaths.Add(path);
      }
     }
    }
   }
   else if (dataObjectHelper.CanGetFiles())
   {
    string[] paths = (string[])dataObjectHelper.GetFiles();
    foreach (string path in paths)
    {
     // TODO : Add code here to check if the file path is a valid path
     foundPaths.Add(path);
    }
   }
   return foundPaths;
  }
 }
}

This is all you need to implement drag drop behavior from ArcCatalog or Windows Explorer to your textbox. If you want to implement drag drop from ArcMap to your form I suggest you to read this and this. Any comments or suggestions ? Let me know.

Related posts
DatagridView Tricks
Calling .NET from Python to execute spatial queries
Projecting coordinates with Python and the ArcGIS Projection Engine

Python Toolbox 4 : Clone Digger

Looking for duplicate code or opportunities to refactor, let me introduce you to a great Python tool called Clone Digger. As the projects page says

Clone Digger aimed to detect similar code in Python and Java programs. The synonyms for the term "similar code" are "clone" and "duplicate code".

Once installed you call the clonedigger.py file with as arguments the path for the output html and the path to a folder or code file to analyze. If you call it with the parameter -h it outputs the different commandline options. To show the power of Clone Digger I used the following extract from actual code.

import arcgisscripting

gp = arcgisscripting.create()

def create_point(x, y):
    p = gp.createobject('Point')
    p.x = x
    p.y = y
    return p

def createPolygon(xMin, xMax, yMin, yMax):
    polygon = gp.createobject('array')
    ##Add the first point
    newPoint = createPoint(xMin, yMin)
    polygon.Add(newPoint)
    ##Add the second point
    newPoint = createPoint(xMin, yMax)
    polygon.Add(newPoint)
    ##Add the third point
    newPoint = createPoint(xMax, yMax)
    polygon.Add(newPoint)
    ##Add the fourth point
    newPoint = createPoint(xMax, yMin)
    polygon.Add(newPoint)
    ##Close the polygon
    newPoint = createPoint(xMin, yMin)
    #polygon.Add(newPoint)
    return polygon

To run Clone Digger on this file all you have to is issue the below command. Make sure that your shell finds the file clonedigger.py by adding to your path variables or by navigating to its folder.

python clonedigger.py -o D:\output.html D:\CodeTest.py

The output first shows some summary values from the code analysis. Then it shows the different code blocks where duplicate or similar code where found. This is how the output looks like for my short Python code.

Source files: 1

Clones detected: 2

9 of 17 lines are duplicates (52.94%)

Parameters
clustering_threshold = 10
distance_threshold = 5
size_threshold = 5
hashing_depth = 1
clusterize_using_hash = False
clusterize_using_dcup = False

Time elapsed
Construction of AST : 0.00 seconds
Building statement hash : 0.00 seconds
Building patterns : 0.00 seconds
Marking similar statements : 0.02 seconds
Finding similar sequences of statements : 0.00 seconds
Refining candidates : 0.02 seconds
Total time: 0.03
Started at: Wed Jun 24 21:05:15 2009
Finished at: Wed Jun 24 21:05:15 2009

Clone # 1
Distance between two fragments = 4
Clone size = 7

Source file "D:\CodeToTest.py"
The first line is 16
Source file "D:\CodeToTest.py"
The first line is 13
newPoint = createPoint(xMin, yMax) newPoint = createPoint(xMin, yMin)
polygon.Add(newPoint) polygon.Add(newPoint)
newPoint = createPoint(xMax, yMax) newPoint = createPoint(xMin, yMax)
polygon.Add(newPoint) polygon.Add(newPoint)
newPoint = createPoint(xMax, yMin) newPoint = createPoint(xMax, yMax)
polygon.Add(newPoint) polygon.Add(newPoint)
newPoint = createPoint(xMin, yMin) newPoint = createPoint(xMax, yMin)



Clone # 2
Distance between two fragments = 4
Clone size = 5

Source file "D:\CodeToTest.py"
The first line is 19
Source file "D:\CodeToTest.py"
The first line is 13
newPoint = createPoint(xMax, yMax) newPoint = createPoint(xMin, yMin)
polygon.Add(newPoint) polygon.Add(newPoint)
newPoint = createPoint(xMax, yMin) newPoint = createPoint(xMin, yMax)
polygon.Add(newPoint) polygon.Add(newPoint)
newPoint = createPoint(xMin, yMin) newPoint = createPoint(xMax, yMax)



Clone Digger is aimed to find software clones in Python and Java programs. It is provided under the GPL license and can be downloaded from the site http://clonedigger.sourceforge.net

I would not try remove all the duplicates or similarities found by Clone Digger. But I think it's a great tool to find code that can be improved. The degree to which you refactor depends greatly on the goal of your code and your time restrictions. Clone Digger is also be useful when working with multiple people on a project or when having to improve some legacy code.
Have any comments ? Any tools you can't live without ? Any suggestions for me ? Feel free to let me know.

Related posts
Pythonnet (call .NET from Python)
Pygments (syntax highlighter)
Logging