Culture & Empire: Digital Revolution

Its been while since I've read such an influential book so I wanted to share it here:

Culture & Empire: Digital Revolution
written by Pieter Hintjens, a campaigner, writer, and programmer (creator of ØMQ)

A book about the digital revolution and the battle between our new communities and the old power and old money. Full with gems like how to build a community, why Germany went crazy in 1939, what are good property systems, the societal cost of patents and many more.

Start reading now on a single pagepdf or epub or buy it on Amazon (affiliate link).

Here is a quote from the beginning of Chapter 4:
Once upon a time, there was a great Empire that ruled the known world. It owned all the lands, the wealth beneath, and the wealth above. The Empire was run by an old, faceless soci- ety of criminals. It ran on cheap oil and cheap blood. It smashed its opponents in the name of Peace. It burned their lands in the name of Reconstruction. It enslaved them in the name of Freedom. It built massive castles of edict and punish- ment to govern its populations, and it fed them a river of pap to keep them docile. It was powerful, invincible, and paranoid.

Far away, in a different place, a civilization called Culture had taken seed and was growing. It owned little except a magic spell called Knowledge. The Culture ran on light, and built little bubbles of fire and hope. It seduced its critics by giving them what they wanted, no matter how unusual. And as it pulled in more people, it grew and built more of its bubbles.

When the Empire first encountered the Culture, it was puzzled. There were no armies to crush, no statesmen to cor- rupt and recruit, no castles to loot and burn. So it ignored the Culture and its pretty bubbles, hoping it would go away.

The Culture grew, and grew faster than you could follow. In less than a generation, it had started to build cities, impossibly beautiful spheres of fire and hope, massive, and yet gentler than the breeze. More people quietly left the castles to move to the cities of the Culture, where they too learned to build their own bubbles of flames and joy.

The Culture seemed harmless. However, the Empire depended on its vassal masses. If the masses left to go to the Culture’s cit - ies, the Empire would starve and die. Total War was inevitable. Both the Empire and the Culture knew it, and prepared for it in very different ways.

The Empire attacked. It tore down the cities closest to it and told the Culture, stop building or we will come back. And for each city it burnt, a hundred others sprang up. Culture shrugged and said, “We enjoy building new cities.” So the Em- pire sent its infiltrators and spies into the cities to try to cor- rupt them. And the Culture laughed, clapped its hands, and exclaimed, “We do much worse to ourselves every day. Look, we enjoy this game!” And it opened its hands. And there lay some of the Empire’s darkest and deepest secrets, for all to see.

So the Empire, the cold finger of fear touching its heart, smiled its most sincere smile and welcomed the Culture into its lands. And then it began to erect a far wall so wide and so high that it could cover all the cities of the Culture in darkness. If the Cul- ture ran on light, thought the Empire, then it would destroy light.
 For other recommended books take a look at the following page.

Array manipulation and booleans in numpy, hunting for what went wrong

Recently I was doing some raster processing with gdal_calc.py but I was getting some surprising results.
The goal was to convert some continuous value into a binary value based on a treshold and then detect places where only A was True (value: -1), where only B was True (value: 1) and where both A and B where True (value: 0). With a treshold set to 2 I came with the following formula:

(A > 2) - (B > 2)

But after calling the gdal_calc.py with my formula and inspecting the results I only got values of 0 and 1.
After inspecting gdal_calc.py I noticed that it uses numpy and more specifically numpy arrays for the raster manipulation.

This how my python shell session went (numpy_array_math.py):

>>> import numpy as np
>>> a = np.array([1,2,3,4])
>>> b = np.array([1,5,3,2])
>>> print(a > 2)
[False False  True  True]
>>> print(b > 2)
[False  True  True False]
>>> print(True-False)
1
>>> print(False-True)
-1
>>> print((a>2)-(b>2))
[False  True False  True]
>>> print((a>2)*1-(b>2))  # we got a winner
[ 0 -1  0  1]

The problem was that boolean substraction in Python does generate the expected numeric results but the results where converted back into a boolean array by numpy after the substraction. And indeed converting -1, 1 or any other non zero number to a boolean generates a True which when converted back to number for writing the raster to disk gives you the value 1.
The solution was to force at least one of the arrays to be numeric so that we substract numeric values.

>>> bool(1)
True
>>> bool(2)
True
>>> bool(-1)
True
>>> bool(0)
False
>>> print((a>2)*1)
[0 0 1 1]

If you want to try this yourself on Windows then the easiest way to install gdal is with the OSGeo4W installer. A windows installer for numpy can be found on the website by Christopher Golke but consider also installing the full SciPy stack with one of the Scientific Python distributions.

What surprising results have you encountered with numpy.array or gdal_calc ?

Minimal introduction to Python

A few years ago I gave a short introduction to Python to some of my former colleagues and this was the result (also on github):

# 1) data basics

# numbers
i = 1234    # integer
f = 1.234   # float

# operators
a = 12 + 5  # = 17
b = 3 * 4   # = 12
c = 6 / 3   # = 2
d = 46 % 5  # = 1 (=> modulo operator)

a += 1      # = 18
a -= 2      # = 16
a *= 2      # = 32
a /= 2      # = 16
a %= 5      # = 1

# text
s = 'string'                
s = s.upper()               # 'STRING'
s = s.lower()               # 'string'
s.startswith('s')           # true
s.endswith('ng')             # true
s = s.replace('str', 'bl')  # 'bling'
l = s.split('i')            # list ['bl', 'ng']
strings = ['s', "t", "r'i'n", 'g"s"tring', "3"]

## add the prefix r for e.g. paths to files to escape the backslash character

testdoc = r'test\test.txt' ## same as : 'test\\test.txt'

# list
l = ['a1']
l.append('b2')
l.append('c3')
l[0] # 'a1'


mixed = ['3', 'abc', 3, 4.56]

d = {} # dictionary (key-value pairs)

d = {'a' : 'blabla',
     'b' : 1,
     3 : 'ablabl',
     4 : 12.3,
     16.9 : 'dif3'}

d['a'] # 'blabla'

# 2) conditions, loops and functions

# indentation is required !!!

if 3 > 2:
    print('3 > 2')
elif 1 == 0:
    print('1 = 0')
else:
    print('else clause')

if 'a' in d:
    print(d['a'])
else:
    print('not found')

for x in range(0, 5):  # from 0 to 5 (0, 1, 2, 3 ,4)
    print(x)

letters = ['a', 'b', 'c']
for letter in letters:
    print(letter)

# list comprehension
upper_letters = [letter.upper() for letter in letters]
print(upper_letters)


d = { 1:'a', 2:'b', 3:'c'}
for key, value in d.iteritems():
    print(str(key), value)
for key in d.keys():
    print('key :', str(key))
for value in d.values():
    print('value :', value)

def special_sum(numberlist):
    total = 0
    for element in numberlist:
        if element < 5:
            continue # go to the next element
        elif total > 100:
            break # stop the loop
        else:
            total += element
    return total

print(special_sum([1,2,2,4,8,50, 60])) # = 118

# 3) using os and shutil

# import modules

import os
import shutil

def setup_test(directory):
    if not os.path.isdir(directory):
        os.mkdir(directory)
    file1 = os.path.join(directory, 'file1_test.dll')
    open(file1, 'a').close() # create empty file and close it
    file2 = os.path.join(directory, 'file2_test.txt')
    open(file2, 'a').close() # 'a' creates a file for appending (doesn't overwrite)
    
setup_test('test')

# looping over files in dir and subdirs and renaming some of them
def list_files(startdir):
    for element in os.listdir(startdir):
        path = os.path.join(startdir, element)
        if os.path.isfile(path):
            print(path)
            root, extension = os.path.splitext(path)
            if extension.lower() == '.dll': # add an extra extension to dll's
                shutil.copyfile(path, path + '.REMOVETHIS')
                # with os.rename you can replace the file
                ## os.rename(path, path + '.REMOVETHIS')
        else:
            list_files(path)

startdir = r'test'
list_files(startdir)

# or you can loop with os.walk

for root, directories, files in os.walk(startdir, topdown=True):
    print 'dir : %s' % root
    if files:
        print('files :')
        for f in files:
            print('\t', os.path.join(root, f))

# 4) reading, parsing and writing files

def create_tab_test(inputpath):
    with open(inputpath, 'w') as inputfile: # 'w' creates a file for (over)writing
        # create some dummy content separated by tabs and with newlines at the end
        lines = ['\t'.join([str(1*i),str(2*i)])+'\n' for i in range(5)]
        # write to the file
        inputfile.writelines(lines)
        
def tab_to_csv(inputpath, outputpath): # only for small files
    lines = []
    with open(inputpath) as inputfile:
        for line in inputfile:
            line = line.replace('\n', '')
            columns = line.split('\t')
            line = ';'.join(columns)
            lines.append(line)
    with open(outputpath, 'w') as outputfile: # overwrites the outputfile if it exists
        outputfile.write('\n'.join(lines))
    
inputpath = r'test\test.txt' ## or 'D:\temp\test.txt'
outputpath = r'test\test.csv'

create_tab_test(inputpath)
tab_to_csv(inputpath, outputpath)

Want to learn more or need some help ?
Contact me at mail@samuelbosch.com or take a look at one of the following resources:
I you want a bigger list of Python and other CS related books to look at then the following free programming books list might be what you're looking for.

-- update --
Some suggestions by Meriam in the comments: