Notes and reflections on From here to human-level AI

The Ghent strong AI meetup group will discuss on the 20th of march the 2007 paper from John McCarthy From here to human-level AI so I decided to highlight some parts and write-up my notes and thoughts.

1. What is human-level AI

There are two approaches to human-level AI, but each presents difficulties. It isn’t a question of deciding between them, because each should eventually succeed; it is more a race.

  1. If we understood enough about how the human intellect works, we could simulate it.
  2. To the extent that we understand the problems achieving goals in the world presents to intelligence we can write intelligent programs. That's what this article is about.
    Much of the public recognition of AI has been for programs with a little bit of AI and a lot of computing.

There are some big projects and a lot of researchers attacking the human-level AI problem with one of the two approaches. The first approach, or at least part of it, is used by the American BRAIN Initiative (Brain Research through Advancing Innovative Neurotechnologies) and the European Human Brain Project. The second approach and especially everything related to deep learning has recently had a fair amount of publicity due to human-level and super-human results on some pattern recognition tasks (image classification, face verification) and games (human-level performance on 29 Atari games). For an overview and history of deep learning you can check out the 88 page overview article from Jürgen Schmidhuber. As a side note Jürgen Schmidhuber recently did an interesting AMA (ask me anything) on Reddit (summary from FastML).

2. The common sense informatic situation

The key to reaching human-level AI is making systems that operate successfully in the common sense informatic situation.
In general a thinking human is in what we call the common sense informatic situation. It is more general than any bounded informatic situation. The known facts are incomplete, and there is no a priori limitation on what facts are relevant. It may not even be decided in advance what phenomena are to be taken into account. The consequences of actions cannot be fully determined. The common sense informatic situation necessitates the use of approximate concepts that cannot be fully defined and the use of approximate theories involving them. It also requires nonmonotonic reasoning in reaching conclusions.

Nonmonotonic reasoning = a logic is non-monotonic if some conclusions can be invalidated by adding more knowledge (source).

Common sense facts and common sense reasoning are necessarily imprecise. The imprecision necessitated by the common sense informatic situation applies to computer programs as well as to people.

3. The use of mathematical logic

Mathematical logic was devised to formalize precise facts and correct reasoning. Its founders, Leibniz, Boole and Frege, hoped to use it for common sense facts and reasoning, not realizing that the imprecision of concepts used in common sense language was often a necessary feature and not always a bug. The biggest success of mathematical logic was in formalizing purely mathematical theories for which imprecise concepts are unneeded. Since the common sense informatic situation requires using imprecise facts and imprecise reasoning, the use of mathematical logic for common sense has had limited success. This has caused many people to give up. Others devise extended logical languages and even extended forms of mathematical logic.

Further on he notes that using different concepts and different predicate and function symbols in a new mathematical logic language might still make mathematical logic adequate for expressing common sense. But he is not very optimistic.

Success so far has been moderate, and it isn’t clear whether greater success can be obtained by changing the concepts and their representation by predicate and function symbols or by varying the nonmonotonic formalism.

4. Approximate concepts and approximate theories

Other kinds of imprecision are more fundamental for intelligence than numerical imprecision. Many phenomena in the world are appropriately described in terms of approximate concepts. Although the concepts are imprecise, many statements using them have precise truth values

He follows up with two clarifying examples, one about the concept Mount Everest, the other about the concept welfare of a chicken. Mount Everest is an approximate concept because the exact pieces of rock and ice that constitute it are unclear. But it is still possible to infer solid conclusions based on a foundation built on a quicksand of approximate concepts without definite extensions e.g. if you haven't been to Asia then you've never climbed Mount Everest. The core of the welfare of a chicken problem is: is it better to raise a chicken with care and nice food and then slaughter it or would it have a better life in the wild, risking starvation and foxes ? McCarthy concludes from this:

There is no truth of the matter to be determined by careful investigation of chickens. When a concept is inherently approximate, it is a waste of time to try to give it a precise definition.

In order to reach human-level AI we'll have to be able to approximate concepts in a way that the computer can reason about them.

5. Nonmonotonic reasoning

6. Elaboration tolerance

7. Formalization of context

8. Reasoning about events - especially action

Human level intelligence requires reasoning about strategies of action, i.e. action programs. It also requires considering multiple actors and also concurrent events and continuous events. Clearly we have a long way to go.

9. Introspection and self-awareness

People have a limited ability to observe their own mental processes. For many intellectual tasks introspection is irrelevant. However, it is at least relevant for evaluating how one is using one’s own thinking time. Human-level AI will require introspective ability. In fact programs can have more than humans do, because they can examine themselves, both in source and compiled form and also reason about the current values of the variables in the program.

10. Heuristics

The largest qualitative gap between human performance and computer performance is in the area of heuristics, even though the gap is disguised in many applications by the millions-fold speed advantage of computers. The general purpose theorem proving programs run very slowly, and the special purpose programs are very specialized in their heuristics.
I think the problem lies in our present inability to give programs domain and problem dependent heuristic advice.

McCarthy advocates the usage of declarative heuristics and explains the concept of postponable variables in constraint satisfaction problems.

11. Psychological, social and political obstacles

In this article, McCarthy states that although the main problems in reaching human-level AI lay in the inherent difficulty of the scientific problems, research is hampered by the focus of the computer science world to connecting the basic research to applied problems. The artificial intelligence has encountered philosophical and ideological (religious) objections but the attacks on AI have been fairly limited.

As the general public gets more and more acquainted with the potential dangers, of human-level AI and especially super-human level AI, I believe, that the pressure against AI research will increase.

An interesting book by Nick Bostrom covering the dangers of AI is Superintelligence: Paths, Dangers, Strategies.

12. Conclusion

Between us and human-level intelligence lie many problems. They can be summarized as that of succeeding in the common sense informatic situation.

If you want to read more about the road to human-level AI and superintelligence and its possible consequences then I can recommend this 2 part article by Tim Urban on his blog Wait But Why.

Benchmarking reading binary values from a file with F#, Python, Julia, R, Go and OCaml

In one of my recent posts I showed some F# code for storing and reading integers as binary data. Since then I've created different versions of the reading part in Python, Julia, Go, OCaml and R to be able to compare the performance of the different languages on this simple task and to get a feeling on what it's like to program in Julia, Go and OCaml as I hadn't created any programs in these languages yet.

Below table shows the benchmark results for reading 10 times 10000 values and 10000 times 10 values with links to the source code files in the different programming languages:

Language 10 times 10000 values 10000 times 10 values
F# 6 seconds 20 seconds
Python 26 seconds 40 seconds
Julia 45 seconds 72 seconds
Go 8 seconds 25 seconds
OCaml 2.5 seconds 48 seconds
R 110 seconds NA

The overall fastest version is the one written in F# but note that it's also the version I have tweaked the most. As I'm not very experienced in most of the languages so any performance tips are welcome. Note that I tried using memory mapped files in .NET and Python this improved performance when querying lots of values from the same file but also made it worse in other cases.

The implementation of the functionality is most of the times rather similar in the different languages. Some notable differences where:

  • Julia apparently doesn't have a null value so I refrained from checking whether the read integer value was equal to the int32 minimum value (-2147483648).
  • In Go converting the bytes to integers was faster with a custom function.
  • I didn't find a function in the OCaml Core library to convert bytes to a 32-bit integer, but luckily I found one on Stack Overflow.

F#:
open System
open System.IO

let readValue (reader:BinaryReader) cellIndex = 
    reader.BaseStream.Seek(int64 (cellIndex*4), SeekOrigin.Begin) |> ignore
    match reader.ReadInt32() with
    | Int32.MinValue -> None
    | v -> Some(v)
        
let readValues indices fileName = 
    use reader = new BinaryReader(File.Open(fileName, FileMode.Open, FileAccess.Read, FileShare.Read))
    let values = Array.map (readValue reader) indices
    values

Python:
def read_values(filename, indices):
    # indices are sorted and unique
    values = []
    with open(filename, 'rb') as f:
        for index in indices:
            f.seek(index*4L, os.SEEK_SET)
            b = f.read(4)
            v = struct.unpack("@i", b)[0]
            if v == -2147483648:
                v = None
            values.append(v)
    return values

Julia:
function readvalue(stream, position)
    seek(stream, position)
    return read(stream, Int32)
end

function readvalues(filename::String, indices)
    stream = open(filename, "r")
    try
        return Int32[readvalue(stream, index*4) for index in indices]
    finally
        close(stream)
    end
end

Go:
import ("os")
func bytes2int(b []byte) int32 {
    v := int32(0)
    for  i := 0; i < 4; i++ {
        v = v | (int32(b[i]) << (uint(8*i)))
    }
    return v
}

func readValues(indices []int, filename string) []int32 {
    results := make([]int32, len(indices))
    b := make([]byte, 4)
    f,_ := os.Open(filename)
    for i, cellIndex := range indices {
        f.Seek(int64(cellIndex*4), os.SEEK_SET)
        f.Read(b)
        value := bytes2int(b) // around 10-20% faster then binary.Read
        if value != -2147483648 {
            results[i] = value
        } else {
            results[i] = 99999
        }
    }
    return results
}

OCaml:
let input_le_int32 inchannel = (* http://stackoverflow.com/a/6031286/477367 *)
  let res = ref 0l in
    for i = 0 to 3 do
      let byte = input_byte inchannel in
        res := Int32.logor !res (Int32.shift_left (Int32.of_int byte) (8*i))
    done;

    match !res with
      | -2147483648l -> None
      | v -> Some(v)

let readvalue inchannel index =
  seek_in inchannel (index*4);
  input_le_int32 inchannel

let readvalues (indices:int array) filename =
  let inchannel = open_in_bin filename in
    try
      let result = Array.map (readvalue inchannel) indices in
        close_in inchannel;
        result
    with e ->
      close_in_noerr inchannel;
      raise e

R:

read.values <- function(filename, indices) {
  conn <- file(filename, "rb")
  read.value <- function(index) {
    seek(conn, where=index*4)
    readBin(conn, integer(), size = 4, n = 1, endian = "little")
  }
  r <- sapply(indices,read.value)
  close(conn)
  r[r==-2147483648] <- NA
  r
}

Any suggestions for improving the implementation in one of the above programming languages ? Which language would you like to compare my results with ? Which other language(s) do you expect to be faster than my benchmark results ? Can you help me out with a version in your favorite language or in C, fortran, Common Lisp, Scheme, Clojure, Java or J ?

Becoming Functional

Here is my first book review from the O'Reilly Reader Review Program. I picked Becoming Functional by Joshua Backfield because I thought it would be nice to compare it to the latest book I've read on Functional Programming in JavaScript by Michael Fogus (accidentally also published by O'Reilly).

A step by step introduction to some basic concepts of functional programming like higher order functions (passing a function to a method), pure functions (for the same input always return the same output without side effects), immutable variables, recursion and lazy evaluation.

The first chapters are in Java but starting from chapter 4 the book switches to Groovy and the latest chapters are in Scala. The Java code is not very complicated and is probably understandable for anyone with knowledge of statically typed object-oriented programming languages. Note that all Java samples use Java 7 which is rather verbose for functional programming which is one of the reasons the book switches to Groovy and Scala. The other reason is off course is that these languages support functional concepts out of the box.

Content wise the books starts with an introduction to functional programming then makes the transition from first-class functions to higher order functions and pure functions. Next are immutable variables and recursion, with a nice introduction to tail recursion. Followed by small side steps to laziness and statements. To conclude with pattern matching functional object-oriented programming. The last chapter is surprisingly as it first gives advice on transitioning into functional programming then talks about new design patterns and ends with a full implementation of a simplistic database in Scala.

It's not a bad book in the sense that it teaches incorrect material but it's not a very good book neither. It's a book written for Java developers that have never heard of the basic concepts of functional programming but personally I prefer a different approach to achieve this goal. Instead of refactoring code from the imperative style into the functional style and along the way introducing concepts, I think it's faster and easier to learn functional programming by clearly explaining and showing the concepts directly. Only then you should elaborate on how to migrate from the original code to the functional style and explain why this is a good thing to do. Although it's a rather short book (an estimated 140 pages) it is not very dense but it gives semi-real world example based on requirements from a fictional company XXY. Only the most common functional programming concepts are introduced.

My advice is to buy this book if you're a Java developer who has no experience in functional programming.

Otherwise I would skip this book and instead read:
Or take a look at my page of recommendations.

Storing and fetching raster values in F#

In my quest for a fast system to fetch values from rasters at random cell locations I've been experimenting with a few different implementations in F#. More specifically I have now 4 methods for reading raster files. As usual all source code is available online: AsciiToBin.fsx and all comments on it are very welcome.

The first and simplest method I came up with was converting the ASCII raster files with integers to binary files where every integer is converted to its binary equivalent and written to disk. This saves about 50% of disk space compared to the very large ASCII files but also provides a fast and easy to program way of accessing the values. Note that there are cells with no data so these are stored as Int32.MinValue. As you can see the code is rather short.

module SimpleReadWrite = 

    let writeValue (writer:BinaryWriter) (value:int option) =
        match value with
        | Some(v) -> writer.Write(v)
        | None -> writer.Write(Int32.MinValue)

    let writeValues fileName (values:seq<int option>) =
        use writer = new BinaryWriter(File.Open(fileName, FileMode.OpenOrCreate))
        values
        |> Seq.iter (writeValue writer)
            
    let readValue (reader:BinaryReader) cellIndex = 
        // set stream to correct location
        reader.BaseStream.Position <- cellIndex*4L
        match reader.ReadInt32() with
        | Int32.MinValue -> None
        | v -> Some(v)
        
    let readValues fileName indices = 
        use reader = new BinaryReader(File.Open(fileName, FileMode.Open, FileAccess.Read, FileShare.Read))
        // Use list or array to force creation of values (otherwise reader gets disposed before the values are read)
        let values = List.map (readValue reader) (List.ofSeq indices)
        values

The second version I created uses a memory mapped file for reading the values from the same format as before. This is slightly faster (about 2 times) when we want to query lots of values from the same raster. But also 2 times slower when you query for example 10000 times 10 values from different rasters.

module MemoryMappedSimpleRead =

    open System.IO.MemoryMappedFiles

    let readValue (reader:MemoryMappedViewAccessor) offset cellIndex =
        let position = (cellIndex*4L) - offset
        match reader.ReadInt32(position) with
        | Int32.MinValue -> None
        | v -> Some(v)
        
    let readValues fileName indices =
        use mmf = MemoryMappedFile.CreateFromFile(fileName, FileMode.Open)
        let offset = (Seq.min indices ) * 4L
        let last = (Seq.max indices) * 4L
        let length = 4L+last-offset
        use reader = mmf.CreateViewAccessor(offset, length, MemoryMappedFileAccess.Read)
        let values = (List.ofSeq indices) |> List.map (readValue reader offset)
        values

The third version is similar to the simple reader but it fetches multiple bytes at once when two or more indexes are within a certain range. The performance is a bit worse then the simple reader so I'm not going into any further details. But if you want you can check the solution on github and any suggestions on easier ways of grouping the indexes by inter-distance are welcome.

The last version I created is more space efficient in my case. As I work with world oceanic date, about two thirds of my grids don't have any data (land). To avoid storing this data I separately store a file indicating which cells don't have data and skip those cells without data when writing the binary file. The disadvantage is that this makes everything a lot more complex because you have to map your cell indexes to the location of your binary value in your file in a space efficient way. To be able to store the bitmap I created some BitConverter extension methods to convert a boolean array to a byte array and back which I have also posted on fssnip. Then end result has a performance comparable to the one from the simple reader so if disk space is no problem then this solution isn't worth the additional complexity.

module BitConverter = 
    let pow2 y = 1 <<< y
    // convert booleans to bytes in a space efficient way
    let FromBooleans (bools:bool []) =
        seq {
            let b = ref 0uy
            for i=0 to bools.Length-1 do
                let rem = (i  % 8)
                if rem = 0 && i<> 0 then 
                    yield !b
                    b := 0uy
                if bools.[i] then
                    b := !b + (byte (pow2 rem))
            yield !b
        } |> Array.ofSeq
    // to booleans only works for bytes created with FromBooleans
    let ToBooleans (bytes:byte []) = 
        bytes
        |> Array.map (fun b -> Array.init 8 (fun i -> ((pow2 i) &&& int b) > 0))
        |> Array.concat

After lots of tweaking I managed to get a similar performance as the SimpleReadWrite but with 30% less diskspace needed and a more complex codebase.

Some performance related things I learned on the way are:
  • The Get method from System.Collections.BitArray is slow
  • You might want to convert lots of Seq chaining to one for loop
  • Some use of mutable values (within a function) might be necessary
  • Precompute as much as you can
And as I've tweeted before, I really like to evaluate my code in the REPL (same for my Python and R work).

Any thought on how to make this faster ? Do you know a fast library/system that achieve the same results ? Should I compress my rasters more to decrease disk access ? Any other suggestions ?

Other posts you might like:


Think Python

Several years ago, when I was learning about Python for the first time, I read the online book: "How to think like a computer scientist". Today I just finished Think Python: How To Think Like a Computer Scientist by Allen B. Downey (free e-version) which is an evolved version of this book.

You should read this book if you want to learn programming or if you want to learn python. But if you already have advanced programming skills then I would suggest to just skim the free online version and spend your money on a more advanced python book.

Think Python starts with the most basic things like operators, variables and assignment then functions, conditionals, recursion and iteration are introduced, followed up with strings, lists , dictionaries and tuples. The next chapter is a practical chapter on files and the book ends with 4 chapters introducing object oriented programming. Between the different chapters there are 4 case studies where the learned concepts and techniques are applied.

What you won't learn in this book: 

  • what the different standard libraries are
  • web development in python
  • popular (scientific) libraries like numpy
  • writing (unit) tests for your programs
  • Python 3, except some small remarks
What I particularly  liked:
  • the information about debugging your programs at the end of every chapter and the appendix 
  • the step by step introduction to object-oriented programming
  • the case studies
  • the exercises
  • the glossary at the end of every chapter
As you might have inferred from the above, this book is not a reference book (we have the web for that) but a book that learns you to think like a programmer.

An alternative way to learn Python is Learn Python the Hard Way (html) (pdf + epub + video) by Zed Shaw.

Other books by Allen B. Downey:

If you want to improve your Python skills then take a look at this list of advanced books. I especially liked Expert Python Programming by Tarek Ziadé.