I'm Brett Slatkin and this is where I write about programming and related topics. Check out my favorite posts if you're new to this site. You can also contact me here or view my projects.

29 December 2014

Generic programming in Go using "go generate"

Go 1.4 was released on December 10th and brings us the new go generate command. The motivation is to provide an explicit preprocessing step in the golang toolchain for things like yacc and protocol buffers. But interestingly, the design doc also hints at other use-cases like "macros: generating customized implementations given generalized packages, such as sort.Ints from ints".

That sounds a lot like generics to me! As you've probably heard, generics are a glaring omission from Go and the subject of a lot of debate. When I saw the word "macro" in the design doc I was reminded of "C with Classes", Bjarne Stroustrup's first attempt at shoehorning object-oriented programming into C. Here's what he recalls from the time:

In October of 1979 I had a pre-processor, called Cpre, that added Simula-like classes to C running and in March of 1980 this pre-processor had been refined to the point where it supported one real project and several experiments. My records show the pre-processor in use on 16 systems by then. The first key C++ library, the task system supporting a co-routine style of programming, was crucial to the usefulness of "C with Classes," as the language accepted by the pre-processor was called, in these projects.

Perhaps go generate is the first version of "Go with Generics" that we'll look back at nostalgically while we're all writing Go++. But for now I'd like to take generate for a spin and see how it works.


Why generics?

To make sure we're on the same page, I'd like to answer the question: why do I want to do this? For me the first painful moment using Go was when I tried to join together some strings with a newline character. The problem was that I'm far too reliant on writing Python code like this:

#!/usr/bin/env python

from collections import namedtuple

Person = namedtuple('Person', ['first_name', 'last_name', 'hair_color'])

people = [
    Person('Sideshow', 'Bob', 'red'),
    Person('Homer', 'Simpson', 'n/a'),
    Person('Lisa', 'Simpson', 'blonde'),
    Person('Marge', 'Simpson', 'blue'),
    Person('Mr', 'Burns', 'gray'),
]

joined = '\n'.join(repr(x) for x in people)

print 'My favorite Simpsons Characters:\n%s' % joined

Note how the '\n'.join(repr(x) for x in people) does all the heavy lifting here. It converts the object to a string representation using the repr function. The join method consumes all of those converted inputs and returns the combined string. The same approach works for any type you throw at it. The output is unsurprising:

My favorite Simpsons Characters:
Person(first_name='Sideshow', last_name='Bob', hair_color='red')
Person(first_name='Homer', last_name='Simpson', hair_color='n/a')
Person(first_name='Lisa', last_name='Simpson', hair_color='blonde')
Person(first_name='Marge', last_name='Simpson', hair_color='blue')
Person(first_name='Mr', last_name='Burns', hair_color='gray')


Here's an attempt at accomplishing the same thing generically in Go. The idea here is that I'll implement the method required to make my struct satisfy the fmt.Stringer interface. Then I'll use a type conversion to invoke the generic method With on a []fmt.Stringer array. This should work because my struct Person satisfies the interface, right?

package main

import (
    "fmt"
    "strings"
)

type Join []fmt.Stringer

func (j Join) With(sep string) string {
    stred := make([]string, 0, len(j))
    for _, s := range j {
        stred = append(stred, s.String())
    }
    return strings.Join(stred, sep)
}

type Person struct {
    FirstName string
    LastName  string
    HairColor string
}

func (s *Person) String() string {
    return fmt.Sprintf("%#v", s)
}

func main() {
    people := []Person{
        Person{"Sideshow", "Bob", "red"},
        Person{"Homer", "Simpson", "n/a"},
        Person{"Lisa", "Simpson", "blonde"},
        Person{"Marge", "Simpson", "blue"},
        Person{"Mr", "Burns", "gray"},
    }
    fmt.Printf("My favorite Simpsons Characters:%s\n", Join(people).With("\n"))
}

Unfortunately, this fails with a cryptic message:

./bad_example.go:40: cannot convert people (type []Person) to type Join

Perhaps the type conversion Join(people) is no good. What if instead I just accept an array of fmt.Stringer interfaces? My Person struct implements String so it's assignable to a fmt.Stringer. It should work. Here's the revised section of the program:

type Joinable []fmt.Stringer

func Join(in []fmt.Stringer) Joinable {
    out := make(Joinable, 0, len(in))
    for _, x := range in {
        out = append(out, x)
    }
    return out
}

func (j Joinable) With(sep string) string {
    stred := make([]string, 0, len(j))
    for _, s := range j {
        stred = append(stred, s.String())
    }
    return strings.Join(stred, sep)
}

This also fails, this time a bit more clearly:

./bad_example2.go:51: cannot use people (type []Person) as type []fmt.Stringer in argument to Join

The problem here is the difference between an array of structs and an array of interfaces. Russ Cox explains all the details here. The gist is that interface references in memory are a pair of pointers. The first pointer is to the type of the interface (like fmt.Stringer). The second pointer is to the underlying data (like Person). A []Person array is contiguous bytes in memory of Person structs. A []fmt.Stringer array is contiguous bytes in memory of interface reference pairs. The representations aren't the same, so you can't convert in a typesafe way.

So we're stuck. The only way out is to use reflection, which will slow everything down. Luckily, in Go 1.4 we now have another built-in option: go generate.


Writing a generate tool

The Go team helpfully provided an example tool for generating Stringer implementations using go generate. The code for the tool is here and it's pretty gnarly. It walks the abstract syntax tree (AST) of the source code and determines the right code to output. It's quite an odd form of generic programming.

Based on this example I tried to implement my own generate tool. My goal was to provide the join functionality I sorely missed from Python. I'd consider it success if the following program would execute simply by running go generate followed by go run *.go.

package main

//go:generate joiner $GOFILE

import (
    "fmt"
)

// @joiner
type Person struct {
    FirstName string
    LastName  string
    HairColor string
}

func main() {
    people := []Person{
        Person{"Sideshow", "Bob", "red"},
        Person{"Homer", "Simpson", "n/a"},
        Person{"Lisa", "Simpson", "blonde"},
        Person{"Marge", "Simpson", "blue"},
        Person{"Mr", "Burns", "gray"},
    }
    fmt.Printf("My favorite Simpsons Characters:\n%s\n", JoinPerson(people).With("\n"))
}

I also had to do some AST walking (that's the core of the tool). I rely on the comment // @joiner to indicate which types I want to make joinable. Yes, this is a gross overloading of comments. Perhaps something like tags for type declarations would be better if the language supported it (similar to "use asm"). Go's built-in templating libraries made it easy to render the generated functions.

The full code for my tool is available on GitHub. You can install it on your system with go install https://github.com/bslatkin/joiner. Once you do that, you can run go generate to cause Go to run the tool and output a corresponding main_joiner.go file that looks like this:

// generated by joiner -- DO NOT EDIT
package main

import (
    "fmt"
    "strings"
)

func (t Person) String() string {
    return fmt.Sprintf("%#v", t)
}

type JoinPerson []Person

func (j JoinPerson) With(sep string) string {
    all := make([]string, 0, len(j))
    for _, s := range j {
        all = append(all, s.String())
    }
    return strings.Join(all, sep)
}

Remarkably, this works for my original example above with no modifications. Here's the output from running go run *.go:

My favorite Simpsons Characters:
main.Person{FirstName:"Sideshow", LastName:"Bob", HairColor:"red"}
main.Person{FirstName:"Homer", LastName:"Simpson", HairColor:"n/a"}
main.Person{FirstName:"Lisa", LastName:"Simpson", HairColor:"blonde"}
main.Person{FirstName:"Marge", LastName:"Simpson", HairColor:"blue"}
main.Person{FirstName:"Mr", LastName:"Burns", HairColor:"gray"}


Conclusion

Does go generate make generics for Go easier? The answer is yes. It's now possible to write generic behavior in a way that easily integrates with the standard golang toolchain. I expect existing Go generics tools like gen and genny to move over to this standard approach.

However, that helps most in consuming generic code libraries and using them in your programs. Writing new generic code is still an exceptionally laborious process. Having to walk the AST just to write a generic function is insane. But you can imagine a standard generate tool that helps you write other generate tools. That's the piece we're missing to make generic programming in Go a reality. Now with go generate in the wild, I look forward to renewed interest in projects like gotgo, gonerics, and gotemplate to make this easy!
© 2009-2016 Brett Slatkin