I'm Brett Slatkin and this is my personal site. I write code. These are my projects:

31 July 2014

How I'm writing a programming book

I've been working on Effective Python for just over two months now. The plan is to have 8 chapters in total. I've written a first draft of 5 so far. Chapter 3, the first one I wrote, was the hardest for many reasons. I had to establish a consistent voice for talking about Python. I took on the most difficult subjects first (objects and metaclasses) to get that work out of the way. I also had to build tools to automate my workflow for writing.

Each chapter consists of a number of short items that are about 2-4 pages in length. The title of an item is the "what", the shortest possible description of its advice (e.g., "Prefer Generators to Returning Lists"). The text of the item is the "why", the explanation that justifies following the advice. It's important to make the core argument for each item using Python code. But it's also important to surround the code with detailed reasoning.

Before I started I read a nice retrospective on how one author wrote their programming book. They had separate source files for the example code and used special comments to stitch it into the book's text at build time. That's a great idea because it ensures the code that's in the book definitely compiles and runs. There's nothing worse in programming tutorials than typing in code from the text and having it barf.

I wanted to go a step further. I wanted my examples to be very short. I wanted to intermix code and prose more frequently, so the reader could focus on smaller pieces one step at a time. I wanted to avoid huge blocks of code followed by huge blocks of prose. I needed a different approach.


Writing workflow

After some experimenting what I landed on is a script that processes GitHub Flavored Markdown. It incrementally reads the input Markdown, finds blocks that are Python code, runs them dynamically, and inserts the output into a following block.

Here's an example of what the input Markdown looks like:

The basic form of the slicing syntax is `list[start:end]` where `start` is inclusive and `end` is exclusive.

```python
a = [1, 2, 3, 4, 5, 6, 7, 8]
print('First four:', a[:4])
print('Last four: ', a[-4:])
print('Middle two:', a[3:-3])
```

```
First four: [1, 2, 3, 4]
Last four:  [5, 6, 7, 8]
Middle two: [4, 5]
```

When slicing from the start of a list you should leave out the zero index to reduce visual noise.

```python
assert a[:5] == a[0:5]
```

I write the files in Sublime Text. When I press ⌘B it builds the Markdown by running my script, which executes all the Python, inserts the output back into the text, and then overwrites the original file in-place. This makes it easy to develop the code examples at the same time I'm writing the explanatory prose. It feels like the read/eval/print loop of an interactive Python shell.

My favorite part is how I made Python treat the Markdown files as input source code. That means when there's an error in my examples and an exception is raised, I'll get a traceback into the Markdown file at exactly the line where the issue occurred.

Here's an example of what that looks like in the Sublime build output:

Traceback (most recent call last):
  File ".../Slicing.md", line 29, in 
    a[99]
IndexError: list index out of range

It's essentially iPython Notebook, but tuned for my specific needs and checked into a git repo as Markdown flat files. Update: A couple people mentioned that this is a variation of Knuth's Literate Programming. Indeed it is!


Publishing workflow

Unfortunately, my deliverable for each chapter must be a Microsoft Word document. As a supporter of open source software and open standards this requirement made me wince when I first heard it. But the justification is understandable. The publisher has a technical book making system that uses Word-based templates and formatting. They have their own workflow for editing and preparing the book for print. This is the reality of desktop publishing. More modern tools like O'Reilly Atlas exist, but they are new and still in beta.

There is no way I'm going to manually convert my Markdown files into Word files. The set of required paragraph and character styles is vast and complicated. These styles are part of why the published book will look good, but it's tedious work that's easy to get wrong. Sounds like the perfect job for automation!

I have a second script that reads the input Markdown (using mistune) and spits out a Word .docx file (using python-docx). The script has a bunch of rules to map Markdown syntax to Word formatting. The script also passes all of the Python code examples through the Python lexer to generate syntax highlighting in the resulting document.

The other important thing the publishing script does is post-process the source code. Often times in an example there are only two lines out of 20 I need to show to the reader to demonstrate my point. The other 18 lines are for setup and ensuring the example actually demonstrates the right thing (testing). So I have special directives in the code as comments that can hide lines or collapse them with ellipses.

Here's an example of what this looks like in the Markdown file:

```python
class MissingPropertyDB(object):
    def __getattr__(self, name):
        if name == 'missing':
            raise AttributeError('That property is missing!')
        # COMPRESS
        value = 'Value for %s' % name
        setattr(self, name, value)
        return value
        # END

data = MissingPropertyDB()
# HIDE
data.foo  # Test the success case
# END
try:
    data.missing
except AttributeError as e:
    pretty(e)
```

The actual output in the published book would look like this:

class MissingPropertyDB(object):
    def __getattr__(self, name):
        if name == 'missing':
            raise AttributeError('That property is missing!')
        # ...

data = MissingPropertyDB()
try:
    data.missing
except AttributeError as e:
    pretty(e)

>>>
AttributeError('That property is missing!',)


Conclusion

If you have any interest in using these tools let me know! Writing a book is already hard enough. Having a good workflow helps a lot. I'd like to save you the trouble.

17 July 2014

Inside the Gopher's Studio

A highlight from GopherCon earlier this year was "Inside the Gopher's Studio":



It's all good to watch, but especially this gem from Rob Pike at 3:49:

"Most people don't know that when I got to Bell Labs Bjarne Stroustrup was my officemate. Uh-- we didn't get along."

Set YouTube to 0.5 speed to make it sound like a drunken story at the bar (in vino veritas).
Somehow came across the Padovan sequence when looking up variations on coprimes.



Looks like a nautilus, which also happens to be my favorite magazine right now.

More examples of perceptual diffs

I've found a few more examples of people using perceptual diff tools to do visual regression testing.


Saw this week that Twitter's trying out dpxdt as well.

10 July 2014

Met self-proclaimed "senior software architect" who'd never heard of Go. Red flag. Curiosity is a defining trait of great programmers.

Generics in Go via "generate"

I'm less interested in generics in Go after reading this intro to Boost, the C++ template library. The resulting code is impenetrable to anyone but an expert. I don't think generalization is as important as computer scientists say it is. Approachability should be the biggest concern given that developing software is almost always a social problem, not a technical one.

Rob Pike's solution for generic programming in Go could be his proposal for the "generate" command in the Go toolchain. This addresses a whole range of requirements, including lexers, binary embedding, and protocol buffers. Templating for Go generics is just a natural consequence of the design. It'll be interesting to see if existing attempts at generics in Go move to use "generate" instead.

Python vs. Go at Dropbox

Dropbox rewrote some infrastructure and produced 200KLOC of Go. They open sourced their common Go libraries. That's great. Some folks are saying this means they've abandoned Python. I don't read it that way. Dropbox needs to reduce costs and scale up if they want to have an IPO. Moving to Go seems like a natural choice for a company heavily invested in Python. Brad says Go gives you "90% of the ease of scripting languages with 90% of the performance of systems languages." That sounds right to me, but it makes the choice between Python and Go even more murky. I'm still searching for the dividing line.

01 July 2014

This website is the (addictive) Amazon Prime equivalent for machine shops.
Current status: Requesting an estimate for 12 foot long, 43 pound bars of 6061 aluminum.

27 June 2014

Universal constructor

Related to visualizing algorithms, I rediscovered the Von Neumann universal constructor in cellular automata (like Conway's game of life). I had never appreciated the similarity of this machine's organization to DNA. From Wikipedia:

Von Neumann's crucial insight is that part of the replicator has a double use; being both an active component of the construction mechanism, and being the target of a passive copying process. This part is played by the tape of instructions in Von Neumann's combination of universal constructor plus instruction tape.

The combination of a universal constructor and a tape of instructions would i) allow self-replication, and also ii) guarantee that the open-ended complexity growth observed in biological organisms was possible. The image below illustrates this possibility.

This insight is all the more remarkable because it preceded the discovery of the structure of the DNA molecule by Watson and Crick, though it followed the Avery-MacLeod-McCarty experiment which identified DNA as the molecular carrier of genetic information in living organisms. The DNA molecule is processed by separate mechanisms that carry out its instructions and copy the DNA for insertion for the newly constructed cell. The ability to achieve open-ended evolution lies in the fact that, just as in nature, errors (mutations) in the copying of the genetic tape can lead to viable variants of the automaton, which can then evolve via natural selection.

Mike Bostock's huge page of visualizing algorithms is not to be missed.

25 June 2014

Effective Python

I'm overwhelmingly excited to be writing the Effective Python book, a follow-on to Scott Meyers' classic, Effective C++. I'm honored to have the opportunity to author this book. I first read Effective C++ when I was 15 years old. Scott's Effective books led to my 7 year obsession with C++ and my first job. At Google I learned Python. I've been building infrastructure and applications with it for the past 9 years. Hopefully I have valuable advice to share from my own experience and what I've learned from the Python community.

Like the original, my book will be ~50 specific pieces of advice, 2-4 pages each, on how to write better Python programs. It should be the second thing you read after wonderful introductory books like Alex Martelli's Python in a Nutshell (which is how I learned Python) and Zed Shaw's Learn Python the Hard Way. It will be a stepping stone towards Python mastery via more thorough references like Wesley Chun's Core Python books and the Python Cookbook.

The goal of Effective Python is to give readers a sense of the "right way" of writing Python code in general. It's not a language introduction, a cookbook, an encyclopedic reference, or a guide for a specific area like Django, NumPy, of Kivy. This book is for programmers who want to know the dirt, the real stuff, the guidance of hard-won experience. It should transcend specific problem domains. This is what made Effective C++ so awesome.

I'll do my best to follow Scott's advice in writing it. Of course I'm terrified of leaving anything out. If you have tips or ideas for things I shouldn't miss, please send me a note using this link. Thanks in advance!

23 June 2014

Programming is an obsession with esoteric knowledge.

22 June 2014

Impressive survey attempting to understand why Erlang has an adoption problem.
A team at Google released FlatBuffers, yet another data encoding and IDL. It makes the same important tradeoff as Cap'n Proto: There is no encoding or decoding; how it's serialized is the same as what's in memory. Having a distinct encode/decode step is the fatal flaw in Protocol Buffers.
Finally an official post with screenshots and detail about NY Times' CMS.
Happy to hear PyPy now supports Python3. I hope the next big milestone they reach is overcoming the GIL.

20 June 2014

Cool to see a proposal to provide the Android NDK for Go.

Runtime errors

Slogged through some Java today. It's fine. But I've never achieved Java nirvana. When I see a runtime error caused by tools like Guice and Dagger I want to delete my homedir and go on a silent retreat. People say they'll never use Python in production because of syntax errors at runtime. It seems like injection warnings are another dimension of the same problem.

18 June 2014

Facebook finally talks publicly about their custom networking gear called "Wedge". When Open Compute launched in April 2011 I thought it was overt that networking gear wasn't mentioned at all. In infrastructure the network is what matters most.

14 June 2014

The only defensible opinion is that your opinions may be indefensible.

11 June 2014

Cats and dogs living together

My current project is first time I've managed other people directly. But I'm still a software engineer, so my responsibilities are:

  1. Write code
  2. Design systems
  3. Lead our engineering team to get things done

#3 is the "engineering management" part of my job. The gist of engineering management:

  • Ask the right questions
  • Identify risks
  • Prioritize the team's efforts to address these risks
  • Escalate issues beyond your control

It's meetings with your team, meetings with other managers, and planning.

There is another type of engineering manager at Google that only does #3. These folks are pure managers, not software engineers. They are not responsible for direct technical contributions (even though many still write code and participate in design). Their only goal is leading their team. Most Directors and VPs have this role. Many large teams are led by pure engineering managers because building software at scale is primarily a social problem, not a technical one.

What's interesting is what happens in an emergency.

When something goes wrong I want to write code to solve the problem because that's what I do. Engineering managers want to schedule meetings because communication is what they do. There's nothing wrong with this, but it causes a conflict like cats and dogs. Dogs bow when they want to play. Cats think dogs are posturing to pounce and are being aggressive. So cats and dogs often have trouble getting along. Similarly, engineering managers schedule meetings with programmers when there's something wrong. Programmers want to be writing code, etc, so they see these meetings as a waste of precious time, interrupting progress towards a solution.

Neither group is "right", it's just the difference between the roles. You need both roles to create the healthy, constructive tension that makes an engineering organization function properly at scale. But I hadn't really appreciated how much this difference in priorities matters until I experienced the friction first-hand in a recent episode.

Here are some suggestions I've come up with to make it easier going forward.

Engineers:

  • Be clear about why you want to avoid meetings (i.e., you're debugging)
  • Have more meetings with pure managers to enable them do their jobs

Managers:

  • Maximize engineers' time by having scalable meetings instead of one-on-ones
  • Minimize interruptions by preferring asynchronous channels like bug trackers and mailing lists

10 June 2014

Cool to see Micro Python, a specialized version of Python for embedded devices.

Micro Python is a new implementation of the Python 3 language, which aims to be properly compatible with CPython, while sporting a very minimal RAM footprint, a compact compiler, and a fast and efficient runtime. These goals have been met by employing many tricks with pointers and bit stuffing, and placing as much as possible in read-only memory.

Unfortunately as long as the language / runtime relies on garbage collection important use-cases like "real-time" aren't possible. Go has the same problem, even though it's a lot closer to the metal. That's not preventing people from trying, though.
Amazingly useful post that explains the "kernel trick" used in machine learning algorithms like support vector machines.

08 June 2014

Evan Miller has some useful formulas for computing the probability of an A/B test winner. The other way I've seen this done is bootstrapping.
Dave Cheney explains 5 things that make the Go programming language fast.

03 June 2014

Purple Rain comes on the radio. Cabbie instinctively cranks the volume. Pretty good day so far.

Lots of Apple stuff today!


Perhaps the web isn't dead yet.

01 June 2014

Also: The laser sintering patent expired earlier this year, so expect this kind of stuff to get cheaper and easier.
Rocket engine printed with direct metal laser sintering. So awesome.

29 May 2014

Word of the day is P-Hacking: "Working the data until you reach the goal of a P-value of 0.05"
Coffee: Works every time.

28 May 2014

Here's the original source of the Chicken and Pig story.

27 May 2014

Trying my hardest not to give in to this one: table-top milling machine.

Masterful explanation of why those "x is correlated with y" graphs going around the Internet are worthless.



The gist: It's autocorrelation with time.

Statistics can be dangerous, folks! Be careful out there.
Mission Bicycle launched a new website! Pretty!

24 May 2014

23 May 2014

Nope.
What is the biggest tree of object-oriented classes you've ever worked on that wasn't for UI widgets? How wide? How deep?

22 May 2014

The terrifying conclusion of how rocket nozzles work

This diagram shows how a rocket nozzle works. A rocket is just a big source of hot burning gas coming in from the left. The nozzle is smaller in the middle. The small part is a junction that causes hot, high-pressure gas to convert all of its heat and pressure into volume (remember, PV = nRT). The increased volume of gas has to move through the nozzle opening on the right because there's a constant stream of new gas coming in from the left. Volume and velocity turns into thrust. That's all there is to it.



Compare that rocket nozzle diagram to this one of a jet engine. The same physical laws are in effect. The only difference between a rocket engine and a jet engine is a jet produces the constant stream of hot gas by using a gas turbine engine.



If that's the case, why isn't a plain old piston engine good enough to produce a bunch of hot gas to pass through a nozzle? That's what comes out of the exhaust pipe of a car, right? Indeed, this is possible. It's called a motorjet. You just take a plain old piston engine, pass its exhaust through a gas compressor like the one in your refrigerator, and then send that gas through a nozzle to convert heat and pressure into volume and velocity.



The magic of jet engines, international commercial travel, space flight, etc is all in the design of the nozzle. The turbine engines on your average 747 are only means to an end – gas for the nozzle to turn into thrust. They're not crucial like the shape of the wings that keep the plane aloft.

What logically comes next is terrifying. If all we need to fly fast is a source of hot gas, why not use the best source of hot gas available to mankind? I'm talking about the nuclear reactor, of course! Instead of using a chemical fire (rocket) or gas turbine (jet) why not use a nuclear reaction to create hot gas and pass it through a nozzle to produce thrust? And indeed, nuclear rockets were built and tested in the 1960s – because, why not?



My favorite part of this story is how the goal of these nuclear rocket engines was to keep bombers in the air for longer periods of time as a deterrent to prevent nuclear war. Having a nuclear rocket engine spew radioactive gas into the atmosphere was totally worth it for the safety of America. Right. Luckily for us they invented the ICBM and were able to retire the bombers and their dreams of nuclear aircraft.

It's amazing that any of us are still alive today.

17 May 2014

Eric Florenzano explains the dream of using React-like frameworks to build native Android apps. Maybe DOM is the platform convergence we're looking for?

16 May 2014

Great rant by Armin Ronacher about Unicode types in Python 3.
Notes from 1989 that Rob Pike wrote about programming in C make it very clear why Go has such a pithy style.

15 May 2014

I saw a Kuka KR150 industrial robot in person recently and it blew my mind. 9 foot reach, 300+ pound capacity, 6 axes, fast as hell. It was like staring the future right in the face. The experience in one word: Fear.

12 May 2014

I love when people back up their opinions with data. From: Google Has Most of My Email Because It Has All of Yours

Despite the fact that I spend hundreds of dollars a year and hours of work to host my own email server, Google has about half of my personal email! Last year, Google delivered 57% of the emails in my inbox that I replied to. They have delivered more than a third of all the email I’ve replied to ever year since 2006 and more than half since 2010. On the upside, there is some indication that the proportion is going down. So far this year, only 51% of the emails I’ve replied to arrived from Google.

The numbers are higher than I imagined and reflect somewhat depressing news. They show how it’s complicated to think about privacy and autonomy for communication between parties. I’m not sure what to do except encourage others to consider, in the wake of the Snowden revelations and everything else, whether you really want Google to have all your email. And half of mine.
The future: When blaming your problem on technology won't be a valid excuse.

10 May 2014

© 2009-2014 Brett Slatkin