I'm Brett Slatkin and this is where I write about programming and related topics. Check out my favorite posts if you're new to this site. You can also contact me here or view my projects.

31 July 2014

How I'm writing a programming book

Update: Visit the official book website to get the book!

Update2: Get my pyliterate tool here.

I've been working on Effective Python for just over two months now. The plan is to have 8 chapters in total. I've written a first draft of 5 so far. Chapter 3, the first one I wrote, was the hardest for many reasons. I had to establish a consistent voice for talking about Python. I took on the most difficult subjects first (objects and metaclasses) to get that work out of the way. I also had to build tools to automate my workflow for writing.

Each chapter consists of a number of short items that are about 2-4 pages in length. The title of an item is the "what", the shortest possible description of its advice (e.g., "Prefer Generators to Returning Lists"). The text of the item is the "why", the explanation that justifies following the advice. It's important to make the core argument for each item using Python code. But it's also important to surround the code with detailed reasoning.

Before I started I read a nice retrospective on how one author wrote their programming book. They had separate source files for the example code and used special comments to stitch it into the book's text at build time. That's a great idea because it ensures the code that's in the book definitely compiles and runs. There's nothing worse in programming tutorials than typing in code from the text and having it barf.

I wanted to go a step further. I wanted my examples to be very short. I wanted to intermix code and prose more frequently, so the reader could focus on smaller pieces one step at a time. I wanted to avoid huge blocks of code followed by huge blocks of prose. I needed a different approach.


Writing workflow

After some experimenting what I landed on is a script that processes GitHub Flavored Markdown. It incrementally reads the input Markdown, finds blocks that are Python code, runs them dynamically, and inserts the output into a following block.

Here's an example of what the input Markdown looks like:

The basic form of the slicing syntax is `list[start:end]`
where `start` is inclusive and `end` is exclusive.

```python
a = [1, 2, 3, 4, 5, 6, 7, 8]
print('First four:', a[:4])
print('Last four: ', a[-4:])
print('Middle two:', a[3:-3])
```

```
First four: [1, 2, 3, 4]
Last four:  [5, 6, 7, 8]
Middle two: [4, 5]
```

When slicing from the start of a list you should leave
out the zero index to reduce visual noise.

```python
assert a[:5] == a[0:5]
```

I write the files in Sublime Text. When I press Command-B it builds the Markdown by running my script, which executes all the Python, inserts the output back into the text, and then overwrites the original file in-place. This makes it easy to develop the code examples at the same time I'm writing the explanatory prose. It feels like the read/eval/print loop of an interactive Python shell.

My favorite part is how I made Python treat the Markdown files as input source code. That means when there's an error in my examples and an exception is raised, I'll get a traceback into the Markdown file at exactly the line where the issue occurred.

Here's an example of what that looks like in the Sublime build output:

Traceback (most recent call last):
  File ".../Slicing.md", line 29, in 
    a[99]
IndexError: list index out of range

It's essentially iPython Notebook, but tuned for my specific needs and checked into a git repo as Markdown flat files. Update: A couple people mentioned that this is a variation of Knuth's Literate Programming. Indeed it is!


Publishing workflow

Unfortunately, my deliverable for each chapter must be a Microsoft Word document. As a supporter of open source software and open standards this requirement made me wince when I first heard it. But the justification is understandable. The publisher has a technical book making system that uses Word-based templates and formatting. They have their own workflow for editing and preparing the book for print. This is the reality of desktop publishing. More modern tools like O'Reilly Atlas exist, but they are new and still in beta.

There is no way I'm going to manually convert my Markdown files into Word files. The set of required paragraph and character styles is vast and complicated. These styles are part of why the published book will look good, but it's tedious work that's easy to get wrong. Sounds like the perfect job for automation!

I have a second script that reads the input Markdown (using mistune) and spits out a Word .docx file (using python-docx). The script has a bunch of rules to map Markdown syntax to Word formatting. The script also passes all of the Python code examples through the Python lexer to generate syntax highlighting in the resulting document.

The other important thing the publishing script does is post-process the source code. Often times in an example there are only two lines out of 20 I need to show to the reader to demonstrate my point. The other 18 lines are for setup and ensuring the example actually demonstrates the right thing (testing). So I have special directives in the code as comments that can hide lines or collapse them with ellipses.

Here's an example of what this looks like in the Markdown file:

```python
class MissingPropertyDB(object):
    def __getattr__(self, name):
        if name == 'missing':
            raise AttributeError('That property is missing!')
        # COMPRESS
        value = 'Value for %s' % name
        setattr(self, name, value)
        return value
        # END

data = MissingPropertyDB()
# HIDE
data.foo  # Test the success case
# END
try:
    data.missing
except AttributeError as e:
    pretty(e)
```

The actual output in the published book would look like this:

class MissingPropertyDB(object):
    def __getattr__(self, name):
        if name == 'missing':
            raise AttributeError('That property is missing!')
        # ...

data = MissingPropertyDB()
try:
    data.missing
except AttributeError as e:
    pretty(e)

>>>
AttributeError('That property is missing!',)


Conclusion

If you have any interest in using these tools let me know! Writing a book is already hard enough. Having a good workflow helps a lot. I'd like to save you the trouble. Otherwise, if you have any suggestions on what I should put in the book, please email me here.
© 2009-2016 Brett Slatkin