25 November 2015

Fun sailing game you can play in your browser. You need a keyboard. Arrow keys to steer, Z and X to trim the sail.

15 November 2015

Huge list of projects that build on Python's asyncio library.

03 November 2015

David Beazley's curio library looks interesting. This is the most compelling example of async/await from PEP 492 that I've seen. I anticipate that this is the direction the asyncio standard library module will go.
The shifter I designed is now available on Mission Bicycles! More details on the manufacturing process to come.

13 October 2015

Learning new skills after a decade

This week is my 10 year anniversary of working as a professional software engineer. The advice I've followed is "always be the worst player in the band". But over time the meaning of that statement has changed for me. I've realized that there's a lot more to making music than playing the notes.

It sounds odd, but what I'm most excited about right now is learning to play the piano. Specifically: I'd like to play Jazz standards using a Fake Book. I can't remember the last time I learned a hard skill that required physical repetition and muscle memory. Maybe it was learning to drive a stick shift? Or do a kick flip? Or maybe Emacs key bindings?

Here are some of my favorite piano moments over the past 3 months of practicing:

  • Being able to read treble clef notes without making mistakes in this app
  • Learning enough music theory from this book to understand why chords work
  • Being able to play dominant, major, minor, diminished, and many other chord variations quickly
  • Learning about the cycle of fifths and using it to practice
  • Practicing scales enough that I can do it (poorly) with my eyes closed
  • Writing down my own ideal fingering for a sequence of notes
  • Learning the importance of proper sustain pedal usage
  • Playing a bunch of songs that I know

Here are things that I can't do yet:

  • Play the rhythm correctly
  • Know all chord inversions and be able to use them quickly
  • Play songs without looking at the keys, especially chords
  • Sing and play at the same time (I actually don't want to do this, but it seems like a good exercise)
  • Play classical music with a real bass clef
  • Make meaningful progress on Mark Levine's Jazz Theory book

Speaking of that Jazz book: I asked a musician I know for some advice on how to learn. He said, "I know just the book for you". He showed me "Jazz Theory" and I asked, "This looks great, but are there any other books I should read?" The musician's answer was, "No. If you wanted to learn about Jesus, you'd read the Bible. If you want to learn about Jazz, this is the book."

Why isn't there such an obvious go-to book in the realm of programming? I wonder who would disagree with my musician friend, why, and what book they'd recommend instead. Perhaps it's just the nature of learning: While you're inexperienced, people have amazing, definitive advice; once you know, everything is murky.

02 October 2015

Deleting code (exclusively) is my favorite thing to do on Fridays.

16 September 2015

Effective Python on the Talk Python Podcast

I had the privilege of appearing on the "Talk Python to Me" podcast this past week to talk about my book, Effective Python. You can listen to the episode here and the transcript is here. Many thanks to Michael Kennedy for hosting!

13 September 2015

Kerbal Space Program

If you haven't heard of Kerbal Space Program, the gist of the game is this: You play the role of NASA or SpaceX to design, finance, build, launch, and control rockets, spaceships, robots, and astronauts that explore the solar system. It sounds like a great idea that ends up being boring in practice, but they somehow figured it out — Kerbal is extremely fun.

I've been watching the game evolve since it debuted. I'm really cautious about playing games because I get obsessed and they take up all my free time (to the detriment of other things, like open source projects and my social life). But this summer I was looking for a new way to chill out. I decided it was finally the right moment to start playing.

Kerbal's graphics aren't the best, but the physics engine is great. You learn a lot about orbital mechanics by playing. In career mode it's a very difficult game. To get better at it, I spent a lot of time reading the Kerbal Wiki, watching walkthrough videos by Scott Manley and Billy Winn Jr, and enjoying the Kerbal subreddit. Did you know that Elon Musk plays Kerbal, too?

Some moments in the game that made me feel great:

  • Getting into orbit
  • Landing on the moon
  • Docking two ships together for the first time
  • Driving a dune buggy on another planet
  • Perfectly matching a satellite's orbital parameters

It scratches a similar itch to Sim City and Civilization. But it also feels like Minecraft in how creative and arbitrary the game is. I haven't even begun to explore the tremendous Kerbal mod community that exists.

Here are some screenshots from my most recent accomplishment in the game: going to Jupiter ("Jool"), landing on two of its moons ("Bop" and "Pol"), and making it back to Earth ("Kerbin").

I had to fly a big spaceship that's powered by a nuclear rocket. Look how stoked those Kerbals are, in the lower right of the screen, to be zooming off for a 10 year mission to the edge of the solar system:

Here I'm using the guidance computer to execute my burn. If you haven't played the game this will look insane. For someone who plays the game this is actually a trivial example:

And finally, here's what it looks like when the astronauts ("Kerbonauts") make it back home safely. Succeeding in getting them home is one of the best parts. (This was a night landing because I ran out of fuel and had to rely on aerobraking to slow down).

I've barely scratched the surface of this game. I can't recommend it enough. But don't blame me if it ruins your life, too.

09 September 2015

This article about SpaceX and their goal of colonizing Mars is the most inspiring thing I've read all year. It's long but totally worth getting through. It includes tons of great links, such as these travel posters:

Toyota is selling a hydrogen car now. Crazy futuristic time.

04 September 2015

Helpful perspective in the presentation called "Ten Things That Will Try To Make You Leave Technical Jobs", from the NYU/Columbia Women in Computing symposium earlier this year. Be sure to read the speaker notes.

15 August 2015

Growing giant sequoias

I saw some giant sequoias a few weeks ago. There aren't many of these 2000+ year old trees left. Growing more seems extremely difficult given all of the war, fire, and suffering that occurs on Earth during such a time period. With all our technology and sophistication we can send a probe to Pluto, but it seems impossible to grow a tree into old age. It's so simple, yet so difficult.

Time is not something we can control. Or is it? You can imagine launching baby trees into space and shooting them off towards the speed of light. Time would elapse faster for the tree than on earth due to special relativity. After a large orbit, the trees would return and be thousands of years older than when they left. Land them back on Earth, put them in the ground: problem solved.

It'd be a great way to age whisky as well. I wonder what kind of energy production breakthrough would make it possible.

Damn, my idea is totally not possible. I got the direction of time dilation mixed up. I'm feeling pretty dumb. Thanks to Tony for pointing out my mistake.

Well, since we're on the topic of humans trying to beat time anyways, I saw this paper from Sandia National Labs and it's extremely spooky: "Expert Judgment on Markers to Deter Inadvertent Human Intrusion into the Waste Isolation Pilot Plant". The fact that we're alive today as a species at all is remarkable.

14 August 2015

Effective Python video released this week

I did a video version of Effective Python! Here's the link to check it out. I also did a video interview with Addison-Wesley to talk about the book and related questions about the Python language:

04 August 2015

I enjoyed this talk about programming languages by Ramsey Nasser from Eyeo 2015:

11 July 2015

I'm a big fan of Amaro (the bitter aperitivo/digestivo). Here's an amazing guide (parts 2, 3, 4) to all the different types!

18 June 2015

Wonderful post by Ade Oshineye explaining the evolution of what the web is. Must read.
WebAssembly is happening! From that post:

I’m happy to report that we at Mozilla have started working with Chromium, Edge and WebKit engineers on creating a new standard, WebAssembly, that defines a portable, size- and load-time-efficient format and execution model specifically designed to serve as a compilation target for the Web.

Watch The Birth and Death of JavaScript to understand the long-term vision (no joke).

12 June 2015

Mesosphere's "Data Center Operating System" is now generally available. The case studies from that post about Twitter, Yelp, Apple, and HubSpot are especially interesting.

09 June 2015

I wish these stats were built into my email client: "Three Years of Logging My Inbox Count"

08 June 2015

Wonky Swift 2.0 code sample

Swift 2.0 is out. What's up with this bad indentation in the code example? Oh it's just Chrome not following web standards dealing with the \ufffc replacement characters in the byte stream from Apple's webserver. Probably a bad copy/paste into the marketing materials. Interesting that Safari and Firefox don't care ("be liberal in what you accept").



"Organizational Debt is like Technical debt — but worse" explains a lot of what I've seen as engineering teams grow and change.

06 June 2015

Python 2.7.11 will be 15-20% faster by using computed gotos instead of a switch statement, thanks to a patch from Intel. Edit: To be clear, it's the bytecode interpreter's code that's faster; overall benchmarks are here.

04 June 2015

I'll be speaking at the Bay Area Python Interest Group on Thursday, June 25th in Mountain View at LinkedIn. See the details here. Hope to see you there!
I'm attempting to translate the items from Effective Python in to best practices for Go. My first attempt is "Know When to Use Channels for Generator-Like Functions".

02 June 2015

This book (in progress) looks great: "Model-Based Machine Learning". Written by two researchers at Microsoft.

27 May 2015

Two amazing posts by Martin Kleppmann:

The illustrations are great, too. Looking forward to reading his book (in-progress).
Helpful presentation by Adrian Cockcroft (formerly of Netflix): "Monitoring Microservices and Containers: A Challenge"

26 May 2015

Go 1.5 is coming

I went to the Gopherfest event tonight. There's a lot to be excited about in Go 1.5. The freeze is now. Release is due in August. Be sure to check out Rob Pike's talk about "Go in Go" and Andrew Gerrand's talk about the "State of Go" (May 2015 edition).

But I'm especially happy to see things like this:

Here's a cool post that demonstrates the differences between interpreters, compilers, and just-in-time compilers. My first exposure to this was the dynamic recompilation that some Nintendo emulators would do for speed. At the time I didn’t realize that it’s pretty much the same thing as JITing.

16 May 2015

Evan Miller — always a treat — writes about Rust.
REST : RPCs :: Functional programming : Imperative programming

13 May 2015

Updated my Chrome extension "Clip It Good" to v0.4.1: It lets you save images and GIFs from webpages to Google+ Photo Albums.
Two Python Neural Net / Deep Learning projects worth checking out: Keras & PyNeural.

One uses Cython for speed, the other uses Theano. Fun times!

10 May 2015

Wonderful answer to "What is the appeal of dynamically-typed languages?"

Erik Osheim's post, "What is the appeal of dynamically-typed languages?", is one of the best explanations of the value of dynamic languages that I've ever read. This is the kicker for me:

One advantage Python has is that this same faculty that you are using to create a program is also used to test it and debug it. When you hit a confusing error, you are learning how the runtime is executing your code based on its state, which feels broadly useful (after all, you were trying to imagine what it would do when you wrote the code).

By contrast, writing Scala, you have to have a grasp on how two different systems work. You still have a runtime (the JVM) which is allocating memory, calling methods, doing I/O, and possibly throwing exceptions, just like Python. But you also have the compiler, which is creating (and inferring) types, checking your invariants, and doing a whole host of other things. There's no good way to peek inside that process and see what it is doing. Most people probably never develop great intuitions around how typing works, how complex types are encoded and used by the compiler, etc.

This logic also explains why Go is so easy to write and probably accounts for its astounding rate of adoption. It's so much easier to master both the running Go language and its type system compared to other statically typed languages.

Erik's answer also puts last year's popular posts, "Why Go Is Not Good" and "Go's Type System Is An Embarrassment", in perspective. The authors of those posts don't appreciate how many people have trouble understanding complex type systems. A lack of understanding prevents programmers from using such languages effectively, which reduces overall adoption.
Interesting survey on the adoption of Python 3 vs. version 2 by scientists.

09 May 2015

Exceptions are a crucial part of a function's interface. Isn't it strange how Python's type hinting leaves them out?

04 May 2015

Please vote for my EuroPython talk and others!

I submitted two talks for EuroPython this year (July 20-26 in Bilbao, Spain). They've just opened voting to existing ticket holders. If you've already bought a ticket, I'd really appreciate your vote on one or both of my talks! (You can also buy a ticket here). You should also take the time to vote for the other submissions you like. There are a lot to choose from!

To vote, first you need to login following this link. Then you can view the talk index, visit the talk pages directly, read the abstracts, and click on the voting stars on the right side of the page:

My two talk proposals are based on the content from my book, Effective Python. You must be logged in or else these proposals will look like dead links:

These will be in the same style as the talk I gave at PyCon 2015 this year, but the content will be completely different. Hopefully they'll end up accepting whichever of my talks that has the highest rating. Thanks in advance for your support!

01 May 2015

The importance of future-proofing

Our product generates histograms from individual data points. We're working on improving our analysis system so it can answer more questions using the same raw input data. We do this by constructing an intermediate data format that makes computations fast. This lets us do aggregations on the fly instead of in batch, which also enables advanced use-cases like interactive parameter tuning and real-time updates.

Each time we'll add a new analysis feature, we'll also have to regenerate the intermediate data from the raw data. It's analogous to rebuilding a database index to support new types of queries. The problem is that this data transformation is non-deterministic and may produce slightly different results each time it runs. This varying behavior was never a problem before because the old system only did aggregations once. Now we're stuck with no stable way to migrate to the new system. It's a crucial mistake in the original design of our raw data format. It's my fault.

The functional programmers out there, who naturally leverage the power of immutability, may be chuckling to themselves now. Isn't reproducibility an obvious goal of any well-designed system? Yes. But, looking back to 4 years ago when we started this project, we thought it was impossible to do this type of analysis on the fly. The non-determinism made sense[1]. We were blind to this potential before. Since then we've learned a lot about the problem domain. We've gained intuition about statistics. We're better programmers.

In hindsight, we should have questioned our assumptions more. For example, we should have asked: What if it becomes possible someday to do X? What design decisions would we change? That would have been enough to inform the original design and prevent this problem with little cost the first time around. It's true that too much future-proofing leads to over-design and over-abstraction. But a little bit of future-proofing goes a long way. It's another variation of the adage, "an ounce of prevention is worth a pound of cure".

1. The non-determinism is caused by machine learning classifiers, which are retrained over time. This results in slightly different judgements between different builds of the classifier.

(PS: There's some additional discussion about this on Lobsters)
Karl Pearson, of the eponymous chi-squared test, invented the histogram in 1895.

24 April 2015

Response to "The Long-Term Problem With Dynamically Typed Languages"

I enjoyed this post quite a bit: "The Long-Term Problem With Dynamically Typed Languages". I think he's got some great points. I especially like the analogy with the "broken windows effect". It's interesting to hear about someone's experience using a software system or practice for a long time.

The best data point I have on this personally is my current project/team. The codebase is over 500KLOC now. The majority of it is in Python, followed by JS. I’ve been working on it since the beginning—over 4 years. We’ve built components and then extended them way beyond their original design goals. There’s a lot of technical debt. Some of it we’ve paid down through refactoring. Other parts we’ve rewritten. Mostly we live with it.

As time has gone on, we’ve gained a better understanding of the problem domain. The architecture of the software system we want is very different than what we have or what we started with. Now we’re spending our time figuring out how to get from where we are to where we want to be without having to rewrite everything from scratch.

I agree we have the lava layers problem the author describes, with multiple APIs to do the same thing. But I’m not sure if we would spend our time unifying them if we had some kind of miraculous tooling afforded by static types.

Our time is better spent reevaluating our architecture and enabling new use-cases. For example, one change we’ve been working towards reduces the turn-around time for a particular data analysis pipeline from 30 minutes to 1 millisecond (6 orders of magnitude). Now our product will be able to do a whole bunch of cool stuff that was impossible before. It took a lot of prototyping to get here. I don’t think static types would have helped.

My team’s biggest problem has always been answering the question: “How do we stay in business?” We've optimized for existence. We’ve had to adapt our system to enable product changes that make our users happy. Maybe once your product definition is so stable, like Google Search or Facebook Timeline, you can focus on a codebase that scales to 10,000 engineers and 10+ years of longevity. I haven't worked on such a project in my career. For me the requirements are always changing.

(Originally from my comment here)
And now an official Google blog post about Borg is up. To get all the info read the paper here.

22 April 2015

Another epic post by Aphyr about MongoDB's consistency problems.

19 April 2015

Some updates to dpxdt

I landed a few updates to Depicted today:

  • We now use virtualenv to manage dependencies. No more git submodules! That was a huge mistake. At the time (two years ago) I thought pip was just as bad. But now I'm fine with it.
  • I rewrote the deployment instructions to use App Engine Managed VMs. It's now 10x easier to deploy. Still non-trivial because Google's Cloud Console is so complicated.
  • Instructions for local dpxdt are now at the top of the README, thanks to Dan. I moved the whole set of dpxdt local instructions over. Hopefully this will make the project less scary for newbies.

What's left: Make it so you can install the whole server with pip install dpxdt_server and be done with it.

18 April 2015

This post explains how to implement the core API of React using jQuery. Good to understand.

The future is not Borg

The New Stack has an interesting write-up about Google's long-secret Borg system. I can't say anything specific about this and I haven't read the paper.

What I will say is when I first arrived at Google in 2005 I felt like I was stepping into the future. Tools like Docker, CoreOS, and Mesos are 10 years behind what Borg provided long ago, according to The New Stack's write-up. Following that delayed timeline, I wonder how long it will be before people realize that all of this server orchestration business is a waste of time?

Ultimately, what you really want is to never think about systems like Borg that schedule processes to run on machines. That's the wrong level of abstraction. You want something like App Engine, vintage 2008 platform as a service, where you run a single command to deploy your system to production with zero configuration.

Kubernetes is interesting to watch, but I worry that it suffers from requiring too much configuration (see this way-too-long "guestbook example" for what I mean). Amazon's Container Service or Google's Container Engine may make such tools more approachable, but it's still very early days.

I believe systems like Borg are necessary infrastructure, but they should be yet another component you take for granted (like your kernel, a disk driver, x86 instructions, etc).

15 April 2015

Out of stock

Effective Python sold out thanks to PyCon 2015! My publisher had to do an emergency second printing a few weeks ago because it looked like this was a possibility. Luckily, the book will be restocked everywhere on April 17. Amazon still has some left in their warehouse in the meantime. I'm happy to know people are enjoying it!

12 April 2015

Links from PyCon 2015

Instead of writing a conference overview, here's an assorted list of links for things I saw, heard of, or wondered about while attending talks and meeting new people at PyCon this year. These are in no particular order and I'm missing a bunch I forgot.

  • "SageMath is a free open-source mathematics software system licensed under the GPL. It builds on top of many existing open-source packages: NumPy, SciPy, matplotlib, Sympy, Maxima, GAP, FLINT, R and many more. Access their combined power through a common, Python-based language or directly via interfaces or wrappers."
  • "ROS [Robot Operating system] is an open-source, meta-operating system for your robot. It provides the services you would expect from an operating system, including hardware abstraction, low-level device control, implementation of commonly-used functionality, message-passing between processes, and package management."
  • "GeoJSON is a format for encoding a variety of geographic data structures."
  • "GeoPandas is an open source project to make working with geospatial data in python easier. GeoPandas extends the datatypes used by pandas to allow spatial operations on geometric types."
  • "N-grams to the rescue! A collection of unigrams (what bag of words is) cannot capture phrases and multi-word expressions, effectively disregarding any word order dependence. Additionally, the bag of words model doesn’t account for potential misspellings or word derivations."
  • "A Few Useful Things to Know about Machine Learning: This article summarizes twelve key lessons that machine learning researchers and practitioners have learned. These include pitfalls to avoid, important issues to focus on, and answers to common questions."
  • "Compare randomized search and grid search for optimizing hyperparameters of a random forest. All parameters that influence the learning are searched simultaneously (except for the number of estimators, which poses a time / quality tradeoff)."
  • SainSmart 4-Axis Control Palletizing Robot Arm Model For Arduino UNO MEGA250
  • Nanopore sequencing of DNA
  • Advanced C++ Programming Styles and Idioms
  • "diff-cover: Automatically find diff lines that need test coverage. Also finds diff lines that have violations (according to tools such as pep8, pyflakes, flake8, or pylint). This is used as a code quality metric during code reviews."
  • "FuzzyWuzzy: Fuzzy string matching like a boss."
  • "Glue is a Python library to explore relationships within and among related datasets."
  • "Tabula: If you’ve ever tried to do anything with data provided to you in PDFs, you know how painful it is — there's no easy way to copy-and-paste rows of data out of PDF files. Tabula allows you to extract that data into a CSV or Microsoft Excel spreadsheet using a simple, easy-to-use interface."
  • "Blaze expressions: Blaze abstracts tabular computation, providing uniform access to a variety of database technologies"
  • "AppVeyor: Continuous Delivery service for Windows" (aka: Travis CI for Windows)
  • "Microsoft Visual C++ Compiler for Python 2.7: This package contains the compiler and set of system headers necessary for producing binary wheels for Python 2.7 packages."
  • "Ghost Inspector lets you create and manage UI tests that check specific functionality in your website or application. We execute these tests continuously from the cloud and alert you if anything breaks."
  • "Think Stats: Probability and Statistics for Programmers"
  • "The bytearray class is a mutable sequence of integers in the range 0 <= x < 256. It has most of the usual methods of mutable sequences, described in Mutable Sequence Types, as well as most methods that the bytes type has."
  • "git webdiff: Two-column web-based git difftool"
  • "Bug: use HTTPS by default for uploading packages to pypi"

If you ever wonder why people go to conferences, this is why. You get exposure to a wide range of topics in a very short period of time, plus you get to meet people who are excited about everything and inspire you to learn more.
© 2009-2015 Brett Slatkin