I'm Brett Slatkin and this is where I write about programming and related topics. You can contact me here or view my projects.

20 August 2013

BlinkDB looks cool:
BlinkDB is a massively parallel, approximate query engine for running interactive SQL queries on large volumes of data. It allows users to trade-off query accuracy for response time, enabling interactive queries over massive data by running queries on data samples and presenting results annotated with meaningful error bars.

Why use Dremel when you can use almost-Dremel 10x faster?

The coolest part is this chart:
Results from this experiment show that error bars from running queries over multi-dimensional stratified samples converge orders-of-magnitude faster than random sampling, and are significantly faster to converge than single-dimensional stratified samples.

© 2009-2024 Brett Slatkin