Big data modeling with Cassandra

Desired talk duration: 45 minutes

Abstract

Big data has become big business: whether it’s modeling a social network, predicting user behavior, or analyzing site traffic, many projects need to store more data than can fit in a single-master database. This talk explores modeling data in Cassandra, one of the most developer-friendly distributed databases. We’ll build a Cassandra data model for a Rails app using the Cequel ORM, keeping an eye out for schema pitfalls that the speaker painfully encountered in his first Cassandra deployment.

Notes

I’ve been a Rails developer since 2008, and worked with Cassandra for two years. During that time, I built the first version of the Cequel library, and fell in love with Cassandra. In particular, CQL version 3 exposes an incredibly rich data modeling grammar for a distributed data store. After I left the company where I’d been using Cassandra, I continued work on a new version of Cequel (now released as version 1.0) that takes full advantage of CQL3 and nudges developers toward good schema design patterns.

I’m particularly animated by this topic because our initial Cassandra deployment was a disaster caused by a poor understanding of how Cassandra stores data and an incorrect assumption that relational data modeling practices were relevant to the problem. That’s why I like to give this talk. I’ve previously spoken about Cassandra data modeling at Windy City Rails, and I spoke at GoRuCo about the joys of using more of the features built into your (relational) database.

outoftime/speaker.md

Mat Brown

Contact details

Speaker bio

Big data modeling with Cassandra

Abstract

Notes