Member-only story
A Brief History of Data
with apologies to Dr. Hawking for appropriating this title!
Some time ago I made the move from Oracle to DataStax. DataStax is a company built around supporting Apache Cassandra and helps to improve Cassandra by contributing its work via the Cassandra Enhancement Proposals (CEP) process.
As a long-time RDBMS guy, learning Cassandra was an eye opening experience. Cassandra is a wholly different way of storing and accessing data. I’ve spent quite a while unlearning the things I learned using RDBMSs in order to get my head around Cassandra. Instead of just telling you what Cassandra is (if you don’t already know), I’m going to go into some data storage history, which will explain why Cassandra is different that every other storage technique to date.
Databases — How We Got Here
I started writing code in 1981 (a story in and of itself). At the time, a common approach to data storage and retrieval in the PC world was the serial access file. In case you’re not familiar with those, they are files with their records written in order. If you have a 100K file (an ocean of storage at the time of the 5.25" floppy drive) and the record you wanted was the last one in the file, you had to read all 100K of the file into memory (often in 1K chunks) until you found the record you were looking for.