Choosing database

Chosing your database

Two reasons to consider a NoSQL database: programmer productivity and data access performance.

  • To improve programmer productivity by using a database that better matches an application’s needs.
  • To improve data access performance via some combination of handling larger data volumes, reducing latency, and improving throughput.

It's essential to test your expectation about programmer productivity and performance before commiting to using a NoSQL technology

Sticking with the default

There are many cases you’re better off sticking with the default option of a relational database:

  • You can easily find people with the experience of using them.
  • They are mature, so you are less likely to run into the rough edges of new technology
  • Picking a new technology will always introduce a risk of problems should things run into difficulties

Popular databases

RMDBS

  • uses B-Tree structure as the storage engine of data
  • Data and indexes are organized with B-Tree concept and read/writes always has logarithmic time. For 1 million records, it takes 20 comparisions in the B-Tree to locate the required data/pointer in the index.
  • Each disk access take around 5ms. For 1 million records, it requires 15ms. For 1000 million records, it takes 1,5s. For 1 billion records, the bill is 15 seconds just to access one row
  • To prevent this happening, the "shard" concepit is applied where data is split into serveral shards based on specific hash key.
  • While data size increasing, write and query performance wil suffer and become bottleneck of the application.

MongoDB

MongoDB is the next logical move from RDMS. MOngoDB uses indexes similarly with B-Tree structure but the file mappings are kept in memory wich makes access to data fast. Writes also first done against the memory. Then from memory data is flushed to disk periodically. All NoSQL’s uses memory a lot compared to RDMS.

Cassandra

Cassandra is meant for fast writes and known queries upfront.(Writes are 3 times faster than MONGODB) But query is less performent which makes is suitable for Time-Series data.

Last updated on