Google started to write a series of publications regarding SCALABLE software, that changed the way of thinking in the complete IT world!
- The distributed file system:
In 2003 the published an article about the Google distributed file system
With the informations in this article it was possible to implement an open source version of a distributed file system included in ASF’s Hadoop.
In 2004 the MapReduce paper made the next important step for scalable software systems. With MapReduce it is now possible to make calculations on huge amount of data using clusters consisting of commodity hardware. The open source implementation is also included in Hadoop.
As a next consequent step Google added a database system, which is able to scale well and store schema-less data in tables with millions of columns and even more rows. They called this system BigTable. Apache’s answer / implementation is called HBase and is a subproject of Hadoop.
In 2010 Google published a paper about its system for interactive analysis of web-scale datasets => Dremel
Actually Apache is incubating a open source version of Dremel which is call Drill.
Influenced by big presence of social network needs Google published internals of it’s scalable graph processing system Pregel. Apache implemented this system under the name Giraph.
The missing part in the scalable software puzzle is a scalable message queue system which acts on shared nothing hardware cluster. Apache presents a scalable system which delivers topics and queues combined in one technological implementation called Kafka. It seems that the basics are well designed even if some features for enterprise use are not yet available (i.e. authorization / access control). Up to now I didn’t find a analogon in a google publication. If someone knows a publication, I would be pleased if you leave me a comment.
As we can see the IT world is changing and the time where knowlege of relational databases and GUI’s or webfrontends are enough for IT professionals seems to be over.
Some people may be afraid to loose their knowledge leader role – others (hopefully the most of us) are just curious about all the possibilities those new techniques will bring us.
Let’s scale out …..
Author: Martin Menzel