Quick Glance at Hadoop Ecosystems

As mentioned in Hadoop post, the community around Hadoop has built tremendous tools and technology to support developers. This becomes Hadoop ecosystem. Some of the most popular ones are:

  • Hive
    Hadoop is based on Java language but not everyone can learn Java. Hive is a software built on top of Hadoop, it exposes SQL interface, allowing SQL developers to use powerful Hadoop system in their familiar language. If you know SQL, you don’t have to experience in Java in order to leverage Hadoop. Hive is using HiveQL language, very SQL-like.
  • HBase
    Basically a non-relational database on top of Hadoop. Even though it’s a non-relational, you can integrate with other system just like a traditional database.
  • Pig
    A tool in Hadoop ecosystem used to manipulate data, transforming unstructured data to structured data. It also has interface to query the data, just like Hive.
  • Storm
    Event stream processor that lives in Hadoop, used to process stream of data (as opposed to batch data). Example would be to process stream of IOT data, where data from an IOT device keep flowing through the system.
  • Oozie
    A workflow management system to coordinate between different Hadoop technologies.
  • Flume / Sqoop
    More of integration system that will tranfer data to and from Hadoop system. If you have data that live outside of Hadoop and need to be processed in Hadoop, Flume / Sqoop will do the job.
  • Spark
    A distribute compute engine within Hadoop. It’s used to process large amount of data, prep-ing them for analytics, machine learning, etc. Needless to say, it has a lot of built-in library for machine learning, artificial intelligence, analytics, stream processing and graph processing. Spark also support various different language, Scala, Python, R, etc.

This is definitely an oversimplified explanation of Hadoop ecosystem and there are lots of other technologies not covered here. But, this should give you quick explanation of each of them.

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s