Table of Contents

Why Enterprises Have to Hack Big Data before spending huge on Internet of Things

Why Enterprises Have to Hack Big Data before spending huge on Internet of Things

Rewinding couple of years ago, when I first heard the term Internet of Things (IoT), it was explained to me with the example straight out of science fiction of a super intelligent refrigerator checking my supplies & telling me their expiration dates. All this analogy did was trigger my imagination of the scene from Darren Aronofsky’s Requiem for a Dream, where the old lady (played by the lovely Ellen Burstyn) hallucinates over her diet pills, running scared of her buzzing refrigerator. Now that I got visual out of the way, let’s discuss about Internet of Things which has been one of the hot topic of conversation at every tech conference recently.

As an industry there indeed seems to be a consensus that IoT still hasn’t attained its complete potential, but the time is just about right for it to flourish. Gartner estimates 26 billion IoT units installed by 2020 [1], while Cisco forecasts a more ambitious 50 billion connected devices by the same year [2]. We are already seeing some early implementations from the industry focused on creating solutions especially in the Home Automation sector. At eInfochips, we had the pleasure of working on developing a Home Security and Automation Solution that allowed customers to migrate to Android without a consultant at their doorstep or offering instructions over a phone. You can read more about the solution from here Download Case Study.

But for IoT to become as effective as let’s say, what mobile devices are today, we need to ask the same questions that global enterprises like Google and Yahoo had to ask in advent of the Internet search engine and web crawling era. We are discussing here about the data generated from the devices. Now, before we get into the scheme of things, let’s take a step back and consider the nature of the data generated by devices. And I would once again use the traditional refrigerator example to explain the problem.

Consider yourself in charge of designing a refrigerator integrated with an intelligent system i.e. Internet of Things. The use case you have in hand includes a sensor that detects perishable items in your refrigerator i.e. milk, vegetables, meat, and chocolates. Now the sensor not only reads the bar code of every item in the fridge but also sends this data to a backend system where it is processed. The findings are sent as action commands to the machine, which would in turn inform the user (via cognitive mobile agent/console display) regarding the state of the perishable items on their refrigerator. But, this it doesn’t end here as the cycle is repeated at frequent intervals and is automatically triggered with every new product you introduce to a refrigerator.

This is a classic case of online transaction processing (OLTP) where streaming data is being generated and pushed at frequent intervals to the device. It is real time and is dynamically generated based on machine configurations.

Now, consider the volume of data that such a machine would generate in a day. Not to mention that when dynamically analysing streaming data, theoretically nothing is to be left over, i.e. the velocity plays an even bigger role. While things may look remarkably simpler in a prototype model as you replicate the model for real life, you would suddenly feel the intensity of the data you would be handling.

But that doesn’t mean that enterprise have to scratch their heads and wait for another decade to have talking refrigerators. A parallel technology initiative, with equal if not more buzz in the market that is Big Data can in fact become the ideal raison d’etre for the problem.

How does Big Data solve the streaming data deluge?
The machine-to-machine communication data (m2m) which is generated in streams need to be aggregated, ingested, processed, analysed and managed in real time. There already have been some developments on this front with technologies like:

  • Apache Storm supported by Hortonworks that delivers online machine learning, real time analytics, distributed RPC and ETL (extract-transform-load) [3]
  • Though not fully realized probably Cloudera supported Apache Flume that currently is used to process and analyse real time log data which can be extended for sensor data
  • Amazon Kinesis is a more matured platform that also inherits the managed services of Amazon stack
  • And not to discount document oriented NoSQL platforms like MongoDB that highlight machine-to-machine (M2M) support for ad-hoc analysis using their built-in aggregation framework [4]

These technologies would only evolve in the coming months ergo they will only scale their capabilities to manage higher volume, velocity and variety of real time streaming data.

In this interesting Keynote video from Gartner Symposium ITxpo 2013, Peter Sondergaard says “Big Data is changing the way you do business. When the nexus of forces meet Internet of everything, Big Data explodes!” It is an interesting open ended question as we still haven’t realized what all possibilities can the combination of IoT and Big Data can offer.

From the brief experience we have working on Big Data use cases which often involves transactional processing for data like log files, machine streams etc., what it has taught us is that there is tremendous room for Hadoop and NoSQL technologies in the evolution of the Internet of Things. So the machine streams coming from our refrigerator example can be stored in the GridFS file in MongoDB system and to run queries on them. Or we can use HBase, or other technology in the Hadoop ecosystem. The selection is completely dependent on the type of data you want to store and the nature of analysis you are intending to derive from the system. So I have started planning on the things I could do with my IoT + Big Data powered refrigerator. How about you?

If you liked the home and automation case study, you can know more about the work we do from Here. Also check out the IoT universe infographic below.


eInfochips IoT Infograph from eInfochips


[1] Gartner Says the Internet of Things Will Transform the Data Center – March 18, 2014 []
[2] Cisco Internet of Things Visualization – []
[3] Apache Storm – []
[4] Realizing the Promise of Machine to Machine (M2M) with MongoDB – []


Explore More

Talk to an Expert

to our Newsletter
Stay in the loop! Sign up for our newsletter & stay updated with the latest trends in technology and innovation.

Our Work





Device Partnerships
Digital Partnerships
Quality Partnerships
Silicon Partnerships


Products & IPs