In his presentation, Prof.Dr. Wortmann described how Big Data emerged three years ago. From the start, this phenomenon challenged the existing analytical technologies, platforms, applications and processes. Big Data involves challenges in terms of complexity and costs, speed and performance, the integration of unstructured data, and its real time analysis.
On the other hand, dealing with Big Data can push developers to have the current Business Intelligence systems evolve faster. Hardware issues are going through a revolutionary process of optimization in which Big Data is able to leverage the full potential of modern hardware.
The huge amount of unstructured, semi-structured and structured data is forcing us to rethink the efficacy of our current technologies, so it seems.
Prof.Dr. Wortmann explained that Big Data is a category of technologies, applications, and processes for gathering, storing, accessing, and analysing data that goes beyond the capabilities of today's BI systems. The volume has evolved from gigabyte and terabyte to petabyte and exabyte; the velocity from hours and minutes to seconds and milliseconds; and the variety from structured data to semi- and unstructured data.
In Big Data we can see how three specific domains logically have to converge. Here, we are speaking about business intelligence, content intelligence and real time intelligence, according to Prof.Dr. Wortmann. He continued by saying that Big data is not mainstream yet. Innovators like Google have pioneered the fundamental technologies and vendors are now pushing hard to bring these technologies to the Fortune 1.000. Big Data is becoming a strategic initiative.
In terms of variety, Big Data is more than capturing social media or web data. Variety means to systematically broaden and exploit the information base with a clear business case in mind, according to the speaker. Enterprises have to move beyond conventional store and product management.
Often, Big Data is not about generating more raw data but about moving from the average customer segment analysis to the individual level.
In terms of velocity sub-second response time and continuous data load constitutes the sweat spot. Velocity indeed is more than machine response time, as the speaker showed. It empowers the human factor and allows us to move towards a new generation of human computation systems. In this way, we can evolve from static to dynamic decision making and put our business into context.
Big Data technology is still a young and fragmented market, according to Prof.Dr. Wortmann. Still, we can observe the existence of three major Big Data technology domains. Business Intelligence is dealing with structured data in relational databases. Innovations here are occurring at the level of main memory approaches and Hadoop capabilities.
NoSQL and Content Intelligence are dealing with structured and unstructured data by means of the Hadoop software stack; HDFS and HBase for data storage; MapReduce, Pig and Hive for data processing; and Lucene and Mahout for language processing.
Real Time Intelligence is used for data streams. Two proven technologies are being deployed here, according to Prof.Dr. Wortmann: Kafka as a distributed and persistent message broker on the one hand, and Storm as a real time computation system on the other hand. Both are early Apache projects.
The next generation of business intelligence tools will build upon the three pillars of multi core and main memory, multiple nodes to scale out capability, and intelligent databases to run calculations, the speaker predicted.
With Hadoop 2.0, Hadoop is not limited to batch processing any more with regard to content intelligence, but can also be used for streaming and interactive applications, for example, explained the speaker.
In any case, for Big Data analytics to succeed, we need the convergence of the different technologies, which means that we can have different platforms for different user groups and use cases.
Prof.Dr. Wortmann concluded that, indeed, Big Data may appear to be disruptive but it is not disrupting each and every single use case.