How to Apply Big Data Analytics and Machine Learning to Real Time Processing
"Big Data" is currently a big hype. Large amounts of historical data are stored in Hadoop or other platforms. Business Intelligence tools and statistical computing are used to draw new knowledge and to find patterns from this data, for example for promotions, cross-selling or fraud detection. The key challenge is how these findings can be integrated from historical data into new transactions in real time to make customers happy, increase revenue or prevent fraud.
"Fast Data" via stream processing is the solution to embed patterns - which were obtained from analyzing historical data - into future transactions in real-time. This session uses several real world success stories to explain the concepts behind stream processing and its relation to Hadoop and other big data platforms. The session discusses how patterns and statistical models of R, Spark MLlib and other technologies can be integrated into real-time processing.
A brief overview of available open source frameworks and commercial products shows possible options for the implementation of stream processing, such as Apache Storm, Spark Streaming, IBM InfoSphere Streams, or TIBCO StreamBase.
A live demo shows how to implement stream processing, how to integrate machine learning, and how human operations can be enabled in addition to the automatic processing via a Web UI and push events.
Kai Wähner works as Technical Lead at TIBCO. Kai’s main area of expertise lies within the fields of Integration, Big Data, Analytics, SOA, Microservices, BPM, Cloud Computing, Java EE and Enterprise Architecture Management. He is speaker at international IT conferences such as JavaOne, ApacheCon or OOP, writes articles for professional journals, and shares his experiences with new technologies on his blog. Find more details and references (presentations, articles, blog posts) on his website.