This presentation gives an exhaustive overview of the architectural and technological underpinnings of SQL on Big Data platforms of today. It covers architectures of low-latency SQL engines on for structured, unstructured and streaming analytics as well as SQL for Operational Systems and Operational Analytics. The talk also covers the innovations happening in the space with probabilistic engines like BlinkDB to GPU based engines like MapD.
With the adoption of Big Data Platforms in the enterprise it has become all the more important to build SQL Engines a variety of workloads and use cases. From low latency analytics based SQL to ACID based semantics on Big Data for Operational Systems, to SQL for handling unstructured and streaming data, SQL is fast becoming the ligua-franca in the big data world too. The talk focuses on the exciting tools, technologies and innovations and their underlying architectures and the exciting road ahead in this space. This is a fiercely competitive landscape with vendors and innovators trying to capture mindshare and piece of the pie – with a whole suite of innovations like – index based SQL solutions in Hadoop to OLAP with Apache Kylin and Tajo to BlinkDB and MapD.
– Why SQL on Hadoop
– Challenges of SQL on Hadoop
– SQL on Hadoop Architectures for Low Latency Analytics ( Drill, Impala, Presto, SparkSQL, JethroData)
– SQL on Hadoop Architecture for Semi-Structured Data
– SQL on Hadoop Architecture for Streaming Data
– Innovations ( OLAP on Hadoop, Probabilistic SQL Engines, GPU Based SQL Solutions )