What is Apache Druid?

 

Apache Druid is a high performance column-oriented, distributed data store designed for OLAP workloads. It is open source, supports horizontal scalability, and has a low latency SQL interface.


Apache Druid was designed to address the needs of OLAP workloads, which are characterized by a few very large, slowly changing dimensions, and a large number of facts. Druid supports a wide variety of data sources, including relational databases, NoSQL data stores, and flat files. Apache Druid is highly scalable, supporting billions of rows and millions of events per second. Apache Druid has a low latency SQL interface that supports real-time queries on live data. Druid also has a powerful data ingestion engine that supports batch and streaming data sources.
Apache Druid has a rich SQL interface that supports most SQL operations, including filters, aggregations, and joins. Apache Druid also supports User-Defined Functions (UDFs), which allows users to extend the SQL interface to support custom operations.

Apache Druid is scalable and can be deployed on a single server or on a cluster of servers. It is highly available and can tolerate failure of individual nodes. Apache Druid is open source and is released under the Apache License, Version 2.0.

Comments

Popular posts from this blog

ZooKeeper as distributed consensus service

What is Apache Kafka?