Posts

What is Apache Airflow?

Image
   Ap ache  Air flow  is  a  platform  to  program matically  author ,  schedule  and  monitor  work flows . Air flow was created as an internal project at Airbnb in October 2014 . The project was open sourced in June 2015 . Air flow has a modular architecture and uses a message queue to orche strate an arbitrary number of workers . Air flow is written in Python and uses the Jin ja tem pl ating engine to generate work flows . Air flow work flows are composed of D AG s ( Direct ed A cycl ic Graph s ). A D AG is a collection of tasks that have dependencies on each other . T asks in Air flow are operators , which perform some sort of action . Oper ators can be written in any language . Air flow sched uler executes your D AG s on an array of workers while following the specified dependencies . When a D AG is executed , the sched uler creates a D AG Run entry in the

What is Apache Kafka?

Image
   Apache Kafka is a distributed streaming platform that offers high - through put , low - lat ency , and durability .. It is a message broker that enables communication between different applications . Apache K af ka is used in a variety of applications , including : - Mess aging - Website activity tracking - Audit trails - Log aggregation - Stream processing The fundamental construct in Kafka is a topic . Topics are like channels in a messaging system . You can publish messages to a topic , and subscribers can consume those messages . Apache K af ka is a distributed system , so topics are partition ed and replicated across multiple brokers . This allows for high availability and fault tolerance . It provides different delivery guarantees : at least once , exactly once , and in order . The at least once guarantee means that each message will be delivered at least onc