Optimizing Driver Allocation for Peak Hours in Ride-Hailing: A Data-Driven ApproachIn the fast-paced world of ride-hailing, ensuring a seamless experience for riders during peak hours is a significant challenge. Long wait…Aug 28, 2023Aug 28, 2023
Navigating the Storm: Taming Rapidly Changing Dimensions in Data ManagementIn the dynamic realm of data management, change is the only constant. But what happens when that change accelerates into a whirlwind of…Aug 23, 2023Aug 23, 2023
Managing Dimensional Data Changes: A Refresher to Slowly Changing Dimensions (SCDs) and their TypesImagine you are working with a retail company that sells products online. One of the dimensions you are tracking is “Product”, and you want…Aug 22, 2023Aug 22, 2023
Supercharging Data Analysis: Unleashing the Power of Cumulative Table DesignUnlocking the full potential of a Data Analysis solution is a quest that intrigues both Data Engineers and well as Data Analysts. Imagine a…Aug 21, 2023Aug 21, 2023
ShellStat Data Model: A Brief Overview & SignificanceShellStat, designed to offer deep insights into shell history, uses a robust data model to capture, store, and analyse terminal…Aug 18, 2023Aug 18, 2023
ShellStatOver the weekend, while reviewing some systems logs of my Mac, I came across one exciting use case. I was trying to run some commands, and…Aug 17, 2023Aug 17, 2023
Hacking Apache Airflow to trigger DAGs based on FileSystem eventsIn this blog, I would like to cover the concepts of triggering the Airflow DAGs basing on the events from a filesystem or simply a…Mar 9, 20201Mar 9, 20201
Writing UDFs (User Defined Functions) in Apache SparkIn this small tutorial, we will see how to create a User Defined Function aka UDF in Apache Spark.Sep 28, 2019Sep 28, 2019
Parsing Apache Kafka __consumer_offsets using Kafka command and Java API__consumer_offsets is the topic where Apache Kafka stores the offsets. Since the time Kafka migrated the offset storage from Zookeeper to…May 16, 2019May 16, 2019