February 17, 2025
Data Engineering
Building a Formula 1 Data Pipeline (Part 1): Designing a Real-time Architecture
In this article, I share my journey building a data pipeline for Formula 1 racing data using modern technologies.
Learn how I leveraged PostgreSQL for storage, Kafka and Debezium for streaming, and DuckDB with dbt for analytics,
implementing advanced patterns like CDC and SCD Type 2 along the way.
May 29, 2020
Machine Learning
Tutorial
Building a Simple Recommender System: Content-based
This tutorial explains how to build a content-based filtering system that predicts user ratings for movies
based on their preferences. Learn the fundamentals of recommendation systems, from data preparation to
similarity calculation and prediction, with practical Python implementation.
April 23, 2025
Data Engineering
Building a Formula 1 Data Pipeline (Part 2): The Data Collection Layer
In this article, I detail how I built the data collection layer for my F1 pipeline using Python-based collectors with
async programming, robust error handling, and efficient connection pooling. I explain the challenges of collecting
Formula 1 data and share lessons learned from implementing API integration and web scraping solutions.