My Articles

Insights, tutorials, and thoughts on data engineering, analytics, and technology

ChessLens - Chess Analytics

I Played 1,379 Chess Games in 3 Months — The Data Explains Why I Keep Losing

A deep dive into building ChessLens, an end-to-end chess analytics platform using dbt, DuckDB, Dagster, and Streamlit. Covers tilt detection, session analysis, opening trends, and on-demand Stockfish evaluation — turning 1,379 chess.com games into actionable insights about when, how, and why I lose.

Modern Data Engineering: Designing a Real-time Architecture

Building a Formula 1 Data Pipeline (Part 1): Designing a Real-time Architecture

In this article, I share my journey building a data pipeline for Formula 1 racing data using modern technologies. Learn how I leveraged PostgreSQL for storage, Kafka and Debezium for streaming, and DuckDB with dbt for analytics, implementing advanced patterns like CDC and SCD Type 2 along the way.

Building a Simple Recommender System

Building a Simple Recommender System: Content-based

This tutorial explains how to build a content-based filtering system that predicts user ratings for movies based on their preferences. Learn the fundamentals of recommendation systems, from data preparation to similarity calculation and prediction, with practical Python implementation.

Building a Formula 1 Data Pipeline: The Data Collection Layer

Building a Formula 1 Data Pipeline (Part 2): The Data Collection Layer

In this article, I detail how I built the data collection layer for my F1 pipeline using Python-based collectors with async programming, robust error handling, and efficient connection pooling. I explain the challenges of collecting Formula 1 data and share lessons learned from implementing API integration and web scraping solutions.