
DataDecode
Your weekly source for insights in data engineering, AI engineering, and machine learning.
Latest Articles
Published on Medium
View All
I’m Writing This From 2031. You’re About to Make a Mistake.
You don’t know me yet. But I’m you, five years from now.I’m a data engineer. Same as you. Same Airflow DAGs. Same dbt models. Same 2 AM PagerDuty alerts for a null column that shouldn’t be null.Except I don’t do any of that anymore.Not because I got fired. Because none of it...
Read on Medium
Why Your AI Agents Suck at Teamwork (And How to Fix It)
A breakdown of natural delegation in multi-agent systems.So you’ve probably heard the buzz: the future of AI isn’t one giant model doing everything. It’s a team of specialized agents working together. One agent fetches data, another analyzes it, another writes the report. Sounds...
Read on Medium
The Problem: S3 Listing Is Costing You More Than You Think
If you’re running AWS Glue jobs at scale, there’s a subtle but significant cost driver that often flies under the radar: S3 LIST operations.Every time a Glue job starts up and scans a source path, it has to list all the files under that path before it can do anything useful....
Read on Medium
Athena + Glue + Hudi: A Surprisingly Powerful Big Data ETL Combo
A Smarter Way to Split the WorkAt first glance, using both Amazon Athena and AWS Glue in the same ETL pipeline might feel unnecessary. After all, both tools can process data sitting in S3. However, this architecture becomes powerful when each service is used for what it does...
Read on Medium
Building an AI Agent That Actually Remembers: The LangGraph Sentinel Agent Story
How we built a production-ready AI agent with long-term memory, intelligent planning, and real-time thinking — and why it matters.The Problem: AI Agents That Forget EverythingYou’ve probably had this frustrating experience: you’re chatting with an AI assistant, and after a few...
Read on Medium
How the Agentic Age Is Transforming Data Engineering and Analytics
The data world is going through one of the biggest shifts since the rise of cloud platforms and this time, the catalyst is the agentic age powered by large language models (LLMs). Before we look at where we’re heading, let’s quickly revisit what data engineering and data...
Read on MediumTop Repositories
Essential open-source tools for data and ML engineering
tensorflow/tensorflow
An end-to-end open source machine learning platform for research and production. TensorFlow provides tools and libraries for building and deploying ML-powered applications.
huggingface/transformers
State-of-the-art machine learning for JAX, PyTorch, and TensorFlow. Provides thousands of pretrained models for NLP, vision, and audio tasks.
kubernetes/kubernetes
Production-grade container orchestration system for automating deployment, scaling, and management of containerized applications.
pytorch/pytorch
Tensors and dynamic neural networks in Python with strong GPU acceleration. PyTorch provides a flexible deep learning framework for research and production.
langchain-ai/langchain
Framework for developing applications powered by language models. Simplifies building LLM applications with chains, agents, and retrieval systems.
scikit-learn/scikit-learn
Machine learning library for Python built on NumPy, SciPy, and matplotlib. Provides simple and efficient tools for predictive data analysis.
Courses of the Month
Hand-picked video courses to level up your skills this month

Apache Spark Tutorial | Spark Tutorial for Beginners | Apache Spark Training
Comprehensive introduction to Apache Spark covering RDDs, DataFrames, and Spark SQL. Learn how to process large-scale data with hands-on examples.

Apache Airflow Tutorial for Beginners | What is Airflow? | Airflow Tutorial
Learn how to build, schedule, and monitor data pipelines using Apache Airflow. Covers DAGs, operators, and workflow orchestration.

Machine Learning Full Course - Learn Machine Learning 10 Hours | ML Tutorial
Comprehensive machine learning course covering supervised and unsupervised learning, algorithms, and real-world applications.

Neural Networks and Deep Learning - Full Course
Deep dive into neural networks and deep learning fundamentals. Covers backpropagation, CNNs, RNNs, and practical implementations.
Never miss an update
Weekly insights on data engineering, AI, and machine learning — delivered straight to your LinkedIn feed.
Subscribe to Newsletter