Chapter 3

Curly-Hair-ai.com: Building a Domain-Specific AI for Haircare

The journey of creating an AI platform that helps users understand and care for their curly hair through personalized recommendations and routines.

Chapter 7

Collaborating with BashNota: Bringing Revolutionary Computational Research Tools to the Masses

How I'm working with Taha Bouhsine, CEO of MLNomads and inventor of YAT, to bring BashNota - the world's greatest computational research tool - to researchers and developers worldwide.

Chapter 9

The Gravity of Learning

Machine learning borrows its vocabulary from biology. The math underneath it is linear algebra. But the thing it might actually be describing is physics — specifically, gravity.

Chapter 10

Small Models, Sharp Instincts

Before you can train big, you need to know how to read what your model is telling you. Optimizers, the batch size equation, and learning to diagnose training from the charts.

Chapter 11

The Tokens You Don't See

Sequence packing, intra-document masking, and why the invisible data engineering of your training pipeline shapes what the model learns as surely as any architectural choice.

Chapter 12

Infrastructure: The Unsung Hero

GPUs, memory hierarchies, NVLink, and PCIe — why the hardware layer underneath your training run shapes everything, and how to stop treating it as a black box.

Chapter 13

Splitting the Work

From replicating models across GPUs to sharding every byte of memory — Data Parallelism, ZeRO, Tensor Parallelism, and Sequence Parallelism explained from first principles.

Chapter 14

The Full Orchestra

Pipeline Parallelism, Context Parallelism, Expert Parallelism — the remaining three dimensions of distributed training, how all five compose, and the art of finding the right configuration.