An examination of how data pipeline design impacts PyTorch training performance, supported by simple experiments and benchmarks.
Explore all notes
Filter by topic and browse all posts in reverse chronological order.
A deep-dive journey inside the ESP-WROOM-32 chip, exploring the Xtensa CPU, memory architecture, and how to reverse-engineer firmware to uncover its secrets.
A real-world story of connecting a camera and a PC via Ethernet, exploring concepts from PoE and static IP to turning a laptop into a DHCP server.
Dive deeper into CUDA to uncover the principles and practices behind high-performance GPU computing.
Learn about GPU architecture and the CUDA model to understand how parallel hardware operates and optimize performance.
A clear, organized space to consolidate knowledge and document the learning process.