Explore all notes

Filter by topic and browse all posts in reverse chronological order.

Filter by topic

All topics gpu2 optimization2 systems2 architecture1 computer-vision1 iot1 mdx1 networking1 performance1 pytorch1

Real-time Object Detection: Lessons from Building a Low-Latency Pipeline

Practical experiences in building a low-latency real-time Object Detection pipeline, from handling initial blocking issues to a stable streaming architecture.

Optimizing PyTorch Data Pipelines: From Bottlenecks to 39× Speedups

An examination of how data pipeline design impacts PyTorch training performance, supported by simple experiments and benchmarks.

Inside the ESP32: Architecture & Firmware Analysis

A deep-dive journey inside the ESP-WROOM-32 chip, exploring the Xtensa CPU, memory architecture, and how to reverse-engineer firmware to uncover its secrets.

From Camera to Ethernet: A Journey into Networking

A real-world story of connecting a camera and a PC via Ethernet, exploring concepts from PoE and static IP to turning a laptop into a DHCP server.

networking

Inside CUDA: Performance Engineering

Dive deeper into CUDA to uncover the principles and practices behind high-performance GPU computing.

Hello CUDA!

Learn about GPU architecture and the CUDA model to understand how parallel hardware operates and optimize performance.

Page 1 / 2

PreviousNext