PyTorch-style tensors from first principles
I am building a PyTorch-style tensor and autograd library from scratch, with NumPy and CuPy-backed storage, CPU/GPU device movement, operator overloading, reverse-mode autodiff, gradient accumulation, and common tensor operations.
Role
Systems Builder
System Surface
Tensors, autograd, devices
Stack
Python, NumPy, CuPy
Runtime Surface
CPU/GPU-backed tensor ops
The useful part is rebuilding the machinery normally hidden behind a framework call: storage, broadcasting, overloaded operators, backward passes, gradient buffers, and device movement. It is a small library for understanding ML infrastructure by making it concrete.
Primary Focus
Autograd mechanics
Output Format
Tensor library