Posts tagged with Apple-silicon

Published on
August 26, 2025
Speeding up PyTorch inference on Apple devices with AI-generated Metal kernels
Kernel Optimization Performance Apple Silicon
Our lab investigated whether frontier models can write optimized GPU kernels for Apple devices to speed up inference. We found that they can: our AI-generated Metal kernels were 1.24x faster across KernelBench v0.1 problems, and 1.87x faster across KernelBench v0 problems.

Speeding up PyTorch inference on Apple devices with AI-generated Metal kernels