|

About

I’m a Senior Engineer at NVIDIA, chasing the Speed of Light โ€” the theoretical peak performance that every GPU workload aspires to reach. At NVIDIA, I learned that true optimization isn’t about clever tricks; it’s about relentlessly measuring, understanding, and eliminating every wasted cycle until you’re as close to SOL as physics allows.

My research interests span high-performance computing, artificial intelligence, and computer architecture. I work on pushing state-of-the-art deep learning models to industry-leading performance across domains including speech recognition, machine translation, image classification & detection, and generative AI.

This blog is where I document my learnings โ€” the insights, techniques, and hard-won lessons from the pursuit of peak efficiency. In my spare time, I build developer tools in Python, CUDA, and PyTorch to make both everyday workflows and deep learning research faster and more productive.

๐Ÿš€ Deep Learning Models Link to heading

Selected training optimizations I’ve contributed to:

๐Ÿ”ง Open Source Contributions Link to heading

Key deep learning building blocks I’ve developed:

๐Ÿ“ฌ Contact Link to heading

Feel free to reach out via Zhihu, LinkedIn, or leave a comment on any post.