Deep Learning Architect at NVIDIA
yifany AT csail.mit.edu
Google Scholar
Github
LinkedIn
Curriculum Vitae
I am a Deep Learning Architect at NVIDIA, working on deep learning inference architecture.
I graduated my Ph.D. in CS from MIT advised by Professor Daniel Sanchez and Professor Joel Emer. My thesis work focuses on computer architecture, specifically on accelerating irregular and sparse applications such as sparse transformers (GPT, Bert, etc.), sparse CNNs, sparse tensor algebra, and graph analytics.
Before joining MIT, I received a bachelor’s degree in Mathematics and Physics from Tsinghua University in 2019, where I worked with Professor Leibo Liu. I did a summer internship at UC Berkeley working with Professor Kurt Keutzer. I also had industry internship at Apple.
You can access my curriculum vitae here.
I plan to write blogs during my process of learning CUDA/Cutlass/Cute/Triton programming. Stay tuned for more!
Axel Feldmann, Courtney Golden, Yifan Yang, Joel S. Emer, Daniel Sanchez
in Proceedings of the 57th annual international symposium on Microarchitecture (MICRO-57), 2024.
[paper]
Yifan Yang, Joel S. Emer, Daniel Sanchez
in Proceedings of the 51th annual International Symposium on Computer Architecture (ISCA-51), 2024.
[paper]
Yifan Yang, Joel S. Emer, Daniel Sanchez
in Proceedings of the 29th international symposium on High Performance Computer Architecture (HPCA-29), 2023.
[paper] [slides] [poster]
Yifan Yang, Joel S. Emer, Daniel Sanchez
in Proceedings of the 48th annual International Symposium on Computer Architecture (ISCA-48), 2021.
[paper] [slides] [lightning] [poster]
Yifan Yang, Zhaoshi Li, Yangdong Deng, Zhiwei Liu, Shouyi Yin, Shaojun Wei, Leibo Liu
in Proceedings of the 47th annual International Symposium on Computer Architecture (ISCA-47), 2020.
[paper] [slides] [lightning]
Yifan Yang, Qijing Huang, Bichen Wu, Tianjun Zhang, Liang Ma, Giulio Gambardella, Michaela Blott, Luciano Lavagno, Kees Vissers, John Wawrzynek, Kurt Keutzer
in Proceedings of the 2019 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays (FPGA), 2019.
[paper] [slides] [code]
Deep learning inference architecture
CPU cache subsystem performance research
6.812/6.825 Hardware Architecture for Deep Learning
Algorithm-hardware co-design for ConvNet accelerators on embedded FPGAs