Summary
The job is for a Deep Learning and Large Language Model Performance Architect at a cutting-edge hardware startup in Silicon Valley. The role includes analyzing workloads, developing performance models, and collaborating with cross-functional teams. The ideal candidate has an MS or PhD in relevant discipline, deep learning knowledge, computer architecture background, strong C/C++ programming skills, and good communication skills.
Requirements
- MS or PhD in relevant discipline (CS, EE, Math) or equivalent experience
- In-depth knowledge of deep learning models or large language models
- Strong background in computer architecture or AI software stack/compilers
- Strong C/C++ programming and hardware modeling skills
- Strong problem solving and analytical thinking skills
- Good communication and organizational skills
Responsibilities
- Workload Analysis - Analyzing the performance of important workloads, tuning our current software, and proposing improvements for future software
- Performance modeling and analysis - develop analytical model for target systems and analyze the performance bottleneck. make recommendations to the implementation teams
- Collaborating with cross-collaborative teams of deep learning software engineers and hardware architects to develop innovative solutions
Preferred Qualifications
- Performance modeling and analysis background a plus
- GPU programming experience (CUDA) a plus
- LLVM/MLIR development experience a plus