Praxis Job Board

Software Engineer, ML Performance, Dojo

Tesla

Tesla

Software Engineering, Data Science
Palo Alto, CA, USA
Posted 6+ months ago
What to Expect

As a member of the Dojo Machine Learning team, you will be responsible for enabling Tesla's neural networks to train efficiently on our upcoming in-house custom silicon supercomputer systems. Join a small team of experienced developers in optimizing and scaling the deployment of our Pytorch derived neural networks on Tesla's custom massively parallel Dojo accelerators. Work with many of the same great engineers who delivered Tesla's custom FSD Computer. The ideal candidate has experience with writing software for large distributed systems.

What You’ll Do
  • Understand and model the end-to-end training performance of the Autopilot SW team's Pytorch-derived neural networks on the Dojo system
  • Develop software that scales and improves training performance based on your analysis of the bottlenecks
  • Collaborate with the Dojo HW team to understand current HW architecture and propose future improvements
What You’ll Bring
  • Degree in Engineering, Computer Science, or equivalent in experience and evidence of exceptional ability
  • Experience scaling neural network training systems or other large distributed systems
  • Familiarity with the internals of PyTorch and/or JAX
  • Performance analysis experience
  • Experience coding parallel programs
  • Able to work from Palo Alto office