Research Engineer, Scaling


Job Details

About the role:

As a Research Engineer in the Scaling team, you will directly train the models we launch to the public via Claude.AI and our API. In this role, you will design, implement, and optimize large-scale distributed systems that interact with state-of-the-art hardware accelerators. You will drive operational overhead towards zero through automation, and system uptime towards 99.9%. You will be at the nexus of systems, infrastructure, and deep learning. Your work on the scaling team will have a direct and massive impact on the company s success.

Responsibilities:

  • Implementing and optimizing distributed training algorithms on accelerators, and surrounding distributed systems to train frontier models.
  • The latest generation hardware accelerators and data center networks.
  • Collaborating with ML researchers, accelerator performance optimization teams, network teams, cluster management teams, and beyond.
  • The high-leverage nexus of systems and machine learning.

You may be a good fit if you:

  • Have experience with Python and one additional high performance language (Rust, C, C++, D, Java, C#, Fortran, Go, Swift, etc.)
  • Are results-oriented, with a bias towards flexibility and impact
  • Enjoy being empowered to fix problems wherever they show up.
  • Enjoy pair programming (we love to pair!)
  • Want to learn more about machine learning research
  • Care about the societal impacts of your work
  • Have clear written and verbal communication

Strong candidates may also have experience with:

  • Have experience working with hardware accelerators, HPC, machine learning, networking, or distributed systems
  • Have experience or a strong interest in machine learning
  • Have experience with complex shared codebases
  • Have experience optimizing the performance of programs
  • Have experience running highly available systems

Deadline to apply: None. Applications will be reviewed on a rolling basis.

#J-18808-Ljbffr





 Lionheart Ventures

 06/26/2024

 all cities,CA