Member of Technical Staff Compute Infrastructure

last updated April 19, 2026 6:11 UTC

xAI

HQ: San Francisco, US

more jobs in this category:

  • -> Virtual Administrative Assistant @ NothernTrust
  • -> AI Training for People Operations Experts @ Remotasks
  • -> Remote Finance Lead @ Red Hot Marketing LLC
  • -> Remote CFO ($100k/yr) @ Thompson & Thompson Consulting
  • -> Virtual Assistant @ Solesdi US

About xAI

xAI’s mission is to create AI systems that can accurately understand the universe and aid humanity in its pursuit of knowledge. Our team is small, highly motivated, and focused on engineering excellence. This organization is for individuals who appreciate challenging themselves and thrive on curiosity. We operate with a flat organizational structure. All employees are expected to be hands-on and to contribute directly to the company’s mission. Leadership is given to those who show initiative and consistently deliver excellence. Work ethic and strong prioritization skills are important. All employees are expected to have strong communication skills. They should be able to concisely and accurately share knowledge with their teammates. ABOUT THE ROLE:

The Compute Infrastructure team at xAI is responsible for designing, building, and operating the massive-scale clusters and orchestration platforms that power frontier AI training, inference, and agent workloads at unprecedented scale. In this role, you will push the boundaries of container orchestration far beyond existing systems like Kubernetes, manage exascale compute resources, optimize for high-performance training runs and production serving, and collaborate closely with research and systems teams to deliver reliable, ultra-scalable infrastructure that enables xAI’s next-generation models and applications. RESPONSIBILITIES:

  • Build and manage massive-scale clusters to host, persist, train, and serve AI workloads with extreme reliability and performance.

  • Design, develop, and extend an in-house container orchestration platform that achieves superior scalability, isolation, resource efficiency, and fault-tolerance compared to off-the-shelf solutions.

  • Collaborate with resear

Apply info ->

To apply for this job, please visit the application page

Shopping Cart
There are no products in the cart!
Total
 0.00
0