Seeking experienced Machine Learning Engineers to develop and optimize inference systems for multi-modal generative AI models using cutting-edge high-performance computing.

Company: Stability AI

Role: Generative AI Inference Engineer

Location: United States | Remote

Experience:

7+ years working on productionizing machine learning systems including inference pipeline development
5+ years working on python scientific stack, pyTorch and at least one high-performance inference framework
Experience profiling and optimizing deep neural networks on Nvidia GPUs
Experience deploying to cloud orchestration systems such as Kubernetes and cloud providers such as AWS, GCP, and Azure

Key Skills:

python
pyTorch
Triton
TensorRT
Diffusion Architecture
NVIDIA Nsight
OpenCV
Kubernetes
AWS
GCP
Azure
Docker
HuggingFace
W&B

Role Focus:

Design and develop customer-facing multi-modal ML inference systems
Build inference systems for next generation models focusing on optimization, model tuning and deployment
Prototype and productionize inference platform improvements and new features
Bring new Stability models and pipelines into existence

Additional responsibilities:

Partner with leading cloud providers to deliver hosted Stability AI inference solutions
Be a strategic thought partner for leaders across the organization on driving business impact through machine learning
Work with Platform and Inference teams

Nice to have:

Familiarity with workflow tools like ComfyUI
Experience with the open-source ML ecosystem

Other:

Strong communication, collaboration, and documentation skills
Ability to rapidly prototype solutions and iterate with tight product deadlines
Opportunity to work alongside top researchers and engineers
Utilize cutting-edge high-performance computing resources

Generative AI Inference Engineer

Skills Required

Description