Seeking experienced Machine Learning Engineers to join the Inference team focusing on generative AI models and inference optimization for multi-modal models.

Company: Stability AI

Role: Generative AI Inference Engineer

Location: United States | Remote

Experience:

7+ years working on productionizing machine learning systems including inference pipeline development
5+ years working on python scientific stack, pyTorch and at least one high-performance inference framework
Experience profiling and optimizing deep neural networks on Nvidia GPUs
Experience deploying to cloud orchestration systems such as Kubernetes and cloud providers such as AWS, GCP, and Azure

Key Skills:

python
pyTorch
Triton
TensorRT
Diffusion Architecture
NVIDIA Nsight
OpenCV
Kubernetes
AWS
GCP
Azure
Docker
HuggingFace
W&B
ComfyUI

Role Focus:

Design and develop customer-facing multi-modal ML inference systems
Build inference systems for next generation models focusing on optimization, model tuning and deployment
Partner with cloud providers to deliver hosted Stability AI inference solutions
Drive business impact through machine learning as a strategic thought partner
Bring new Stability models and pipelines into existence
Prototype and productionize inference platform improvements and new features

Additional responsibilities:

Work alongside top researchers and engineers using cutting-edge high-performance computing resources
Rapidly prototype solutions and iterate with tight product deadlines
Strong communication, collaboration, and documentation

Nice to have:

Familiarity with workflow tools like ComfyUI
Experience with the open-source ML ecosystem

Other:

Equal opportunity employer with non-discrimination policy on race, religion, national origin, gender, sexual orientation, age, veteran status, disability or other legally protected statuses

Generative AI Inference Engineer

Skills Required

Description