TopGenAIJobs

TopGenAIJobs

A Gen AI & Agentic AI Jobs Platform to discover high-quality Gen AI and Agentic AI opportunities from top companies worldwide.

topgenaijobs.com

Quick Links

  • Home
  • Browse Jobs
  • Browse by Category
  • Companies
  • Post a Job
  • Career Resources
  • About Us

Resources

  • Blog
  • Career Guide
  • Resume Tips
  • Interview Prep
  • Salary Guide
  • Skill Demand Index

Top Gen AI Roles

  • Gen AI Engineer Jobs
  • Agentic AI Engineer Jobs
  • Prompt Engineer Jobs
  • LLM Engineer Jobs
  • RAG Engineer Jobs
  • MLOps Engineer Jobs
  • Remote AI Jobs
  • Entry Level AI Jobs
  • Senior AI Jobs

Legal

  • Privacy Policy
  • Terms of Service
  • Cookie Policy
  • Contact

© 2026 TopGenAIJobs (Gen AI & Agentic AI Jobs Platform). All rights reserved.

Made with ❤ by TopGenAIJobs Team

    Home/Jobs/Generative AI Inference Engineer

    Generative AI Inference Engineer

    Stability AI

    United States
    7+ years
    Today
    ₹23–44 LPA
    Full-time
    Remote

    Skills Required

    LLM
    Gen AI
    Diffusion Architecture
    Hugging Face
    Multi-modal models
    Inference pipeline development
    High-performance inference frameworks
    Python
    PyTorch
    Triton
    TensorRT
    NVIDIA Nsight
    OpenCV
    Kubernetes
    AWS

    Description

    Seeking experienced Machine Learning Engineers to join the Inference team focusing on generative AI models and inference optimization for multi-modal models.

    Company: Stability AI

    Role: Generative AI Inference Engineer

    Location: United States | Remote

    Experience:

    • 7+ years working on productionizing machine learning systems including inference pipeline development
    • 5+ years working on python scientific stack, pyTorch and at least one high-performance inference framework
    • Experience profiling and optimizing deep neural networks on Nvidia GPUs
    • Experience deploying to cloud orchestration systems such as Kubernetes and cloud providers such as AWS, GCP, and Azure

    Key Skills:

    • python
    • pyTorch
    • Triton
    • TensorRT
    • Diffusion Architecture
    • NVIDIA Nsight
    • OpenCV
    • Kubernetes
    • AWS
    • GCP
    • Azure
    • Docker
    • HuggingFace
    • W&B
    • ComfyUI

    Role Focus:

    • Design and develop customer-facing multi-modal ML inference systems
    • Build inference systems for next generation models focusing on optimization, model tuning and deployment
    • Partner with cloud providers to deliver hosted Stability AI inference solutions
    • Drive business impact through machine learning as a strategic thought partner
    • Bring new Stability models and pipelines into existence
    • Prototype and productionize inference platform improvements and new features

    Additional responsibilities:

    • Work alongside top researchers and engineers using cutting-edge high-performance computing resources
    • Rapidly prototype solutions and iterate with tight product deadlines
    • Strong communication, collaboration, and documentation

    Nice to have:

    • Familiarity with workflow tools like ComfyUI
    • Experience with the open-source ML ecosystem

    Other:

    • Equal opportunity employer with non-discrimination policy on race, religion, national origin, gender, sexual orientation, age, veteran status, disability or other legally protected statuses

    Prepare for this role

    Recommended resources to build the skills for this position. Sponsored.

    Python for Everybody Specialization

    Coursera

    Learn Python from scratch — variables, data structures, web scraping, and databases.

    Python 3 Programming Specialization

    Coursera

    Intermediate Python covering classes, inheritance, APIs, and data processing.

    Generative AI with Large Language Models

    Coursera

    Comprehensive LLM course covering transformer architecture, fine-tuning, RLHF, and deployment.

    More LLM jobs

    AI / ML Engineer

    Accenture

    Pune

    Today

    SAP BTP AI Consultant

    Infosys Limited

    Bengaluru

    Today

    Generative AI Inference Engineer

    Stability AI

    United States

    Today

    Sr. Machine Learning Engineer

    Akasa

    South San Francisco

    Today

    Enterprise Application AI Architect

    Gusto, Inc.

    Denver

    Today

    Lead AI Solutions Engineer

    Chainlink Labs

    United States

    Today