TopGenAIJobs

TopGenAIJobs

A Gen AI & Agentic AI Jobs Platform to discover high-quality Gen AI and Agentic AI opportunities from top companies worldwide.

topgenaijobs.com

Quick Links

  • Home
  • Browse Jobs
  • Browse by Category
  • Companies
  • Post a Job
  • Career Resources
  • About Us

Resources

  • Blog
  • Career Guide
  • Resume Tips
  • Interview Prep
  • Salary Guide
  • Skill Demand Index

Top Gen AI Roles

  • Gen AI Engineer Jobs
  • Agentic AI Engineer Jobs
  • Prompt Engineer Jobs
  • LLM Engineer Jobs
  • RAG Engineer Jobs
  • MLOps Engineer Jobs
  • Remote AI Jobs
  • Entry Level AI Jobs
  • Senior AI Jobs

Legal

  • Privacy Policy
  • Terms of Service
  • Cookie Policy
  • Contact

© 2026 TopGenAIJobs (Gen AI & Agentic AI Jobs Platform). All rights reserved.

Made with ❤ by TopGenAIJobs Team

    Home/Jobs/Generative AI Inference Engineer

    Generative AI Inference Engineer

    Stability AI

    United States
    7+ years
    Today
    ₹23–44 LPA
    Full-time
    Remote

    Skills Required

    LLM
    Gen AI
    Diffusion Architecture
    Hugging Face
    Multi-modal models
    Python
    PyTorch
    Triton
    TensorRT
    NVIDIA Nsight
    OpenCV
    Kubernetes
    AWS
    GCP
    Azure

    Description

    Seeking experienced Machine Learning Engineers to develop and optimize inference systems for multi-modal generative AI models using cutting-edge high-performance computing.

    Company: Stability AI

    Role: Generative AI Inference Engineer

    Location: United States | Remote

    Experience:

    • 7+ years working on productionizing machine learning systems including inference pipeline development
    • 5+ years working on python scientific stack, pyTorch and at least one high-performance inference framework
    • Experience profiling and optimizing deep neural networks on Nvidia GPUs
    • Experience deploying to cloud orchestration systems such as Kubernetes and cloud providers such as AWS, GCP, and Azure

    Key Skills:

    • python
    • pyTorch
    • Triton
    • TensorRT
    • Diffusion Architecture
    • NVIDIA Nsight
    • OpenCV
    • Kubernetes
    • AWS
    • GCP
    • Azure
    • Docker
    • HuggingFace
    • W&B

    Role Focus:

    • Design and develop customer-facing multi-modal ML inference systems
    • Build inference systems for next generation models focusing on optimization, model tuning and deployment
    • Prototype and productionize inference platform improvements and new features
    • Bring new Stability models and pipelines into existence

    Additional responsibilities:

    • Partner with leading cloud providers to deliver hosted Stability AI inference solutions
    • Be a strategic thought partner for leaders across the organization on driving business impact through machine learning
    • Work with Platform and Inference teams

    Nice to have:

    • Familiarity with workflow tools like ComfyUI
    • Experience with the open-source ML ecosystem

    Other:

    • Strong communication, collaboration, and documentation skills
    • Ability to rapidly prototype solutions and iterate with tight product deadlines
    • Opportunity to work alongside top researchers and engineers
    • Utilize cutting-edge high-performance computing resources

    Prepare for this role

    Recommended resources to build the skills for this position. Sponsored.

    Python for Everybody Specialization

    Coursera

    Learn Python from scratch — variables, data structures, web scraping, and databases.

    Python 3 Programming Specialization

    Coursera

    Intermediate Python covering classes, inheritance, APIs, and data processing.

    Generative AI with Large Language Models

    Coursera

    Comprehensive LLM course covering transformer architecture, fine-tuning, RLHF, and deployment.

    More LLM jobs

    AI / ML Engineer

    Accenture

    Pune

    Today

    SAP BTP AI Consultant

    Infosys Limited

    Bengaluru

    Today

    Sr. Machine Learning Engineer

    Akasa

    South San Francisco

    Today

    Enterprise Application AI Architect

    Gusto, Inc.

    Denver

    Today

    Lead AI Solutions Engineer

    Chainlink Labs

    United States

    Today

    Generative AI Inference Engineer

    Stability AI

    United States

    Today