TopGenAIJobs

TopGenAIJobs

A Gen AI & Agentic AI Jobs Platform to discover high-quality Gen AI and Agentic AI opportunities from top companies worldwide.

topgenaijobs.com

Quick Links

  • Home
  • Browse Jobs
  • Browse by Category
  • Companies
  • Post a Job
  • Career Resources
  • About Us

Resources

  • Blog
  • Career Guide
  • Resume Tips
  • Interview Prep
  • Salary Guide
  • Skill Demand Index

Top Gen AI Roles

  • Gen AI Engineer Jobs
  • Agentic AI Engineer Jobs
  • Prompt Engineer Jobs
  • LLM Engineer Jobs
  • RAG Engineer Jobs
  • MLOps Engineer Jobs
  • Remote AI Jobs
  • Entry Level AI Jobs
  • Senior AI Jobs

Legal

  • Privacy Policy
  • Terms of Service
  • Cookie Policy
  • Contact

© 2026 TopGenAIJobs (Gen AI & Agentic AI Jobs Platform). All rights reserved.

Made with ❤ by TopGenAIJobs Team

    Home/Jobs/Generative AI Inference Engineer

    Generative AI Inference Engineer

    Stability AI

    United States
    7+ years
    Today
    ₹23–44 LPA
    Full-time
    Remote

    Skills Required

    LLM
    Gen AI
    Diffusion Architecture
    Hugging Face
    Multi-modal models
    Inference pipeline development
    High-performance inference frameworks
    Python
    PyTorch
    Triton
    TensorRT
    NVIDIA Nsight
    OpenCV
    Kubernetes
    AWS

    Description

    Seeking passionate Machine Learning Engineers to join the Inference team focusing on generative AI models and inference optimization for multi-modal generative models.

    Company: Stability AI

    Role: Generative AI Inference Engineer

    Location: United States | Remote

    Experience:

    • 7+ years working on productionizing machine learning systems including inference pipeline development
    • 5+ years working on python scientific stack, pyTorch and at least one high-performance inference framework
    • Experience profiling and optimizing deep neural networks on Nvidia GPUs
    • Experience deploying to cloud orchestration systems such as Kubernetes and cloud providers such as AWS, GCP, and Azure

    Key Skills:

    • python services at scale
    • pyTorch
    • high-performance inference frameworks (e.g. Triton, TensorRT)
    • Diffusion Architecture
    • NVIDIA Nsight profiling tools
    • OpenCV
    • Docker
    • cloud orchestration systems
    • AWS
    • GCP
    • Azure
    • HuggingFace
    • W&B

    Role Focus:

    • Design and develop customer-facing multi-modal ML inference systems
    • Build inference systems for next generation models including optimization, model tuning, and deployment
    • Partner with cloud providers to deliver hosted Stability AI inference solutions
    • Drive business impact through machine learning as a strategic thought partner
    • Bring new Stability models and pipelines into existence
    • Prototype and productionize inference platform improvements and new features

    Additional responsibilities:

    • Work alongside top researchers and engineers using high-performance computing resources
    • Rapidly prototype solutions and iterate with tight product deadlines
    • Strong communication, collaboration, and documentation

    Nice to have:

    • Familiarity with workflow tools like ComfyUI
    • Experience with the open-source ML ecosystem

    Other:

    • Equal opportunity employer with non-discrimination policy

    Prepare for this role

    Recommended resources to build the skills for this position. Sponsored.

    Python 3 Programming Specialization

    Coursera

    Intermediate Python covering classes, inheritance, APIs, and data processing.

    Python for Everybody Specialization

    Coursera

    Learn Python from scratch — variables, data structures, web scraping, and databases.

    Generative AI with Large Language Models

    Coursera

    Comprehensive LLM course covering transformer architecture, fine-tuning, RLHF, and deployment.

    More LLM jobs

    Senior AI Engineer

    Grafana Labs

    Remote

    Today

    Software Engineer, AI DevX

    Ramp

    New York

    Today

    Senior Full-Stack Developer (Python + React)

    PwC

    Kraków

    Today

    AI Solution Architect

    Penguin Solutions

    Us

    Today

    AI Solution Architect I

    C1

    United States

    Today

    Lead AI Enabled Engineer

    Stride Build

    Remote

    Today