Seeking a Staff Machine Learning Engineer to lead design and development of scalable AI/ML training infrastructure at General Motors. This role involves technical leadership, architecture definition, and collaboration with ML engineers and research scientists.

Company: General Motors

Role: Staff Machine Learning Engineer - ML Training Infrastructure

Location: Frankfort | Remote

Experience:

7+ years of professional software engineering experience
5+ years of specialized experience in AI/ML infrastructure
Experience leading technically ambiguous, cross-team infrastructure initiatives

Key Skills:

Python
PyTorch
TensorFlow
Distributed systems
Distributed training
GPU computing
Cloud environments (AWS, GCP, Azure)
ML frameworks
Model training optimization
System observability
Debuggability
Operational excellence

Qualification:

Bachelor's degree or higher in Computer Science or related field or equivalent practical experience

Role Focus:

Define and drive architecture, design, and development of scalable ML frameworks and platform capabilities
Lead model training performance analysis and optimization across distributed training workflows
Improve scalability, efficiency, and cost across heterogeneous hardware environments
Enhance system observability, debuggability, operational excellence, and developer experience
Own large, ambiguous, cross-functional technical initiatives from strategy through execution
Define technical roadmap, perform tradeoff analysis, and deliver solutions
Influence platform direction by identifying long-term infrastructure investments and setting engineering standards
Collaborate across organizational boundaries to align requirements and integrate new capabilities
Mentor engineers through design reviews, technical guidance, and hands-on partnership

Additional responsibilities:

Travel to Sunnyvale, CA as needed
Operate in highly ambiguous and dynamic environments

Nice to have:

Deep expertise in PyTorch 2.x+ and distributed training frameworks
Experience with training platforms supporting FSDP, pipeline parallelism, and scalable solutions for large foundational models
Experience profiling, analyzing, debugging, and optimizing training and data loading performance at scale
Strong record of technical leadership through architecture reviews, roadmap influence, and cross-team execution
Excellent communication skills for building consensus and providing constructive technical feedback
Self-motivated and execution-oriented with broad organizational impact

Other:

Salary range $185,000 to $335,300 with bonus potential based on company and individual performance
Relocation benefits may be available
Benefits include medical, dental, vision, Health Savings Account, Flexible Spending Accounts, retirement savings plan, sickness and accident benefits, life insurance, paid vacation and holidays, tuition assistance, employee assistance program, and GM vehicle discounts
Company vehicle evaluation program available upon successful motor vehicle report review
Role categorized as remote with no expected onsite reporting unless directed
General Motors commitment to diversity, inclusion, equal employment opportunity, and accommodations for disabilities
Employment decisions made without regard to protected status under federal, state, and local laws

Staff Machine Learning Engineer - ML Training Infrastructure

Skills Required

Description