Job description

Lead the architecture and implementation of MLOps/LLMOps systems within OpenShift AI, establishing best practices for scalability, reliability, and maintainability while actively contributing to relevant open-source communities
Design and develop robust, production-grade features focused on AI trustworthiness, including model monitoring
Drive technical decision-making around system architecture, technology selection, and implementation strategies for key MLOps components, with a focus on open-source technologies
Define and implement technical standards for model deployment, monitoring, and validation pipelines, while mentoring team members on MLOps best practices and engineering excellence
Collaborate with product management to translate customer requirements into technical specifications, architect solutions that address scalability and performance challenges, and provide technical leadership in customer-facing discussions
Lead code reviews, architectural reviews, and technical documentation efforts to ensure high code quality and maintainable systems across distributed engineering teams
Identify and resolve complex technical challenges in production environments, particularly around model serving, scaling, and reliability in enterprise Kubernetes deployments
Partner with cross-functional teams to establish technical roadmaps, evaluate build-vs-buy decisions, and ensure alignment between engineering capabilities and product vision
Provide technical mentorship to team members, including code review feedback, architecture guidance, and career development support while fostering a culture of engineering excellence
5+ years of software engineering experience, with at least 4 years focusing on ML/AI systems in production environments
Strong expertise in Python, with demonstrated experience building and deploying production ML systems
Deep understanding of Kubernetes and container orchestration, particularly in ML workload contexts
Extensive experience with MLOps tools and frameworks (e.g., KServe, Kubeflow, MLflow, or similar)
Track record of technical leadership in open source projects, including significant contributions and community engagement
Proven experience architecting and implementing large-scale distributed systems
Strong background in software engineering best practices, including CI/CD, testing, and monitoring
Experience mentoring engineers and driving technical decisions in a team environment

Advantageous Experience/Skills:

Experience with Red Hat OpenShift or similar enterprise Kubernetes platforms
Contributions to ML/AI open source projects, particularly in the MLOps/GitOps space
Background in implementing ML model monitoring
Experience with LLM operations and deployment at scale
Public speaking experience at technical conferences
Advanced degree in Computer Science, Machine Learning, or related field
Experience working with distributed engineering teams across multiple time zones

Remote

All Others

Mid-level

Job description

Similar Remote Jobs

Freelance Legal Specialist - AI Tutor

Remote

Finance & Legal

Mid-level

Freelance Legal Specialist - AI Tutor

Remote

Finance & Legal

Mid-level

Freelance Legal Specialist - AI Tutor

Remote

Finance & Legal

Mid-level

Ai Specialist

Remote

All Others

Mid-level

Freelance Legal Specialist, AI Tutor

Remote

Finance & Legal

Mid-level

Generative AI Specialist

Remote

Software Development

Mid-level

Sr. Engineer TL & AI Specialist

Remote

Software Development

Senior

Ai Operations Specialist

Remote

DevOps

Mid-level

Ai Automation Specialist

Remote

All Others

Mid-level

Manila Recruitment

Latest Jobs at Manila Recruitment

Protection Relay Application Engineer

Market Research Specialist

Mortgage Brokering Assistant

Mid-Level Back-End Software Engineer

Marketing and Ops Assistant