Job description

About Blueprint

At Blueprint, we’re on a mission to empower therapists with world-class tools so they can focus on what matters most—delivering exceptional mental health care.

Our AI assistant is purpose-built for therapists, automating the administrative tasks that slow them down and enabling them to operate at the top of their license. With Blueprint, therapists aren’t just managing their work; they’re supported by tools that understand the context of each client interaction. Compared to legacy software tools, Blueprint feels more like having the world’s best executive assistant at your side.

Today, over 50,000 therapists are on Blueprint, leveraging our platform to enhance care for hundreds of thousands of clients. We’ve found strong product-market fit and are scaling rapidly to meet demand.

Our organization is very flat and our team is intentionally small and talent-dense. We like people who are truthseekers, creative, and passionate about improving mental health care.

We’re a remote-first company (US and Canada only, for now) and come together in person a few times a year to connect, have fun, and help shape the future of mental health care.

About the role

We’re looking for an experienced AI/ML Engineer to take ownership of evaluation and quality across our AI systems. At Blueprint, AI isn’t a bolt-on — it’s the foundation of our product. We use LLMs to automate clinical documentation, deliver clinical insights, and reimagine how therapists work.

This role is about making sure those systems work reliably, safely, and well. You’ll design the evaluation infrastructure that helps us measure what “good” looks like across subjective, human-centered workflows and build the tools to track, test, and improve model outputs over time.

You’ll work closely with engineering, product, and clinical leaders to define quality in practical, therapist-facing terms and make sure we have the systems in place to deliver it consistently.

This is a highly cross-functional, high-impact role. Your work will directly shape what tens of thousands of therapists experience when they use our product every day.

What You’ll Do

Design and build our end-to-end evaluation infrastructure: LLM-as-a-judge, human QA pipelines, offline scoring, and more
Define and implement application-specific quality metrics — not just accuracy, but tone, structure, clinical alignment, and more
Collaborate with product and clinical leads to turn subjective requirements into structured evaluation criteria
Monitor and analyze model performance across different therapist cohorts and workflows
Build tools and processes to capture in-the-wild feedback from clinicians and route it back into model and product improvement loops
Work closely with engineers to integrate eval into our CI, deployment, and iteration cycles
Help shape data labeling, prompt evaluation, experiment design, and prompt tuning frameworks

Who We’re Looking For

You’re a hands-on ML/AI practitioner who’s passionate about building high-quality systems that actually get used — not just optimizing for benchmark scores. You’ve worked with LLMs in production at scale and know the hard part is making outputs reliable, human-aligned, and easy to evaluate. You’re motivated by impact, comfortable with ambiguity, and thrive in early-stage, fast-paced environments.

You might be a fit if:

You’ve built or owned evaluation infrastructure for LLMs or generative AI products
You have experience designing QA workflows, human-in-the-loop systems, or LLM-as-a-judge pipelines
You think in terms of feedback loops — and can turn fuzzy product goals into testable quality metrics
You write code, ship experiments, and are comfortable working across the stack to get the right signals flowing
You’re excited about working closely with product, design, and domain experts to define and refine what “good” means in a real-world AI application

Bonus if you have:

Experience in healthcare, mental health, or other high-trust environments
Familiarity with labeling, data QA, or prompt engineering at scale
A strong POV on eval tools, metrics, or best practices — and a willingness to invent new ones where needed
Competitive salary and equity
100% remote – no office, no commuting
Health, dental, and vision insurance, with 75% of your premium covered by Blueprint
Semi-annual team gatherings (in Chicago!)
Unlimited PTO
Opportunities to grow with the company and shape our product
Hardworking, mission-driven, friendly coworkers

Blueprint is an equal opportunity employer and does not discriminate on the basis of race, gender, sexual orientation, gender identity/expression, national origin, disability, age, genetic information, veteran status, marital status, pregnancy or related condition, or any other basis protected by law.

Benefits of using Remote First Jobs

Discover Hidden Jobs

Unique jobs you won't find on other job boards.

Advanced Filters

Filter by category, benefits, seniority, and more.

Priority Job Alerts

Get timely alerts for new job openings every day.

Manage Your Job Hunt

Save jobs you like and keep a simple list of your applications.

Search remote, work from home, 100% online jobs

We help you connect with top remote-first companies.

Search jobs

Hiring remote talent? Post a job

Frequently Asked Questions

What makes Remote First Jobs different from other job boards?

Unlike other job boards that only show jobs from companies that pay to post, we actively scan over 20,000 companies to find remote positions. This means you get access to thousands more jobs, including ones from companies that don't typically post on traditional job boards. Our platform is dedicated to fully remote positions, focusing on companies that have adopted remote work as their standard practice.

How often are new jobs added?

New jobs are constantly being added as our system checks company websites every day. We process thousands of jobs daily to ensure you have access to the most up-to-date remote job listings. Our algorithms scan over 20,000 different sources daily, adding jobs to the board the moment they appear.

Can I trust the job listings on Remote First Jobs?

Yes! We verify all job listings and companies to ensure they're legitimate. Our system automatically filters out spam, junk, and fake jobs to ensure you only see real remote opportunities.

Can I suggest companies to be added to your search?

Yes! We're always looking to expand our listings and appreciate suggestions from our community. If you know of companies offering remote positions that should be included in our search, please let us know. We actively work to increase our coverage of remote job opportunities.

How do I apply for jobs?

When you find a job you're interested in, simply click the 'Apply Now' button on the job listing. This will take you directly to the company's application page. We kindly ask you to mention that you found the position through Remote First Jobs when applying, as it helps us grow and improve our service 🙏

Job description

About Blueprint

About the role

What You’ll Do

Who We’re Looking For

Similar Remote Jobs

Senior Fullstack Engineer - AI/ML

Senior Fullstack Engineer - AI/ML

Senior AI/ML Engineer

Technical Lead- AI/ML Engineer

Senior AI/ML Engineer

AI/ML Engineer

AI/ML Engineer

Senior AI/ML Engineer

Senior Gen AI/ML Engineer

Blueprint

Business Development & Partner Support Representative

AI/ML Engineer

Senior Software Engineer

Principal Product Designer

Benefits of using Remote First Jobs

Discover Hidden Jobs

Advanced Filters

Priority Job Alerts

Manage Your Job Hunt

Search remote, work from home, 100% online jobs

Frequently Asked Questions

What makes Remote First Jobs different from other job boards?

How often are new jobs added?

Can I trust the job listings on Remote First Jobs?

Can I suggest companies to be added to your search?

How do I apply for jobs?