Probably Genetic Logo

Senior Data Scientist

💰 $180k-$230k

Job Description

About Probably Genetic

Probably Genetic is changing the lives of patients living with severe, complex diseases. Our data platform is used by drug developers and patient advocacy groups to develop and launch treatments for these patients. Our technology discovers undiagnosed patients online, analyzes their disease state using machine learning and at-home testing, and enables compliant communication with patients. In doing so, we help patients access diagnoses, clinical trials, and treatments as early as possible.

We are a tight-knit group of hard-working, ambitious problem solvers united by a mission greater than ourselves. We do well by doing right by patients. We are developing some of the most cutting-edge solutions in healthcare, and our roadmap is packed with innovations in bioinformatics, AI, and drug development. We have built a lean, all-star team to help us bring our vision to life, and we want you to be a part of it.

Probably Genetic has raised multiple rounds of funding from Silicon Valley’s best investors, including Threshold, Khosla, and Y Combinator, and offer competitive salaries, comprehensive benefits, and meaningful early stage equity.

About the role

We are looking for a Senior Data Scientist who will own some of the most consequential diagnostic AI in rare disease: building, validating, and operationalizing the models that help us find and diagnose patients who have never had a name for their disease, powering the analytical rigor behind our testing programs, and shaping how we use data to make smarter product decisions.

What you will do

  • Own the end-to-end development, validation, and operationalization of PG’s predictive diagnostic AI models — from feature engineering through production deployment – that power program eligibility decisions and clinical decisions for patients

  • Run prospective testing experiments: apply diagnostic models to undiagnosed patients, coordinate testing, and track outcomes to continuously improve model performance

  • Build and maintain PG’s synthetic patient data pipeline, a critical deliverable for our research programs, and key input to our own model development lifecycle

  • Optimize our patient intake experience using NLP and multimodal data analysis to determine which questions to ask, in what order, to maximize data quality and conversion

  • Own API usage and cost optimization across PG’s AI stack, including prompt engineering, model evaluation, and ongoing performance monitoring

  • Conduct ad hoc strategic analyses that inform product prioritization, causality assessment, and generate customer-facing program insights

  • Establish MLOps infrastructure: model monitoring, drift detection, API observability, and lightweight but durable operational processes

  • Have the freedom to conduct blue sky research initiatives aimed at creating value from our data

  • Work with Data Engineering to build a robust, scalable data foundation that supports all of the above

Who you are

We are looking for a few specific things that will help you succeed in this role:

  • 7+ years of experience in data science, machine learning engineering, or a closely related field

  • Strong Python proficiency and fluency across the core data science stack: pandas, NumPy, scikit-learn, PySpark, and SQL

  • Demonstrated end-to-end ML experience: you have taken models from problem definition through feature engineering, validation, deployment, and monitoring in a production environment

  • Experience with NLP techniques and applying language models to real-world problems

  • Comfort with prompt engineering and evaluating external AI API performance (e.g., OpenAI)

  • A track record of operating with high ownership in lean, fast-moving environments where you have had to build structure as much as execute within it

  • Strong analytical communication skills — you can translate complex model outputs and data findings into clear, actionable narratives for technical and non-technical audiences alike

Some things that are not required, but you will learn on the job:

  • Experience with Databricks or similar lakehouse/ML platform environments

  • Familiarity with synthetic data generation techniques

  • Domain knowledge in healthcare, rare disease, genomics, or clinical research

  • Experience with MLOps tooling and building observability infrastructure from scratch

  • Exposure to biopharma or insurance analytics use cases

As with all new hires at Probably Genetic, you will also need to be:

  • A good person. We work with some of the most marginalized populations on the planet and empathy is key

  • Patient-focused and motivated to have a lasting, positive impact on humanity

  • Comfortable in a fast-paced, often ambiguous environment with rapid change

  • Action-oriented and excited to build a company from the ground up

The salary range for this role is $180,000-$230,000 annually. Actual compensation offered will depend on several factors including but not limited to: work experience, education, skill level, and/or other business and organizational needs.

What we offer at Probably Genetic:

  • An engaging and supportive team all on a mission to improve lives

  • Fair and equitable compensation with competitive early-stage equity grants

  • Generous Flexible Time off policy, that we actually use

  • Parental Leave Benefits (12 weeks for both birthing and non-birthing)

  • Hybrid, flexible work with high-trust and autonomy

  • A bright, inviting, pet-friendly office in Downtown SF near transit

  • A “work from anywhere” policy, up to 4 weeks a year

  • Regular team retreats in exciting destinations

  • Health Benefits including medical, dental, vision, therapy, FSA, and 401k

  • And so much more!

Probably Genetic is committed to fostering a welcoming and inclusive work environment for people of all genders, sexuality, ethnicity, socioeconomic background and life experiences. We urge candidates of all backgrounds to apply. If you require specific accommodations as you interview or consider working with us, please let us know.

Share this job:
Please let Probably Genetic know you found this job on Remote First Jobs 🙏

1263 similar remote jobs

Explore latest remote opportunities and join a team that values work flexibility.

Remote companies like Probably Genetic

Explore remote-first companies similar to Probably Genetic. Discover other top-rated employers that offer flexible schedules and work-from-anywhere options.

PharmaEssentia Logo

PharmaEssentia

Develops biologics in hematology, oncology, and immunology, with global operations in the U.S., Japan, China, and Korea.

View company profile →
Pyros Pharmaceuticals Logo

Pyros Pharmaceuticals

Develops specialty medicines for rare pediatric diseases and underserved patient populations.

View company profile →
Recursion Logo

Recursion

A clinical stage TechBio company using AI and machine learning for drug discovery and development.

View company profile →
Genomenon, Inc Logo

Genomenon, Inc

A genomic intelligence company providing software, data, and services for patient diagnosis and precision medicine.

View company profile →
Ultragenyx Logo

Ultragenyx

Developing and commercializing innovative medicines for rare and ultrarare genetic diseases.

View company profile →
Kyowa Kirin, Inc.- U.S. Logo

Kyowa Kirin, Inc.- U.S.

A global specialty pharmaceutical company focused on discovering and delivering novel medicines using biotechnologies.

View company profile →

Project: Career Search

Rev. 2026.6

[ Remote Jobs ]
Direct Access

We source jobs directly from 21,000+ company career pages. No intermediaries.

01

Discover Hidden Jobs

Unique jobs you won't find on other job boards.

02

Advanced Filters

Filter by category, benefits, seniority, and more.

03

Priority Job Alerts

Get timely alerts for new job openings every day.

04

Manage Your Job Hunt

Save jobs you like and keep a simple list of your applications.

21,000+ SOURCES UPDATED 24/7
Apply