Data Engineer AI Compliance

🇫🇷 France - Remote
📊 Data🔵 Mid-level

Job description

About the Company

Cephalgo is a Strasbourg-based technology company founded in 2020, focused on developing AI solutions that ensure safety, compliance, and trust in human-AI interactions. Originally rooted in healthcare innovation, Cephalgo’s platform helps organizations securely analyze and monitor voice and emotion data while meeting privacy, security, and regulatory standards.

Backed by over €3 million in funding, Cephalgo combines deep expertise in voice AI, data protection, and compliance frameworks to help enterprises build and deploy responsible AI systems. The company collaborates with leading European partners in AI ethics, healthcare, and regulatory technology.

About the Role

We are seeking a Data Engineer to build and scale systems that support text and voice analysis, risk detection, and classifier training workflows. You will be responsible for production-grade machine learning pipelines (0 → 1) and collaborate closely with data scientists and AI engineers to deliver compliant, reliable data infrastructure and services.

What You’ll Do

Pipeline Development

  • Build and maintain end-to-end ML pipelines: data ingestion, preprocessing, feature extraction, model training, evaluation and deployment.

  • Develop reliable workflows specifically for voice and text analysis models.

Data Infrastructure

  • Design and maintain data storage, ETL workflows, and streaming/batch systems.

  • Implement data-quality, data-labeling, versioning and governance practices.

ML Collaboration

  • Work with data scientists and AI engineers to productionize models (e.g., text classifiers, anomaly-detection models, compliance-scoring models).

  • Support model monitoring and performance tracking once models are live.

Scalability & Reliability

  • Build robust, scalable, fault-tolerant pipelines.

  • Add observability layers: logging, monitoring, alerting for data and model pipelines.

Documentation & Governance

  • Document ETL processes, schemas, architecture and workflows.

  • Support compliance, data governance, and security standards in data pipelines and infrastructure.

You Might Be a Fit If You Have:

Experience

  • 3+ years in data engineering or ML engineering roles.

  • Proven experience building ML pipelines from scratch.

  • Experience with text classification, voice analysis or similar ML tasks is a strong plus.

Technical Skills

  • Strong programming skills (Python, Scala or Java).

  • Experience with big-data/streaming frameworks (Spark, Beam, Kafka or similar).

  • Familiarity with ML frameworks (PyTorch, TensorFlow, scikit-learn).

  • Experience with cloud data infrastructure and production deployment.

Soft Skills

  • Strong analytical and problem-solving skills.

  • Excellent collaborator and communicator—capable of working with data scientists, engineers and product/compliance stakeholders.

  • Detail-oriented, documentation-focused and comfortable in a fast-paced environment.

Education

  • Degree in Data Engineering, Computer Science, Machine Learning or related field (or equivalent experience).

Why Join Cephalgo?

  • Be at the intersection of cutting-edge AI/voice technology and compliance.
  • Make an impact by shaping a growing brand in a high-growth market.
  • Work with a collaborative, high-energy remote team driving forward-thinking solutions.
  • Grow your career and influence across product, marketing and business domains.
Share this job:
Please let AllCares know you found this job on Remote First Jobs 🙏

Benefits of using Remote First Jobs

Discover Hidden Jobs

Unique jobs you won't find on other job boards.

Advanced Filters

Filter by category, benefits, seniority, and more.

Priority Job Alerts

Get timely alerts for new job openings every day.

Manage Your Job Hunt

Save jobs you like and keep a simple list of your applications.

Search remote, work from home, 100% online jobs

We help you connect with top remote-first companies.

Search jobs

Hiring remote talent? Post a job

Apply