Job description
CodaMetrix is revolutionizing Revenue Cycle Management with its AI-powered autonomous coding solution, a multi-specialty AI-platform that translates clinical information into accurate sets of medical codes. CodaMetrix’s autonomous coding drives efficiency under fee-for-service and value-based care models and supports improved patient care. We are passionate about getting physicians and healthcare providers away from the keyboard and back to clinical care.
Overview
The Senior Data Engineer is a member of the Data & Analytics team. The Data & Analytics team is responsible for executing the data strategy for the organization. The goal of the team is to ensure high quality of external data ingested into the Lakehouse that can be realized in powerful insights for our internal and external customers while ensuring our ML/AI teams have the data they need to develop and train their models.
Senior Data Engineer is responsible for designing, developing, and managing the systems and architecture required to process and analyze large datasets. They will play a key role in building scalable and efficient data pipelines, integrating various data sources, and ensuring that data is clean, reliable, and easily accessible for reporting and Analytics. They will work closely with other teams, including data scientists, business intelligence developers, and DevOps, to ensure that the organization’s data infrastructure supports both current and future needs.
They support our analytics and customer onboarding teams, data scientists and software engineers on data initiatives and ensure optimal data access across the organization. As a team member, you will populate and maintain our data and data pipeline architecture, as well as optimizing data flow and collection for cross functional teams. You are an experienced data pipeline author and data wrangler who enjoys optimizing data systems and evolving them - and have a customer-centric approach towards the various teams who provide and consume data.
Responsibilities
- Create, maintain, populate and optimize the CodaMetrix data platform and analytics architecture.
- Assemble large, complex data sets that meet functional / non-functional business requirements using the Databricks platform.
- Develop and manage ETL processes using Spark and Kafka to ingest, clean, and transform data from different sources (databases, APIs, external feeds, etc.) into usable formats for downstream analysis and reporting.
- Implement data quality checks and ensure that data is accurate, consistent, and free from errors.
- Implement data governance with data privacy and security regulations (e.g., GDPR, HIPAA).
- Identify, design, and implement internal process improvements such as automating manual processes, optimizing data delivery, re-designing infrastructure for greater scalability, etc.
- Collaborate with software engineers to ensure that data infrastructure is compatible with applications and services that rely on data.
- Optimize data processing workflows for speed, efficiency, and scalability.
- Work with stakeholders including the Analytics, Machine Learning, Executive and Product teams to assist with data-related technical issues and support their data infrastructure needs.
- Ensure that data infrastructure supports real-time and batch data processing.
- Work with structured, semi-structured, and unstructured data, managing large volumes of data and ensuring its accessibility.
- Review code, provide constructive feedback, and ensure high standards of engineering excellence within the team.
- Lead and mentor junior and mid-level data engineers, providing guidance and training on best practices, architecture design, and data pipeline management.
- Establish best practices for data engineering and promote their adoption across teams.
Requirements
- Required
- BS, MS degree in Computer Science, Informatics, Information Systems or other related fields or equivalent work experience
- 5+ years of working experience with Databricks platform using PySpark/Scala
- 5+ years of experience with big data technologies on data ingestion using Apache Spark, Apache Kafka, and other distributed computing tools.
- 5+ years of strong SQL experience on relational and non-relational databases (SQL, NoSQL, MongoDB, etc.).
- Experience with object-oriented or functional programming languages: Scala, Java, and Python are all preferred
- Experience with both structured and unstructured data formats such as Parquet, CSV, JSON, XML
- Experience working with Terraform to provision cloud infrastructure.
- Experience with GitHub for version control, collaborative development, and CI/CD pipelines.
- Hands-on experience building and managing data pipelines in large-scale, cloud-based environments.
- Good knowledge of BI Tools; Tableau is a huge plus
- Agile Development (SDLC, Scrum, Kanban)
- You have experience building and optimizing ‘big data’ data pipelines, architectures and data sets. You have strong analytical skills related to working with both structured and unstructured datasets. You have built processes supporting data transformation, data structures, metadata, dependency and workload management. Strong project management and interpersonal skills. Experience supporting and working with cross-functional teams in a dynamic environment.
- Preferred
- Knowledge of HIPAA compliance requirements as well as other security/compliance practices such as PII and SOC2 a big plus
- Experience with Streaming workloads and integrating Spark with Apache Kafka
- Experience with consuming or authoring REST and/or SOAP web service APIs
- Familiarity with machine learning concepts or AI applications in the context of data engineering
- You understand what IaC means and have experience with common tools to implement it
The estimated hiring range for this role is $115,000 - $170,000 (plus applicable bonus/plus equity). This hiring range could vary by region based upon local market data. Final salary is ultimately decided upon taking into account a wide range of factors, including but not limited to: skills and experience, licensure and certifications, education, specific location and dynamic market data.
What CodaMetrix can offer you:
Learn more about our full-time employee benefits and how we take care of our team.
- Health Insurance: We cover 80% of the cost of medical and dental insurance and offer vision insurance
- Retirement: We offer a 401(k) plan that eligible employees can contribute to one month after their first day
- Flexibility: We have a generous Paid Time Off policy, which is managed but not limited, so you can take the time you need to relax and rejuvenate
- Learning: All new hires complete our 7-week Onboarding Program where they learn about our company and each of our departments through live sessions hosted by a variety of our leaders
- Development: We provide annual performance evaluations and prioritize working with employees on what their individual growth looks like
- Recognition: We recognize the outstanding achievements of our team through annual company awards where employees have the opportunity to nominate their peers
- Office Location: A modern open plan workspace located in the bustling Back Bay neighborhood of Boston
- Additional Employer Paid Benefits: We offer employer-paid life insurance and short-term and long-term disability insurance
Background Check Notice
All candidates will be required to complete a background check upon acceptance of a job offer.
Equal Employment Opportunity
Our company, as well as our products, are made better because we embrace diverse skills, perspectives, and ideas. CodaMetrix is an Equal Employment Opportunity Employer and all qualified applicants will receive consideration for employment.
Don’t meet every requirement? We invite you to apply anyway. Studies have shown that women, communities of color and historically underrepresented talent are less likely to apply to jobs unless they meet every single qualification. At CodaMetrix we are committed to building a diverse, inclusive and authentic workplace and encourage you to consider joining us.
Location: Boston, MA/Remote - Hybrid
Job Type: Full-time, exempt, regular