Job Description
π Data Engineer (Python, SQL, ETL, Airflow, Snowflake, BigQuery)
Full-Time | Remote | U.S. Business Hours
π‘ About the Role
Weβre hiring a highly technical Data Engineer to build and maintain scalable data pipelines, cloud data infrastructure, and analytics-ready datasets that power business decision-making.
This role is focused on:
β ETL/ELT pipeline development
β Data warehouse architecture
β SQL optimization
β Cloud-based data infrastructure
β Pipeline reliability & monitoring
β Scalable analytics systems
Youβll work closely with:
- Data Analysts
- Data Scientists
- Engineering Teams
- BI & Leadership Teams
to ensure the organization always has accurate, clean, and trustworthy data.
If you:
- enjoy building robust data systems,
- love optimizing pipelines and queries,
- and care deeply about data quality and scalability,
this role is a strong fit.
π₯ What Youβll Own
ETL / ELT Pipeline Development
Build and maintain scalable ETL/ELT pipelines using:
- Python
- SQL
- Scala
Ingest data from:
- APIs
- SaaS platforms
- relational databases
- cloud applications
- streaming systems
Develop reliable workflows for:
- data extraction
- transformation
- loading
- validation
Workflow Orchestration & Automation
Manage orchestration platforms such as:
- Apache Airflow
- Prefect
- Dagster
- Luigi
Monitor:
- pipeline health
- failed jobs
- scheduling reliability
Build automated workflows with:
- retries
- alerting
- dependency management
Data Warehousing & Modeling
Design and optimize cloud data warehouses using:
- Snowflake
- BigQuery
- Redshift
Develop:
- star schemas
- snowflake schemas
- analytics-ready data models
Improve:
- query performance
- clustering
- partitioning
- warehouse efficiency
Data Quality & Governance
Implement:
- validation checks
- anomaly detection
- logging systems
- lineage tracking
Use tools such as:
- dbt
- Great Expectations
Ensure:
- consistent naming conventions
- clean transformations
- audit-ready datasets
Support compliance requirements:
- GDPR
- HIPAA
- industry-specific governance standards
Streaming & Real-Time Data
Build and maintain streaming pipelines using:
- Kafka
- Kinesis
- Pub/Sub
Support:
- real-time ingestion
- event-driven processing
- low-latency analytics workflows
Infrastructure & DevOps
Containerize services using:
- Docker
- Kubernetes
Build CI/CD workflows with:
- GitHub Actions
- Jenkins
- GitLab CI
Manage cloud infrastructure using:
- Terraform
- CloudFormation
Improve scalability, reliability, and deployment automation
Cross-Functional Collaboration
Partner with:
- analysts
- data scientists
- BI teams
- product teams
Deliver curated datasets for:
- dashboards
- analytics
- machine learning workflows
Support BI tools such as:
- Tableau
- Looker
- Power BI
Maintain documentation for:
- pipelines
- schemas
- workflows
- data definitions
β Required Experience & Skills
3+ years of Data Engineering or backend engineering experience
Strong proficiency with:
- Python
- SQL
Experience with:
- Snowflake
- BigQuery
- Redshift
Familiarity with:
- Airflow
- Prefect
- workflow orchestration tools
Strong understanding of:
- ETL pipelines
- data modeling
- cloud infrastructure
- warehouse optimization
β Ideal Experience
Experience using:
- dbt
- Great Expectations
- data lineage tools
Streaming experience with:
- Kafka
- Kinesis
- Pub/Sub
Experience with:
- AWS Glue
- GCP Dataflow
- Azure Data Factory
Background in:
- healthcare
- fintech
- regulated environments
Experience optimizing large-scale warehouse costs and performance
π§ What Makes You a Great Fit
- You care deeply about clean and reliable data
- You enjoy debugging complex pipeline and infrastructure issues
- You think about scalability and long-term maintainability
- You combine engineering rigor with analytical thinking
- You communicate effectively across technical and non-technical teams
π What a Typical Day Looks Like
- Review Airflow/Prefect pipeline health and resolve failures
- Build connectors for new APIs or SaaS platforms
- Optimize SQL queries and warehouse performance
- Collaborate with analysts and data scientists on datasets
- Improve validation and monitoring systems
- Document pipelines and warehouse structures
- Reduce warehouse costs and improve pipeline reliability
In short:
You build the data infrastructure that powers analytics, reporting, automation, and business intelligence across the organization.
π Key Success Metrics (KPIs)
- Pipeline uptime β₯ 99%
- Data freshness within SLA
- Zero critical data quality issues reaching production
- Query performance & warehouse cost optimization
- Reliable and scalable pipeline infrastructure
- Positive feedback from analysts, BI teams, and leadership
π Why This Role Stands Out
Work on modern cloud-native data infrastructure
Build scalable ETL and analytics systems
Exposure to:
- streaming pipelines
- cloud data platforms
- orchestration frameworks
- warehouse optimization
Opportunity to grow into:
- Senior Data Engineer
- Analytics Engineering
- Platform Engineering
- Data Architecture
Fully remote flexibility with collaborative engineering teams
π§ͺ Interview Process
- Initial Phone Screen
- Video Interview with Pavago Recruiter
- Technical Task
(Build a small ETL pipeline or optimize a SQL query)
- Client Interview with Engineering/Data Team
- Offer & Background Verification
π Apply Now
If you:
- love building scalable data systems,
- enjoy solving complex pipeline problems,
- and want to work with modern data infrastructure,
this role is a strong fit for you.












