Senior Data Collection Engineer

at Centric Software
  • Remote - Spain

Remote

Data

Senior

Job description

In today’s complex retail landscape, characterized by economic fluctuations and supply chain challenges, consumers are more discerning, often comparing prices and seeking compelling products. Centric Pricing™ addresses this by enabling retailers and brands to gain a deep understanding of the competitive landscape post-product launch. By leveraging AI-driven insights, businesses can make informed decisions quickly, aligning product development, sourcing, costing, and pricing strategies with real-time market demands.

The integration of Centric Pricing™ into Centric Software’s platform provides an end-to-end solution that combines intelligence and execution capabilities. This empowers brands and retailers to optimize product availability, reduce time to market, and enhance product quality, ultimately improving the consumer experience and driving profitability.

We are a key innovation partner for iconic and emerging brands across the world. Our Platform is able to analyze the info of more than 1.000 retailers, processing data from more than 600.000 brands, tracking millions of products.

As a Data Collection Engineer, you will be instrumental in building scalable and high-quality data collection systems, collaborating across teams to drive innovation and maintain the robustness of our data pipeline.

What you’ll do:

  • Design and Build Robust Web Crawlers

    • Develop and maintain spiders for high-scale data extraction using Scrapy.
    • Ensure spiders are modular, reusable, and easy to maintain with components such as loaders, middlewares, and pipelines.
    • Apply advanced techniques to bypass anti-bot mechanisms, including rotating proxies, captcha-solving strategies and fingerprinting
  • Enhance and Maintain Infrastructure

    • Build scalable CI/CD pipelines for automated testing, deployment, and monitoring of spiders.

    • Leverage tools like Scrapyd for centralized spider scheduling and lifecycle management.

    • Ensure efficient parallelization and cloud deployment for high-throughput crawling.

  • Code Quality and Consistency

    • Uphold coding standards and implement across teams consistent practices.

    • Conduct thorough code reviews and mentor junior engineers on clean code principles.

    • Maintain version control and detailed change logs for spider development.

  • Monitoring, Maintenance & Reliability

    • Integrate performance monitoring systems to ensure real-time alerts and health checks.

    • Schedule periodic spider audits to handle structure changes and improve reliability.

    • Troubleshoot failures and optimize resource usage (CPU/network) for crawling efficiency.

  • Data Integrity and Accuracy

    • Build robust data validation mechanisms to guarantee quality outputs.

    • Collaborate with internal consumers to ensure data collected aligns with business requirements.

    • Continuously track data anomalies and automate recovery strategies.

  • Collaboration and Knowledge Sharing

    • Work cross-functionally with product, engineering, and other data teams.

    • Promote a culture of documentation, onboarding tools, and internal knowledge bases

    • Contribute to training initiatives, helping the team stay current on scraping techniques and technologies.

Desired technical skills and experience

Core requirements:

  • Comfort with Git workflows, code reviews, and CI/CD pipelines

  • Experience with cloud infrastructure like AWS

  • Experience with monitoring / observability systems like Grafana and Sentry.

  • Knowledge of the Web environment (model, standards, DOM, Request-Response, Cookies, Javascript, Browsers, Headers, XHR, etc.).

  • Familiarity with TLS/SSL, TCP/IP stack, and low-level web networking is a strong plus.

Bonus / Senior-Level expectations:

  • Proficient in designing fault-tolerant systems and deploying them at scale

  • Familiarity with containerized deployments

  • Proficient developing scalable web crawlers and data pipelines using Python and Scrapy.

  • Experience building resilient scraping systems across diverse web architectures

  • Prior experience mentoring or leading junior developers

Soft Skills and Work Ethic

  • Excellent communication skills in English, both written and spoken.

  • A collaborative mindset with a proactive approach to knowledge sharing.

  • Strong analytical thinking and problem-solving abilities.

  • Commitment to continuous improvement, mentoring, and agile team dynamics.

  • Remain up-to-date with technology trends to keep our software as innovative as possible.

Centric Software provides equal employment opportunities to all qualified applicants without regard to race, sex, sexual orientation, gender identity, national origin, color, age, religion, protected veteran or disability status or genetic information.

Share this job:
Please let Centric Software know you found this job on Remote First Jobs 🙏

Benefits of using Remote First Jobs

Discover Hidden Jobs

Unique jobs you won't find on other job boards.

Advanced Filters

Filter by category, benefits, seniority, and more.

Priority Job Alerts

Get timely alerts for new job openings every day.

Manage Your Job Hunt

Save jobs you like and keep a simple list of your applications.

Search remote, work from home, 100% online jobs

We help you connect with top remote-first companies.

Search jobs

Hiring remote talent? Post a job

Frequently Asked Questions

What makes Remote First Jobs different from other job boards?

Unlike other job boards that only show jobs from companies that pay to post, we actively scan over 20,000 companies to find remote positions. This means you get access to thousands more jobs, including ones from companies that don't typically post on traditional job boards. Our platform is dedicated to fully remote positions, focusing on companies that have adopted remote work as their standard practice.

How often are new jobs added?

New jobs are constantly being added as our system checks company websites every day. We process thousands of jobs daily to ensure you have access to the most up-to-date remote job listings. Our algorithms scan over 20,000 different sources daily, adding jobs to the board the moment they appear.

Can I trust the job listings on Remote First Jobs?

Yes! We verify all job listings and companies to ensure they're legitimate. Our system automatically filters out spam, junk, and fake jobs to ensure you only see real remote opportunities.

Can I suggest companies to be added to your search?

Yes! We're always looking to expand our listings and appreciate suggestions from our community. If you know of companies offering remote positions that should be included in our search, please let us know. We actively work to increase our coverage of remote job opportunities.

How do I apply for jobs?

When you find a job you're interested in, simply click the 'Apply Now' button on the job listing. This will take you directly to the company's application page. We kindly ask you to mention that you found the position through Remote First Jobs when applying, as it helps us grow and improve our service 🙏

Apply