AI Engineer - LLM Quantization Specialist

🇮🇩 Indonesia - Remote
📊 Data🔵 Mid-level

Job description

Who We Are

At CloudFactory, we are a mission-driven team passionate about unlocking the disruptive potential of AI for the world. By combining advanced technology with a global network of talented experts, we make unusable data usable and inference reliable and trustworthy, driving real-world business value at scale.

More than just a workplace, we’re a global community founded on strong relationships and the belief that meaningful work transforms lives. Our commitment to earning, learning, and serving fuels everything we do, as we strive to connect one million people to meaningful work and build leaders worth following.

Our Culture

At CloudFactory, we believe in building a workplace where everyone feels empowered, valued, and inspired to bring their authentic selves to work. We are:

  • Mission-Driven: We focus on creating economic and social impact.
  • People-Centric: We care deeply about our team’s growth, well-being, and sense of belonging.
  • Innovative: We embrace change and find better ways to do things, together.
  • Globally Connected: We foster collaboration between diverse cultures and perspectives.

If you’re ready to earn, learn, serve, and be part of a vibrant global community, CloudFactory is your place!

Position Overview

This is a full-time role based in Jakarta, Indonesia. You’ll begin with an on-site phase for the first six months to collaborate closely with our client team, then transition to a hybrid schedule (three days on-site each week). The initial contract is for one year, with the possibility of extension.

About the Role

We are seeking an experienced AI Engineer to join our international AI research and engineering team. Based in Indonesia, you will collaborate closely with our Berlin-based core AI group to develop, optimize, test and deploy large language models (LLMs) at scale.

Your primary focus will be on quantization, model compression, and performance optimization to ensure efficient inference.

This is an exciting opportunity to be part of a global AI innovation hub, working on next-generation model efficiency and serving a global user base.

Key Responsibilities

  • Develop and implement quantization and pruning strategies for large language models (LLMs) to improve runtime efficiency and reduce memory footprint.

  • Collaborate with the AI Research team on model architecture, fine-tuning, and deployment of multilingual and multimodal models.

  • Evaluate and benchmark quantized models across hardware platforms (GPU, TPU, CPU, edge accelerators).

  • Contribute to the design and maintenance of model optimization pipelines (training, evaluation, conversion, inference).

  • Stay current with cutting-edge research on model compression, distillation, and efficient inference frameworks.

  • Support continuous integration of optimized models into production and internal tools.

  • Document methodologies and share insights across global teams to promote technical excellence and reproducibility

  • Bachelor’s or Master’s degree in Computer Science, Machine Learning, Electrical Engineering, or related field.

  • 5+ years of professional experience in AI/ML engineering, with a focus on deep learning model optimization.

  • Hands-on experience with quantization techniques (e.g., PTQ, QAT, INT8/FP16 quantization) using frameworks like PyTorch, TensorFlow, or ONNX Runtime.

  • Solid understanding of LLM architectures (e.g., Transformer-based models such as GPT, LLaMA, Mistral, Falcon).

  • Strong programming skills in Python, including proficiency with CUDA, NumPy, and PyTorch internals.

  • Experience with distributed training/inference and deployment on cloud or edge infrastructure.

  • Excellent communication skills and comfort working in a remote, cross-functional, international environment.

Preferred Qualifications:

  • Experience with quantization-aware training (QAT) and post-training quantization (PTQ).
  • Familiarity with Hugging Face Transformers, DeepSpeed, or TensorRT.
  • Contributions to open-source ML optimization libraries or toolkits.
  • Knowledge of low-level performance profiling and benchmarking (e.g., NVIDIA Nsight, PyTorch Profiler).
  • Prior experience collaborating with global AI research teams across time zones.
Share this job:
Please let CloudFactory know you found this job on Remote First Jobs 🙏

Benefits of using Remote First Jobs

Discover Hidden Jobs

Unique jobs you won't find on other job boards.

Advanced Filters

Filter by category, benefits, seniority, and more.

Priority Job Alerts

Get timely alerts for new job openings every day.

Manage Your Job Hunt

Save jobs you like and keep a simple list of your applications.

Search remote, work from home, 100% online jobs

We help you connect with top remote-first companies.

Search jobs

Hiring remote talent? Post a job

Frequently Asked Questions

What makes Remote First Jobs different from other job boards?

Unlike other job boards that only show jobs from companies that pay to post, we actively scan over 20,000 companies to find remote positions. This means you get access to thousands more jobs, including ones from companies that don't typically post on traditional job boards. Our platform is dedicated to fully remote positions, focusing on companies that have adopted remote work as their standard practice.

How often are new jobs added?

New jobs are constantly being added as our system checks company websites every day. We process thousands of jobs daily to ensure you have access to the most up-to-date remote job listings. Our algorithms scan over 20,000 different sources daily, adding jobs to the board the moment they appear.

Can I trust the job listings on Remote First Jobs?

Yes! We verify all job listings and companies to ensure they're legitimate. Our system automatically filters out spam, junk, and fake jobs to ensure you only see real remote opportunities.

Can I suggest companies to be added to your search?

Yes! We're always looking to expand our listings and appreciate suggestions from our community. If you know of companies offering remote positions that should be included in our search, please let us know. We actively work to increase our coverage of remote job opportunities.

How do I apply for jobs?

When you find a job you're interested in, simply click the 'Apply Now' button on the job listing. This will take you directly to the company's application page. We kindly ask you to mention that you found the position through Remote First Jobs when applying, as it helps us grow and improve our service 🙏

Apply