Toku Logo

Founding AI Engineer Speech Recognition

Job Description

Description

At Toku, we create bespoke cloud communications and customer engagement solutions to reimagine customer experiences for enterprises. We provide an end-to-end approach to help businesses overcome the complexity of digital transformation and deliver mission-critical CX through cloud communication solutions. Toku combines local strategic consulting expertise, bespoke technology, regional in-country infrastructure, connectivity, and global reach to serve the diverse needs of enterprises operating at scale. Headquartered in Singapore, Toku supports customers across APAC and beyond, with a growing footprint across global markets.

As a Founding AI Engineer, you will lead the development of our speech recognition capabilities, including contributing to open-source models optimised for APAC languages and telephony environments. You will own the entire machine learning pipeline from model architecture through to deployment, and publication on Hugging Face and GitHub. This is a unique opportunity to build technology that will serve billions of people across the Asia-Pacific region and beyond.

Requirements

What you will be doing

Model Development & Training

  • Design and implement telephony-optimised speech recognition models for APAC languages (English variants, Mandarin, Thai, Vietnamese, Indonesian, and more)

  • Develop comprehensive AI model training frameworks using PyTorch on local and cloud GPU infrastructure

  • Create and optimise data augmentation pipelines addressing telephony-specific challenges (8kHz audio, codec artefacts, background noise, SNR optimisation)

  • Build models that handle code-switching common in APAC contexts (Singlish, Hinglish, Taglish)

APAC-Specific Optimisation

  • Address tonal language challenges for Mandarin, Thai, Vietnamese, and other tonal languages

  • Optimise for regional accent variations across target markets

  • Develop evaluation benchmarks specific to APAC telephony contexts, including SNR and audio quality metrics

  • Implement techniques for low-resource language support

Infrastructure & Deployment

  • Build scalable inference systems for real-time and batch processing

  • Create containerised applications for model demonstration and testing

  • Develop APIs for integration with telephony systems

  • Deploy models on local and cloud GPU infrastructure

  • Integrate with Toku’s existing Llama 8B deployment for language model capabilities

Open-Source Contribution (Future)

  • Contribute to the preparation of open-source releases

  • Write comprehensive technical documentation and user guides

  • Conduct performance benchmarking and validation studies

  • Contribute to the broader speech recognition community through publications and presentations

We’d love to hear from you if you have

Required Qualifications

  • Bachelor’s or Master’s degree in Computer Science, Engineering, or related technical field with strong ML foundations

  • 1-3 years of hands-on experience in machine learning projects

  • Excellent Python programming skills

  • Experience with PyTorch and deep learning model training

  • Proficiency in handling large datasets and data preprocessing

  • Understanding of speech processing concepts and techniques

  • Experience with cloud platforms and GPU computing

  • Familiarity with containerisation (Docker) and deployment practices

Preferred Qualifications

  • Portfolio of AI projects (open-source contributions highly valued)

  • Familiarity with OpenAI Whisper and transformer-based architectures

  • Previous experience with speech-to-text or audio processing projects

  • Experience with open-source project development and collaboration

  • Strong technical writing and documentation skills

  • Familiarity with at least one APAC language’s phonological characteristics

  • Understanding of telephony audio characteristics (8kHz sampling, codec artefacts, SNR considerations)

  • Publication history in speech recognition or related fields

Personal Attributes

  • Independent and ownership-driven: ability to take projects from conception to completion

  • Growth-oriented: enthusiasm for learning new technologies

  • Quality-focused: commitment to robust, well-documented code

  • Strong communication and presentation skills

Location:

  • This is a remote / hybrid role to be based in either Singapore, Hong Kong or the Netherlands (Rotterdam preferred)

Why join Toku?

Mission-Driven Impact: Contribute to democratising speech AI for APAC’s diverse linguistic landscape

Open-Source Leadership: Build your reputation through contributions to bespoke model development

Technical Growth: Work with experienced engineers on state-of-the-art speech AI technologies

Regional Expertise: Become a specialist in an underserved but massive market

Autonomy: Take ownership of significant technical challenges with support to succeed

Benefits and Perks: Training and development, annual bonus and salary review, healthcare coverage based on location, 20 days Paid Annual Leave plus other leave allowances, and more

Toku has been recognised as a LinkedIn Top Startup and by the Financial Times as one of APAC’s Top 500 High Growth Companies. If you’re looking to be part of a company on a strong growth trajectory while working on meaningful, real-world challenges, we’d love to hear from you.

Share this job:
Please let Toku know you found this job on Remote First Jobs 🙏

7 similar remote jobs

Explore latest remote opportunities and join a team that values work flexibility.

Remote companies like Toku

Find your next opportunity with companies that specialize in Voip, International Voice, Sms, and Virtual Numbers. Explore remote-first companies like Toku that prioritize flexible work and home-office freedom.

Plivo Logo

Plivo

AI agents and cloud communication APIs for customer engagement across multiple channels.

View company profile →
AVOXI Logo

AVOXI

Provides cloud software for global voice communications and contact centers, serving businesses in over 150 countries.

View company profile →
Twilio Logo

Twilio

5001-10000 www.twilio.com

Our Customer Engagement Platform combines communication APIs with AI and data to help businesses build strong customer relationships.

View company profile →
Vonage Logo

Vonage

1001-5000 www.vonage.com

Provides cloud communications platforms, unified communications, contact centers, and programmable APIs for global enterprises.

View company profile →
Telnyx Logo

Telnyx

A full-stack platform for real-time conversational AI, integrating global telephony and AI infrastructure.

View company profile →
Sinch Logo

Sinch

1001-5000 www.sinch.com

Cloud communication services for businesses to engage customers through mobile messaging, voice, and email.

View company profile →

Project: Career Search

Rev. 2026.3

[ Remote Jobs ]
Direct Access

We source jobs directly from 21,000+ company career pages. No intermediaries.

01

Discover Hidden Jobs

Unique jobs you won't find on other job boards.

02

Advanced Filters

Filter by category, benefits, seniority, and more.

03

Priority Job Alerts

Get timely alerts for new job openings every day.

04

Manage Your Job Hunt

Save jobs you like and keep a simple list of your applications.

21,000+ SOURCES UPDATED 24/7
Apply