Performance Engineer

  • Remote - United States

Remote

Software Development

Mid-level

Job description

πŸ“ About this role

Writer is seeking a highly skilled and motivated Principal Performance Engineer to lead the performance optimization of our cutting-edge Generative AI technology stack. This role is critical in ensuring the scalability, efficiency, and reliability of our Large Language Models (LLMs) and Retrieval Augmented Generation (RAG) systems. You will be a key driver in identifying and resolving performance bottlenecks, optimizing resource utilization, and ensuring a seamless user experience. You will work closely with our AI research, software engineering, and infrastructure teams to deliver world-class AI solutions.

πŸ¦ΈπŸ»β€β™€οΈ Your responsibilities

  • Performance Leadership:

    • Define and implement performance engineering strategies for our Generative AI full stack, including services, application, LLMs, RAG pipelines, and related infrastructure.

    • Lead performance testing, profiling, and analysis efforts to identify and resolve performance bottlenecks.

    • Establish and maintain performance benchmarks and SLAs for critical AI services.

    • Provide technical leadership and mentorship to performance engineering team members.

  • LLM Capacity and Tuning:

    • Analyze and improve LLM inference performance, including latency, throughput, and resource utilization.

    • Develop and implement strategies for LLM capacity planning and scaling.

    • Collaborate with AI researchers to evaluate and improve LLM model architectures and training techniques for performance.

    • Optimize LLM inference through techniques such as quantization, distillation, and optimized kernel implementation.

  • RAG Performance Optimization:

    • Design and implement performance tests for RAG pipelines, including retrieval, ranking, and generation components.

    • Identify and optimize performance bottlenecks in RAG systems, such as database queries, vector search, and document processing.

    • Evaluate and optimize RAG system architectures for scalability and efficiency.

    • Tune vector databases for optimal recall and latency.

  • Infrastructure Optimization:

    • Collaborate with infrastructure teams to optimize hardware and software configurations for AI workloads.

    • Evaluate and recommend new technologies and tools for performance monitoring and analysis.

    • Develop and maintain performance dashboards and reports to track key metrics.

    • Optimize GPU utilization and memory management for LLM inference.

  • Collaboration and Communication:

    • Work closely with AI researchers, software engineers, and product managers to ensure performance requirements are met.

    • Communicate performance findings and recommendations to stakeholders at all levels.

    • Stay up-to-date with the latest developments in Generative AI and performance engineering.

⭐️ Is this you?

  • Education:

    • Bachelor’s degree in Computer Science, Engineering, or a related field (Master’s preferred).
  • Experience:

    • 10+ years of experience in performance engineering, with a focus on large-scale distributed systems.

    • 2+ years of experience working with AI/ML technologies

    • Proven experience in performance testing, profiling, and analysis of complex software systems.

    • Deep understanding of NLP architectures, training, and inference.

    • Experience with vector databases and search technologies.

    • Experience with cloud computing platforms (e.g., AWS, Azure, GCP) and containerization technologies (e.g., Docker, Kubernetes).

    • Strong programming skills in python.

    • Experience with performance analysis tools (e.g., profilers, debuggers, monitoring tools).

  • Skills:

    • Strong analytical and problem-solving skills.

    • Excellent communication and collaboration skills.

    • Ability to work in a fast-paced and dynamic environment.

    • Passion for AI and a desire to push the boundaries of performance engineering

    #LI-Remote

🍩 Benefits & perks (US Full-time employees)

  • Generous PTO, plus company holidays

  • Medical, dental, and vision coverage for you and your family

  • Paid parental leave for all parents (12 weeks)

  • Fertility and family planning support

  • Early-detection cancer testing through Galleri

  • Flexible spending account and dependent FSA options

  • Health savings account for eligible plans with company contribution

  • Annual work-life stipends for:

    • Home office setup, cell phone, internet

    • Wellness stipend for gym, massage/chiropractor, personal training, etc.

    • Learning and development stipend

  • Company-wide off-sites and team off-sites

  • Competitive compensation, company stock options and 401k

Writer is an equal-opportunity employer and is committed to diversity. We don’t make hiring or employment decisions based on race, color, religion, creed, gender, national origin, age, disability, veteran status, marital status, pregnancy, sex, gender expression or identity, sexual orientation, citizenship, or any other basis protected by applicable local, state or federal law. Under the San Francisco Fair Chance Ordinance, we will consider for employment qualified applicants with arrest and conviction records.

By submitting your application on the application page, you acknowledge and agree to Writer’s Global Candidate Privacy Notice.

Share this job:
Please let Writer know you found this job on Remote First Jobs πŸ™

Benefits of using Remote First Jobs

Discover Hidden Jobs

Unique jobs you won't find on other job boards.

Advanced Filters

Filter by category, benefits, seniority, and more.

Priority Job Alerts

Get timely alerts for new job openings every day.

Manage Your Job Hunt

Save jobs you like and keep a simple list of your applications.

Search remote, work from home, 100% online jobs

We help you connect with top remote-first companies.

Search jobs

Hiring remote talent? Post a job

Frequently Asked Questions

What makes Remote First Jobs different from other job boards?

Unlike other job boards that only show jobs from companies that pay to post, we actively scan over 20,000 companies to find remote positions. This means you get access to thousands more jobs, including ones from companies that don't typically post on traditional job boards. Our platform is dedicated to fully remote positions, focusing on companies that have adopted remote work as their standard practice.

How often are new jobs added?

New jobs are constantly being added as our system checks company websites every day. We process thousands of jobs daily to ensure you have access to the most up-to-date remote job listings. Our algorithms scan over 20,000 different sources daily, adding jobs to the board the moment they appear.

Can I trust the job listings on Remote First Jobs?

Yes! We verify all job listings and companies to ensure they're legitimate. Our system automatically filters out spam, junk, and fake jobs to ensure you only see real remote opportunities.

Can I suggest companies to be added to your search?

Yes! We're always looking to expand our listings and appreciate suggestions from our community. If you know of companies offering remote positions that should be included in our search, please let us know. We actively work to increase our coverage of remote job opportunities.

How do I apply for jobs?

When you find a job you're interested in, simply click the 'Apply Now' button on the job listing. This will take you directly to the company's application page. We kindly ask you to mention that you found the position through Remote First Jobs when applying, as it helps us grow and improve our service πŸ™

Apply