Summary
Join Perplexity's small team in revolutionizing the way people search and interact with the internet. As a Site Reliability Engineer (SRE), you will lead the design, implementation, and scaling of infrastructure and systems that support web and mobile products.
Requirements
- Strong experience with cloud infrastructure built on AWS
- Proficient in database management and caching strategies
- Excellent problem-solving and troubleshooting skills, with the ability to analyze, debug, and resolve complex technical issues
- Experience working with containers (Docker, Kubernetes) and orchestration tools
- Excellent communication and collaboration skills
- Experience with Python and Terraform
- 4+ years of SRE experience
Responsibilities
- Design and implement highly available, high-performance, and scalable systems
- Maintain and optimize key-value and relational databases
- Scale and load balance web server backends to meet rapidly changing needs
- Monitor systems and applications, proactively identifying and resolving reliability, scalability, or performance issues
- Develop monitoring tools, alerts, and dashboards to provide visibility into system health and performance
- Collaborate with product engineering and security engineering teams to develop scalable automations
Benefits
Comprehensive health, dental, and vision insurance for you and your dependents. Includes a 401(k) plan