Summary
The job is for a DevOps Engineer responsible for managing and maintaining cloud environments, pre-production testing, ensuring system uptime, CI/CD, security, troubleshooting, architecture improvements, technology evaluation, and following up on support tickets. The role requires at least 3+ years’ experience with various cloud technologies, core experience as a DevOps Engineer in a 24x7 uptime environment, monitoring tools experience, strong scripting skills, and certification in AWS or Azure.
Requirements
- At least 3+ years’ experience with using a broad range of cloud technologies (e.g. EC2, ECR, ECS, EKS, Lambda, RDS, ELB, EFS, EBD, S3, VPC, SNS, SQS, SES, CloudWatch, Route 53) to develop and maintain a range of Amazon AWS based and/or Microsoft Azure based cloud solutions
- Core experience as a DevOps Engineer in a 24x7 uptime Amazon AWS / Microsoft Azure environment, including automation experience with configuration management tools
- Monitoring Tools: Experience with machine-generated big data searching, monitoring and analysis using tools such as Splunk
- Strong experience in the CI/CD pipeline using tools such as Jenkins, GitLab, SonarQube, Maven, TwistLock etc. Special emphasis on integrating the tools to form the pipeline
- Linux and Windows system administration
- Understanding network topologies and common network protocols and services (DNS, HTTP(S), SSH, FTP, SMTP)
- Problem Solving: Ability to analyze and resolve complex infrastructure resource and application deployment issues
- Strong scripting (e.g., CDK, CloudFormation, TerraForm etc.) and automation skills
- Certification: AWS or Azure technical certification
Responsibilities
- Deploy, automate, maintain, and manage cloud-based production, pre-production and dev environments
- Build, release, and configuration management of the above environments
- Pre-production Acceptance Testing to help assure the quality of products/services
- Ensure 24/7 system uptimes
- Ensure smooth 24/7 CI/CD and DevOps
- Design, implement, and execute Backup and Recovery and Business Continuity processes
- Design, implement, and execute security standards
- System troubleshooting and problem solving across platform and application domains
- Suggest architecture improvements, recommend process improvements
- Evaluate new technology options and vendor products
- Ensure critical system security using best-in-class cloud security solutions
- Monitor maintenance and outages, assess impact, and develop strategies to minimize impact
- Create and follow up on AWS / Azure support tickets
Preferred Qualifications
- OO Programming Skills: Strong OO Programming skills (Java or C#)
- DB Skills: Basic DB administration experience (Oracle, SQL Server). Experience with NoSQL DBs
- Agile Methodologies: Experience with Agile project management methodologies