Job Description
This position is ideal for a driven DevOps professional passionate about building resilient cloud-native infrastructure, automation pipelines, and ensuring high availability across environments.
Infrastructure & Automation
- Lead the design and management of highly available, scalable Kubernetes infrastructure using Amazon EKS.
- Implement and manage Infrastructure as Code (IaC) with Terraform and Terragrunt.
- Champion a Git-first approach to infrastructure management and CI/CD automation.
- Build and maintain robust CI/CD pipelines using Harness to streamline delivery across environments.
- Automate infrastructure provisioning, configuration, and deployment for consistent environment management.
- Design and implement monitoring and observability solutions using New Relic.
- Securely manage secrets and credentials using AWS Secrets Manager.
- Support production systems through incident management, root cause analysis, and proactive reliability improvements.
Collaboration & Continuous Improvement
- Work closely with software engineering and security teams to define best practices for deployment, scalability, and security.
- Drive infrastructure automation, performance optimization, and environment consistency.
- Mentor junior DevOps engineers and developers on modern infrastructure practices.
- Contribute to documentation, monitoring, and operational excellence across product releases.
What You Know
- 4+ years of experience in DevOps, Site Reliability, or Platform Engineering roles supporting cloud-native applications.
- Strong hands-on experience with Kubernetes (EKS) including Helm, networking, and security.
- Deep expertise in Terraform and Terragrunt for AWS infrastructure provisioning.
- Proven experience with CI/CD pipelines using Harness, Git, and related automation tools.
- Proficiency in scripting and automation using Python or Golang.
- Strong understanding of AWS services (EKS, EC2, S3, IAM, RDS, VPC, Route53, CloudWatch, etc.).
- Proficient with Linux systems administration, troubleshooting, and performance tuning.
- Solid knowledge of networking fundamentals (DNS, TLS, HTTP/S, Load Balancing).
- Experience with Ansible for configuration and provisioning.
- Working knowledge of MySQL and PostgreSQL.
- Strong experience with monitoring, logging, and alerting, particularly using New Relic.
- Familiarity with Kafka, RabbitMQ, or other pub/sub systems is a plus.
Education
- Bachelor’s degree in computer science, Engineering, or a related field.