Staff Engineer-DevOPs
Position summary:
We are seeking a Staff Engineer – DevOps with 8-12 years of experience in designing, implementing, and optimizing CI/CD pipelines, cloud infrastructure, and automation frameworks. The ideal candidate will have expertise in Kubernetes, Terraform, CI/CD, Security, Observability, and Cloud Platforms (AWS, Azure, GCP). You will play a key role in scaling and securing our infrastructure, improving developer productivity, and ensuring high availability and performance.
Key Roles & Responsibilities:
- Design, implement, and maintain CI/CD pipelines using tools like Jenkins, GitLab CI/CD, ArgoCD, and Tekton.
- Deploy and manage Kubernetes clusters (EKS, AKS, GKE) and containerized workloads.
- Automate infrastructure provisioning using Terraform, Ansible, Pulumi, or CloudFormation.
- Implement observability and monitoring solutions using Prometheus, Grafana, ELK, OpenTelemetry, or Datadog.
- Ensure security best practices in DevOps, including IAM, secrets management, container security, and vulnerability scanning.
- Optimize cloud infrastructure (AWS, Azure, GCP) for performance, cost efficiency, and scalability.
- Develop and manage GitOps workflows and infrastructure-as-code (IaC) automation.
- Implement zero-downtime deployment strategies, including blue-green deployments, canary releases, and feature flags.
- Work closely with development teams to optimize build pipelines, reduce deployment time, and improve system reliability.
Basic Qualifications:
- A bachelor’s or master’s degree in computer science, electronics engineering or a related field
- 8-12 years of experience in DevOps, Site Reliability Engineering (SRE), or Infrastructure Automation.
- Strong expertise in CI/CD pipelines, version control (Git), and release automation.
- Hands-on experience with Kubernetes (EKS, AKS, GKE) and container orchestration.
- Proficiency in Terraform, Ansible for infrastructure automation.
- Experience with AWS, Azure, or GCP services (EC2, S3, IAM, VPC, Lambda, API Gateway, etc.).
- Expertise in monitoring/logging tools such as Prometheus, Grafana, ELK, OpenTelemetry, or Datadog.
- Strong scripting and automation skills in Python, Bash, or Go.
- Knowledge of secrets management tools (Vault, AWS Secrets Manager, etc.).
- Strong troubleshooting and debugging skills in distributed systems and cloud environments.
Preferred Qualifications
- Experience in FinOps Cloud Cost Optimization) and Kubernetes cluster scaling.
- Exposure to serverless architectures and event-driven workflows.
- Contributions to open-source DevOps projects.