Site Reliability Engineer
Adobe
- San Jose, CA
- Permanent
- Full-time
- Drive initiatives in collaboration with multi-functional teams across RTCDP and beyond to define, implement, measure, and report on SLx (SLO, SLT, etc.) for our infrastructure and services.
- Participate in the Design, Implementation, and Support of Core Database Infrastructure based on FoundationDB, Aerospike, and Postgres by serving as the primary point of contact and domain expert.
- Participate in capacity planning, performance analysis, and tuning activities to optimize system performance and resource utilization.
- Design, Develop, and Improve operational processes and automation using infrastructure-as-code (IaaC) tools such as Terraform, Pulumi, or Crossplane.
- Collaborate with multi-functional teams, contribute to architectural decisions, and ensure the reliability and scalability of our infrastructure and services.
- Participate in the oncall rotation, providing 24/7 support (follow-the-sun model) to maintain the stability and availability of our infrastructure and services.
- Experience working as a Site Reliability Engineer or in a similar role.
- Proficiency and hands-on experience in container orchestration platforms like Kubernetes (e.g., AKS, OpenShift)
- Technical expertise in areas such as Observability (e.g., Prometheus, Alert Manager), Infrastructure as Code (IaaC) (e.g., Terraform, Pulumi, Crossplane), and CI/CD (e.g., Argo, Jenkins).
- Coding and scripting skills, proficient in Python, Go, or Bash.