Staff Site Reliability Engineer - (Infra)

Bengaluru, IndiaOn-site 5d ago

Secure Every Identity, from AI to Human

Identity is the key to unlocking the potential of AI. Okta secures AI by building the trusted, neutral infrastructure that enables organizations to safely embrace this new era. This work requires a relentless drive to solve complex challenges with real-world stakes. We are looking for builders and owners who operate with speed and urgency and execute with excellence.

This is an opportunity to do career-defining work. We're all in on this mission. If you are too, let's talk.

What You’ll Be Doing

Design, build, and operate highly scalable, reliable, and secure infrastructure powering our production systems across AWS and GCP.
Lead major reliability and modernization initiatives, including container platform migrations (e.g., ECS to EKS/GKE) and microservice enablement across multi-cloud environments.
Serve as a technical authority in Kubernetes (EKS and GKE), cloud infrastructure (AWS and GCP), and modern CI/CD practices (GitOps, automation pipelines).
Partner with development teams to architect and enable microservice-based applications, ensuring production readiness, scalability, and observability.
Implement and manage infrastructure as code (Terraform, Ansible) to automate provisioning, scaling, and configuration management across multiple cloud providers.
Drive improvements in observability, performance, and cost efficiency through robust monitoring, logging, and alerting systems that span AWS and GCP.
Champion SRE best practices — defining SLOs/SLIs, conducting blameless postmortems, and continuously improving incident response.
Lead complex technical projects from conception to completion, managing timelines, and technical dependencies across teams.
Mentor engineers across teams, fostering a culture of reliability, automation, and continuous learning.
Collaborate with security and compliance partners to ensure infrastructure adheres to best practices and standards (e.g., IAM Federation, Workload Identity).
Participate in the on-call rotation, using incidents as learning opportunities to enhance systems and processes.

What You’ll Bring to the Role:

Strong hands-on experience architecting and operating cloud-native distributed systems (AWS and GCP).
Deep expertise with Kubernetes (EKS and GKE) — design, provisioning, scaling, and advanced troubleshooting in production.
Proven experience leading ECS to EKS/GKE migrations and driving microservice enablement initiatives at scale.<

Apply now

Opens the company's application page

About the company

Okta

Identity and access management.

All open roles Visit website

Similar roles

Sr. Customer Support Engineer, Raipur

Danaher

IndiaRemote

Collibra Platform Developer (Mid to Senior)

Arch Capital Group Ltd.

PhilippinesRemote

Scheduling Director (Renewables Construction)

MasTec Industrial

United StatesRemote

Mom and Baby Care Manager - RN - Must reside in Nevada

CareSource

United StatesRemote

Design & Tech

Related reads from TCHNX

View all →

Technology

The Quiet Revolution in Local-First Software

As major platforms face outages and data breaches, a new generation of developers is building applications that prioritise local data storage and peer-to-peer sync, challenging the cloud-first orthodoxy that's dominated tech for two decades.

tchnx.com

Products

The Return of Physical Controls: Why Haptic Feedback Is Reshaping Digital Interfaces

After years of pursuing flat, buttonless designs, tech companies are rediscovering the value of tactile interaction. A new wave of products proves that touching isn't just feeling it's understanding.

tchnx.com

Design

The Quiet Revolution of Parametric Design Tools in Everyday Products

Parametric design is migrating from architecture studios to consumer products. As tools democratize and manufacturers adopt flexible production, we're entering an era of mass customization that challenges fundamental assumptions about design.

tchnx.com