Senior Site Reliability Engineer

Bengaluru, IndiaOn-site 5d ago

Secure Every Identity, from AI to Human

Identity is the key to unlocking the potential of AI. Okta secures AI by building the trusted, neutral infrastructure that enables organizations to safely embrace this new era. This work requires a relentless drive to solve complex challenges with real-world stakes. We are looking for builders and owners who operate with speed and urgency and execute with excellence.

This is an opportunity to do career-defining work. We're all in on this mission. If you are too, let's talk.

As a Senior Site Reliability Engineer you will champion all things pertaining to reliability at Okta for Auth0. Working closely with the Product Engineers, Quality Engineers, Platform Engineers and Architecture teams, your primary focus will be on ensuring production systems remain operational at all times, while continually setting and achieving long-term performance, reliability and scalability goals in a platform with an exponential growth plan for the coming years.

With Okta’s increased dedication to ensuring customer availability expectations are exceeded in every way, you will play a key role as we evolve our system architecture to meet the demands of enormous growth and support the hundreds of millions of users who rely on us to provide uninterrupted access to business-critical enterprise and consumer applications.

Skills

Exceptional communication skills, including technical writing in the English language
Systematic problem-solving approach, coupled with a strong sense of ownership and drive
Understanding of microservices, cloud infrastructure (AWS, Azure), databases (SQL, No-SQL, Key/Value), containers (docker, kubernetes), web technologies (web sockets, http) and networking (SSL, routing, VPN)
Live and breathe SLIs, SLOs, error budgets and SLAs
Strong belief in automating everything and reducing toil for yourself and teammates
Loves to work as a team, but is able to work effectively in a remote environment where tasks may be self-driven
Knowledge of Datadog or other observability platform is desired
The role expects the member to handle 24*7 oncall (on a rotational basis) independently

Responsibilities

Working with the other teams to run, own and improve incident response processes
Participate in regular on-call rotations to ensure 24/7 coverage of all critical systems
Use existing monitoring tools to identify problems and resolve and/or escalate to service teams
Implement changes

Apply now

Opens the company's application page

About the company

Okta

Identity and access management.

All open roles Visit website

Listed via

Greenhouse

Similar roles

Sr. Customer Support Engineer, Raipur

Danaher

IndiaRemote

Collibra Platform Developer (Mid to Senior)

Arch Capital Group Ltd.

PhilippinesRemote

Scheduling Director (Renewables Construction)

MasTec Industrial

United StatesRemote

Mom and Baby Care Manager - RN - Must reside in Nevada

CareSource

United StatesRemote

Design & Tech

Related reads from TCHNX

View all →

Technology

The Quiet Revolution in Local-First Software

As major platforms face outages and data breaches, a new generation of developers is building applications that prioritise local data storage and peer-to-peer sync, challenging the cloud-first orthodoxy that's dominated tech for two decades.

tchnx.com

Products

The Return of Physical Controls: Why Haptic Feedback Is Reshaping Digital Interfaces

After years of pursuing flat, buttonless designs, tech companies are rediscovering the value of tactile interaction. A new wave of products proves that touching isn't just feeling it's understanding.

tchnx.com

Design

The Quiet Revolution of Parametric Design Tools in Everyday Products

Parametric design is migrating from architecture studios to consumer products. As tools democratize and manufacturers adopt flexible production, we're entering an era of mass customization that challenges fundamental assumptions about design.

tchnx.com