Software Engineer, Caching Infrastructure
OpenAIAbout the Team
At OpenAI, we’re building safe and beneficial artificial general intelligence. We deploy our models through ChatGPT, our APIs, and other cutting-edge products. Behind the scenes, making these systems fast, reliable, and cost-efficient requires world-class infrastructure.
The Caching Infrastructure team is responsible for building a caching layer that powers many critical use cases at OpenAI. We aim to provide a high-availability, multi-tenant cache platform that scales automatically with workload, minimizes tail latency, and supports a diverse range of use cases.
We’re looking for an experienced engineer to help design and scale this critical infrastructure. The ideal candidate has deep experience in distributed caching systems (e.g., Redis, Memcached), networking fundamentals, and Kubernetes-based service orchestration.
In This Role, You Will:
Design, build, and operate OpenAI’s multi-tenant caching platform used across inference, identity, quota, and product experiences.
Define the long-term vision and roadmap for caching as a core infra capability, balancing performance, durability, and cost.
Collaborate with other infra teams (e.g., networking, observability, databases) and product teams to ensure our caching platform meets their needs.
You Might Thrive In This Role If You:
Have 5+ years of experience building and scaling distributed systems, with a strong focus on caching, load balancing, or storage systems.
Have deep expertise with Redis, Memcached, or similar solutions, including clustering, durability configurations, client-side connection patterns, and performance tuning.
Have production experience with Kubernetes, service meshes (e.g., Envoy), and autoscaling systems.
Think rigorously about latency, reliability, throughput, and cost in designing platform capabilities.
Thrive in a fast-paced environment and enjoy balancing pragmatic engineering with long-term technical excellence.
About OpenAI
OpenAI is an AI research and deployment company dedicated to ensuring that general-purpose artificial intelligence benefits all of humanity. We push the boundaries of the capabilities of AI systems and seek to safely deploy them to the world through our products. AI is an extremely powerful tool that must be created with safety and human needs at its