Site Reliability Engineer (m/f/d)

Köln, DEOn-siteengineering Today

At our company, it’s all about #OneTeam! Join gridscale and help shape the future of the cloud together with OVH.

As a leading tech company, we’ve been working for over two decades to reduce our environmental footprint - with innovative solutions and an open cloud designed to be sustainable from the ground up: #SustainableByDesign.

Our Tech Stack 🚀

OpenStack · Kubernetes · KVM · Linux · Bare-metal

· Ansible · Terraform · FluxCD/ ArgoCD · Git · Go · Python

· Claude Code/ Cursor/ agentic coding tooling

Your Role💻

You'll help build, operate, and industrialize OVHcloud's on-premise cloud platform (OPCP). You'll join a small, senior team that owns the OpenStack-based infrastructure and the Kubernetes / GitOps stack our customer-facing cloud runs on and that treats AI-assisted engineering as a first-class part of how we work.

The platform is actively in build mode, so joining now means real influence on the architecture, the automation strategy, and how we adopt AI in platform engineering. As a Senior, you shape the focus of your role around your strengths and interests: there's a clear backbone of automation, compute-lifecycle, and platform work, plus an explicit AI-substrate workstream. You're at home in a security-oriented, highly automated (GitOps) environment, keep an overview in ambiguous situations, and make well-founded decisions on that basis.

Your Tasks

Design and build OpenStack-based on-prem infrastructure that deploys itself autonomously - discovering available hardware and bringing up a functional datacenter in minutes.
Develop Infrastructure as Code with Ansible and Terraform - typically spec-first with LLM assistance, then human-validated; push this further via custom agent / sub-agent setups, agentic test generation, and prompt-engineered review loops.
Drive the ongoing development of our Kubernetes stack and GitOps workflows (FluxCD / ArgoCD).
Own the full lifecycle of our compute infrastructure - from bare-metal (firmware, provisioning, hardware health) through hypervisors to virtual compute nodes - and build the automation that keeps capacity healthy and rolls out updates without disturbing tenant workloads.
Build and extend the AI substrate that compounds our output: Markdown knowledge bases as retrieval substrate, agentic prototypes for incident triage and capacity planning, and deeper integration of agentic coding tools into daily work.
Contribute to the self-healing direction, turning today's manual runbooks into tomorrow's reasoning agents. Auto-remediation isn't a separate team here - it's how platform work is meant to land.
Design and implement test suites aligne

Opens the company's application page

About the company

gridscale GmbH

All open roles

Listed via

Arbeitnow

arbeitnow.com

Similar roles

Werden Sie KI-Experte (m/w/d) - Geförderte Weiterbildung mit Zukunftsperspektive

ZBID - Gesellschaft für zertifizierte Bildung in Deutschland UG (haftungsbeschränkt)

HamburgRemote

Full Stack Engineer - Billings & Subscriptions

SumUp

Berlin, Berlin, Germany, DEOn-site

Engineering Manager - Website

SumUp

Berlin, Germany, DEOn-site

Software Developer in Test II - Java (m/f/d)

Onapsis

Heidelberg, Baden-Württemberg, Germany, DEOn-site

Design & Tech

Related reads from TCHNX

View all →

Technology

The Quiet Revolution in Local-First Software

As major platforms face outages and data breaches, a new generation of developers is building applications that prioritise local data storage and peer-to-peer sync, challenging the cloud-first orthodoxy that's dominated tech for two decades.

tchnx.com

Technology

The Quiet Revolution in Edge AI: Why Your Next Computer Might Not Need the Cloud

As neural processing units become standard in consumer devices, we're witnessing a fundamental shift in how AI applications work. Local processing is no longer a fallback; it's becoming the preferred architecture.

tchnx.com

Technology

The Rise of AI-Assisted Code Generation 2: Are Developers Becoming Prompt Engineers?

As AI coding assistants reshape software development, the industry grapples with a fundamental question: is writing code giving way to writing prompts? We examine how London's tech scene is adapting to this seismic shift.

tchnx.com