
(Senior) Site Reliability Engineer (m/f/d) - Platform & Agentic Operations
1KOMMA5°1KOMMA5°
At 1KOMMA5°, we pursue a clear vision: Living on wind and sunlight forever for free. To make this a reality, we are building the energy system of the future with Heartbeat AI. Want to be part of it?We bring together regional craftsmanship and scalable software: We don't think of solar, batteries, heat pumps, and e-mobility as isolated components, but control them as an intelligent, integrated overall system in our virtual power plant. Directly connected to the electricity market – in real time, fully automated. This way, energy is used when it is available from renewables and particularly cost-effective. By 2030, our goal is to transition 1.5 million households to renewable energies. Over 3,000 people are working towards this every day, at more than 80 locations worldwide, from Finland to Australia.Want to take responsibility and build solutions that truly matter? Apply now and help us shape the energy world of tomorrow.
Learn more about our Product & Tech team!
Deine Position
1KOMMA5° is building Europe’s largest virtual power plant ("Heartbeat AI"). As a Senior SRE in our Platform team, you will bridge classic infrastructure with Agentic Engineering, specifically focusing on leveraging AI agents to eliminate developer friction, optimize CI/CD pipelines, and automate the resolution of code review and deployment bottlenecks.
Tech Stack
Cloud & Infra: GCP (CloudRun, GKE), Terraform, Terramate
Reliability: Incident.io, Datadog (OpenTelementry)
Agentic: Cursor
CI/CD & DevEx: GitHub Actions, Backstage
Languages: Python, GoLang, TypeScript
Key Responsibilities Include but not limited to
Implement and improve monitoring, alerting, and incident response systems and processes to ensure high reliability for our customers and meet defined SLOs
Design, build, and maintain resilient, scalable infrastructure utilizing SRE principles and best practices
Attend post-incident reviews, detect patterns and contribute to continuous improvement efforts
Execute performance testing, analyze system bottlenecks, and formulate strategies for capacity planning to ensure our systems meet current and future demands effectively
Build systems where CI/CD test failures serve as immediate, real-time context for agents, enabling them to analyze logs, trace dependencies, and suggest or apply instant code fixes.
Dein Profil
6+ years in SRE, DevOps, or Platform Engineering
Strong understanding and practical application of Site Reliability Engineering (SRE) principles, methodologies, and best practices
Proficiency in programming/scripting languages such
Opens the company's application page
Listed via
Jobicy
jobicy.com
Similar roles
Design & Tech
Related reads from TCHNX

The Quiet Revolution in Local-First Software
As major platforms face outages and data breaches, a new generation of developers is building applications that prioritise local data storage and peer-to-peer sync, challenging the cloud-first orthodoxy that's dominated tech for two decades.

The Quiet Revolution in Edge AI: Why Your Next Computer Might Not Need the Cloud
As neural processing units become standard in consumer devices, we're witnessing a fundamental shift in how AI applications work. Local processing is no longer a fallback; it's becoming the preferred architecture.

The Rise of AI-Assisted Code Generation 2: Are Developers Becoming Prompt Engineers?
As AI coding assistants reshape software development, the industry grapples with a fundamental question: is writing code giving way to writing prompts? We examine how London's tech scene is adapting to this seismic shift.

