Site Reliability Engineer

Salary Competitive

All Gravy is the all-in-one employee app designed for front-line, hospitality, and retail workers. 🥞

The next generation of employees are used to apps like Instagram, TikTok and ChatGPT in their personal life, but when they go to work at a restaurant chain or retail store, they're met with software created before they were born. We are transforming the industry through AI, and are modernizing the tools for frontline employees and transforming the experience into one that is engaging and motivating; making their lives easier - from coordinating work to developing their careers.

We've grown 3x over the last year and are looking to accelerate further. We already work with some of the most exciting brands in the Nordics and the UK (Pizza Pilgrims, Honest Burgers, Ottolenghi, Dishoom, Q8, 7-Eleven) and are looking to scale even faster.

🤩 About the role

We're looking for an ambitious Site Reliability Engineer to join our Platform team. Our products already power operations for some of the most recognizable brands in the Nordics and the UK, and the bar for uptime, performance, and security keeps rising. You'll join a Platform team that owns the foundations the rest of engineering builds on, and within that team you'll be the main person responsible for application security, performance, and production health.

As a Site Reliability Engineer on the Platform team, you'll work primarily in the codebase - not in infrastructure configs. Your focus is on how our application behaves in production: whether it's secure, whether it's fast, and whether we know about it when it isn't.

You'll work closely with the rest of the Platform team and with engineering leadership to shape strategy, but this is a hands-on role. You'll be reading code, profiling queries, investigating security vulnerabilities, building monitoring, and responding to incidents - not configuring cloud resources and writing Terraform.

🎯 What You Will Do

Security: Finding, Investigating, and Resolving Issues: Own application security across our stack. Proactively hunt for vulnerabilities in our codebase, triage and resolve security alerts, manage dependency patching, and run point on security incidents. You'll review code and architecture for security risks, and work with product teams to fix issues at the source rather than papering over them at the perimeter.
Performance: Query Optimisatio and Application Speed: Make our product fast. Profile and optimize database queries, identify slow endpoints and bottlenecks, and work with product teams to fix what matters. From MongoDB query patterns to API response times to mobile app cold starts - you'll dig into the code and data to understand where time is being spent and eliminate waste.
Monitoring & Observability: Define what "good" looks like for our services. Build and maintain the metrics, logs, traces, and alerts that tell the story of what the system is doing. Create dashboards that surface real problems rather than noise, and ensure that when something goes wrong, the right people know about it quickly and have the context to act.
Incident Response: Build and run our incident management practice. Lead incidents when they happen, drive blameless postmortems, and turn findings into durable improvements. Make on-call a reasonable experience for every engineer.
Reliability Culture: Be the person who makes reliability feel like a partner rather than a blocker. Teach engineers to own their services, help them understand production behavior, and level up the whole engineering org on how their code actually runs.

Our Ideal Candidate has

Code-First SRE Mindset: You solve reliability problems by reading and improving application code, not just by tuning infrastructure. You're comfortable diving into a TypeScript codebase, understanding query patterns, and profiling performance at the application layer.
Security Instinct: You can look at a codebase and spot problems - injection risks, auth gaps, insecure defaults, dangerous dependencies. You stay current on common vulnerability patterns and know how to systematically find and fix them.
Database & Query Expertise: You understand how queries translate to performance. You can read an explain plan, spot a missing index, recognize an N+1 problem, and reason about data access patterns at scale.
Pragmatism at Startup Pace: You understand that perfect is the enemy of shipped. You can identify what actually matters, right-size solutions to the stage of the company, and revisit decisions as we grow.
Influence Without Authority: Owning reliability and security across a product means your impact depends on bringing engineers along with you rather than policing them. You teach, you explain the "why," and you build trust with product teams so that reliability and security become shared values.
AI-Native Development: A versatile understanding of how to leverage AI in your day-to-day work. The best candidates are top-tier at working with AI tools, can demonstrate concrete ways they've integrated them into how they operate, and actively contribute back to how the wider team adopts and improves its AI-assisted ways of working.

We have a full-fledged TypeScript-based tech stack in All Gravy. We are looking for exceptional people with great energy and a good sense of humour. The ideal candidate will love to think about how to solve

🙌 You'll thrive here if you

Have solid hands-on experience in SRE, backend engineering, or a security-focused engineering role.
Be comfortable working in application code daily, this is not a pure infrastructure role.
Startup or scale-up background. You know what it means to move fast, make pragmatic calls, and revisit them when the time is right.
Have a good sense of humour, and be willing to contribute to this being a fun place to work.
Be excited about building a big company and want to grow personally on the way there.
Be passionate, organized and self-driven.
Fluent English, spoken and written.
Based within European time zones, whether from our Copenhagen or London office or remotely.

The ideal candidate has experience with:

Application security: vulnerability discovery, dependency auditing, secure coding practices.
Database performance: query optimization, indexing strategies, profiling slow queries.
Observability tooling: metrics, logs, traces, and alerting (e.g. Datadog, Grafana, or similar).
Incident response and on-call practice.
Working in production TypeScript / Node.js codebases.
MongoDB or other document databases.

An advantage if you have experience with:

Multi-tenant SaaS products.
Performance engineering on mobile or web products.
Penetration testing or security auditing.
Cost optimization at the application layer (query efficiency, caching strategies).

💎 Why All Gravy

Grow fast, for real. We promote from within, move quickly, and give you real responsibility from day one.
Top equipment. You'll receive all new equipment to do your best work.
A culture that actually develops you. Honest feedback, active coaching, and the autonomy to experiment. We invest in you because your growth is our growth.
Work that matters. Your work will directly impact the daily lives of millions of frontline workers.
You'll work with people who've built, scaled, and won. Ex-founders, operators, all here because they want to do it again, bigger.
The good stuff too. Prime office location, team lunches, Friday bars, padel tournaments, wine tastings, offsites, top-tier equipment, and a fridge that's always stocked. We work hard and celebrate harder.

💡 Even if you do not tick every box on this page, you might be the perfect fit for the job! We treasure our learning culture and encourage all humble people to apply.

Perks and benefits

This job comes with several perks and benefits

Free coffee / tea

Flexible working hours

Social gatherings

Free office snacks

Near public transit

Free friday beers

See all 17 benefits

Working at
All Gravy

See company profile

All Gravy: All the people stuff in one employee app - built for hospitality Hospitality is one of the largest employment sectors in Europe. It's also one of the most chaotic to operate. The average multi-site restaurant group or hotel brand manages hundreds of hourly workers across dozens of locations - none of whom sit at a desk, have a work email, or receive information through the same channel twice. The result is predictable. New starters show up on day one not knowing what to do. Training happens inconsistently, or not at all. Managers spend half their day answering the same questions on repeat. Communication happens in WhatsApp groups that the business doesn't own, can't audit, and can't control. Staff churn - already the highest of any industry, averaging 75% annually - gets worse. And operators have no way of knowing why. The tools that exist to solve this were built for office workers. They assume laptops, corporate email, and people who sit still. Hospitality doesn't work like that. All Gravy is the fix. We're a communications and learning app built specifically for multi-site hospitality operators - restaurant groups, hotel brands, contract caterers, quick-service chains. Our customers typically manage between 200 and 1,000 staff across multiple locations in the UK, Scandinavia, and Germany. The product gives operators one place to run everything people-related: Staff communications - a branded, structured feed that works like social media. Managers post to the right people by location, role, or team. Employees are notified. Everything is documented. No more WhatsApp chaos. Onboarding journeys - automated, role-specific sequences that start the moment a new hire is added to the system. They arrive on day one prepared, connected to the brand, and knowing what's expected of them. Training and learning - a full LMS designed for deskless workers, supporting both digital courses and in-person sessions, with attendance tracking, completion records, and automated reminders built in. Digital handbooks - a living, searchable library that employees access from their phone. Update once, everyone sees it instantly. No more outdated PDFs or laminated binders no one reads. AI assistant - trained on the operator's own content, not the internet. Employees get instant, accurate answers to questions about policies, procedures, and their role. Managers stop answering the same things over and over. The employee experience looks and feels like a social media app - because that's what people actually use. Adoption is high because the product meets people where they already are. The admin side is a web dashboard where operators build and manage everything centrally, with full visibility into engagement, training completion, onboarding progress, and team sentiment across every location. The problem is big. There are roughly 90 million deskless workers in Europe. Hospitality alone employs millions, with vacancy and churn rates that cost the sector billions in lost revenue every year. The tools available to operators haven't kept up with what the problem actually requires. The opportunity is clear. Operators who invest in their people's experience - making them feel informed, connected, and supported from day one - see measurably lower turnover, better training compliance, and lower management overhead. All Gravy makes that investment easy, scalable, and consistent across every location in a business. We've raised €5.8M+ to date, backed by Moonfire Ventures, Scale Capital, and a syndicate of angels including senior operators and investors from Google, Deliveroo, and Peakon. Our customers include some of the largest hospitality operators in Northern Europe. We're building the operating layer for the front-line hospitality workforce - the infrastructure that makes it possible to run a great team at scale, not just at one site, but across all of them.