HikeCatalystBook Free Audit

← Back to Paths

[PLACEHOLDER hero banner]

Switch from Support to SRE

Translate your incident expertise into SRE engineering skills and land your first reliability engineering role.

CREATED BY

D

Dev R. [PLACEHOLDER] ★ 4.8

Senior Data Engineer at StreamBase | 10+ years of experience

About this Path

Purpose-built for L1/L2/L3 support engineers who live in incident queues and want to cross over into SRE or Platform Engineering. You already understand production pain — this path teaches you to fix it systematically: SLOs, automation with Python and Bash, infrastructure-as-code, container observability, and postmortem culture. The outcome is a GitHub portfolio and the vocabulary to pass a technical SRE screen.

Path Overview

Intermediate LevelCertificate of CompletionAbout 48 hours to completeEnglish language20+ curated videosLearn online at your own pace6 modules with resourcesGamified & interactive

Path Curriculum

Support Engineer to SRE: what actually changes

Map your existing incident skills to the SRE job ladder and identify gaps.

View Resources Start Learning

SLIs, SLOs, and Error Budgets from Scratch

Define availability, latency, and throughput SLIs and set realistic SLO targets.

View Resources Start Learning

Error Budget Policy: when to freeze releases

Write a policy that halts feature work when the error budget drops below 10%.

View Resources Start Learning

Blameless Postmortems and the Five Whys

Facilitate a postmortem and produce action items with owners and due dates.

View Resources Start Learning

Identifying and Measuring Toil in Your Queue

Categorize tickets by type and calculate hours per week lost to repetitive tasks.

View Resources Start Learning

Python Scripting for Ops: boto3, requests, and subprocess

Automate AWS resource queries, webhook calls, and shell command execution.

View Resources Start Learning

Writing Robust Bash Scripts with Error Handling

Use set -euo pipefail, traps, and logging to make scripts safe in production.

View Resources Start Learning

Building a Simple CLI Tool and Publishing it Internally

Package a Python script as a CLI with argparse and document it in a README.

View Resources Start Learning

Terraform Core Workflow: init, plan, apply, destroy

Provision an EC2 instance and VPC from scratch and inspect the state file.

View Resources Start Learning

Modules, Variables, and Remote State with S3

Refactor inline config into reusable modules and store state in a shared backend.

View Resources Start Learning

Terraform in CI: plan on PR, apply on merge

Wire a GitHub Actions workflow to show a plan diff before any infrastructure change.

View Resources Start Learning

Prometheus Data Model and PromQL Fundamentals

Query rate(), histogram_quantile(), and recording rules for SLO dashboards.

View Resources Start Learning

Grafana Dashboard Design for SRE

Build a USE method (Utilization, Saturation, Errors) dashboard from real metrics.

View Resources Start Learning

Structured Logging with Python and the ELK Stack

Emit JSON logs, ship them to Elasticsearch, and build a Kibana error-rate view.

View Resources Start Learning

Distributed Tracing with OpenTelemetry and Jaeger

Instrument a Python service and trace a request across three microservices.

View Resources Start Learning

Docker Essentials Refresher for SREs

Build a container image and debug it with exec, logs, and inspect in 30 minutes.

View Resources Start Learning

Kubernetes Architecture from an Operator Perspective

Understand control plane, kubelet, etcd, and how they relate to incident diagnosis.

View Resources Start Learning

kubectl Survival Guide: get, describe, logs, exec, top

Diagnose CrashLoopBackOff, OOMKilled, and Pending pods with five commands.

View Resources Start Learning

Liveness, Readiness, and Startup Probes

Configure probes to prevent bad deploys from receiving traffic prematurely.

View Resources Start Learning

Building Your SRE Portfolio: three GitHub projects to have

Deploy a monitored app on Kubernetes, an IaC repo, and a toil-reduction script.

View Resources Start Learning

SRE Interview Formats: design, troubleshooting, and coding screens

Walk through sample questions on error budgets, postmortems, and Python scripting.

View Resources Start Learning

Negotiating the SRE Title and Compensation Jump

Position your support background as production depth and anchor your number correctly.

View Resources Start Learning

What you'll learn

✓Define SLIs, SLOs, and error budgets for a real service and use them to prioritize reliability work over feature requests.
✓Write production-grade Python and Bash automation scripts that replace repetitive toil with reliable, tested tooling.
✓Provision and manage cloud infrastructure using Terraform, applying the same change-control discipline you use for incidents.
✓Instrument applications with Prometheus metrics and build Grafana dashboards that surface error rate, latency, and saturation.
✓Lead blameless postmortems and write action items that durably fix root causes instead of patching symptoms.
✓Containerize a workload with Docker and deploy it to a Kubernetes cluster, interpreting pod events and logs to diagnose failures.

FREE PROFILE AUDIT

Book your free audit

Tell us where you are — a senior mentor reviews your profile and shows you exactly what's blocking interview calls. Only name, email and role are required; the more you share, the sharper your audit. No spam, no obligation.

Name *Email *Current role *Phone / WhatsApp

A FEW MORE DETAILS (OPTIONAL)

LinkedIn URLExperienceCurrent companyCurrent job titleCurrent locationNotice periodCurrent salary (CTC)Expected salaryKey skills / tech stackTarget / dream companiesPreferred work modeHow did you hear about us?

I want

What's blocking you right now?

* required · Prefer talking? WhatsApp +91 83598 96054 or email connect@hikecatalyst.com

📄 Score My Resume