Staff Cloud Native Engineer: Role Blueprint, Responsibilities, Skills, KPIs, and Career Path
A **Staff Cloud Native Engineer** is a senior individual contributor (IC) who designs, builds, and continuously improves the cloud-native foundations that enable engineering teams to ship reliable software quickly and safely. This role is accountable for the technical direction and hands-on delivery of platform capabilities such as Kubernetes orchestration, infrastructure-as-code, CI/CD enablement, service-to-service networking, observability, and reliability practices.
Staff Cloud Engineer: Role Blueprint, Responsibilities, Skills, KPIs, and Career Path
The **Staff Cloud Engineer** is a senior individual contributor in the **Cloud & Infrastructure** department responsible for designing, building, and evolving the company’s cloud platform capabilities so product engineering teams can deliver secure, reliable, and cost-effective services at scale. The role exists to translate business and engineering goals (speed, availability, compliance, cost) into **repeatable cloud patterns, automation, and platform guardrails** that reduce operational toil and risk.
SRE Engineer: Role Blueprint, Responsibilities, Skills, KPIs, and Career Path
The **SRE Engineer** (Site Reliability Engineering Engineer) is a hands-on reliability practitioner responsible for keeping production systems **available, performant, scalable, and cost-effective** while enabling frequent, safe software delivery. This role applies software engineering approaches to operational problems—using automation, observability, and reliability design patterns to reduce incidents and accelerate recovery when they occur.
Site Reliability Engineer: Role Blueprint, Responsibilities, Skills, KPIs, and Career Path
A Site Reliability Engineer (SRE) ensures that customer-facing and internal services remain reliable, performant, secure, and cost-effective at scale by applying software engineering to operations. This role exists to reduce operational risk, improve service availability, and create leverage through automation, observability, and disciplined incident/problem management.
Senior Systems Reliability Engineer: Role Blueprint, Responsibilities, Skills, KPIs, and Career Path
The **Senior Systems Reliability Engineer** is a senior individual contributor in the **Cloud & Infrastructure** organization responsible for ensuring that production systems are **reliable, resilient, observable, performant, and cost-effective** at scale. This role blends deep systems engineering with SRE practice: defining service reliability targets (SLOs), strengthening operational readiness, driving automation, and leading complex incident response to protect customer experience and revenue.
Senior Storage Engineer: Role Blueprint, Responsibilities, Skills, KPIs, and Career Path
The Senior Storage Engineer designs, implements, and operates enterprise-grade storage and data protection platforms that underpin application availability, performance, and recoverability across on-premises and cloud environments. This role exists to ensure that data services (block, file, object, backup, and replication) are reliable, secure, cost-effective, and scalable—while meeting evolving product and engineering demands.
Senior SRE Engineer: Role Blueprint, Responsibilities, Skills, KPIs, and Career Path
The **Senior SRE Engineer** is an experienced individual contributor responsible for designing, improving, and operating the reliability practices, platforms, and automation that keep customer-facing services available, performant, and cost-effective. This role blends software engineering with systems engineering, with a focus on **SLOs/SLIs, error budgets, observability, incident response, toil reduction, and resilient architecture** across cloud and infrastructure layers.
Senior Site Reliability Engineer: Role Blueprint, Responsibilities, Skills, KPIs, and Career Path
The **Senior Site Reliability Engineer (SRE)** ensures that customer-facing and internal cloud services are **reliable, performant, resilient, and cost-effective** at scale. This role applies software engineering principles to operations—designing reliability into systems through automation, observability, incident management rigor, and continuous improvement.
Senior Reliability Engineer: Role Blueprint, Responsibilities, Skills, KPIs, and Career Path
The **Senior Reliability Engineer** is a senior individual contributor in the **Cloud & Infrastructure** organization responsible for ensuring production services meet defined reliability, availability, performance, and recoverability targets. This role designs and operates reliability mechanisms (SLOs, error budgets, observability, automation, incident response, resilience engineering) to reduce customer-impacting outages and improve operational efficiency at scale.
Senior Production Engineer: Role Blueprint, Responsibilities, Skills, KPIs, and Career Path
A **Senior Production Engineer** is a senior individual contributor in the Cloud & Infrastructure organization responsible for ensuring that production systems are **reliable, scalable, secure, and cost-efficient** while enabling fast, safe delivery of software changes. The role blends software engineering, systems engineering, and operational excellence to reduce downtime, improve performance, and increase developer velocity through automation and well-defined production practices.
Senior Observability Engineer: Role Blueprint, Responsibilities, Skills, KPIs, and Career Path
A **Senior Observability Engineer** designs, builds, and operates the monitoring, logging, tracing, and alerting capabilities that enable engineering teams to **detect, diagnose, and resolve production issues quickly** while meeting reliability and performance objectives. The role sits at the intersection of platform engineering, SRE/operations, and software engineering, translating system behavior into actionable signals and standards that scale across teams and services.
Senior Network Engineer: Role Blueprint, Responsibilities, Skills, KPIs, and Career Path
The Senior Network Engineer designs, builds, and operates reliable, secure, and scalable network connectivity across cloud and on-prem environments to enable product delivery, internal engineering productivity, and enterprise-grade service reliability. This role balances deep hands-on engineering (routing/switching, WAN, firewalls, load balancing, DNS, connectivity) with operational excellence (monitoring, incident response, change management, capacity planning) and modern automation practices (Infrastructure as Code, configuration management, CI/CD integration).
Senior Network Automation Engineer: Role Blueprint, Responsibilities, Skills, KPIs, and Career Path
The **Senior Network Automation Engineer** is a senior individual contributor in the **Cloud & Infrastructure** organization responsible for designing, building, and operating automation systems that provision, configure, validate, and continuously manage network infrastructure at scale. The role bridges traditional network engineering and modern software engineering practices (NetDevOps), enabling safe, repeatable, and observable network change through code, pipelines, and policy-driven controls.
Senior Monitoring Engineer: Role Blueprint, Responsibilities, Skills, KPIs, and Career Path
The Senior Monitoring Engineer designs, implements, and continuously improves the organization’s monitoring and observability capabilities across cloud infrastructure, platforms, and production services. This role ensures that engineering teams can detect incidents early, diagnose issues quickly, and measure reliability through actionable metrics, logs, traces, and service-level objectives (SLOs).
Senior Linux Systems Engineer: Role Blueprint, Responsibilities, Skills, KPIs, and Career Path
The **Senior Linux Systems Engineer** is a senior individual contributor responsible for the reliability, security, performance, and lifecycle management of Linux-based compute platforms that power production services, internal engineering systems, and core infrastructure. This role designs and operates scalable Linux environments across on-premises and cloud, automates system configuration and fleet operations, and hardens platforms to meet uptime and security requirements.
Senior Kubernetes Engineer: Role Blueprint, Responsibilities, Skills, KPIs, and Career Path
The Senior Kubernetes Engineer designs, builds, secures, and operates Kubernetes platforms that reliably run production workloads at scale. This role exists to provide a standardized, automated, and supportable container orchestration foundation—so application teams can ship faster while meeting enterprise expectations for availability, security, cost, and compliance.
Senior Infrastructure Engineer: Role Blueprint, Responsibilities, Skills, KPIs, and Career Path
The Senior Infrastructure Engineer designs, builds, and operates reliable, secure, and scalable infrastructure platforms that enable product engineering teams to ship and run software with confidence. This role is accountable for improving availability, performance, and operational efficiency across cloud and/or hybrid environments, while reducing risk through automation, standardization, and strong operational controls.
Senior DevOps Engineer: Role Blueprint, Responsibilities, Skills, KPIs, and Career Path
The **Senior DevOps Engineer** is a senior individual contributor in the **Cloud & Infrastructure** department responsible for building, operating, and continuously improving the platforms, automation, and operational practices that enable engineering teams to deliver software safely, quickly, and reliably. This role designs and runs cloud infrastructure, CI/CD systems, observability, and operational controls that reduce lead time and change risk while improving availability and performance.
Senior Cloud Native Engineer: Role Blueprint, Responsibilities, Skills, KPIs, and Career Path
The **Senior Cloud Native Engineer** designs, builds, and operates cloud-native platforms and runtime capabilities that enable application teams to ship secure, scalable, reliable software with high delivery velocity. This role sits in the **Cloud & Infrastructure** department and focuses on modern infrastructure engineering: containers, Kubernetes, service networking, infrastructure-as-code, CI/CD enablement, observability, and reliability practices.
Senior Cloud Engineer: Role Blueprint, Responsibilities, Skills, KPIs, and Career Path
The **Senior Cloud Engineer** designs, builds, and operates secure, reliable, and cost-efficient cloud infrastructure that enables product engineering teams to deliver software quickly and safely. This role is accountable for production-grade cloud foundations (networking, compute, identity, observability, automation) and for evolving them into scalable internal platforms and patterns.
Reliability Engineer: Role Blueprint, Responsibilities, Skills, KPIs, and Career Path
The Reliability Engineer ensures that cloud-based services and the infrastructure they run on are available, performant, resilient, and recoverable under real-world conditions—including failures, traffic spikes, deployments, and dependency issues. This role blends software engineering, operational excellence, and systems thinking to reduce customer-impacting incidents, improve mean time to restore (MTTR), and raise the reliability baseline through automation and engineering standards.
Reliability and Platform Engineering Leader: Role Blueprint, Responsibilities, Skills, KPIs, and Career Path
The Reliability and Platform Engineering Leader is accountable for the reliability, scalability, and operational readiness of the company’s production systems while building a developer platform that enables fast, safe, and cost-effective software delivery. This role leads Site Reliability Engineering (SRE) and Platform Engineering capabilities across cloud infrastructure, Kubernetes/container platforms, CI/CD foundations, and observability—balancing uptime, feature velocity, security, and cost.
Production Engineer: Role Blueprint, Responsibilities, Skills, KPIs, and Career Path
A **Production Engineer** ensures that customer-facing services and internal platforms run safely, reliably, and efficiently in live (“production”) environments. The role blends software engineering, systems engineering, and operational excellence to reduce downtime, improve performance, increase deployment safety, and minimize manual operational toil through automation.
Principal Systems Reliability Engineer: Role Blueprint, Responsibilities, Skills, KPIs, and Career Path
The **Principal Systems Reliability Engineer** is a senior individual-contributor (IC) role responsible for designing, governing, and continuously improving reliability outcomes across cloud infrastructure and the production systems that run on it. This role sets reliability strategy, defines measurable reliability standards (SLOs/SLIs/error budgets), and drives systemic improvements that reduce incidents, accelerate recovery, and increase customer trust.
Principal Storage Engineer: Role Blueprint, Responsibilities, Skills, KPIs, and Career Path
The Principal Storage Engineer is the senior individual-contributor authority for enterprise storage platforms that underpin application reliability, data durability, performance, and cost efficiency across on-prem, hybrid, and cloud environments. The role designs, standardizes, automates, and continuously improves storage services (block, file, object) and data protection capabilities (backup, replication, archive) to meet production-grade requirements.
Principal SRE Engineer: Role Blueprint, Responsibilities, Skills, KPIs, and Career Path
The **Principal SRE Engineer** is a senior individual contributor (IC) responsible for shaping, scaling, and continuously improving the reliability, performance, and operational excellence of cloud-hosted products and core infrastructure. This role drives enterprise-grade Site Reliability Engineering practices—particularly SLO-based reliability management, resilient architectures, high-quality observability, and automated operations—across multiple teams and services.
Principal Site Reliability Engineer: Role Blueprint, Responsibilities, Skills, KPIs, and Career Path
The Principal Site Reliability Engineer (SRE) is a senior individual contributor responsible for ensuring that critical cloud services are reliable, scalable, secure, and cost-efficient, while enabling rapid product delivery. This role designs and governs reliability engineering practices (SLOs/SLIs, error budgets, incident management, observability, resilience testing) and drives cross-team execution of reliability improvements across the platform.
Principal Reliability Engineer: Role Blueprint, Responsibilities, Skills, KPIs, and Career Path
The **Principal Reliability Engineer** is a senior individual-contributor (IC) role responsible for **setting reliability strategy and technical direction** across critical cloud infrastructure and production services, while directly improving **availability, latency, scalability, incident response maturity, and operational efficiency**. This role exists to ensure that engineering teams can ship changes quickly **without compromising production stability**, and that reliability is designed, measured, and governed as a first-class product attribute.
Principal Production Engineer: Role Blueprint, Responsibilities, Skills, KPIs, and Career Path
The Principal Production Engineer is a senior individual contributor in the Cloud & Infrastructure organization responsible for ensuring that customer-facing and internal production systems are reliable, scalable, secure, and cost-efficient. This role blends deep systems engineering with operational excellence and influences architecture and engineering practices across multiple teams and services.
Principal Observability Engineer: Role Blueprint, Responsibilities, Skills, KPIs, and Career Path
The Principal Observability Engineer is a senior individual contributor (IC) in the Cloud & Infrastructure organization accountable for the end-to-end observability strategy, platform architecture, and operational outcomes across distributed systems. This role builds and evolves the telemetry foundations (metrics, logs, traces, profiling, synthetics) that enable engineering teams to detect, understand, and remediate reliability, performance, and customer-impacting issues quickly and safely.
