Site Reliability Engineering (SRE) Certified Professional Training

(5.0) G 4.5/5 f 4.5/5

Course Duration

72 hours/6 Days

Live Project

Certification

Industry recognized

Training Format

Online/Classroom/Corporate

8000+

Certified Learners

15+

Years Avg. faculty experience

40+

Happy Clients

4.5/5.0

Average class rating

ABOUT
AGENDA
PROJECTS
FAQS
FEEDBACK
POPULAR COURSES
COMPARISON
BLOGS
GALLERY

How DevOpsSchool will help in SRE Certification & Courses

The Site Reliability Engineering Certified Professional (SRECP) certification course by DevOpsSchool will help you to learn the principles & practices that allows an organization to reliably and economically scale critical services. SRE is a process of operations which emphasize to accumulate software engineering and automation solutions to ensure that continuously delivered applications are running efficiently and reliably. Our SRECP course highlights the progression of SRE in modern software engineering process and its future direction and prepares learners with the methods, practices, and tools to engage workforce across the organization involved in reliability and stability evidenced through the use of real-life scenarios and case stories.

Site Reliability Engineering(SRE) Intermediate Certification - Instructor-led, Live & Interactive Training

AGENDA	MODE	DURATION
Site Reliability Engineering(SRE)	Online (Instructor-led)	69 Hours
Site Reliability Engineering(SRE)	Classroom Public (Due to pendemic - not available)	6 Days(Weekend)
Site Reliability Engineering(SRE)	Corp Classroom	5 Days

Calendar

Course Price at

49,999/-

No Negotiation

What is Site Reliability Engineering (SRE)?

Site Reliability Engineering (SRE) is a discipline that incorporates aspects of software engineering and applies them to infrastructure and operations problems. The main goals are to create scalable and highly reliable software systems. According to Ben Treynor, founder of Google's Site Reliability Team, SRE is "what happens when a software engineer is tasked with what used to be called operations."

Know about Site Reliability Engineering Certified Professional (SRECP)?

Site Reliability Engineering Certified Professional (SRECP) is a certification from DevOpsCertification.co. The Objective of this certification and its associated course is to impart, test and validate knowledge of SRE vocabulary, principles and practices. Site Reliability Engineering Certified Professional (SRECP) is intended to provide individuals an understanding of basic SRE concepts and how SRE may be used to improve operational activities by applying Site Reliability Engineering principles and engineering practices in Software Development Lifecycle.

This course teaches the theory of Service Level Objectives (SLOs), a principled way of describing and measuring the desired reliability of a service. Upon completion, Certified Professional should be able to apply these principles to develop the first SLOs for services they are familiar with in their own organizations.

Certified Professional will also learn how to use Service Level Indicators (SLIs) to quantify reliability and Error Budgets to drive business decisions around engineering for greater reliability. The learner will understand the components of a meaningful SLI and walk through the process of developing SLIs and SLOs for an example service.

What is Advantage of SRECP certification?

A Site Reliability Engineering Certified Professional (SRECP) Engineer is a professional who understands the principles of performance evaluation and prediction to improve product/systems safety, reliability and maintainability.

How to become Site Reliability Engineering Certified Professional?

Please contact contact@DevOpsSchool.com

What you would Learn?

You'll learn:

How to run reliable services in environments you don't completely control-like cloud
Practical applications of how to create, monitor, and run your services via service level objectives
How to convert existing ops teams to SRE-including how to dig out of operational overload
Methods for starting SRE from either greenfield or brownfield

Agenda of the Site Reliability Engineering Certified Professional? Download Curriculum

SDLC Models & Architecture with Agile, DevOps, SRE & DevSecOps, SOA & Micro services - Concept

Let’s Understand about Software Development Model
Overview of Waterfall Development Model
Challenges of Waterfall Development Model
Overview of Agile Development Model
Challenges of Agile Development Model
Requirement of New Software Development Model
Understanding an existing Pain and Waste in Current Software Development Model
What is DevOps?

Transition in Software development model
Waterfall -> Agile -> CI/CD -> DevOps -> DevSecOps

Understand DevOps values and principles
Culture and organizational considerations
Communication and collaboration practices
Improve your effectiveness and productivity
DevOps Automation practices and technology considerations
DevOps Adoption considerations in an enterprise environment
Challenges, risks and critical success factors
What is DevSecOps?

Let’s Understand DevSecOps Practices and Toolsets.

What is SRE?

Let’s Understand SRE Practices and Toolsets.

List of Tools to become Full Stack Developer/QA/SRE/DevOps/DevSecOps
Microservices Fundamentals
Microservices Patterns

Choreographing Services
Presentation components
Business Logic
Database access logic
Application Integration
Modelling Microservices
Integrating multiple Microservices

Keeping it simple

Avoiding Breaking Changes
Choosing the right protocols
Sync & Async
Dealing with legacy systems
Testing

What and When to test
Preparing for deployment
Monitoring Microservice Performance
Tools used for Microservices Demo using container

Platform - Operating Systems - Centos/Ubuntu & VirtualBox & Vagrant

Ubuntu

Installing CentOS7 and Ubuntu
Accessing Servers with SSH
Working at the Command Line
Reading Files
Using the vi Text Editor
Piping and Redirection
Archiving Files
Accessing Command Line Help
Understanding File Permissions
Accessing the Root Account
Using Screen and Script
Overview of Hypervisor
Introduction of VirtualBox
Install VirtualBox and Creating CentOS7 and Ubuntu Vms

Vagrant

Understanding Vagrant
Basic Vagrant Workflow
Advance Vagrant Workflow
Working with Vagrant VMs
The Vagrantfile
Installing Nginx
Provisioning
Networking
Sharing and Versioning Web Site Files
Vagrant Share
Vagrant Status
Sharing and Versioning Nginx Config Files
Configuring Synced Folders

Platform - Cloud - AWS

Introduction of AWS
Understanding AWS infrastructure
Understanding AWS Free Tier
IAM: Understanding IAM Concepts
IAM: A Walkthrough IAM
IAM: Demo & Lab
Computing:EC2: Understanding EC2 Concepts
Computing:EC2: A Walkthrough EC2
Computing:EC2: Demo & Lab
Storage:EBS: Understanding EBS Concepts
Storage:EBS: A Walkthrough EBS
Storage:EBS: Demo & Lab
Storage:S3: Understanding S3 Concepts
Storage:S3: A Walkthrough S3
Storage:S3: Demo & Lab

Storage:EFS: Understanding EFS Concepts
Storage:EFS: A Walkthrough EFS
Storage:EFS: Demo & Lab
Database:RDS: Understanding RDS MySql Concepts
Database:RDS: A Walkthrough RDS MySql
Database:RDS: Demo & Lab
ELB: Elastic Load Balancer Concepts
ELB: Elastic Load Balancer Implementation
ELB: Elastic Load Balancer: Demo & Lab
Networking:VPC: Understanding VPC Concepts
Networking:VPC: Understanding VPC components
Networking:VPC: Demo & Lab

Platform - Containers - Docker

What is Containerization?
Why Containerization?
How Docker is good fit for Containerization?
How Docker works?
Docker Architecture
Docker Installations & Configurations
Docker Components
Docker Engine
Docker Image
Docker Containers
Docker Registry
Docker Basic Workflow
Managing Docker Containers
Creating our First Image
Understading Docker Images
Creating Images using Dockerfile
Managing Docker Images
Using Docker Hub registry

Docker Networking
Docker Volumes
Deepdive into Docker Images
Deepdive into Dockerfile
Deepdive into Docker Containers
Deepdive into Docker Networks
Deepdive into Docker Volumes
Deepdive into Docker Volume
Deepdive into Docker CPU and RAM allocations
Deepdive into Docker Config
Docker Compose Overview
Install & Configure Compose
Understanding Docker Compose Workflow
Understanding Docker Compose Services
Writing Docker Compose Yaml file
Using Docker Compose Commands
Docker Compose with Java Stake
Docker Compose with Rails Stake
Docker Compose with PHP Stake
Docker Compose with Nodejs Stake

Planning and Designing - Jira & Confulence

a. Jira

Overview of Jira
Use cases of Jira
Architecture of Jira
Installation and Configuraration of Jira in Linux
Installation and Configuraration of Jira in Windows
Jira Terminologies
Understanding Types of Jira Projects
Working with Projects
Working with Jira Issues
Adding Project Components and Versions
Use Subtasks to Better Manage and Structure Your Issues
Link Issues to Other Resources
Working in an Agile project
Working with Issues Types by Adding/Editing/Deleting
Working with Custom Fields by Adding/Editing/Deleting
Working with Screens by Adding/Editing/Deleting
Searching and Filtering Issues
Working with Workflow basic
Introduction of Jira Plugins and Addons.
Jira Integration with Github

b. Confluence

Exploring Confluence benefits and resources
Configuring Confluence
Navigating the dashboard, spaces, and pages
Creating users and groups
Creating pages from templates and blueprints
Importing, updating, and removing content
Giving content feedback
Watching pages, spaces, and blogs
Managing tasks and notifications
Backing up and restoring a site
Admin tasks

Add/Edit/Delete new users
Adding group and setting permissions
Managing user permissions
Managing addons or plugins
Customizing confluence site

Installing Confluence

Evaluation options for Confluence
Supported platforms
Installing Confluence on Windows
Activating Confluence trial license
Finalizing Confluence Installation

Backend Programming Language - Python/Flask with mysql DB

Planning - Discuss some of the Small Project Requirement which include
Login/Registertration with Some Students records CRUD operations.
Design a Method --> Classes -> Interface using Core Python

Fundamental of Core Python with Hello-world Program with Method --> Classes

Coding in Flask using HTMl - CSS - JS - MySql

Fundamental of Flask Tutorial of Hello-World APP

UT - 2 Sample unit Testing using Pythontest
Package a Python App
AT - 2 Sample unit Testing using Selenium

Technology Demonstration

Software Planning and Designing using JAVA
Core Python
Flask
mySql
pytest
Selenium
HTMl
CSS
Js.

Source Code Versioning - Git using Github

Introduction of Git
Installing Git
Configuring Git
Git Concepts and Architecture
How Git works?
The Git workflow

Working with Files in Git

Adding files
Editing files
Viewing changes with diff
Viewing only staged changes
Deleting files
Moving and renaming files
Making Changes to Files

Undoing Changes

- Reset
- Revert

Amending commits
Ignoring Files
Branching and Merging using Git
Working with Conflict Resolution
Comparing commits, branches and workspace
Working with Remote Git repo using Github
Push - Pull - Fetch using Github
Tagging with Git

Code Analysis & Securing Code (SAST) - SonarQube & Coverity Scan & Snyk

What is SonarQube?
Benefits of SonarQube?
Alternative of SonarQube
Understanding Various License of SonarQube
Architecture of SonarQube
How SonarQube works?
Components of SonarQube
SonarQube runtime requirements
Installing and configuring SonarQube in Linux
Basic Workflow in SonarQube using Command line
Working with Issues in SonarQube

Working with Rules in SonarQube
Working with Quality Profiles in SonarQube
Working with Quality Gates in SonarQube
Deep Dive into SonarQube Dashboard
Understanding Seven Axis of SonarQube Quality
Workflow in SonarQube with Maven Project
Workflow in SonarQube with Gradle Project
OWASP Top 10 with SonarQube

Build Management - Maven and Gradle

Maven

Introduction to Apache Maven
Advantage of Apache Maven over other build tools
Understanding the Maven Lifecycle and Phase
Understanding the Maven Goals
Understanding the Maven Plugins
Understanding the Maven Repository
Understanding and Maven Release and Version
Prerequisite and Installing Apache Maven
Understanding and using Maven Archetypes
Understanding Pom.xml and Setting.xml
Playing with multiples Maven Goals
Introducing Maven Dependencies
Introducing Maven Properties
Introducing Maven Modules
Introducing Maven Profile
Introducing Maven Plugins
How can Maven benefit my development process?
How do I setup Maven?
How do I make my first Maven project?
How do I compile my application sources?
How do I compile my test sources and run my unit tests?
How do I create a JAR and install it in my local repository?
How do I use plugins?
How do I add resources to my JAR?
How do I filter resource files?
How do I use external dependencies?
How do I deploy my jar in my remote repository?
How do I create documentation?
How do I build other types of projects?
How do I build more than one project at once?

Gradle

What is Gradle?
Why Gradle?
Installing and Configuring Gradle
Build Java Project with Gradle
Build C++ Project with Gradle
Build Python Project with Gradle
Dependency Management in Gradle
Project Structure in Gradle
Gradle Tasks
Gradle Profile and Cloud
Gradle Properties
Gradle Plugins

Package Management - Packer & Artifactory

Artifactory

Artifactory

Artifactory Overview
Understanding a role of Artifactory in DevOps
System Requirements
Installing Artifactory in Linux
Using Artifactory
Getting Started
General Information
Artifactory Terminology
Artifactory Repository Types
Artifactory Authentication
Deploying Artifacts using Maven
Download Artifacts using Maven
Browsing Artifactory
Viewing Packages
Searching for Artifacts
Manipulating Artifacts

Packer

Packer
Getting to Know Packer

What is Packer?
Save What is Packer?
Installing Packer
Save Installing Packer
The Packer workflow and components
Save The Packer workflow and components
The Packer CLI
Save The Packer CLI

Baking a Website Image for EC2
Select an AWS AMI base
Save Select an AWS AMI base
Automate AWS AMI base build
Save Automate AWS AMI base build
Using build variables
Save Using build variables
Provision Hello World
Save Provision Hello World
Provision a basic site
Save Provision a basic site
Customization with a Config Management Tool

Simplify provisioning with a config tool
Save Simplify provisioning with a config tool
Use ansible to install the webserver
Save Use ansible to install the webserver
Debugging
Save Debugging

Building Hardened Images

Use Ansible modules to harden our image
Save Use Ansible modules to harden our image
Baking a Jenkins image
Save Baking a Jenkins image

Building a Pipeline for Packer Image

Validate Packer templates
Save Validate Packer templates
Create a manifest profile
Save Create a manifest profile
Testing
Save Testing
CI pipeline
Save CI pipeline

Unit Testing & Acceptance Testing & Coverage - Junit & Selenium & Jmeter & Jacoco

Junit

- What is Unit Testing?
- Tools for Unit Testing
- What is Junit?
- How to configure Junit?
- Writing Basic Junit Test cases
- Running Basic Junit Test cases
- Junit Test Results

Selenium

Introduction to Selenium

Components of Selenium

- Selenium IDE
- Selenium Web driver
- Selenium Grid

Installing and Configuring Selenium

Working with Selenium IDE

Working With Selenium Web driver with Java Test Case Setup and Working with Selenium Grid

Setup and Working with Selenium Grid

Jacoco

Overview of Code coverage process
Introduction of Jacoco
How Jacoco works!
How to install Jaoco?
Setup testing Environment with Jacoco
Create test data files using Jacoco and Maven
Create a Report using Jacoco
Demo - Complete workflow of Jacoco with Maven and Java Project

Configuration & Deployment Management - Ansible

Overflow of Configuration Management
Introduction of Ansible
Ansible Architecture
Let’s get startted with Ansible
Ansible Authentication & Authorization
Let’s start with Ansible Adhoc commands
Let’s write Ansible Inventory

Let’s write Ansible Playbook
Working with Popular Modules in Ansible
Deep Dive into Ansible Playbooks
Working with Ansible Variables
Working with Ansible Template
Working with Ansible Handlers
Roles in Ansible
Ansible Galaxy

Container Orchestration - Kubernetes & Helm Introduction

Understanding the Need of Kubernetes
Understanding Kubernetes Architecture
Understanding Kubernetes Concepts
Kubernetes and Microservices
Understanding Kubernetes Masters and its Component

kube-apiserver
etcd
kube-scheduler
kube-controller-manager

Understanding Kubernetes Nodes and its Component

kubelet
kube-proxy
Container Runtime

Understanding Kubernetes Addons

DNS
Web UI (Dashboard)
Container Resource Monitoring
Cluster-level Logging

Understand Kubernetes Terminology
Kubernetes Pod Overview
Kubernetes Replication Controller Overview
Kubernetes Deployment Overview
Kubernetes Service Overview
Understanding Kubernetes running environment options
Working with first Pods
Working with first Replication Controller
Working with first Deployment
Working with first Services
Introducing Helm
Basic working with Helm

Infrastructure Coding - Terraform

Deploying Your First Terraform Configuration

Introduction
What's the Scenario?
Terraform Components

Updating Your Configuration with More Resources

Introduction
Terraform State and Update
What's the Scenario?
Data Type and Security Groups

Configuring Resources After Creation

Introduction
What's the Scenario?
Terraform Provisioners
Terraform Syntax

Adding a New Provider to Your Configuration

Introduction
What's the Scenario?
Terraform Providers
Terraform Functions
Intro and Variable
Resource Creation
Deployment and Terraform Console
Updated Deployment and Terraform Commands

Continuous Integration - Jenkins

Lets understand Continuous Integration
What is Continuous Integration
Benefits of Continuous Integration
What is Continuous Delivery
What is Continuous Deployment
Continuous Integration Tools

What is Jenkins
History of Jenkins
Jenkins Architecture
Jenkins Vs Jenkins Enterprise
Jenkins Installation and Configurations

Jenkins Dashboard Tour
Understand Freestyle Project
Freestyle General Tab
Freestyle Source Code Management Tab
Freestyle Build Triggers Tab
Freestyle Build Environment
Freestyle Build
Freestyle Post-build Actions
Manage Jenkins
My Views
Credentials
People
Build History

Creating a Simple Job
Simple Java and Maven Based Application
Simple Java and Gradle Based Application
Simple DOTNET and MSBuild Based Application

Jobs Scheduling in Jenkins
Manually Building
Build Trigger based on fixed schedule
Build Trigger by script
Build Trigger Based on pushed to git
Useful Jobs Configuration
Jenkins Jobs parameterised
Execute concurrent builds
Jobs Executors
Build Other Projects
Build after other projects are built
Throttle Builds

Jenkins Plugins
Installing a Plugin
Plugin Configuration
Updating a Plugin
Plugin Wiki
Top 20 Useful Jenkins Plugins
Using Jenkins Pluginss Best Practices

Jenkins Node Managment
Adding a Linux Node
Adding a Windows Nodes
Nodes Management using Jenkins
Jenkins Nodes High Availability

Jenkins Integration with other tools
Jira
Git
SonarQube
Maven
Junit
Ansible
Docker
AWS
Jacoco
Coverity
Selenium
Gradle

Reports in Jenkins
Junit Report
SonarQube Reports
Jacoco Reports
Coverity Reports
Selenium Reports
Test Results
Cucumber Reports

Jenkins Node Managment
Adding a Linux Node
Adding a Windows Nodes
Nodes Management using Jenkins
Jenkins Nodes High Availability

Notification & Feedback in Jenkins
CI Build Pipeline & Dashboard
Email Notification
Advance Email Notification
Slack Notification

Jenkins Advance - Administrator
Security in Jenkins
Authorization in Jenkins
Authentication in Jenkins
Managing folder/subfolder
Jenkins Upgrade
Jenkins Backup
Jenkins Restore
Jenkins Command Line

Infrastructure Monitoring Tool 1 - Datadog

Getting started
Integrations
Infrastructure
Host Map
Events
Dashboards

Datadog Tagging
Assigning Tags
Using Tags

Agent
Datadog Agent Usage
Datadog Agent Docker
Datadog Agent Kubernetes
Datadog Agent Cluster Agent
Datadog Agent Log Collection
Datadog Agent Proxy
Datadog Agent Versions
Datadog Agent Troubleshooting

Datadog Integrations
Apache
Tomcat
AWS
MySql

Datadog Metrics
Metrics Introduction
Metrics Explorer
Metrics Summary

Datadog Graphing
Dashboards
Metrics

Datadog Alerting
Monitors
Manage Monitors
Monitor Status

Log Monitoring Tool 1 - Splunk

What Is Splunk?
Overview
Machine Data
Splunk Architecture
Careers in Splunk

Setting up the Splunk Environment
Overview
Splunk Licensing
Getting Splunk
Installing Splunk
Adding Data to Splunk

Basic Searching Techniques
Adding More Data
Search in Splunk
Demo: Splunk Search
Splunk Search Commands
Splunk Processing Langauge
Splunk Reports
Reporting in Splunk
Splunk Alerts
Alerts in Splunk

Enterprise Splunk Architecture
Overview
Forwarders
Enterprise Splunk Architecture
Installing Forwarders
Installing Forwarders
Troubleshooting Forwarder Installation

Splunking for DevOps and Security
Splunk in DevOps
DevOps Demo
Splunk in Security
Enterprise Use Cases
Application Development in Splunkbase
What Is Splunkbase?

Navigating the Splunkbase
Creating Apps for Splunk
Benefits of Building in Splunkbase
Splunking on Hadoop with Hunk
What Is Hadoop?
Running HDFS Commands
What Is Hunk?
Installing Hunk
Moving Data from HDFS to Hunk
Composing Advanced Searches
Splunk Searching

Introduction to Advanced Searching
Eval and Fillnull Commands
Other Splunk Command Usage
Filter Those Results!
The Search Job Inspector
Creating Search Macros
What Are Search Macros?
Using Search Macros within Splunk
Macro Command Options and Arguments
Other Advanced Searching within Splunk

Performance & RUM Monitoring - NewRelic

Introduction and Overview of NewRelic
What is Application Performance Management?
Understanding a need of APM
Understanding transaction traces
What is Application Performance?
APM Benefits
APM Selection Criteria
Why NewRelic is best for APM?
What is NewRelic APM?
How does NewRelic APM work?
NewRelic Architecture
NewRelic Terminology
Installing and Configuring NewRelic APM Agents for Application
Register a Newrelic Trial account
Installing a JAVA Agent to Monitor your Java Application
Installing a PHP Agent to Monitor your PHP Application
Installing New Relic Agent for .NET Framework Application
Installing a Docker based Agent to Monitor your Docker based Application
Understanding of NewRelic Configration settings of newrelic.yml
Understanding of NewRelic Agent Configration settings
Working with NewRelic Dashboard
Understanding a transactions
Understanding Apdex and Calculating and Setting Apdex Threshold
Understanding Circuit break
Understanding Throughput
Newrelic default graphs
Understanding and Configuring Service Maps
Understanding and Configuring JVM
Understanding Error Analytics
Understanding Violations
Understanding and Configuring Deployments
Understanding and Configuring Thread Profiler
Depp Dive into Transaction Traces
Profiling with New Relic

Creating and managing Alerts
Working with Incidents
Sending NewRelic Alerts to Slack
Assessing the quality of application deployments
Monitoring using Newrelic
View your applications index
APM Overview page
New Relic APM data in Infrastructure
Transactions page
Databases and slow queries
Viewing slow query details
External services page
Agent-specific UI
Viewing the transaction map

Deep Dive into Newrelic Advance
Newrelic transaction alerts
Configure abnd Troubleshoot and Cross Application Traces
NewRelic Service Level Agreements
Troubleshooting NewRelic
Understanding and Configuring NewRelic X-Ray Sessions
Deep Dive into NewRelic Agent Configuration
Adding Custom Data with the APM Agent
Extending Newrelic using Plugins
Finding and Fixing Application Performance Issues with New Relic APM
Setting up database montioring using Newrelic APM
Setting up and Configuring Newrelic Alerts

Working with NewRelic Performance Reports
Availability report
Background jobs analysis report
Capacity analysis report
Database analysis report
Host usage report
Scalability analysis report
Web transactions analysis report
Weekly performance report

Webserver - Apache HTTP & Nginx

Apache HTTP

Introduction to web server
Install Apache on CentOS 7.4
Enable Apache to automatically start when system boot
Configure the firewall service
Where is Apache?
Directory structure

Apache directory structure
Configuration file
Create your first page

Virtual hosts

Setting up the virtual host - name based
Setting up the virtual host - port based

Using aliases and redirecting
Configuring an alias for a url
Redirects
Logging

The error log
The access log
Custom log
Log rotation

Security

Basic Security - Part 1
Basic Security - Part 2
Set up TLS/SSl for free
Basic authentication
Digest authentication
Access Control
.htaccess (Administrator Side)
.htaccess (User Side)
Install and Configure antivirus
Mitigate dos attacks - mod_evasive

Apache Performance and Troubleshooting

Apache Multi-Processing Modules (MPMs)
Adjusting httpd.conf - Part 1
Adjusting httpd.conf - Part 2
Troubleshoot Apache (Analyz Access Log) - Part 1
Troubleshoot Apache (Analyze Access Log) - Part 2
Use Apachetop to monitor web server traffic

Nginx

Overview

Introduction
About NGINX
NGINX vs Apache
Test your knowledge

Installation

Server Overview
Installing with a Package Manager
Building Nginx from Source & Adding Modules
Adding an NGINX Service
Nginx for Windows
Test your knowledge

Configuration

Understanding Configuration Terms
Creating a Virtual Host
Location blocks
Variables
Rewrites & Redirects
Try Files & Named Locations
Logging
Inheritance & Directive types
PHP Processing
Worker Processes
Buffers & Timeouts
Adding Dynamic Modules
Test your knowledge

Performance

Headers & Expires
Compressed Responses with gzip
FastCGI Cache
HTTP2
Server Push

Security

HTTPS (SSL)
Rate Limiting
Basic Auth
Hardening Nginx
Test your knowledge
Let's Encrypt - SSL Certificates

Multi-cluster Kubernetes orchestration platform - Rancher

Multi-cluster management

Rancher provides a unified interface for managing multiple Kubernetes clusters across different environments, including on-premises, cloud, and hybrid.

Centralized administration

With Rancher, you can manage user access, security policies, and cluster settings from a central location, making it easier to maintain a consistent and secure deployment across all clusters.

Automated deployment

Rancher streamlines the application deployment process by providing built-in automation tools that allow you to deploy applications to multiple clusters with just a few clicks.

Monitoring and logging

Rancher provides a built-in monitoring and logging system that enables you to monitor the health and performance of your applications and clusters in real-time.

Application catalog

Rancher offers a curated catalog of pre-configured application templates that enable you to deploy and manage popular applications such as databases, web servers, and messaging queues.

Scalability and resilience

Rancher is designed to be highly scalable and resilient, enabling you to easily add new clusters or nodes to your deployment as your needs grow.

Extensibility

Rancher provides an open API and a rich ecosystem of plugins and extensions, enabling you to customize and extend the platform to meet your specific needs.

Services mesh Data planes & Control Planes - Envoy & Istio

Envoy

Data Plane

Envoy is a high-performance proxy that is deployed as a sidecar to each microservice in the infrastructure.
Envoy manages all inbound and outbound traffic for the microservice and provides features like load balancing, circuit breaking, and health checks.
Envoy can also be used as a standalone proxy outside of a service mesh architecture.

Control Plane:

Envoy does not have a built-in control plane.
It can be integrated with other service mesh management solutions like Istio, Consul, or Linkerd, which provide a central point of management for the Envoy proxies.
These control planes enable features like traffic management, security, and observability.

Istio:

Data Plane:

Istio uses Envoy as its data plane, which means that each microservice has an Envoy sidecar proxy that manages the inbound and outbound traffic for that service.
Envoy is configured and managed by Istio's control plane components.

Control Plane:

Istio provides a built-in control plane that includes the following components:
Pilot: responsible for managing the configuration of the Envoy proxies and enabling features like traffic routing and load balancing.
Mixer: provides policy enforcement, telemetry collection, and access control for the microservices in the service mesh.
Citadel: responsible for managing the security of the service mesh, including mutual TLS encryption and identity-based access control.

Network configurations and Service Discovery - Consul

Network Configurations:

Consul provides a central service registry that keeps track of all the services in the infrastructure.
Each microservice in the infrastructure registers itself with Consul, providing information like its IP address, port, and health status.
Consul also supports multiple datacenters, allowing for the deployment of services across different regions or availability zones.
Consul provides a DNS interface that can be used to discover services in the infrastructure. Applications can use this interface to resolve service names to IP addresses and connect to the appropriate service.

Service Discovery:

Consul provides a service discovery mechanism that enables microservices to discover and communicate with each other.
Consul supports different service discovery methods, including DNS, HTTP, and gRPC.
Consul can perform health checks on the services in the infrastructure to ensure that they are functioning properly. If a service fails a health check, it is removed from the service registry until it is healthy again.
Consul also supports service segmentation, allowing services to be grouped into logical subsets based on tags or other attributes. This enables more fine-grained control over service discovery and traffic routing.

Securing Credentials - HashiCorp Vault

Secret Storage:

Vault provides a secure storage mechanism for sensitive data, including credentials, API keys, and other secrets.
Vault uses encryption and access control policies to ensure that secrets are protected both at rest and in transit.
Vault supports different storage backends, including disk, cloud storage, and key management systems.

Authentication:

Vault provides several authentication methods that can be used to validate user or machine identity.
These methods include LDAP, Active Directory, Kubernetes, and token-based authentication.
Vault also supports multi-factor authentication (MFA) to provide an additional layer of security.

Access Control:

Vault provides fine-grained access control policies that can be used to restrict access to specific secrets or resources.
These policies can be based on user or machine identity, time of day, and other factors.
Vault supports role-based access control (RBAC) and attribute-based access control (ABAC) policies.

Encryption:

Vault provides end-to-end encryption for all secrets stored in its storage backend.
Vault uses encryption keys that are stored separately from the secrets themselves, providing an additional layer of security.
Vault supports different encryption algorithms and key management systems.

Auditing and Logging:

Vault provides detailed auditing and logging capabilities that can be used to track access to secrets and detect potential security threats.
Vault logs all user and system activity, including authentication events, secret access, and configuration changes.
Vault also supports integration with popular logging and monitoring tools.

Infrastructure Monitoring Tool - Prometheus with Grafana

Prometheus

Introduction
Introduction to Prometheus
Prometheus installation
Grafana with Prometheus Installation

Monitoring
Introduction to Monitoring
Client Libraries
Pushing Metrics
Querying
Service Discovery
Exporters

Alerting
Introduction to Alerting
Setting up Alerts

Internals
Prometheus Storage
Prometheus Security
TLS & Authentication on Prometheus Server
Mutual TLS for Prometheus Targets

Use Cases
Monitoring a web application
Calculating Apdex score
Cloudwatch Exporter
Grafana Provisioning
Consul Integration with Prometheus
EC2 Auto Discovery

Grafana

Installation
Installing on Ubuntu / Debian
Installing on Centos / Redhat
Installing on Windows
Installing on Mac
Installing using Docker
Building from source
Upgrading

Administration
Configuration
Authentication
Permissions
Grafana CLI
Internal metrics
Provisioning
Troubleshooting

Log Monitoring Tool - Elasticsearch Logstash Kibana(ELK stake)

Introduction to Elasticsearch
Overview of the Elastic Stack (ELK+)
Elastic Stack

Architecture of Elasticsearch
Nodes & Clusters
Indices & Documents
A word on types
Another word on types
Sharding
Replication
Keeping replicas synchronized
Searching for data
Distributing documents across shards

Installing Elasticsearch & Kibana
Running Elasticsearch & Kibana in Elastic Cloud
Installing Elasticsearch on Mac/Linux
Using the MSI installer on Windows
Installing Elasticsearch on Windows
Configuring Elasticsearch
Installing Kibana on Mac/Linux
Installing Kibana on Windows
Configuring Kibana
Kibana now requires data to be available
Introduction to Kibana and dev tools

Managing Documents
Creating an index
Adding documents
Retrieving documents by ID
Replacing documents
Updating documents
Scripted updates
Upserts
Deleting documents
Deleting indices
Batch processing
Importing test data with cURL
Exploring the cluster

Mapping
Introduction to mapping
Dynamic mapping
Meta fields
Field data types
Adding mappings to existing indices
Changing existing mappings
Mapping parameters
Adding multi-fields mappings
Defining custom date formats
Picking up new fields without dynamic mapping

Analysis & Analyzers
Introduction to the analysis process
A closer look at analyzers
Using the Analyze API
Understanding the inverted index
Analyzers
Overview of character filters
Overview of tokenizers
Overview of token filters
Overview of built-in analyzers
Configuring built-in analyzers and token filters
Creating custom analyzers
Using analyzers in mappings
Adding analyzers to existing indices
A word on stop words

Introduction to Searching
Search methods
Searching with the request URI
Introducing the Query DSL
Understanding query results
Understanding relevance scores
Debugging unexpected search results
Query contexts
Full text queries vs term level queries
Basics of searching

Term Level Queries
Introduction to term level queries
Searching for a term
Searching for multiple terms
Retrieving documents based on IDs
Matching documents with range values
Working with relative dates (date math)
Matching documents with non-null values
Matching based on prefixes
Searching with wildcards
Searching with regular expressions
Term Level Queries

Full Text Queries
Introduction to full text queries
Flexible matching with the match query
Matching phrases
Searching multiple fields
Full Text Queries

Adding Boolean Logic to Queries
Introduction to compound queries
Querying with boolean logic
Debugging bool queries with named queries
How the “match” query works

Incident Response Tool - PagerDuty & Opsgenie

Alert Management:

Both PagerDuty and Opsgenie provide powerful alert management capabilities, allowing teams to configure alerts based on specific criteria, such as event severity, priority, and more.
Alerts can be sent to multiple channels, including email, SMS, voice, and mobile push notifications.
Both tools also provide support for escalation policies, allowing teams to ensure that critical alerts are addressed promptly.

Incident Management:

Both PagerDuty and Opsgenie provide incident management capabilities, allowing teams to track incidents and collaborate on resolving them.
Incident management features include creating incidents, adding notes, assigning owners, and tracking status changes.
Both tools also provide support for incident timelines, allowing teams to visualize the progress of an incident over time.

Integration:

Both PagerDuty and Opsgenie provide extensive integration capabilities, allowing teams to integrate with a wide range of tools and technologies.
Integrations include popular monitoring tools, such as Nagios, New Relic, and AWS CloudWatch, as well as IT service management (ITSM) tools like JIRA and ServiceNow.
Both tools also provide REST APIs for custom integrations.

Analytics and Reporting:

Both PagerDuty and Opsgenie provide analytics and reporting capabilities, allowing teams to track performance metrics and identify areas for improvement.
Analytics and reporting features include incident duration, resolution times, and other key performance indicators (KPIs).
Both tools also provide support for custom dashboards and reports.

Automation:

Both PagerDuty and Opsgenie provide automation capabilities, allowing teams to automate repetitive tasks and streamline incident response processes.
Automation features include auto-acknowledgment of alerts, auto-escalation of incidents, and auto-remediation of issues.
Both tools also provide support for scripting and custom automation workflows.

Production Env Job scheduler and Run Book Automation - RunDeck

Job Scheduling:

RunDeck provides powerful job scheduling capabilities, allowing teams to schedule jobs based on specific criteria, such as time, date, and recurrence.
Jobs can be executed on multiple platforms, including Windows, Linux, and macOS.
RunDeck also provides support for job dependencies, allowing teams to ensure that jobs are executed in the correct order.

Run Book Automation:

RunDeck provides run book automation capabilities, allowing teams to automate repetitive tasks and streamline operations.
Run book automation features include executing commands, scripts, and workflows on multiple systems, as well as orchestrating complex processes across multiple systems.
RunDeck also provides support for auditing and logging, allowing teams to track changes and monitor system activity.

Integration:

RunDeck provides extensive integration capabilities, allowing teams to integrate with a wide range of tools and technologies.
Integrations include popular configuration management tools, such as Ansible and Puppet, as well as monitoring tools like Nagios and Zabbix.
RunDeck also provides REST APIs for custom integrations.

Access Control:

RunDeck provides access control capabilities, allowing teams to control who can access and execute jobs and workflows.
Access control features include role-based access control (RBAC), LDAP integration, and multi-factor authentication (MFA).
RunDeck also provides support for audit logging, allowing teams to track user activity and changes to system configurations.

Notifications and Reporting:

RunDeck provides notifications and reporting capabilities, allowing teams to track performance metrics and identify areas for improvement.
Notifications and reporting features include job execution status, error notifications, and custom reports.
RunDeck also provides support for custom dashboards and reports.

Application Performance Monitoring - Appdynamics

Application Performance Monitoring:

AppDynamics provides powerful application performance monitoring capabilities, allowing teams to monitor the performance of their applications in real-time.
APM features include application topology maps, transaction tracing, code-level diagnostics, and performance baselines.
AppDynamics also provides support for identifying and troubleshooting performance issues, such as slow database queries, inefficient code, and memory leaks.

End-User Monitoring:

AppDynamics provides end-user monitoring capabilities, allowing teams to track the performance of their applications from the end-user perspective.
End-user monitoring features include real-user monitoring, synthetic monitoring, and business transaction monitoring.
AppDynamics also provides support for identifying and troubleshooting end-user issues, such as slow page load times and errors.

Infrastructure Monitoring:

AppDynamics provides infrastructure monitoring capabilities, allowing teams to monitor the health and performance of their infrastructure.
Infrastructure monitoring features include server monitoring, container monitoring, and cloud monitoring.
AppDynamics also provides support for identifying and troubleshooting infrastructure issues, such as high CPU usage, low memory, and network latency.

Integration:

AppDynamics provides extensive integration capabilities, allowing teams to integrate with a wide range of tools and technologies.
Integrations include popular monitoring tools, such as Splunk and Elasticsearch, as well as IT service management (ITSM) tools like ServiceNow and Remedy.
AppDynamics also provides REST APIs for custom integrations.

Analytics and Reporting:

AppDynamics provides analytics and reporting capabilities, allowing teams to track performance metrics and identify areas for improvement.
Analytics and reporting features include transaction analysis, error analysis, and custom dashboards.
AppDynamics also provides support for machine learning and predictive analytics, allowing teams to proactively identify and address performance issues.

ArgoCD - Agenda

Session 1: Introduction to ArgoCD

Overview of ArgoCD and its features
Understanding the role of ArgoCD in GitOps workflows
Key concepts and components of ArgoCD

Session 2: Installing and Configuring ArgoCD

Preparing the environment for ArgoCD installation
Step-by-step installation guide for ArgoCD
Configuring ArgoCD server and connecting it to the Git repository

Session 3: ArgoCD Architecture and Components

Understanding the architecture of ArgoCD
Exploring the various components of ArgoCD, such as the API server, controller, and repository server

Session 4: Deploying Applications with ArgoCD

Creating applications in ArgoCD
Configuring application specifications using GitOps manifests
Deploying applications and managing their lifecycle with ArgoCD

Session 5: Continuous Delivery with ArgoCD

Implementing continuous delivery pipelines using ArgoCD
Automating application updates and rollbacks with ArgoCD
Monitoring and managing application deployments with ArgoCD

Session 6: Advanced ArgoCD Features

Exploring advanced features of ArgoCD, such as RBAC and secrets management
Integrating ArgoCD with other tools and services, like Kubernetes, Helm, and Prometheus

Session 7: Troubleshooting and Best Practices

Common issues and troubleshooting techniques in ArgoCD
Best practices for managing and maintaining ArgoCD deployments
Tips for optimizing performance and scalability in ArgoCD

Conclusion

The attributes of SRE

“There are a lot of attributes SRE would share with any engineering discipline: pragmatic, objective, articulate, expressive,” says Theo Schlossnagle, founder of Circonus. “However, one that sets itself apart is a desire to straddle layers of abstraction.”

1. Operations is a software problem

“The basic tenet of SRE is that doing operations well is a software problem. SRE should therefore use software engineering approaches to solve that problem.”

2. Manage by Service Level Objectives (SLOs)

Maintaining 100% availability isn’t the goal of SRE. “Instead, the product team and the SRE team select an appropriate availability target for the service and its user base, and the service is managed to that SLO. Deciding on such a target requires strong collaboration from the business.”

3. Work to minimize toil

— Toil is tedious, manual, work. SRE doesn’t accept toil as the default. “We believe that if a machine can perform a desired operation, then a machine often should. This is a distinction (and a value) not often seen in other organizations, where toil is the job, and that’s what you’re paying a person to do.”

4. Automate this year’s job away

Automation goes hand-in-hand with reducing toil by “determining what to automate, under what conditions, and how to automate it.”

5. Move fast by reducing the cost of failure

The later a problem is discovered, the harder it is to fix. SRE addresses this issue. “SREs are specifically charged with improving undesirably late problem discovery, yielding benefits for the company as a whole.”

6. Share ownership with developers

SRE aims to reduce boundaries. “Ideally, both product development and SRE teams should have a holistic view of the stack—the frontend, backend, libraries, storage, kernels, and physical machine—and no team should jealously own single components.”

7. Use the same tooling, regardless of function or job title

In SRE, you can’t have different teams using different sets of tools. “There is no good way to manage a service that has one tool for the SREs and another for the product developers, behaving differently (and potentially catastrophically so) in different situations. The more divergence you have, the less your company benefits from each effort to improve each individual tool.”

INTERVIEW

As part of this, You would be given complete interview preparations kit, set to be ready for the SRE hotseat. This kit has been crafted by 200+ years industry experience and the experiences of nearly 10000 DevOpsSchool SRE learners worldwide.

PROJECTS

To put your knowledge on into action, you will be required to work on 1 real time scenario industry-based projects that discuss significant real-time use cases. This project will be completely in-line with the modules mentioned in the curriculum and help you to understand real-work environment.

OUR COURSE IN COMPARISON

FEATURES	DEVOPSSCHOOL	OTHERS
Faculty Profile Check
Lifetime Technical Support
Lifetime LMS access
Top 26 Tools
Training + Additional Videos
Real time scenario projects
Interview KIT (Q&A)
Training Notes
Step by Step Web Based Tutorials
Training Slides

AQA vs ADEV vs SRE vs DEVOPS vs DSOCP vs MDE

Why SRE skill is essential for Software Engineers?

This is the ERA of IT and the whole world has switched to online. Whether shops, banks, service industries or any other businesses and its really crucial to have services up and running as quickly as possible and we must try to prevent any subsequent failure for as long as possible.

If we'll see various services like: GMAIL, Google, Walmart, Netflix, Facebook, Twitter or various e-commerce operations to global banks to search engines they have been running like without any failure for a much longer period of time. We don't even remember when the last time their operations was down. According to Gartner, the average cost of downtime is going somewhere around $5,600 per minute to—when it comes to Amazon —$2 million for every minute down. The way we manage systems and their workloads has changed. How its possible to continuouly running all these services with hell lots of requests, clicks, coninuous changes and improvment and uses 24X7 - 365 days. Behind the scenes, there are principles of "Site Reliability Engineering (SRE)" that takes place.

Reliability of websites, cloud applications and cloud infrastructure has turn into an important business needs. These days we hardly think about high-performance servers instead of that we are using cloud services from where we can pool commodity servers through virtualization. The focus has shifted from hardware to software-defined infrastructure and from inconsistent and error-prone manual processes to consistent, reliable, and repeatable automated tasks. A Site Reliability Engineer (SRE) is some one who can take care and be accountable for the availability, performance, monitoring, and incident response, among other things, of the platforms and services that our businesses runs and owns.

The SRE methodolgy and priciples establishes a healthy and productive interaction between the development and SRE teams using SLOs and error budgets to balance the speed of new features with whatever work is needed to make the software reliable. They care about every step and process from source code to deployment. SRE therefore required quite special expertise and various tools in their arsenal to succeed, along with strong trust between teams.

How Our SRECP course would help?

The goal of our SRE course is to make you a Certifed SRE Engineer from a normal software engineer or operation engineer. Our currciculum will help you to learn all the skills you need to develop, the mindset shift that needs to take place, and the practical work experience you should pursue before directly getting into a SRE role.
Our SRE training will help you to walk through all the concepts, principles and approach to service management, and help you to gain an understanding of the basics to advanced topics of site reliability engineering. You'll get all the real-world examples and use cases of how companies are using SRE approach to ensure that their services are exactly as reliable as they need to be. And what technical and professional skills an SRE needs to embed themselves within development teams with culture and human aspects of makes up a good SRE team that drives successful implementation.
Our SRE curriculum and certification are acrredited from DevOpsCertification.co.
The SRE training will be delivered by accredited trainers who are highly experienced professionals with 15+ years of industry experience and have trained more than 5000 professionals.

Pre-requisites

There are no as such specific pre-requisites but IT experience/Operations experience/DevOps knowledge is recommended

How DevOpsSchool will help in SRE Certification & Courses

Site Reliability Engineering(SRE) Intermediate Certification - Instructor-led, Live & Interactive Training

Calendar

Course Price at

49,999/-

What is Site Reliability Engineering (SRE)?

Know about Site Reliability Engineering Certified Professional (SRECP)?

What is Advantage of SRECP certification?

How to become Site Reliability Engineering Certified Professional?

What you would Learn?

Agenda of the Site Reliability Engineering Certified Professional? Download Curriculum

Envoy

Istio:

Conclusion

INTERVIEW

PROJECTS

OUR COURSE IN COMPARISON

UPCOMING EVENTS - OTHER CERTIFICATION COURSES

SRE

Site Reliability Engineering

January 2021

(DSOCP)

DevSecOps Certified Professional

January 2021

DCA

Docker Certified Associate

January 2021

CKA

Certified Kubernetes Administrator

January 2021

Splunk

Master in Splunk Engineereing

January 2021

Python

Master in Python Programming

January 2021

Need Assistance

Feel Free To Contact Us -

1800 889 7977

(India Toll Free)

+91 7004 215 841

(Worldwide)

For More Queries-

Contact@DevOpsSchool.com

Site Reliability Engineering Certified Professional (SRECP) Certification

What are the benefits of Site Reliability Engineering (SRE) certification?

View more

FREQUENTLY ASKED QUESTIONS

View more

Google Ratings

Videos Reviews

Facebook Ratings

RELATED COURSE

DevOps Certified Professional

Reviews

Site Reliability Engineering Courses

Reviews

Master in DevOps Engineering (MDE)

Reviews

DevSecOps Certified Professional

Reviews

Agile QA

Reviews

Full Stack Developers Training

Reviews

Azure DevOps Training

Reviews

Docker Certified Associate

Reviews

Kubernetes Certification Courses

Reviews

Terraform Training

Reviews

Ansible Training

Reviews

AWS Certified Solution Architect Associate

Reviews

DevOps Certified Professional

Reviews

Site Reliability Engineering Courses