Automatically Detecting Anomalies and Outliers in Real-Time


by DevOpsSchool.com

Rajesh Kumar

(Senior DevOps Manager & Principal Architect)


Rajesh Kumar — an award-winning academician and consultant trainer, with 15+ years’ experience in diverse skill management, who has more than a decade of experience in training large and diverse groups across multiple industry sectors.

Outline



  • Monitoring
  • Alerting
  • Outlier vs. Anomaly Detection
  • Outlier Detection Algorithms
  • Anomaly Detection Algorithms

Monitor Everything


Monitor Everything


Datadog gathers performance data from all your application components.


Monitor Everything


Monitor Everything


Monitor Everything


Alerting


Alerting?


Alerting?


Outlier and Anomaly Detection


Outlier Detection


Outlier Detection


Outlier Detection


Outlier Detection


Outlier Detection Algorithms



Robust Outlier Detection Algorithms


Median Absolute Deviation





Median Absolute Deviation



Median Absolute Deviation


Median Absolute Deviation


Median Absolute Deviation


Median Absolute Deviation


Median Absolute Deviation


Parameters: Tolerance, Pct

DBSCAN


DBSCAN


Parameters:
epsilon, min_samples

DBSCAN



DBSCAN



DBSCAN



MAD or DBSCAN?



MAD or DBSCAN?



Some subtleties



Some subtleties



Some subtleties



Anomaly Detection


An Investigation



An Investigation


An Investigation


An Investigation


An Investigation


Anomalies



A time series point is an anomaly if:


  • Given the past points in the series (®®®®®), the point in question (®) is unlikely given your model of the past;

Anomalies


A time series point is an anomaly if:


  • Given the past points in the series (®®®®®), the point in question (®) is unlikely given your model of the past;


and you should alert on a set of anomalies if:


  • they are a symptom of an issue you care about (Ȋ).

Our Approach



  1. Extract as much signal as we can from the time series.

  2. Use robust statistical measures when creating the model.

  3. Give the user control over when they get alerted.

What’s Normal?



What’s Normal?



What’s Normal?



What’s Normal?



Past Performance...



Past Performance...



Decomposition


Decomposition


Decomposition


Decomposition


Decomposition


Autocorrelation


Signal vs. Noise


Signal vs. Noise


Signal vs. Noise vs. Signal


Real-time Anomaly Detection



Anomaly Detection



Robust Anomaly Detection



Robust Anomaly Detection



Robust Anomaly Detection



Robust Anomaly Detection



Alerting



Alerting



Recap



  • Extract as much signal as you can
  • Use robust statistical measures.
  • Alert judiciously.
  • Don’t over-optimize.

Anomalies or Noise?



Thanks!


Appendixe


See It All In One Place


Your Servers, Your Clouds, Your Metrics, Your Apps, Your team. Together.

See It All In One Place


Your Servers, Your Clouds, Your Metrics, Your Apps, Your team. Together.

Flexible Pricing


To Match Your Dynamic Infrastructure.

DevOpsSchool Community Networks


These platforms provide you the opportunity to connect with peers and industry DevOps leaders, where you can share, discuss or get information on latest topics or happenings in DevOps culture and grow your DevOps professionals network.

DevOps
Build & Release
DevOps
Build & Release
DevOpsSchool
DevOps Group
BestDevOps.com
      

Any Questions?


Thank You!


DevOpsSchool — Lets Learn, Share & Practice DevOps

www.devopsschool.com

Connect with us on
contact@devopsschool.com | +91 700 483 5930
     

Next up:


Datadog Course

Datadog-Data-Pipelines