It is constructed specifically to demonstrate real-world performance engineering concepts in .NET, using a scenario every enterprise system deals with: bulk inserts, EF Core behavior, SQL bottlenecks, GC pressure, and threadpool starvation.

Below is the clear breakdown of what each endpoint proves, what performance concept it demonstrates, and what your trainees should observe.

Sample Code – https://github.com/devopsschool-demo-labs-projects/Performance-Engineering-in-.NET/tree/master/Lab1/PerfLabDemov1

✅ 1. What Naive Bulk Insert Proves

Endpoint:

api/orders/bulk-naive?count=1000

What the code does:

Loops 1000 times.
Adds 1 entity.
Calls SaveChanges() 1000 times.

What it demonstrates:

A. EF Core is very slow when SaveChanges() is called repeatedly

SaveChanges triggers:

Change tracking validation
DB command generation
Opening a DB connection (sometimes pooled)
Executing the SQL INSERT
Committing a transaction

Doing it 1000 times = big overhead.

B. SQL Server becomes chatty

You generate 1000 round-trips to the DB engine.

This simulates real-world anti-patterns:

Chatty repositories
Overuse of Insert() inside loops
Poor DDD aggregates causing many tiny writes
Repository pattern over-abstraction

C. ThreadPool starvation / long-running synchronous IO

SaveChanges() waits for the DB:

Can exhaust threadpool threads
Causes slower Kestrel handling
Latency increases dramatically

D. GC Pressure

Creating many objects inside a loop causes:

More allocations
More Gen0/Gen1 collections
More pauses

Which you can show live using:

dotnet-counters monitor --process-id {pid}

Scientific Output Audience Should Observe

Metric	Naive Bulk	Impact
CPU	Low (waiting on DB)	Slow
Memory	Higher	More GC cycles
Time	Slowest	5–30 sec
DB Calls	1000 INSERTs	Very chatty
Locks	More	Slows everyone else

2. What Optimized Bulk Insert Proves

Endpoint:

api/orders/bulk-optimized?count=1000

What the code does:

Adds 1000 entities into DbContext
Calls SaveChanges once

What it demonstrates:

A. EF Core’s Unit-of-Work Model Works Best

One SaveChanges:

One transaction
Batch of INSERTs
Less change tracker overhead

B. Up to 50x Faster Performance

Typical behavior:

Naive: 10–40 seconds
Optimized: 100–400 ms

C. Far fewer SQL round-trips

Instead of 1000 statements:

SQL batches efficiently
Reduces latency
Reduces network round-trips

D. Lower GC Pressure

One batch = fewer objects alive at once → fewer collections.

Scientific Output Audience Should Observe

Metric	Optimized Bulk	Improvement
CPU	Slightly higher	Good (doing real work)
Memory	Lower	Steady
Time	Very fast	10–40x faster
DB Calls	1	Huge gain
Locks	Minimal	Helps concurrency

3. Check Count — Proves Persistence

Endpoint:

api/orders/count

Purpose:

Prove that both naive & optimized inserts actually inserted data.
Visualize DB growth.
Show effect of concurrent inserts.

In training, we proved:

DB is storing state
Optimized insert quickly increases count
Naive insert slowly increases count

This makes the performance difference real and visible.

Overall What This Demo Proves

This small demo actually teaches 6 core performance engineering concepts:

✔ 1. Impact of chatty database I/O

1000 tiny writes vs 1 batch write.

✔ 2. EF Core change tracking and SaveChanges cost

Understanding EF internals.

✔ 3. Async vs Sync I/O behavior

Bulk naive blocks threads → causes starvation.

✔ 4. Garbage Collection impact

Small object allocation patterns in loops.

✔ 5. SQL Server round-trip cost

Network + parsing + locking overhead.

✔ 6. Importance of batching & correct architectural patterns

Avoid anti-patterns like:

Repository over-abstraction
N+1 operations
Per-row SaveChanges

How to Experience Demo

I. Run Naive Bulk Insert

It hangs or takes long.
Show CPU, memory, threadpool, GC logs.

II. Run Optimized Bulk Insert

Instant.
Show difference in metrics.

III. Ask the audience: Why did it happen?

Discuss:

DB round-trips
EF Core internals
Transaction management
GC
ThreadPool starvation

IV. Tie it to real architecture

“Your microservice might look healthy, but if its DB access pattern is naive, you will suffer scaling issues.”

This is exactly the pain point of real enterprise systems.

I’ll give you:

A simple performance chart dataset (Naive vs Optimized) you can paste into Excel/PowerPoint.
A perf counters command list (dotnet-counters + PerfMon).
Ready-to-use scripts for Postman, k6, and JMeter.

🟦 1. Process Memory (Private Bytes)

In your screenshot:

Memory rises gradually → EF Core and SQL client allocating objects.
Yellow triangles (GC events) → frequent Gen0/Gen1 garbage collections.
After a spike, memory stabilizes because GC reclaimed space.

🎯 Training Insight:

Naive bulk insert = more allocations = more frequent GC = lower throughput

Each SaveChanges call causes allocations of:
- SQL parameters
- Commands
- Transaction objects
- Change tracker nodes
So memory climbs quickly and GC fires often.

Optimized bulk will show:

A single memory spike
Fewer GC events
Faster completion

🟩 2. CPU (% of all processors)

Your screenshot shows:

CPU is mostly idle
Small bumps, but never sustained high usage

🎯 Training Insight:

Naive bulk insert does not maximize CPU — it blocks waiting for DB I/O.

That’s why:

CPU stays low → app is waiting on the database
The bottleneck is not CPU → it’s SQL round-trips

Optimized bulk will show:

Higher CPU bursts (doing real work)
Shorter total duration

🟦 3. ThreadPool Thread Count

Graph shows slow growth from 8 → ~12 threads.

🎯 Training Insight:

This indicates ThreadPool grows because threads are waiting on database I/O.

Naive bulk:

1000 calls to SaveChangesAsync() = 1000 I/O waits
ThreadPool sees “too many blocking tasks” and tries to grow

This is how real production APIs experience:

ThreadPool starvation
Request queues
High latency

Optimized bulk stays closer to 8 threads (default), because the work is not blocking.

🟧 4. GC Heap Size

Increasing heap → many small allocations from:

EF Core
SQL Client
LINQ
Loops & entity allocations

Frequent GC pauses slow down naive bulk significantly.

Optimized bulk = fewer allocations, fewer collections.

1️⃣ Performance Chart – Naive vs Optimized (Sample Data)

These are example numbers you can either:

Use as-is for slides, or
Replace with your actual measurements from your machine.

Suggested Scenario

1000 orders (count=1000)
Single request to /bulk-naive and /bulk-optimized

Example Results (for chart)

Metric	Naive Bulk Insert	Optimized Bulk Insert
Total time (ms)	12000	600
Requests/sec (effective)	~83	~1666
SQL INSERT statements executed	1000	1 (batched)
CPU utilization (peak, %)	25%	40% (short burst)
GC Gen0 collections during run	15	4
Avg DB round-trip time (ms)	8–12	5–8

How to turn this into a chart

Chart 1 – Response Time (ms)

X-axis: Naive, Optimized
Y-axis: Total time (ms)
Values: 12000 vs 600 (log scale works nicely)

Chart 2 – SQL Round-Trips

X-axis: Naive, Optimized
Y-axis: Number of DB round-trips
Values: 1000 vs 1

Chart 3 – Requests/sec (effective throughput)

X-axis: Naive, Optimized
Y-axis: Requests/sec
Values: 83 vs 1666

On the slide, your message is:

“Same functionality. Only difference is how we use EF Core and SaveChanges – and performance differs by ~20x.”

2️⃣ Perf Counters Command List

A. Using `dotnet-counters` (recommended in session)

Find the process ID of your running API:

dotnet tool install --global dotnet-counters
dotnet-counters ps
Code language: PHP (php)

Note the PID for Orders.Api.

Monitor runtime + ASP.NET + GC counters live:

dotnet-counters monitor \
  --process-id <PID> \
  System.Runtime \
  Microsoft.AspNetCore.Hosting \
  Microsoft.AspNetCore.Http.Connections \
  Microsoft.EntityFrameworkCore

dotnet-counters monitor --process-id 55788 System.Runtime Microsoft.AspNetCore.Hosting Microsoft.AspNetCore.Http.Connections Microsoft.EntityFrameworkCore

Code language: CSS (css)

Useful counters to point out:

System.Runtime
- cpu-usage
- gc-heap-size
- gen-0-gc-count, gen-1-gc-count, gen-2-gc-count
- threadpool-thread-count
Microsoft.AspNetCore.Hosting
- requests-per-second
- total-requests
- current-requests
Microsoft.EntityFrameworkCore
- active-db-contexts
- queries-per-second (depending on provider/version)

Run dotnet-counters while you hit:

curl -X POST "http://localhost:5000/api/orders/bulk-naive?count=1000"
Code language: JavaScript (javascript)

then:

curl -X POST "http://localhost:5000/api/orders/bulk-optimized?count=1000"
Code language: JavaScript (javascript)

…and talk through the visible differences (GC counts, RPS, etc.).

B. Windows PerfMon (classic counters)

Open perfmon.exe → Add these counters:

Processor

Processor(_Total)\% Processor Time

Process (dotnet / w3wp if hosted under IIS)

Process(dotnet)\% Processor Time
Process(dotnet)\Private Bytes
Process(dotnet)\Working Set

.NET CLR Memory

.NET CLR Memory(Orders.Api)\# Gen 0 Collections
.NET CLR Memory(Orders.Api)\# Gen 2 Collections
.NET CLR Memory(Orders.Api)\% Time in GC

SQL Server

SQLServer:SQL Statistics\Batch Requests/sec
SQLServer:SQL Statistics\SQL Compilations/sec
SQLServer:SQL Statistics\SQL Re-Compilations/sec

This lets you show:

Naive = many batches/sec, more compilations, more GC.
Optimized = fewer batches/sec, smoother GC.

✅ How to Add These Counters in Windows PerfMon

1. Open PerfMon

Press Win + R
Type: perfmon.exe
Press Enter

2. Add Counters

In the left panel:

Expand Monitoring Tools
Click Performance Monitor
Click the green “+” button (Add Counters)

Now you will add counters category by category.

✅ A. Processor Counters

Category: Processor

Select Processor
Select counter:
- % Processor Time
Select instance:
- _Total
Click Add

✅ B. Process Counters (dotnet / w3wp)

Category: Process

Select Process
Select counters:
- % Processor Time
- Private Bytes
- Working Set
Select instance:
- If running as console → dotnet
- If hosted in IIS → w3wp
Click Add

(You can add multiple counters at once before clicking Add.)

✅ C. .NET CLR Memory Counters

Category: .NET CLR Memory

Select .NET CLR Memory
Select counters:
- # Gen 0 Collections
- # Gen 2 Collections
- % Time in GC
Select instance:
- Your process name → Orders.Api
  (or the DLL name without .dll)
Click Add

✅ D. SQL Server Counters

Category: SQLServer:SQL Statistics

Select SQLServer:SQL Statistics
Select counters:
- Batch Requests/sec
- SQL Compilations/sec
- SQL Re-Compilations/sec
Instance = (select default instance / MSSQLSERVER)
Click Add

🎯 Result Summary

You will now be monitoring:

CPU

Processor(_Total)% Processor Time
Process(dotnet)% Processor Time

Memory

Process(dotnet)\Private Bytes
Process(dotnet)\Working Set

.NET GC

CLR Memory Gen0 collections
CLR Memory Gen2 collections
CLR % Time in GC

SQL Server Load

Batch Requests/sec
SQL Compilations/sec
SQL Re-Compilations/sec

🎯 The Story Your Graphs Will Show

Naive Insert (1000 × SaveChanges)

🔺 High Batch Requests/sec (many round-trips)
🔺 Higher SQL compilations (same query many times)
🔺 More GC activity (more allocations)
🔺 Higher CPU usage

Optimized Insert (1 SaveChanges with AddRange)

🔻 Lower Batch Requests/sec (single round-trip)
🔻 Minimal SQL compilations
🔻 Less GC pressure
🔻 Lower CPU, smoother curve

3️⃣ Postman Collection (JSON)

You can import this directly into Postman.

Create a file PerfLab.postman_collection.json with:

{
  "info": {
    "name": "PerfLab Orders API",
    "_postman_id": "e9d1d5f3-0000-0000-0000-000000000001",
    "description": "Naive vs Optimized bulk insert demo for Orders API",
    "schema": "https://schema.getpostman.com/json/collection/v2.1.0/collection.json"
  },
  "item": [
    {
      "name": "Bulk Naive (1000)",
      "request": {
        "method": "POST",
        "header": [],
        "url": {
          "raw": "{{baseUrl}}/api/orders/bulk-naive?count=1000",
          "host": ["{{baseUrl}}"],
          "path": ["api", "orders", "bulk-naive"],
          "query": [
            {
              "key": "count",
              "value": "1000"
            }
          ]
        }
      }
    },
    {
      "name": "Bulk Optimized (1000)",
      "request": {
        "method": "POST",
        "header": [],
        "url": {
          "raw": "{{baseUrl}}/api/orders/bulk-optimized?count=1000",
          "host": ["{{baseUrl}}"],
          "path": ["api", "orders", "bulk-optimized"],
          "query": [
            {
              "key": "count",
              "value": "1000"
            }
          ]
        }
      }
    },
    {
      "name": "Get Count",
      "request": {
        "method": "GET",
        "header": [],
        "url": {
          "raw": "{{baseUrl}}/api/orders/count",
          "host": ["{{baseUrl}}"],
          "path": ["api", "orders", "count"]
        }
      }
    }
  ],
  "variable": [
    {
      "key": "baseUrl",
      "value": "http://localhost:5000"
    }
  ]
}
Code language: JSON / JSON with Comments (json)

How to use:

Postman → Import → Select this JSON.
Use baseUrl = http://localhost:5000 or https://localhost:5001.

4️⃣ k6 Load Test Script

Create k6-orders-perf.js:

import http from 'k6/http';
import { check, sleep } from 'k6';
import { Trend } from 'k6/metrics';

const naiveTrend = new Trend('naive_duration');
const optTrend = new Trend('optimized_duration');

export const options = {
  scenarios: {
    naive: {
      executor: 'constant-vus',
      vus: 5,
      duration: '30s',
      exec: 'testNaive',
    },
    optimized: {
      executor: 'constant-vus',
      vus: 5,
      duration: '30s',
      exec: 'testOptimized',
      startTime: '35s'
    }
  }
};

const BASE_URL = __ENV.BASE_URL || 'http://localhost:5000';

export function testNaive() {
  const res = http.post(`${BASE_URL}/api/orders/bulk-naive?count=100`, null);
  naiveTrend.add(res.timings.duration);
  check(res, {
    'status is 200': r => r.status === 200
  });
  sleep(1);
}

export function testOptimized() {
  const res = http.post(`${BASE_URL}/api/orders/bulk-optimized?count=100`, null);
  optTrend.add(res.timings.duration);
  check(res, {
    'status is 200': r => r.status === 200
  });
  sleep(1);
}
Code language: JavaScript (javascript)

Run:

k6 run k6-orders-perf.js
# or explicitly
BASE_URL=http://localhost:5000 k6 run k6-orders-perf.js
Code language: PHP (php)

In the output, point at:

naive_duration{} summary vs optimized_duration{}
http_reqs, http_req_duration, http_req_failed

5️⃣ JMeter Test Plan (Concept + Minimal XML)

Concept for your training

Create 2 Thread Groups:

NaiveBulk:
- 10 threads (users)
- 10 loops
- HTTP Request Sampler → POST /api/orders/bulk-naive?count=100
OptimizedBulk:
- 10 threads
- 10 loops
- HTTP Request Sampler → POST /api/orders/bulk-optimized?count=100

Add:

View Results in Table
Summary Report
Aggregate Report

Compare:

Avg response time
90th/95th percentile
of samples
Throughput

Minimal `.jmx` skeleton (for reference)

I won’t blow this up with full XML, but here is the key idea you can recreate in the JMeter GUI:

Test Plan
- HTTP Request Defaults: Server Name or IP = localhost, Port = 5000
- Thread Group: NaiveGroup
  - HTTP Sampler: POST /api/orders/bulk-naive with param count=100
- Thread Group: OptimizedGroup
  - HTTP Sampler: POST /api/orders/bulk-optimized with param count=100
- Listeners: Summary Report, Aggregate Report

You can save it as PerfLabNaiveVsOptimized.jmx and share with students.

Rajesh Kumar

I’m a DevOps/SRE/DevSecOps/Cloud Expert passionate about sharing knowledge and experiences. I have worked at Cotocus. I share tech blog at DevOps School, travel stories at Holiday Landmark, stock market tips at Stocks Mantra, health and fitness guidance at My Medic Plus, product reviews at TrueReviewNow , and SEO strategies at Wizbrand.

Do you want to learn Quantum Computing?

Please find my social handles as below;

Rajesh Kumar Personal Website
Rajesh Kumar at YOUTUBE
Rajesh Kumar at INSTAGRAM
Rajesh Kumar at X
Rajesh Kumar at FACEBOOK
Rajesh Kumar at LINKEDIN
Rajesh Kumar at WIZBRAND

Rajesh Kumar DailyLogs

Find Trusted Cardiac Hospitals

Compare heart hospitals by city and services — all in one place.

Explore Hospitals

Find Trusted Cardiac Hospitals

✅ 1. What Naive Bulk Insert Proves

Endpoint:

What the code does:

What it demonstrates:

A. EF Core is very slow when SaveChanges() is called repeatedly

B. SQL Server becomes chatty

C. ThreadPool starvation / long-running synchronous IO

D. GC Pressure

Scientific Output Audience Should Observe

2. What Optimized Bulk Insert Proves

Endpoint:

What the code does:

What it demonstrates:

A. EF Core’s Unit-of-Work Model Works Best

B. Up to 50x Faster Performance

C. Far fewer SQL round-trips

D. Lower GC Pressure

Scientific Output Audience Should Observe

3. Check Count — Proves Persistence

Endpoint:

Overall What This Demo Proves

✔ 1. Impact of chatty database I/O

✔ 2. EF Core change tracking and SaveChanges cost

✔ 3. Async vs Sync I/O behavior

✔ 4. Garbage Collection impact

✔ 5. SQL Server round-trip cost

✔ 6. Importance of batching & correct architectural patterns

How to Experience Demo

I. Run Naive Bulk Insert

II. Run Optimized Bulk Insert

III. Ask the audience: Why did it happen?

IV. Tie it to real architecture

🟦 1. Process Memory (Private Bytes)

🎯 Training Insight:

🟩 2. CPU (% of all processors)

🎯 Training Insight:

🟦 3. ThreadPool Thread Count

🎯 Training Insight:

🟧 4. GC Heap Size

1️⃣ Performance Chart – Naive vs Optimized (Sample Data)

Suggested Scenario

Example Results (for chart)

How to turn this into a chart

2️⃣ Perf Counters Command List

A. Using dotnet-counters (recommended in session)

B. Windows PerfMon (classic counters)

✅ How to Add These Counters in Windows PerfMon

1. Open PerfMon

2. Add Counters

✅ A. Processor Counters

Category: Processor

✅ B. Process Counters (dotnet / w3wp)

Category: Process

✅ C. .NET CLR Memory Counters

Category: .NET CLR Memory

✅ D. SQL Server Counters

Category: SQLServer:SQL Statistics

🎯 Result Summary

CPU

Memory

.NET GC

SQL Server Load

🎯 The Story Your Graphs Will Show

Naive Insert (1000 × SaveChanges)

Optimized Insert (1 SaveChanges with AddRange)

3️⃣ Postman Collection (JSON)

4️⃣ k6 Load Test Script

5️⃣ JMeter Test Plan (Concept + Minimal XML)

Concept for your training

Minimal .jmx skeleton (for reference)

Find Trusted Cardiac Hospitals

Certification Courses

A. Using `dotnet-counters` (recommended in session)

Minimal `.jmx` skeleton (for reference)