Azure Network Watcher Tutorial: Architecture, Pricing, Use Cases, and Hands-On Guide for Management and Governance

1. Introduction

Azure Network Watcher is Azure’s native network monitoring and diagnostics service for virtual networks. It helps you understand, diagnose, and gain visibility into network traffic, routing, and connectivity issues inside Azure networking.

In simple terms: Azure Network Watcher tells you what the network is doing and why—whether a VM can (or cannot) reach another endpoint, which route is being used, whether an NSG rule is blocking traffic, and what flows are actually traversing your network.

Technically, Azure Network Watcher is a regional service that provides a set of tools (connectivity tests, packet capture, NSG flow logs, topology views, routing diagnostics, and VPN troubleshooting) that interact with Azure networking resources such as VNets, NICs, NSGs, route tables, and VPN gateways. It integrates with Azure Monitor and Log Analytics for alerting, visualization, and long-term analysis.

What problem it solves: network problems are often the hardest to troubleshoot in cloud environments because the “network” is a mix of distributed components (NSGs, routes, DNS, NAT, firewalls, private endpoints, gateways). Azure Network Watcher provides first-party, Azure-aware diagnostics so you can troubleshoot faster, reduce downtime, and improve governance over network operations.

Naming and lifecycle note (verify in official docs for the latest): Azure Network Watcher is an active service. Some experiences within it have evolved over time (for example, “Connection Monitor” has newer versions, and older “classic” monitoring experiences have been retired or deprecated). When using this tutorial, always prefer the latest workflow shown in Azure’s official documentation.

2. What is Azure Network Watcher?

Official purpose: Azure Network Watcher is designed to monitor, diagnose, view metrics, and enable or disable logs for resources in an Azure virtual network.

Core capabilities (what it does)

Azure Network Watcher commonly provides: – Connectivity diagnostics (why a connection succeeds or fails) – Traffic flow logging (NSG flow logs) – Network topology visualization (resource relationship mapping) – Routing diagnostics (effective routes, next-hop checks) – Security rule evaluation (IP flow verify, security group view) – Deep packet-level capture (packet capture on supported VMs) – VPN troubleshooting (for supported VPN gateway scenarios)

Major components (how it’s organized)

In practice, you’ll interact with Azure Network Watcher through: – Azure Portal (Network Watcher blade) – Azure CLI / PowerShell (network watcher commands) – Azure Resource Manager (ARM) APIs (management-plane operations) – Regional Network Watcher resource that Azure creates/uses (often visible in a NetworkWatcherRG resource group in the region)

Service type

Service category fit: Although it’s closely tied to networking, Azure Network Watcher is also a key Management and Governance tool because it supports operational governance: troubleshooting, logging, auditing network changes, and enforcing visibility standards across environments.
Type: Primarily a management-plane diagnostic service that orchestrates data collection from networking resources and agents/extensions on VMs for certain features.

Scope: regional vs global

Regional service: Azure Network Watcher is enabled per region. If you troubleshoot resources in multiple Azure regions, you typically ensure Network Watcher is enabled in each of those regions.
Subscription context: It operates within the context of a subscription (and the resources you have access to via RBAC).

How it fits into the Azure ecosystem

Azure Network Watcher complements and integrates with: – Azure Virtual Network (VNet), NICs, NSGs, route tables, Private Endpoints – VPN Gateway (and certain gateway diagnostic scenarios) – Azure Monitor / Log Analytics for querying and alerting on logs – Azure Storage (common destination for NSG flow logs and packet captures) – Microsoft Sentinel (optional downstream consumption of logs, depending on your logging architecture)

3. Why use Azure Network Watcher?

Business reasons

Reduce downtime and MTTR: Faster root cause analysis for network incidents.
Improve service reliability: Proactive monitoring (for example, continuous connection monitoring) can detect degradations before users report them.
Operational standardization: Establish repeatable troubleshooting and logging patterns across teams and subscriptions.

Technical reasons

Azure-aware diagnostics: The tools understand Azure networking objects (NSGs, UDRs, NAT behavior, gateways).
Pinpoint “where it broke”: Identify whether failure is due to NSG rules, routing, DNS resolution, or unreachable next hop.
Evidence-driven troubleshooting: Flow logs and packet capture provide concrete proof rather than assumptions.

Operational reasons

Self-service for platform teams: Provide engineers a consistent diagnostic toolkit.
Integrates with runbooks: CLI/PowerShell support enables automation in incident response.
Supports governance: Enabling consistent logging (like NSG flow logs) supports audits and investigations.

Security/compliance reasons

Network forensics: Flow logs help investigate suspicious traffic patterns and policy violations.
Change validation: Verify the effect of NSG/route changes without waiting for user impact.
Segmentation assurance: Validate that segmentation rules actually block prohibited traffic.

Scalability/performance reasons

Designed for cloud scale: Continuous monitoring and logging can scale across many VNets and workloads (with careful cost control).
Targeted deep dives: Packet capture is available when you need detail, rather than always-on full capture.

When teams should choose it

Choose Azure Network Watcher when you need: – Repeatable network troubleshooting for Azure IaaS and hybrid networking – NSG flow visibility for detection, investigation, and governance – Connection monitoring between endpoints (Azure-to-Azure and potentially hybrid, depending on your design) – A first-party approach aligned with Azure RBAC and resource model

When teams should not choose it

Consider alternatives or complements when: – You need full NPM/APM across applications (use Azure Monitor Application Insights or third-party APM) – You need full network IDS/IPS or advanced L7 inspection (consider Azure Firewall, third-party NVA, or dedicated security tooling) – Your environment is mostly PaaS-only with minimal VNets/NSGs (you may rely more on service-specific diagnostics and Azure Monitor) – You require long-term, centralized SIEM correlation: Network Watcher is a source; Sentinel/SIEM is the analysis plane

4. Where is Azure Network Watcher used?

Industries

Common in any industry operating regulated or mission-critical networks: – Finance and insurance (segmentation, auditability) – Healthcare (compliance logging, incident investigations) – Retail/e-commerce (availability and performance) – SaaS providers (multi-tenant segmentation and operational monitoring) – Public sector (governance, audit trails)

Team types

Cloud platform teams managing shared networking
SRE/operations teams responding to incidents
Security engineering and SOC teams investigating network events
DevOps teams validating connectivity for deployments
Network engineers extending on-prem patterns into Azure

Workloads and architectures

Hub-and-spoke VNets with centralized firewalls
Multi-region active-active or active-passive designs
Hybrid connectivity (VPN/ExpressRoute plus on-prem DNS)
Microsegmented environments using NSGs and UDRs
Kubernetes clusters (AKS) that depend on underlying VNet routing and security (note: some diagnostics are at VM/NIC/NSG level; interpret accordingly)

Real-world deployment contexts

Production: Continuous connection monitoring, NSG flow logs to a centralized logging account, alerting on failures
Dev/test: Ad-hoc troubleshooting during provisioning, validating NSG rules, debugging routes during lab builds

5. Top Use Cases and Scenarios

Below are realistic scenarios where Azure Network Watcher is commonly used.

1) Diagnose “VM can’t reach VM” inside a VNet

Problem: Two VMs in the same VNet can’t connect on a specific port.
Why it fits: Connection troubleshoot + IP flow verify can quickly isolate NSG/routing issues.
Example: App VM can’t reach DB VM on TCP 1433 after an NSG change.

2) Validate NSG rules before and after deployments

Problem: Engineers deploy new rules but aren’t sure which rule will match traffic.
Why it fits: IP flow verify and security group view help confirm effective security rules.
Example: Confirm that only the load balancer subnet can reach backend VMs on port 443.

3) Capture packets to debug intermittent TCP resets

Problem: Users see timeouts or resets, but logs aren’t conclusive.
Why it fits: Packet capture provides packet-level evidence (SYN/SYN-ACK, retransmits, resets).
Example: A Linux VM intermittently fails to establish TLS sessions to an internal API.

4) Audit traffic patterns with NSG flow logs

Problem: Need to know which sources are talking to a subnet and on which ports.
Why it fits: NSG flow logs provide structured flow telemetry for investigation and baselining.
Example: Detect unexpected inbound attempts on SSH from unapproved source ranges.

5) Confirm routing and next hop after UDR changes

Problem: Routing changes accidentally send traffic to the wrong appliance or blackhole.
Why it fits: Next hop and effective routes show the chosen path.
Example: After adding a default route to a firewall, a subnet loses access to Azure services.

6) Troubleshoot VPN connectivity issues

Problem: On-premises can’t reach Azure subnets through VPN.
Why it fits: VPN troubleshoot can help identify common tunnel and configuration issues.
Example: A site-to-site VPN drops after an on-prem network device update.

7) Continuous monitoring of critical dependencies

Problem: Need early warning if a critical service becomes unreachable.
Why it fits: Connection Monitor supports continuous tests and integration with alerts (via Azure Monitor).
Example: Monitor connectivity between web tier and database tier across regions.

8) Validate segmentation in hub-and-spoke environments

Problem: Must prove that spokes are isolated except through shared services.
Why it fits: IP flow verify, next hop, and flow logs help validate and document segmentation.
Example: Ensure Spoke-A cannot reach Spoke-B directly, only via firewall.

9) Investigate suspected data exfiltration paths

Problem: Security suspects a VM is sending data to an unauthorized destination.
Why it fits: NSG flow logs and connection monitoring help confirm egress paths and destinations.
Example: A workload unexpectedly initiates outbound connections to unknown IPs.

10) Troubleshoot DNS-related connectivity symptoms (indirectly)

Problem: “Connection fails” but root cause is name resolution.
Why it fits: Connection troubleshooting workflows can reveal whether failure is at DNS vs network.
Example: App can reach an IP directly but fails when using hostname after DNS changes.

11) Validate Private Endpoint and NSG/UDR interactions

Problem: Private Endpoint traffic doesn’t behave as expected; access is denied.
Why it fits: Effective security rules and route diagnostics clarify whether traffic is blocked.
Example: Private Endpoint access fails from a locked-down subnet with strict NSGs.

12) Standardize incident runbooks for network triage

Problem: Different engineers troubleshoot differently, wasting time.
Why it fits: Network Watcher provides consistent tools that can be embedded in runbooks.
Example: “Tier-1 network triage” checklist using next hop + IP flow verify + test connectivity.

6. Core Features

This section lists key Azure Network Watcher features commonly available in current Azure deployments. Availability and exact UI naming can change—verify in official docs for your region and subscription.

Network topology

What it does: Visualizes network resources and their relationships (VNets, subnets, NICs, NSGs, route tables, gateways).
Why it matters: Helps quickly understand “what’s connected to what.”
Practical benefit: Faster onboarding and troubleshooting—especially in shared hub-and-spoke networks.
Limitations/caveats: Topology is a view, not a source of truth for traffic flows; it shows resource relationships, not packet paths.

Connection Monitor (connectivity monitoring)

What it does: Continuously monitors connectivity between endpoints and collects latency/availability metrics; can integrate with alerting.
Why it matters: Moves teams from reactive troubleshooting to proactive detection.
Practical benefit: Detect dependency failures between tiers (web → API → DB), across subnets/regions/hybrid (depending on endpoint type and agent support).
Limitations/caveats: Often requires an agent/extension on VMs for certain endpoint types; monitor design affects cost and data volume.

Connection troubleshoot / Test connectivity

What it does: Performs on-demand connectivity checks between a source and destination and provides diagnostic output (reachable/unreachable, hops, potential blocking).
Why it matters: Quickly answers “is it network or not?”
Practical benefit: Identifies NSG/UDR issues without manually correlating rules and routes.
Limitations/caveats: Some checks require VM agent/extension; results are point-in-time.

IP flow verify

What it does: Validates whether traffic (5-tuple) is allowed or denied by NSG rules for a VM NIC at a given time.
Why it matters: NSG rule evaluation is one of the most common causes of connectivity issues.
Practical benefit: Pinpoints the exact NSG rule (allow/deny) affecting traffic.
Limitations/caveats: Focused on NSG evaluation; doesn’t prove the remote endpoint is listening or that routing is correct.

Next hop

What it does: Shows the next hop type and IP for traffic from a VM to a destination, based on effective routes.
Why it matters: Routing surprises are common in hub-and-spoke networks with UDRs.
Practical benefit: Confirms whether traffic is going to Internet, a virtual appliance, a gateway, or staying within VNet.
Limitations/caveats: Again, route choice isn’t the same as end-to-end success; downstream devices can still drop traffic.

Effective routes

What it does: Displays the effective route table applied to a NIC, including system routes and user-defined routes (UDRs).
Why it matters: Many outages come from unintended route propagation or UDR mistakes.
Practical benefit: Enables deterministic verification of routing behavior.
Limitations/caveats: Must interpret with knowledge of peering, gateways, and appliance routing.

Security group view (effective NSG rules)

What it does: Shows the effective inbound/outbound security rules applied to a NIC from associated NSGs.
Why it matters: Multiple NSGs (subnet + NIC) can make effective policy unclear.
Practical benefit: Quickly review the rules that actually apply.
Limitations/caveats: Effective rules are still just rules; they don’t validate remote service health.

NSG flow logs

What it does: Logs network flows that pass through an NSG, typically to a storage account; optionally integrated into broader log analytics pipelines.
Why it matters: Provides network visibility and supports investigations.
Practical benefit: Audit traffic patterns, detect anomalies, and validate segmentation.
Limitations/caveats: Can generate large volumes; requires careful retention, storage security, and cost management.

Packet capture

What it does: Captures packets on a VM (often via a Network Watcher agent/extension) and stores captures for analysis.
Why it matters: When logs and flow summaries aren’t enough, packets provide ground truth.
Practical benefit: Diagnose TCP handshakes, MTU issues, retransmissions, and TLS negotiation problems.
Limitations/caveats: Sensitive data risk; capture files can be large; requires strict access control and short retention.

VPN troubleshoot (for supported VPN Gateway scenarios)

What it does: Helps diagnose VPN tunnel connectivity issues using Azure-side diagnostics.
Why it matters: Hybrid connectivity is business-critical and often complex.
Practical benefit: Faster isolation of misconfigurations and tunnel state issues.
Limitations/caveats: Not a replacement for on-prem device logs; scenario coverage varies—verify support for your gateway type and configuration.

7. Architecture and How It Works

High-level architecture

Azure Network Watcher is a regional orchestration service that: 1. Uses Azure control-plane APIs to inspect network configuration (NSGs, routes, NICs). 2. For certain features (packet capture, continuous connection monitoring), coordinates with an agent/VM extension to collect data from the guest/host boundary. 3. Stores outputs in Azure Storage and/or Log Analytics depending on the feature and your configuration. 4. Surfaces results via Portal, CLI/PowerShell, and APIs.

Request/data/control flow

Control plane: You request a diagnostic action (e.g., next hop) → Network Watcher queries Azure networking configuration → returns results.
Data plane (logging): NSG flow logs and packet captures generate data → written to storage/log destinations → queried/processed externally (Log Analytics, SIEM, notebooks, etc.).

Key integrations

Azure Monitor / Log Analytics: alerting and analytics for connection monitoring and log queries.
Azure Storage: common sink for NSG flow logs and packet capture files.
Azure RBAC: governs who can run diagnostics and access captured data.
Network resources: VNets, NSGs, NICs, route tables, gateways, load balancers (as applicable).

Dependency services

Typical dependencies you’ll see in real deployments: – Storage accounts (logging destination) – Log Analytics workspace (analytics, alerting) – VM extensions/agents (for packet capture and some monitoring scenarios) – Azure Policy (to enforce enabling flow logs or diagnostic settings—policy availability and effects vary by resource type; verify in official docs)

Security/authentication model

Uses Azure AD authentication and Azure RBAC for permissions.
Diagnostic actions are management operations; access is governed by roles on subscriptions/resource groups/resources.
Data access (packet capture files, flow logs) is governed by permissions on the storage account or workspace.

Networking model

Network Watcher does not “sit inline” in your traffic path.
It observes configuration and logs, and for some features it triggers captures/agents on endpoints.
NSG flow logs are generated as part of NSG processing and exported to configured destinations.

Monitoring/logging/governance considerations

Decide upfront:
Which VNets/subnets require flow logs (usually production and shared networks)
Retention period and access controls for logs
Whether logs go to centralized storage/workspaces per environment
Who can run packet capture and where outputs are stored
Use tags, naming standards, and consistent resource group layouts to make diagnostics repeatable.

Simple architecture diagram (Mermaid)

flowchart LR
  User[Engineer / SRE] -->|Portal / CLI| NW[Azure Network Watcher (Regional)]
  NW --> ARM[Azure Resource Manager APIs]
  ARM --> VNet[VNets / NICs / NSGs / Routes]

  NW -->|Flow logs / Capture output| Storage[Azure Storage Account]
  NW -->|Metrics/Logs (optional)| LA[Log Analytics Workspace]

  User -->|Query| LA
  User -->|Download| Storage

Production-style architecture diagram (Mermaid)

flowchart TB
  subgraph Subscriptions["Azure Subscriptions"]
    subgraph Hub["Hub Network (Prod)"]
      FW[Azure Firewall / NVA]
      GW[VPN Gateway]
      NSGHub[NSGs + UDRs]
    end

    subgraph Spokes["Spoke VNets (Prod)"]
      App[App VMs/VMSS]
      DB[DB VMs]
      NSGSpoke[NSGs per subnet/NIC]
    end
  end

  subgraph NWRegion["Azure Network Watcher (per region)"]
    CM[Connection Monitor]
    Diag[Diagnostics: Next Hop / IP Flow Verify / Topology]
    PC[Packet Capture (on demand)]
    FL[NSG Flow Logs]
  end

  subgraph Observability["Observability & Governance"]
    SA[Central Storage Account(s)]
    LAW[Log Analytics Workspace]
    AM[Azure Monitor Alerts]
    SIEM[Microsoft Sentinel (optional)]
  end

  App <--> DB
  App --> FW
  FW --> GW

  CM --> LAW
  FL --> SA
  PC --> SA
  Diag --> App
  Diag --> DB

  LAW --> AM
  LAW --> SIEM

8. Prerequisites

Before starting with Azure Network Watcher in a lab or production environment:

Azure account and subscription

An Azure subscription where you can create:
Resource groups
VNets, subnets, NSGs
Virtual machines
Storage account (for logs)
Billing must be enabled (even if using free credits), because:
VMs cost money
Storage and log ingestion can cost money

Permissions / IAM roles

You need sufficient Azure RBAC permissions for: – Creating network and compute resources (e.g., Contributor on a resource group for the lab) – Running Network Watcher operations (typically covered by Contributor/Network Contributor) – Enabling NSG flow logs and accessing storage outputs

Common roles (choose least privilege appropriate to your org): – Network Contributor (network resources) – Virtual Machine Contributor (VMs) – Storage Blob Data Reader/Contributor (to view flow logs / packet captures in storage) – Log Analytics Reader/Contributor (if using Log Analytics)

In production, separate “who can run packet capture” from “who can read capture files” to reduce sensitive data exposure.

Tools

For the hands-on lab: – Azure CLI (recommended): https://learn.microsoft.com/cli/azure/install-azure-cli
– Optional: PowerShell Az module, or Portal-only workflow

Region availability

Azure Network Watcher is regional and broadly available. Still:
Verify your target region supports the features you need.
Ensure Network Watcher is enabled in that region for your subscription (often automatic, but not guaranteed in every scenario).

Quotas/limits (high level)

VM cores quota in the region
Public IP quotas (if using public access)
Storage account limits (IOPS/throughput) and retention
Flow log volume and workspace ingestion limits (if integrating with Log Analytics)

Always verify exact service limits in official docs for the feature you’re using.

Prerequisite services/resources

Virtual network and subnets
NSG applied to a subnet or NIC (for flow logs and rule evaluation)
VMs for connectivity and packet capture scenarios
Storage account for NSG flow logs and packet capture outputs (recommended)

9. Pricing / Cost

Azure Network Watcher pricing is feature- and usage-dependent. The base “service” may appear free, but many capabilities generate costs via dependent services and data processing.

Official pricing page (verify current pricing and meters):
https://azure.microsoft.com/pricing/details/network-watcher/

Azure Pricing Calculator:
https://azure.microsoft.com/pricing/calculator/

Pricing dimensions (what you pay for)

Depending on what you enable, costs typically come from: – NSG flow logs – Log generation/export and/or processing (metering depends on current Azure pricing model—verify) – Storage costs (hot/cool/archive, transactions) – Optional analytics costs if you ingest into Log Analytics / SIEM – Connection Monitor – Test runs, monitoring frequency, and data ingestion/storage (often via Azure Monitor/Log Analytics) – Packet capture – Storage for capture files (pcap) and storage transactions – VM overhead (CPU/disk during capture) can be indirect cost – VMs used for monitoring – If you deploy “test agents” or monitor from VMs, VM runtime is a cost driver – Log Analytics – Data ingestion, retention beyond free thresholds (if any), and queries (pricing varies by model and region)

Free tier

There is no universal “free tier” that makes all Network Watcher features free. Some components may not charge directly, but downstream storage and analytics almost always do.
Always validate what is included for your subscription type and region in the official pricing page.

Main cost drivers

Volume of flow logs (high traffic subnets generate lots of logs)
Retention period (storage costs scale with time)
Connection Monitor frequency and number of tests
Log Analytics ingestion (if you centralize flow logs or monitoring into a workspace)
Packet capture size and frequency

Hidden or indirect costs

Data transfer charges (for moving logs between regions, or exporting out of Azure)
Security overhead (Key rotation, access reviews, SIEM integration)
Operational overhead (incident response processes, runbooks, tooling)

Network/data transfer implications

If you centralize logs cross-region, or export to third-party tools, you may incur:
Inter-region bandwidth costs
Egress charges to the internet or to other clouds

How to optimize cost (practical guidance)

Enable NSG flow logs only where needed (production, shared services, sensitive subnets).
Use short retention in hot storage; lifecycle older logs to cool/archive when appropriate.
Prefer targeted packet captures with strict time windows and filters.
Right-size Connection Monitor: fewer endpoints, longer intervals, and focused tests for critical paths.
If using Log Analytics:
Control which logs are ingested
Define retention intentionally
Consider sampling strategies where appropriate (verify what is supported)

Example low-cost starter estimate (conceptual)

A minimal lab typically includes: – 2 small Linux VMs for a short time (primary cost) – 1 storage account with minimal logs/captures (small cost if limited volume and retention) – Optional Log Analytics workspace (can add cost if ingesting flow logs)

Because prices vary by region and change over time, do not assume fixed numbers—use the pricing calculator with your region and expected data volumes.

Example production cost considerations

In production, the largest costs usually come from: – High-volume NSG flow logs across many subnets – Centralized analytics (Log Analytics/SIEM ingestion and retention) – Multiple regions and high availability logging patterns

A cost-conscious production pattern is: – Enable flow logs for critical NSGs only – Centralize in a small number of storage accounts with lifecycle policies – Ingest only necessary subsets into analytics platforms – Use scheduled audits and on-demand deep diagnostics (packet capture) rather than always-on deep capture

10. Step-by-Step Hands-On Tutorial

This lab builds a small Azure network, intentionally blocks traffic with an NSG, and then uses Azure Network Watcher tools to identify the cause and validate the fix. It is designed to be safe and relatively low-cost if you delete resources afterward.

Objective

Create two VMs in the same VNet on different subnets.
Apply an NSG rule that blocks SSH to one VM.
Use Azure Network Watcher to:
Test connectivity
Verify IP flow (NSG allow/deny)
Check next hop (routing)
Fix the NSG rule and confirm connectivity.
(Optional) Enable NSG flow logs to a storage account and view generated log blobs.

Lab Overview

You will create: – Resource group: rg-nw-lab – VNet: vnet-nw-lab with two subnets – NSG: nsg-vm2 applied to VM2 NIC (or subnet) – VM1 (jump/test): vm1-nw (Linux) – VM2 (target): vm2-nw (Linux) – Public IP for VM1 to SSH in (optional but convenient) – Use Azure Network Watcher diagnostics in the same region

Cost note: The biggest cost in this lab is VM runtime. Use small VM sizes and delete the resource group when finished.

Step 1: Set variables and sign in (Azure CLI)

Open a terminal with Azure CLI installed.
Sign in and pick a subscription.

az login
az account show
# If needed:
az account set --subscription "<YOUR_SUBSCRIPTION_ID_OR_NAME>"

Set variables (choose a region close to you):

RG="rg-nw-lab"
LOC="eastus"           # change as needed
VNET="vnet-nw-lab"
SUBNET1="snet-vm1"
SUBNET2="snet-vm2"
NSG="nsg-vm2"
VM1="vm1-nw"
VM2="vm2-nw"
ADMINUSER="azureuser"

Expected outcome: You’re authenticated, and variables are defined.

Step 2: Create a resource group and VNet with two subnets

az group create -n "$RG" -l "$LOC"

az network vnet create \
  -g "$RG" -n "$VNET" -l "$LOC" \
  --address-prefixes 10.10.0.0/16 \
  --subnet-name "$SUBNET1" --subnet-prefixes 10.10.1.0/24

az network vnet subnet create \
  -g "$RG" --vnet-name "$VNET" -n "$SUBNET2" \
  --address-prefixes 10.10.2.0/24

Expected outcome: Resource group, VNet, and two subnets exist.

Verify:

az network vnet show -g "$RG" -n "$VNET" --query "{addressSpace:addressSpace.addressPrefixes, subnets:subnets[].name}" -o table

Step 3: Create an NSG that blocks SSH inbound (intentionally)

Create an NSG and a deny rule for TCP 22 inbound:

az network nsg create -g "$RG" -n "$NSG" -l "$LOC"

az network nsg rule create \
  -g "$RG" --nsg-name "$NSG" -n "Deny-SSH-Inbound" \
  --priority 100 \
  --direction Inbound --access Deny --protocol Tcp \
  --source-address-prefixes "*" --source-port-ranges "*" \
  --destination-address-prefixes "*" --destination-port-ranges 22

Expected outcome: VM2 will not allow inbound SSH (even from VM1) once the NSG is applied.

Verify:

az network nsg rule list -g "$RG" --nsg-name "$NSG" -o table

Step 4: Create VM1 (with public IP) and VM2 (private only)

Create VM1 in subnet1:

az vm create \
  -g "$RG" -n "$VM1" -l "$LOC" \
  --image Ubuntu2204 \
  --admin-username "$ADMINUSER" \
  --generate-ssh-keys \
  --vnet-name "$VNET" --subnet "$SUBNET1" \
  --public-ip-sku Standard

Create VM2 in subnet2 (no public IP):

az vm create \
  -g "$RG" -n "$VM2" -l "$LOC" \
  --image Ubuntu2204 \
  --admin-username "$ADMINUSER" \
  --generate-ssh-keys \
  --vnet-name "$VNET" --subnet "$SUBNET2" \
  --public-ip-address ""

Apply the NSG to VM2’s NIC (NIC-level is clear for labs):

VM2_NIC_ID=$(az vm show -g "$RG" -n "$VM2" --query "networkProfile.networkInterfaces[0].id" -o tsv)

az network nic update \
  --ids "$VM2_NIC_ID" \
  --network-security-group "$NSG"

Expected outcome: VM1 is reachable via SSH from your machine; VM2 has no public IP and blocks SSH inbound due to the NSG.

Verify VM IPs:

VM1_PUBLIC_IP=$(az vm show -d -g "$RG" -n "$VM1" --query publicIps -o tsv)
VM1_PRIVATE_IP=$(az vm show -d -g "$RG" -n "$VM1" --query privateIps -o tsv)
VM2_PRIVATE_IP=$(az vm show -d -g "$RG" -n "$VM2" --query privateIps -o tsv)

echo "VM1 public:  $VM1_PUBLIC_IP"
echo "VM1 private: $VM1_PRIVATE_IP"
echo "VM2 private: $VM2_PRIVATE_IP"

Step 5: Ensure Azure Network Watcher is enabled in the region

In many subscriptions, Azure enables Network Watcher automatically when networking resources exist. Still, explicitly enabling it avoids confusion.

Run:

az network watcher configure --locations "$LOC" --enabled true

Expected outcome: Network Watcher is enabled for the region.

Verify:

az network watcher list -g "NetworkWatcherRG" -o table 2>/dev/null || true

If the NetworkWatcherRG resource group name differs or isn’t visible due to permissions, verify in the Azure Portal: search Network Watcher → ensure the region is enabled.

Step 6: Reproduce the problem (SSH from VM1 to VM2 should fail)

SSH into VM1 from your local machine:

ssh ${ADMINUSER}@${VM1_PUBLIC_IP}

From VM1, attempt to SSH to VM2 private IP:

ssh -o ConnectTimeout=5 ${ADMINUSER}@${VM2_PRIVATE_IP}

Expected outcome: SSH fails (timeout or connection failure), because VM2 inbound TCP 22 is denied by the NSG.

Exit VM1 (or keep it open for later):

exit

Step 7: Use Network Watcher “IP flow verify” to confirm NSG is denying

Run IP flow verify against VM2 NIC for inbound port 22. You need: – Target VM (VM2) – Direction: inbound – Protocol: TCP – Local port: 22 – Remote IP: VM1 private IP (source)

In Azure CLI, IP flow verify is exposed under Network Watcher. The exact CLI parameters can vary by CLI version; use --help if needed.

Check help:

az network watcher ip-flow-verify --help

A commonly used pattern is:

az network watcher ip-flow-verify \
  -g "$RG" \
  --vm "$VM2" \
  --direction Inbound \
  --protocol Tcp \
  --local  "$VM2_PRIVATE_IP:22" \
  --remote "$VM1_PRIVATE_IP:12345"

Expected outcome: Result indicates Deny and identifies the rule (for example, Deny-SSH-Inbound).

If the CLI syntax differs in your installed version, use the Azure Portal alternative: Network Watcher → IP flow verify → select VM2 → specify inbound, TCP, local port 22, remote IP VM1.

Step 8: Use Network Watcher “Next hop” to confirm routing is not the issue

Check next hop from VM1 to VM2 private IP:

az network watcher show-next-hop \
  -g "$RG" \
  --vm "$VM1" \
  --destination-ip-address "$VM2_PRIVATE_IP"

Expected outcome: Next hop should indicate a VNet route (for example, VnetLocal) and show that routing is normal inside the VNet.

Step 9: Use Network Watcher “Test connectivity” (connection troubleshoot)

Run a connectivity test from VM1 to VM2:22.

Check help:

az network watcher test-connectivity --help

Run the test:

az network watcher test-connectivity \
  -g "$RG" \
  --source-resource "$(az vm show -g "$RG" -n "$VM1" --query id -o tsv)" \
  --dest-address "$VM2_PRIVATE_IP" \
  --dest-port 22

Expected outcome: Status should be Unreachable (or similar), and details may point to NSG denial.

If it reports agent/extension requirements, install the Network Watcher VM extension (next step) or use portal-based diagnostics which may guide you.

Step 10: Fix the NSG and re-test

Now allow SSH from VM1 subnet to VM2 on port 22 (more secure than allowing *).

Create an allow rule with higher priority (lower number) than the deny rule:

az network nsg rule create \
  -g "$RG" --nsg-name "$NSG" -n "Allow-SSH-From-Subnet1" \
  --priority 90 \
  --direction Inbound --access Allow --protocol Tcp \
  --source-address-prefixes 10.10.1.0/24 --source-port-ranges "*" \
  --destination-address-prefixes "*" --destination-port-ranges 22

Expected outcome: SSH from VM1 to VM2 should succeed now.

Re-SSH to VM1 and test again:

ssh ${ADMINUSER}@${VM1_PUBLIC_IP}
ssh -o ConnectTimeout=5 ${ADMINUSER}@${VM2_PRIVATE_IP}

You should get an SSH prompt on VM2. Exit both sessions:

exit
exit

Step 11 (Optional): Enable NSG flow logs to a storage account (Portal-first)

This optional step adds observability but can increase cost and complexity. It’s valuable to see real flow records.

Create a storage account (CLI):

STORAGE="stnwl$RANDOM$RANDOM"
az storage account create \
  -g "$RG" -n "$STORAGE" -l "$LOC" \
  --sku Standard_LRS \
  --kind StorageV2

In Azure Portal: – Go to Network Watcher → NSG flow logs – Select your NSG (nsg-vm2) – Set Flow logs = On – Choose the storage account you created – Choose retention (keep it short for lab) – Save
Generate some traffic (from VM1 to VM2): – SSH to VM1 – SSH to VM2 a few times, or run curl to a port if you open one – Wait a few minutes for logs to appear
View logs in the storage account: – Storage account → Containers – Look for the flow logs container/path created by the feature – Download a JSON log file and inspect it

Expected outcome: You’ll find flow log blobs that record allowed/denied flows through the NSG.

Exact container names and schema can evolve; use Microsoft’s documentation for current flow log format and fields.

Validation

You have successfully validated: – An NSG deny rule caused an SSH outage (reproduced). – IP flow verify identified the deny action and (typically) the matching rule. – Next hop confirmed routing was not the issue. – Test connectivity confirmed the reachability status. – Updating NSG rules restored connectivity. – (Optional) NSG flow logs captured flow telemetry to storage.

Troubleshooting

Common issues and fixes:

Network Watcher isn’t enabled in the region – Symptom: Tools fail or region isn’t selectable. – Fix: Enable it for the region: bash az network watcher configure --locations "$LOC" --enabled true
RBAC permissions – Symptom: Access denied when running diagnostics or configuring flow logs. – Fix: Ensure you have appropriate roles (Network Contributor, Contributor, plus storage/log roles for data access).
NSG applied to wrong place – Symptom: SSH isn’t blocked even though you created a deny rule, or remains blocked after allowing. – Fix: Confirm the NSG is associated with the correct NIC or subnet, and verify effective rules: – Portal → VM → Networking → NIC NSG association – Network Watcher → Security group view (effective rules)
SSH fails even after allow rule – Potential causes: – VM2’s OS firewall (UFW/iptables) blocks port 22 – Wrong username or keys – You’re testing from a different source IP range than allowed – Fix: Verify OS firewall and confirm rule source prefix matches VM1 subnet.
Flow logs not appearing – Causes: – Flow logs not enabled on the correct NSG – Wrong storage account selected or access issue – Not enough time elapsed – Fix: Re-check configuration, then generate traffic and wait several minutes.

Cleanup

Delete the entire resource group to avoid ongoing charges:

az group delete -n "$RG" --yes --no-wait

Expected outcome: All lab resources are removed (VMs, network, NSG, storage). Confirm in portal after deletion completes.

11. Best Practices

Architecture best practices

Design for debuggability: Standardize NSG usage (subnet vs NIC) so effective policy is predictable.
Hub-and-spoke clarity: In complex networks, document UDRs, firewall paths, and DNS—Network Watcher helps validate, but architecture clarity prevents incidents.
Centralize logs deliberately: Decide whether flow logs go to per-subscription storage or centralized logging subscriptions.

IAM/security best practices

Apply least privilege:
Many engineers can run “read-only” diagnostics (topology, effective routes).
Only a small group should run packet capture.
Separate permissions for:
Running packet capture
Reading capture output in storage
Use Privileged Identity Management (PIM) where appropriate for just-in-time elevation (verify applicability in your tenant).

Cost best practices

Enable NSG flow logs selectively and review periodically.
Control retention and storage lifecycle policies.
Avoid indiscriminate Log Analytics ingestion for high-volume flow logs unless you have a clear detection/analytics need and budget.

Performance best practices

Prefer flow logs for broad visibility and packet capture for targeted deep dives.
Schedule captures for short windows; use filters where supported.
For continuous monitoring, pick intervals appropriate for the SLO (don’t over-sample).

Reliability best practices

Use Connection Monitor for critical dependencies with alerting via Azure Monitor.
Run periodic “network health checks” as part of operational readiness.

Operations best practices

Maintain an incident runbook:
Step 1: test-connectivity
Step 2: IP flow verify
Step 3: next hop + effective routes
Step 4: check NSG flow logs (if enabled)
Step 5: packet capture (only if needed)
Standardize naming conventions:
nsg-<app>-<env>-<region>
rt-<subnet>-<env>
Tags: env, owner, costCenter, dataSensitivity

Governance/tagging/naming best practices

Tag NSGs and logging storage accounts with:
dataClassification (because flow logs can be sensitive)
retentionPolicy
securityOwner
Use Azure Policy where possible to audit:
NSG presence on subnets
Flow log enablement (availability depends on policy definitions—verify current built-ins)

12. Security Considerations

Identity and access model

Azure Network Watcher actions are controlled by Azure RBAC.
Treat the ability to run packet capture and to read its output as privileged.

Encryption

At rest: Azure Storage and Log Analytics encrypt data at rest by default (verify current guarantees and configuration options).
In transit: Access to storage/workspaces uses TLS.
Consider customer-managed keys (CMK) if required by policy (verify service support for your chosen storage/workspace configuration).

Network exposure

Network Watcher does not expose inbound endpoints into your VNets, but:
Packet capture outputs and flow logs are stored in storage accounts—secure those endpoints (private endpoints, firewall rules, least privilege).
Avoid public access to storage where possible.

Secrets handling

Prefer identity-based access (Azure AD) to storage over shared keys when possible.
Rotate storage keys if they must be used (some workflows historically relied on keys; verify current options).
Avoid sharing capture files outside controlled channels.

Audit/logging

Log and audit:
Who enabled flow logs
Who ran packet captures and when
Storage access (Blob access logs, Azure Activity Logs)
Consider sending Activity Logs to a central workspace/SIEM.

Compliance considerations

Flow logs and packet captures may contain:
IP addresses, ports, and metadata
Potential payload data (packet capture)
Apply:
Data retention limits
Access reviews
Incident handling procedures

Common security mistakes

Storing packet captures in a broadly accessible storage account.
Long retention without justification.
Enabling flow logs everywhere without a plan, then failing to secure or review the data.
Granting broad Contributor rights to too many people, enabling unintended data exposure.

Secure deployment recommendations

Use dedicated, locked-down storage accounts for network logs.
Apply private endpoints and storage firewall rules where possible.
Define a minimal group for packet capture capability and enforce JIT access.
Document data handling and retention policies for flow logs and captures.

13. Limitations and Gotchas

The exact limitations vary by feature and evolve over time. Always confirm in official documentation for your region and scenario.

Common, practical gotchas include:

Regional enablement: Network Watcher is regional; troubleshooting a resource in a region where it’s not enabled can fail or be confusing.
Agent/extension dependencies: Some features (notably packet capture and certain continuous monitoring scenarios) may require VM extensions/agents and proper VM access.
Data volume growth: NSG flow logs can become massive in busy environments.
Retention surprises: Keeping logs “forever” in hot storage is expensive and risky.
Storage security: Flow logs and packet capture files are sensitive; a misconfigured storage account is a security incident waiting to happen.
Point-in-time vs continuous: Tools like IP flow verify and next hop are point-in-time evaluations; they don’t replace continuous telemetry.
Complex routing: Effective routes can be correct yet traffic still fails due to downstream appliances, asymmetric routing, or on-prem routes.
Portal UX changes: Azure Portal often renames or reorganizes blades; rely on official docs when you can’t find a feature.
CLI command variations: Azure CLI evolves; if a command differs, use --help and cross-check official CLI reference pages.

14. Comparison with Alternatives

Azure Network Watcher is not the only way to monitor and troubleshoot networking. It’s often used alongside other tools.

Comparison table

Option	Best For	Strengths	Weaknesses	When to Choose
Azure Network Watcher	Azure VNet diagnostics, NSG flow logs, routing/connectivity troubleshooting	First-party Azure-aware tools; deep diagnostics (IP flow verify, next hop); integrates with Azure Monitor	Some features require agents; logging can be high-volume; not a full SIEM/APM	Default choice for Azure IaaS network troubleshooting and baseline network visibility
Azure Monitor (Logs/Metrics/Alerts)	Central monitoring, alerting, query, dashboards	Strong analytics and alerting; cross-service observability	Needs data sources (flow logs, VM logs); may increase cost with ingestion	Use with Network Watcher for alerting and long-term analysis
Microsoft Sentinel	SIEM and security analytics	Correlation, detection rules, incident management	Additional cost and tuning; needs good data hygiene	Choose when security monitoring and SOC workflows are required
Azure Firewall logs / NVA logs	Centralized egress/ingress inspection	L3–L7 visibility at chokepoints	Doesn’t replace NSG-level visibility everywhere	Use when you enforce centralized inspection and want firewall-centric visibility
AWS VPC Reachability Analyzer + VPC Flow Logs	Similar capabilities in AWS	Strong path analysis; flow logging	Different cloud; not Azure-native	Choose for AWS environments
Google Network Intelligence Center (incl. Connectivity Tests)	Similar capabilities in GCP	Network insights and tests	Different cloud	Choose for GCP environments
Self-managed tcpdump/Wireshark/Zeek	Deep packet and protocol analysis	Maximum detail and control	Operational overhead; access challenges; not Azure-aware by default	Use when you need deep inspection beyond what managed tooling provides (often alongside Network Watcher)

15. Real-World Example

Enterprise example (regulated, hub-and-spoke)

Problem: A financial services company runs a hub-and-spoke Azure network with strict segmentation. Periodic incidents occur after NSG/UDR changes, and audits require evidence of traffic controls.
Proposed architecture:
Enable Azure Network Watcher in all production regions.
Enable NSG flow logs for:
- Hub firewall subnets
- Spoke subnets containing regulated workloads
Send flow logs to dedicated storage accounts with lifecycle policies.
Use Connection Monitor for critical dependency paths (web → API → DB, and hybrid endpoints).
Use Azure Monitor alerts for connection monitor failures.
Why this service was chosen:
Azure-native diagnostics map directly to NSGs, NICs, and UDRs.
Supports audit and incident response with flow evidence.
Expected outcomes:
Faster resolution of “blocked traffic” incidents.
Improved audit readiness with consistent, reviewable network logs.
Reduced risk from misconfigured routes and security rules.

Startup/small-team example (lean ops)

Problem: A startup hosts a small SaaS on a few VMs. They occasionally break internal connectivity when tightening NSG rules and need a fast way to debug without a dedicated network team.
Proposed architecture:
Use Azure Network Watcher’s IP flow verify + next hop as standard incident steps.
Enable NSG flow logs only on the production backend subnet NSG with short retention.
Use Connection Monitor only for the single most critical dependency path.
Why this service was chosen:
Low operational overhead; first-party tool integrated into Azure Portal.
Targeted logging keeps cost manageable.
Expected outcomes:
Fewer “mystery outages” after configuration changes.
Better confidence during deployments and security hardening.

16. FAQ

1) Is Azure Network Watcher free?

The “service” may not have a flat monthly fee, but many features generate usage-based costs, especially NSG flow logs, storage, and any analytics ingestion. Always check the official pricing page.

2) Is Azure Network Watcher global or regional?

It is regional. You typically enable and use it per Azure region.

3) Do I need to enable Azure Network Watcher manually?

Often it is enabled automatically when you create networking resources, but not always in every scenario. If diagnostics aren’t working, explicitly enable it for the region.

4) What’s the fastest way to see if an NSG is blocking traffic?

Use IP flow verify (and optionally security group view) to see whether a specific flow is allowed or denied and which rule matches.

5) What’s the fastest way to check routing issues?

Use Next hop and effective routes. These show where Azure will send traffic and which routes apply.

6) What is the difference between NSG flow logs and packet capture?

NSG flow logs: summarized flow records at the NSG level (who talked to whom, allowed/denied).
Packet capture: packet-level data from a VM (deep inspection, payload risk).

7) Can Azure Network Watcher troubleshoot PaaS services directly?

Network Watcher primarily targets VNet-attached resources (VMs, NICs, NSGs, routes). For PaaS, you often combine it with service-specific diagnostics and Azure Monitor.

8) Does Connection Monitor replace the older Network Performance Monitor?

Historically, Azure offered Network Performance Monitor (NPM) and earlier “classic” experiences. Today, Connection Monitor is the primary approach under Azure Network Watcher/Azure Monitor. Verify current migration guidance in official docs.

9) Can I use Connection Monitor for hybrid connectivity?

Often yes, depending on endpoint types and agent support. Verify current supported endpoints and requirements in the official Connection Monitor documentation.

10) Where should I store NSG flow logs?

Commonly in an Azure Storage account with restricted access and lifecycle policies. Some organizations centralize logs into a dedicated logging subscription.

11) How long should I retain flow logs?

Keep the minimum required for: – troubleshooting (often days/weeks) – compliance (varies) Use storage lifecycle policies to reduce cost and exposure.

12) Can Network Watcher prove my application is healthy?

No. It can show network reachability and network-level symptoms, but application health depends on app logs, dependencies, and performance telemetry.

13) Why does “next hop” look correct but traffic still fails?

Because routing can be correct while: – NSGs deny the flow – a firewall/NVA drops traffic – asymmetric routing breaks return traffic – DNS resolves incorrectly – the destination service isn’t listening

14) Is it safe to run packet capture in production?

It can be, if you: – restrict access – run short captures with filters – store outputs securely – have an approved data handling policy But treat captures as sensitive.

15) How do I operationalize Azure Network Watcher for governance?

Standardize: – which NSGs have flow logs enabled – where logs are stored – retention and access controls – runbooks for incident triage using IP flow verify/next hop/test connectivity Then audit regularly.

16) Does Azure Network Watcher work across subscriptions?

It operates within what you have RBAC access to. Cross-subscription scenarios are common in enterprises, but you must design permissions, logging destinations, and operational processes accordingly.

17) Can I automate Network Watcher diagnostics?

Yes. Many actions are exposed via Azure CLI, PowerShell, and ARM APIs, enabling scripted troubleshooting and runbook automation.

17. Top Online Resources to Learn Azure Network Watcher

Resource Type	Name	Why It Is Useful
Official documentation	Azure Network Watcher documentation: https://learn.microsoft.com/azure/network-watcher/	Authoritative reference for features, requirements, and latest updates
Official documentation	Connection Monitor: https://learn.microsoft.com/azure/network-watcher/connection-monitor	How to set up continuous connectivity monitoring and interpret results
Official documentation	IP flow verify / traffic filtering diagnostics: https://learn.microsoft.com/azure/network-watcher/diagnose-network-traffic-filtering-problem	Step-by-step NSG deny/allow troubleshooting
Official documentation	Next hop / routing diagnostics: https://learn.microsoft.com/azure/network-watcher/diagnose-vm-network-routing-problem	Understand effective routes and route-related failures
Official documentation	Packet capture: https://learn.microsoft.com/azure/network-watcher/packet-capture-overview	How to safely capture packets and manage outputs
Official documentation	NSG flow logs: https://learn.microsoft.com/azure/network-watcher/nsg-flow-logs	Configuration, log format, and operational guidance
Official documentation	VPN troubleshooting: https://learn.microsoft.com/azure/network-watcher/network-watcher-troubleshoot-vpn	Supported scenarios and troubleshooting steps
Official pricing page	Azure Network Watcher pricing: https://azure.microsoft.com/pricing/details/network-watcher/	Current meters and billing model
Pricing tool	Azure Pricing Calculator: https://azure.microsoft.com/pricing/calculator/	Estimate total cost including storage and Log Analytics
Official CLI reference	Azure CLI Network Watcher commands: https://learn.microsoft.com/cli/azure/network/watcher	Exact CLI syntax and parameters by version
Architecture guidance	Azure Architecture Center: https://learn.microsoft.com/azure/architecture/	Patterns for hub-spoke, governance, and observability (combine with Network Watcher)
Official videos	Microsoft Azure YouTube channel: https://www.youtube.com/@MicrosoftAzure	Search for Network Watcher/Connection Monitor walkthroughs and updates
Samples (verify)	Azure samples on GitHub: https://github.com/Azure	Find scripts and examples; validate they match current docs before using

18. Training and Certification Providers

The following training providers may offer Azure, networking, or operations courses. Verify current course availability and delivery modes on their websites.

Institute	Suitable Audience	Likely Learning Focus	Mode	Website URL
DevOpsSchool.com	DevOps engineers, cloud engineers, SREs	Azure operations, DevOps practices, monitoring fundamentals	Check website	https://www.devopsschool.com/
ScmGalaxy.com	Beginners to intermediate engineers	DevOps, SCM, cloud fundamentals	Check website	https://www.scmgalaxy.com/
CLoudOpsNow.in	Cloud operations teams	Cloud ops, monitoring, governance	Check website	https://www.cloudopsnow.in/
SreSchool.com	SREs, platform engineers	Reliability engineering, incident response, observability	Check website	https://www.sreschool.com/
AiOpsSchool.com	Ops and monitoring teams	AIOps concepts, automation, monitoring analytics	Check website	https://www.aiopsschool.com/

19. Top Trainers

These sites may list trainers, coaching, or training services. Verify background and course relevance directly on each site.

Platform/Site	Likely Specialization	Suitable Audience	Website URL
RajeshKumar.xyz	DevOps/cloud training content (verify specifics)	Beginners to intermediate	https://rajeshkumar.xyz/
devopstrainer.in	DevOps training services (verify specifics)	DevOps engineers, admins	https://devopstrainer.in/
devopsfreelancer.com	Freelance DevOps guidance (verify specifics)	Teams needing short-term help	https://devopsfreelancer.com/
devopssupport.in	DevOps support/training resources (verify specifics)	Ops/DevOps teams	https://devopssupport.in/

20. Top Consulting Companies

These consulting organizations may help with Azure architecture, operations, and governance initiatives. Confirm service scope, references, and delivery model directly with each provider.

Company Name	Likely Service Area	Where They May Help	Consulting Use Case Examples	Website URL
cotocus.com	Cloud/DevOps consulting (verify service catalog)	Cloud adoption, operations setup, governance	Designing network observability approach; implementing logging storage and access controls; runbooks for incident response	https://cotocus.com/
DevOpsSchool.com	DevOps/cloud consulting and training (verify service catalog)	DevOps transformation, platform practices	Building operational playbooks; implementing monitoring strategy; standardizing Azure RBAC and tagging	https://www.devopsschool.com/
DEVOPSCONSULTING.IN	DevOps consulting (verify service catalog)	CI/CD, cloud operations	Operationalizing network diagnostics; integrating logs into monitoring workflows; improving MTTR processes	https://devopsconsulting.in/

21. Career and Learning Roadmap

What to learn before Azure Network Watcher

To use Azure Network Watcher effectively, you should understand: – Azure fundamentals: subscriptions, resource groups, RBAC – Azure networking basics: – VNets, subnets – NSGs and rule evaluation – Route tables (UDRs) and system routes – VNet peering – DNS basics in Azure – Basic Linux/Windows networking tools: – ping, traceroute, ss/netstat, curl, tcpdump (even if you plan to use managed tools)

What to learn after Azure Network Watcher

To mature beyond ad-hoc troubleshooting: – Azure Monitor (Logs, Metrics, Alerts) – Log Analytics / KQL querying – Microsoft Sentinel (if security monitoring is a requirement) – Azure Firewall and/or NVA patterns – Infrastructure as Code (Bicep/Terraform) for consistent NSG/flow log provisioning – Azure Policy for governance and auditing

Job roles that use it

Cloud engineer / cloud operations engineer
Network engineer (cloud)
SRE / platform engineer
DevOps engineer
Security engineer / SOC analyst (as a data source for investigations)
Solutions architect (designing observability and governance)

Certification path (Azure)

Azure certifications change over time; check Microsoft’s certification pages for current tracks. Commonly relevant areas: – Azure Fundamentals (baseline) – Azure Administrator (operations) – Azure Network Engineer (network specialization) – Azure Security Engineer (security monitoring and governance)

Project ideas for practice

Build a hub-and-spoke lab and validate:
UDR routing through a firewall/NVA
segmentation using IP flow verify
flow log baselining for allowed/denied traffic
Create an “incident runbook” repository with scripts:
test-connectivity wrapper
next hop + effective routes export
NSG rule evaluation helper
Build a cost-controlled logging design:
flow logs → storage with lifecycle rules
a small subset → Log Analytics for alerts

22. Glossary

Azure Network Watcher: Azure service for network monitoring and diagnostics in VNets.
VNet (Virtual Network): Private network in Azure for hosting resources.
Subnet: A range within a VNet where resources are placed.
NIC (Network Interface): Network adapter attached to a VM.
NSG (Network Security Group): L3/L4 stateful filtering rules controlling inbound/outbound traffic.
UDR (User Defined Route): Custom route table entries to override system routing.
Effective routes: The final set of routes applied to a NIC (system + UDR).
Next hop: The next routing destination Azure selects for traffic to a destination IP.
IP flow verify: A check that returns whether a given 5-tuple is allowed/denied by NSG rules.
NSG flow logs: Logs that record flows through an NSG (allowed/denied).
Packet capture: Capturing packets (pcap) from a VM for deep network analysis.
Connection Monitor: Continuous connectivity monitoring with metrics and (typically) alert integration.
Log Analytics workspace: Azure Monitor logs store queried with KQL.
KQL (Kusto Query Language): Query language for Log Analytics and related services.
Azure RBAC: Role-based access control for managing access to Azure resources.
MTTR: Mean time to recovery/resolve; a key operations metric.

23. Summary

Azure Network Watcher is Azure’s built-in, regional network diagnostics and visibility service. It matters because cloud networking failures are often caused by subtle interactions between NSGs, routes, and gateways—and Network Watcher provides Azure-native tools (IP flow verify, next hop, connectivity tests, flow logs, and packet capture) to troubleshoot quickly and consistently.

From a cost perspective, the key is understanding that logging and analytics drive spend: NSG flow logs can generate large volumes, and storage/Log Analytics ingestion and retention can become significant. From a security perspective, treat flow logs and especially packet captures as sensitive data, and lock down access to both the diagnostic actions and the stored outputs.

Use Azure Network Watcher when you need reliable, first-party network troubleshooting and governance for Azure VNets. Pair it with Azure Monitor and (optionally) Sentinel when you need alerting and centralized security analytics.

Next step: implement a small production-ready pattern—selective NSG flow logs + Connection Monitor for critical paths + runbooks—and validate it against your organization’s operational and compliance requirements using Microsoft’s official documentation.

rajeshkumar

Category

1. Introduction

2. What is Azure Network Watcher?

Core capabilities (what it does)

Major components (how it’s organized)

Service type

Scope: regional vs global

How it fits into the Azure ecosystem

3. Why use Azure Network Watcher?

Business reasons

Technical reasons

Operational reasons

Security/compliance reasons

Scalability/performance reasons

When teams should choose it

When teams should not choose it

4. Where is Azure Network Watcher used?

Industries

Team types

Workloads and architectures

Real-world deployment contexts

5. Top Use Cases and Scenarios

1) Diagnose “VM can’t reach VM” inside a VNet

2) Validate NSG rules before and after deployments

3) Capture packets to debug intermittent TCP resets

4) Audit traffic patterns with NSG flow logs

5) Confirm routing and next hop after UDR changes

6) Troubleshoot VPN connectivity issues

7) Continuous monitoring of critical dependencies

8) Validate segmentation in hub-and-spoke environments

9) Investigate suspected data exfiltration paths

10) Troubleshoot DNS-related connectivity symptoms (indirectly)

11) Validate Private Endpoint and NSG/UDR interactions

12) Standardize incident runbooks for network triage

6. Core Features

Network topology

Connection Monitor (connectivity monitoring)

Connection troubleshoot / Test connectivity

IP flow verify

Next hop

Effective routes

Security group view (effective NSG rules)

NSG flow logs

Packet capture

VPN troubleshoot (for supported VPN Gateway scenarios)

7. Architecture and How It Works

High-level architecture

Request/data/control flow

Key integrations

Dependency services

Security/authentication model

Networking model

Monitoring/logging/governance considerations

Simple architecture diagram (Mermaid)

Production-style architecture diagram (Mermaid)

8. Prerequisites

Azure account and subscription

Permissions / IAM roles

Tools

Region availability

Quotas/limits (high level)

Prerequisite services/resources

9. Pricing / Cost

Pricing dimensions (what you pay for)

Free tier

Main cost drivers

Hidden or indirect costs

Network/data transfer implications

How to optimize cost (practical guidance)

Example low-cost starter estimate (conceptual)

Example production cost considerations

10. Step-by-Step Hands-On Tutorial

Objective

Lab Overview

Step 1: Set variables and sign in (Azure CLI)

Step 2: Create a resource group and VNet with two subnets

Step 3: Create an NSG that blocks SSH inbound (intentionally)

Step 4: Create VM1 (with public IP) and VM2 (private only)

Step 5: Ensure Azure Network Watcher is enabled in the region

Step 6: Reproduce the problem (SSH from VM1 to VM2 should fail)