Category
Management and Governance
1. Introduction
Azure Network Watcher is Azure’s native network monitoring and diagnostics service for virtual networks. It helps you understand, diagnose, and gain visibility into network traffic, routing, and connectivity issues inside Azure networking.
In simple terms: Azure Network Watcher tells you what the network is doing and why—whether a VM can (or cannot) reach another endpoint, which route is being used, whether an NSG rule is blocking traffic, and what flows are actually traversing your network.
Technically, Azure Network Watcher is a regional service that provides a set of tools (connectivity tests, packet capture, NSG flow logs, topology views, routing diagnostics, and VPN troubleshooting) that interact with Azure networking resources such as VNets, NICs, NSGs, route tables, and VPN gateways. It integrates with Azure Monitor and Log Analytics for alerting, visualization, and long-term analysis.
What problem it solves: network problems are often the hardest to troubleshoot in cloud environments because the “network” is a mix of distributed components (NSGs, routes, DNS, NAT, firewalls, private endpoints, gateways). Azure Network Watcher provides first-party, Azure-aware diagnostics so you can troubleshoot faster, reduce downtime, and improve governance over network operations.
Naming and lifecycle note (verify in official docs for the latest): Azure Network Watcher is an active service. Some experiences within it have evolved over time (for example, “Connection Monitor” has newer versions, and older “classic” monitoring experiences have been retired or deprecated). When using this tutorial, always prefer the latest workflow shown in Azure’s official documentation.
2. What is Azure Network Watcher?
Official purpose: Azure Network Watcher is designed to monitor, diagnose, view metrics, and enable or disable logs for resources in an Azure virtual network.
Core capabilities (what it does)
Azure Network Watcher commonly provides: – Connectivity diagnostics (why a connection succeeds or fails) – Traffic flow logging (NSG flow logs) – Network topology visualization (resource relationship mapping) – Routing diagnostics (effective routes, next-hop checks) – Security rule evaluation (IP flow verify, security group view) – Deep packet-level capture (packet capture on supported VMs) – VPN troubleshooting (for supported VPN gateway scenarios)
Major components (how it’s organized)
In practice, you’ll interact with Azure Network Watcher through:
– Azure Portal (Network Watcher blade)
– Azure CLI / PowerShell (network watcher commands)
– Azure Resource Manager (ARM) APIs (management-plane operations)
– Regional Network Watcher resource that Azure creates/uses (often visible in a NetworkWatcherRG resource group in the region)
Service type
- Service category fit: Although it’s closely tied to networking, Azure Network Watcher is also a key Management and Governance tool because it supports operational governance: troubleshooting, logging, auditing network changes, and enforcing visibility standards across environments.
- Type: Primarily a management-plane diagnostic service that orchestrates data collection from networking resources and agents/extensions on VMs for certain features.
Scope: regional vs global
- Regional service: Azure Network Watcher is enabled per region. If you troubleshoot resources in multiple Azure regions, you typically ensure Network Watcher is enabled in each of those regions.
- Subscription context: It operates within the context of a subscription (and the resources you have access to via RBAC).
How it fits into the Azure ecosystem
Azure Network Watcher complements and integrates with: – Azure Virtual Network (VNet), NICs, NSGs, route tables, Private Endpoints – VPN Gateway (and certain gateway diagnostic scenarios) – Azure Monitor / Log Analytics for querying and alerting on logs – Azure Storage (common destination for NSG flow logs and packet captures) – Microsoft Sentinel (optional downstream consumption of logs, depending on your logging architecture)
3. Why use Azure Network Watcher?
Business reasons
- Reduce downtime and MTTR: Faster root cause analysis for network incidents.
- Improve service reliability: Proactive monitoring (for example, continuous connection monitoring) can detect degradations before users report them.
- Operational standardization: Establish repeatable troubleshooting and logging patterns across teams and subscriptions.
Technical reasons
- Azure-aware diagnostics: The tools understand Azure networking objects (NSGs, UDRs, NAT behavior, gateways).
- Pinpoint “where it broke”: Identify whether failure is due to NSG rules, routing, DNS resolution, or unreachable next hop.
- Evidence-driven troubleshooting: Flow logs and packet capture provide concrete proof rather than assumptions.
Operational reasons
- Self-service for platform teams: Provide engineers a consistent diagnostic toolkit.
- Integrates with runbooks: CLI/PowerShell support enables automation in incident response.
- Supports governance: Enabling consistent logging (like NSG flow logs) supports audits and investigations.
Security/compliance reasons
- Network forensics: Flow logs help investigate suspicious traffic patterns and policy violations.
- Change validation: Verify the effect of NSG/route changes without waiting for user impact.
- Segmentation assurance: Validate that segmentation rules actually block prohibited traffic.
Scalability/performance reasons
- Designed for cloud scale: Continuous monitoring and logging can scale across many VNets and workloads (with careful cost control).
- Targeted deep dives: Packet capture is available when you need detail, rather than always-on full capture.
When teams should choose it
Choose Azure Network Watcher when you need: – Repeatable network troubleshooting for Azure IaaS and hybrid networking – NSG flow visibility for detection, investigation, and governance – Connection monitoring between endpoints (Azure-to-Azure and potentially hybrid, depending on your design) – A first-party approach aligned with Azure RBAC and resource model
When teams should not choose it
Consider alternatives or complements when: – You need full NPM/APM across applications (use Azure Monitor Application Insights or third-party APM) – You need full network IDS/IPS or advanced L7 inspection (consider Azure Firewall, third-party NVA, or dedicated security tooling) – Your environment is mostly PaaS-only with minimal VNets/NSGs (you may rely more on service-specific diagnostics and Azure Monitor) – You require long-term, centralized SIEM correlation: Network Watcher is a source; Sentinel/SIEM is the analysis plane
4. Where is Azure Network Watcher used?
Industries
Common in any industry operating regulated or mission-critical networks: – Finance and insurance (segmentation, auditability) – Healthcare (compliance logging, incident investigations) – Retail/e-commerce (availability and performance) – SaaS providers (multi-tenant segmentation and operational monitoring) – Public sector (governance, audit trails)
Team types
- Cloud platform teams managing shared networking
- SRE/operations teams responding to incidents
- Security engineering and SOC teams investigating network events
- DevOps teams validating connectivity for deployments
- Network engineers extending on-prem patterns into Azure
Workloads and architectures
- Hub-and-spoke VNets with centralized firewalls
- Multi-region active-active or active-passive designs
- Hybrid connectivity (VPN/ExpressRoute plus on-prem DNS)
- Microsegmented environments using NSGs and UDRs
- Kubernetes clusters (AKS) that depend on underlying VNet routing and security (note: some diagnostics are at VM/NIC/NSG level; interpret accordingly)
Real-world deployment contexts
- Production: Continuous connection monitoring, NSG flow logs to a centralized logging account, alerting on failures
- Dev/test: Ad-hoc troubleshooting during provisioning, validating NSG rules, debugging routes during lab builds
5. Top Use Cases and Scenarios
Below are realistic scenarios where Azure Network Watcher is commonly used.
1) Diagnose “VM can’t reach VM” inside a VNet
- Problem: Two VMs in the same VNet can’t connect on a specific port.
- Why it fits: Connection troubleshoot + IP flow verify can quickly isolate NSG/routing issues.
- Example: App VM can’t reach DB VM on TCP 1433 after an NSG change.
2) Validate NSG rules before and after deployments
- Problem: Engineers deploy new rules but aren’t sure which rule will match traffic.
- Why it fits: IP flow verify and security group view help confirm effective security rules.
- Example: Confirm that only the load balancer subnet can reach backend VMs on port 443.
3) Capture packets to debug intermittent TCP resets
- Problem: Users see timeouts or resets, but logs aren’t conclusive.
- Why it fits: Packet capture provides packet-level evidence (SYN/SYN-ACK, retransmits, resets).
- Example: A Linux VM intermittently fails to establish TLS sessions to an internal API.
4) Audit traffic patterns with NSG flow logs
- Problem: Need to know which sources are talking to a subnet and on which ports.
- Why it fits: NSG flow logs provide structured flow telemetry for investigation and baselining.
- Example: Detect unexpected inbound attempts on SSH from unapproved source ranges.
5) Confirm routing and next hop after UDR changes
- Problem: Routing changes accidentally send traffic to the wrong appliance or blackhole.
- Why it fits: Next hop and effective routes show the chosen path.
- Example: After adding a default route to a firewall, a subnet loses access to Azure services.
6) Troubleshoot VPN connectivity issues
- Problem: On-premises can’t reach Azure subnets through VPN.
- Why it fits: VPN troubleshoot can help identify common tunnel and configuration issues.
- Example: A site-to-site VPN drops after an on-prem network device update.
7) Continuous monitoring of critical dependencies
- Problem: Need early warning if a critical service becomes unreachable.
- Why it fits: Connection Monitor supports continuous tests and integration with alerts (via Azure Monitor).
- Example: Monitor connectivity between web tier and database tier across regions.
8) Validate segmentation in hub-and-spoke environments
- Problem: Must prove that spokes are isolated except through shared services.
- Why it fits: IP flow verify, next hop, and flow logs help validate and document segmentation.
- Example: Ensure Spoke-A cannot reach Spoke-B directly, only via firewall.
9) Investigate suspected data exfiltration paths
- Problem: Security suspects a VM is sending data to an unauthorized destination.
- Why it fits: NSG flow logs and connection monitoring help confirm egress paths and destinations.
- Example: A workload unexpectedly initiates outbound connections to unknown IPs.
10) Troubleshoot DNS-related connectivity symptoms (indirectly)
- Problem: “Connection fails” but root cause is name resolution.
- Why it fits: Connection troubleshooting workflows can reveal whether failure is at DNS vs network.
- Example: App can reach an IP directly but fails when using hostname after DNS changes.
11) Validate Private Endpoint and NSG/UDR interactions
- Problem: Private Endpoint traffic doesn’t behave as expected; access is denied.
- Why it fits: Effective security rules and route diagnostics clarify whether traffic is blocked.
- Example: Private Endpoint access fails from a locked-down subnet with strict NSGs.
12) Standardize incident runbooks for network triage
- Problem: Different engineers troubleshoot differently, wasting time.
- Why it fits: Network Watcher provides consistent tools that can be embedded in runbooks.
- Example: “Tier-1 network triage” checklist using next hop + IP flow verify + test connectivity.
6. Core Features
This section lists key Azure Network Watcher features commonly available in current Azure deployments. Availability and exact UI naming can change—verify in official docs for your region and subscription.
Network topology
- What it does: Visualizes network resources and their relationships (VNets, subnets, NICs, NSGs, route tables, gateways).
- Why it matters: Helps quickly understand “what’s connected to what.”
- Practical benefit: Faster onboarding and troubleshooting—especially in shared hub-and-spoke networks.
- Limitations/caveats: Topology is a view, not a source of truth for traffic flows; it shows resource relationships, not packet paths.
Connection Monitor (connectivity monitoring)
- What it does: Continuously monitors connectivity between endpoints and collects latency/availability metrics; can integrate with alerting.
- Why it matters: Moves teams from reactive troubleshooting to proactive detection.
- Practical benefit: Detect dependency failures between tiers (web → API → DB), across subnets/regions/hybrid (depending on endpoint type and agent support).
- Limitations/caveats: Often requires an agent/extension on VMs for certain endpoint types; monitor design affects cost and data volume.
Connection troubleshoot / Test connectivity
- What it does: Performs on-demand connectivity checks between a source and destination and provides diagnostic output (reachable/unreachable, hops, potential blocking).
- Why it matters: Quickly answers “is it network or not?”
- Practical benefit: Identifies NSG/UDR issues without manually correlating rules and routes.
- Limitations/caveats: Some checks require VM agent/extension; results are point-in-time.
IP flow verify
- What it does: Validates whether traffic (5-tuple) is allowed or denied by NSG rules for a VM NIC at a given time.
- Why it matters: NSG rule evaluation is one of the most common causes of connectivity issues.
- Practical benefit: Pinpoints the exact NSG rule (allow/deny) affecting traffic.
- Limitations/caveats: Focused on NSG evaluation; doesn’t prove the remote endpoint is listening or that routing is correct.
Next hop
- What it does: Shows the next hop type and IP for traffic from a VM to a destination, based on effective routes.
- Why it matters: Routing surprises are common in hub-and-spoke networks with UDRs.
- Practical benefit: Confirms whether traffic is going to Internet, a virtual appliance, a gateway, or staying within VNet.
- Limitations/caveats: Again, route choice isn’t the same as end-to-end success; downstream devices can still drop traffic.
Effective routes
- What it does: Displays the effective route table applied to a NIC, including system routes and user-defined routes (UDRs).
- Why it matters: Many outages come from unintended route propagation or UDR mistakes.
- Practical benefit: Enables deterministic verification of routing behavior.
- Limitations/caveats: Must interpret with knowledge of peering, gateways, and appliance routing.
Security group view (effective NSG rules)
- What it does: Shows the effective inbound/outbound security rules applied to a NIC from associated NSGs.
- Why it matters: Multiple NSGs (subnet + NIC) can make effective policy unclear.
- Practical benefit: Quickly review the rules that actually apply.
- Limitations/caveats: Effective rules are still just rules; they don’t validate remote service health.
NSG flow logs
- What it does: Logs network flows that pass through an NSG, typically to a storage account; optionally integrated into broader log analytics pipelines.
- Why it matters: Provides network visibility and supports investigations.
- Practical benefit: Audit traffic patterns, detect anomalies, and validate segmentation.
- Limitations/caveats: Can generate large volumes; requires careful retention, storage security, and cost management.
Packet capture
- What it does: Captures packets on a VM (often via a Network Watcher agent/extension) and stores captures for analysis.
- Why it matters: When logs and flow summaries aren’t enough, packets provide ground truth.
- Practical benefit: Diagnose TCP handshakes, MTU issues, retransmissions, and TLS negotiation problems.
- Limitations/caveats: Sensitive data risk; capture files can be large; requires strict access control and short retention.
VPN troubleshoot (for supported VPN Gateway scenarios)
- What it does: Helps diagnose VPN tunnel connectivity issues using Azure-side diagnostics.
- Why it matters: Hybrid connectivity is business-critical and often complex.
- Practical benefit: Faster isolation of misconfigurations and tunnel state issues.
- Limitations/caveats: Not a replacement for on-prem device logs; scenario coverage varies—verify support for your gateway type and configuration.
7. Architecture and How It Works
High-level architecture
Azure Network Watcher is a regional orchestration service that: 1. Uses Azure control-plane APIs to inspect network configuration (NSGs, routes, NICs). 2. For certain features (packet capture, continuous connection monitoring), coordinates with an agent/VM extension to collect data from the guest/host boundary. 3. Stores outputs in Azure Storage and/or Log Analytics depending on the feature and your configuration. 4. Surfaces results via Portal, CLI/PowerShell, and APIs.
Request/data/control flow
- Control plane: You request a diagnostic action (e.g., next hop) → Network Watcher queries Azure networking configuration → returns results.
- Data plane (logging): NSG flow logs and packet captures generate data → written to storage/log destinations → queried/processed externally (Log Analytics, SIEM, notebooks, etc.).
Key integrations
- Azure Monitor / Log Analytics: alerting and analytics for connection monitoring and log queries.
- Azure Storage: common sink for NSG flow logs and packet capture files.
- Azure RBAC: governs who can run diagnostics and access captured data.
- Network resources: VNets, NSGs, NICs, route tables, gateways, load balancers (as applicable).
Dependency services
Typical dependencies you’ll see in real deployments: – Storage accounts (logging destination) – Log Analytics workspace (analytics, alerting) – VM extensions/agents (for packet capture and some monitoring scenarios) – Azure Policy (to enforce enabling flow logs or diagnostic settings—policy availability and effects vary by resource type; verify in official docs)
Security/authentication model
- Uses Azure AD authentication and Azure RBAC for permissions.
- Diagnostic actions are management operations; access is governed by roles on subscriptions/resource groups/resources.
- Data access (packet capture files, flow logs) is governed by permissions on the storage account or workspace.
Networking model
- Network Watcher does not “sit inline” in your traffic path.
- It observes configuration and logs, and for some features it triggers captures/agents on endpoints.
- NSG flow logs are generated as part of NSG processing and exported to configured destinations.
Monitoring/logging/governance considerations
- Decide upfront:
- Which VNets/subnets require flow logs (usually production and shared networks)
- Retention period and access controls for logs
- Whether logs go to centralized storage/workspaces per environment
- Who can run packet capture and where outputs are stored
- Use tags, naming standards, and consistent resource group layouts to make diagnostics repeatable.
Simple architecture diagram (Mermaid)
flowchart LR
User[Engineer / SRE] -->|Portal / CLI| NW[Azure Network Watcher (Regional)]
NW --> ARM[Azure Resource Manager APIs]
ARM --> VNet[VNets / NICs / NSGs / Routes]
NW -->|Flow logs / Capture output| Storage[Azure Storage Account]
NW -->|Metrics/Logs (optional)| LA[Log Analytics Workspace]
User -->|Query| LA
User -->|Download| Storage
Production-style architecture diagram (Mermaid)
flowchart TB
subgraph Subscriptions["Azure Subscriptions"]
subgraph Hub["Hub Network (Prod)"]
FW[Azure Firewall / NVA]
GW[VPN Gateway]
NSGHub[NSGs + UDRs]
end
subgraph Spokes["Spoke VNets (Prod)"]
App[App VMs/VMSS]
DB[DB VMs]
NSGSpoke[NSGs per subnet/NIC]
end
end
subgraph NWRegion["Azure Network Watcher (per region)"]
CM[Connection Monitor]
Diag[Diagnostics: Next Hop / IP Flow Verify / Topology]
PC[Packet Capture (on demand)]
FL[NSG Flow Logs]
end
subgraph Observability["Observability & Governance"]
SA[Central Storage Account(s)]
LAW[Log Analytics Workspace]
AM[Azure Monitor Alerts]
SIEM[Microsoft Sentinel (optional)]
end
App <--> DB
App --> FW
FW --> GW
CM --> LAW
FL --> SA
PC --> SA
Diag --> App
Diag --> DB
LAW --> AM
LAW --> SIEM
8. Prerequisites
Before starting with Azure Network Watcher in a lab or production environment:
Azure account and subscription
- An Azure subscription where you can create:
- Resource groups
- VNets, subnets, NSGs
- Virtual machines
- Storage account (for logs)
- Billing must be enabled (even if using free credits), because:
- VMs cost money
- Storage and log ingestion can cost money
Permissions / IAM roles
You need sufficient Azure RBAC permissions for: – Creating network and compute resources (e.g., Contributor on a resource group for the lab) – Running Network Watcher operations (typically covered by Contributor/Network Contributor) – Enabling NSG flow logs and accessing storage outputs
Common roles (choose least privilege appropriate to your org): – Network Contributor (network resources) – Virtual Machine Contributor (VMs) – Storage Blob Data Reader/Contributor (to view flow logs / packet captures in storage) – Log Analytics Reader/Contributor (if using Log Analytics)
In production, separate “who can run packet capture” from “who can read capture files” to reduce sensitive data exposure.
Tools
For the hands-on lab:
– Azure CLI (recommended): https://learn.microsoft.com/cli/azure/install-azure-cli
– Optional: PowerShell Az module, or Portal-only workflow
Region availability
- Azure Network Watcher is regional and broadly available. Still:
- Verify your target region supports the features you need.
- Ensure Network Watcher is enabled in that region for your subscription (often automatic, but not guaranteed in every scenario).
Quotas/limits (high level)
- VM cores quota in the region
- Public IP quotas (if using public access)
- Storage account limits (IOPS/throughput) and retention
- Flow log volume and workspace ingestion limits (if integrating with Log Analytics)
Always verify exact service limits in official docs for the feature you’re using.
Prerequisite services/resources
- Virtual network and subnets
- NSG applied to a subnet or NIC (for flow logs and rule evaluation)
- VMs for connectivity and packet capture scenarios
- Storage account for NSG flow logs and packet capture outputs (recommended)
9. Pricing / Cost
Azure Network Watcher pricing is feature- and usage-dependent. The base “service” may appear free, but many capabilities generate costs via dependent services and data processing.
Official pricing page (verify current pricing and meters):
https://azure.microsoft.com/pricing/details/network-watcher/
Azure Pricing Calculator:
https://azure.microsoft.com/pricing/calculator/
Pricing dimensions (what you pay for)
Depending on what you enable, costs typically come from: – NSG flow logs – Log generation/export and/or processing (metering depends on current Azure pricing model—verify) – Storage costs (hot/cool/archive, transactions) – Optional analytics costs if you ingest into Log Analytics / SIEM – Connection Monitor – Test runs, monitoring frequency, and data ingestion/storage (often via Azure Monitor/Log Analytics) – Packet capture – Storage for capture files (pcap) and storage transactions – VM overhead (CPU/disk during capture) can be indirect cost – VMs used for monitoring – If you deploy “test agents” or monitor from VMs, VM runtime is a cost driver – Log Analytics – Data ingestion, retention beyond free thresholds (if any), and queries (pricing varies by model and region)
Free tier
- There is no universal “free tier” that makes all Network Watcher features free. Some components may not charge directly, but downstream storage and analytics almost always do.
- Always validate what is included for your subscription type and region in the official pricing page.
Main cost drivers
- Volume of flow logs (high traffic subnets generate lots of logs)
- Retention period (storage costs scale with time)
- Connection Monitor frequency and number of tests
- Log Analytics ingestion (if you centralize flow logs or monitoring into a workspace)
- Packet capture size and frequency
Hidden or indirect costs
- Data transfer charges (for moving logs between regions, or exporting out of Azure)
- Security overhead (Key rotation, access reviews, SIEM integration)
- Operational overhead (incident response processes, runbooks, tooling)
Network/data transfer implications
- If you centralize logs cross-region, or export to third-party tools, you may incur:
- Inter-region bandwidth costs
- Egress charges to the internet or to other clouds
How to optimize cost (practical guidance)
- Enable NSG flow logs only where needed (production, shared services, sensitive subnets).
- Use short retention in hot storage; lifecycle older logs to cool/archive when appropriate.
- Prefer targeted packet captures with strict time windows and filters.
- Right-size Connection Monitor: fewer endpoints, longer intervals, and focused tests for critical paths.
- If using Log Analytics:
- Control which logs are ingested
- Define retention intentionally
- Consider sampling strategies where appropriate (verify what is supported)
Example low-cost starter estimate (conceptual)
A minimal lab typically includes: – 2 small Linux VMs for a short time (primary cost) – 1 storage account with minimal logs/captures (small cost if limited volume and retention) – Optional Log Analytics workspace (can add cost if ingesting flow logs)
Because prices vary by region and change over time, do not assume fixed numbers—use the pricing calculator with your region and expected data volumes.
Example production cost considerations
In production, the largest costs usually come from: – High-volume NSG flow logs across many subnets – Centralized analytics (Log Analytics/SIEM ingestion and retention) – Multiple regions and high availability logging patterns
A cost-conscious production pattern is: – Enable flow logs for critical NSGs only – Centralize in a small number of storage accounts with lifecycle policies – Ingest only necessary subsets into analytics platforms – Use scheduled audits and on-demand deep diagnostics (packet capture) rather than always-on deep capture
10. Step-by-Step Hands-On Tutorial
This lab builds a small Azure network, intentionally blocks traffic with an NSG, and then uses Azure Network Watcher tools to identify the cause and validate the fix. It is designed to be safe and relatively low-cost if you delete resources afterward.
Objective
- Create two VMs in the same VNet on different subnets.
- Apply an NSG rule that blocks SSH to one VM.
- Use Azure Network Watcher to:
- Test connectivity
- Verify IP flow (NSG allow/deny)
- Check next hop (routing)
- Fix the NSG rule and confirm connectivity.
- (Optional) Enable NSG flow logs to a storage account and view generated log blobs.
Lab Overview
You will create:
– Resource group: rg-nw-lab
– VNet: vnet-nw-lab with two subnets
– NSG: nsg-vm2 applied to VM2 NIC (or subnet)
– VM1 (jump/test): vm1-nw (Linux)
– VM2 (target): vm2-nw (Linux)
– Public IP for VM1 to SSH in (optional but convenient)
– Use Azure Network Watcher diagnostics in the same region
Cost note: The biggest cost in this lab is VM runtime. Use small VM sizes and delete the resource group when finished.
Step 1: Set variables and sign in (Azure CLI)
- Open a terminal with Azure CLI installed.
- Sign in and pick a subscription.
az login
az account show
# If needed:
az account set --subscription "<YOUR_SUBSCRIPTION_ID_OR_NAME>"
Set variables (choose a region close to you):
RG="rg-nw-lab"
LOC="eastus" # change as needed
VNET="vnet-nw-lab"
SUBNET1="snet-vm1"
SUBNET2="snet-vm2"
NSG="nsg-vm2"
VM1="vm1-nw"
VM2="vm2-nw"
ADMINUSER="azureuser"
Expected outcome: You’re authenticated, and variables are defined.
Step 2: Create a resource group and VNet with two subnets
az group create -n "$RG" -l "$LOC"
az network vnet create \
-g "$RG" -n "$VNET" -l "$LOC" \
--address-prefixes 10.10.0.0/16 \
--subnet-name "$SUBNET1" --subnet-prefixes 10.10.1.0/24
az network vnet subnet create \
-g "$RG" --vnet-name "$VNET" -n "$SUBNET2" \
--address-prefixes 10.10.2.0/24
Expected outcome: Resource group, VNet, and two subnets exist.
Verify:
az network vnet show -g "$RG" -n "$VNET" --query "{addressSpace:addressSpace.addressPrefixes, subnets:subnets[].name}" -o table
Step 3: Create an NSG that blocks SSH inbound (intentionally)
Create an NSG and a deny rule for TCP 22 inbound:
az network nsg create -g "$RG" -n "$NSG" -l "$LOC"
az network nsg rule create \
-g "$RG" --nsg-name "$NSG" -n "Deny-SSH-Inbound" \
--priority 100 \
--direction Inbound --access Deny --protocol Tcp \
--source-address-prefixes "*" --source-port-ranges "*" \
--destination-address-prefixes "*" --destination-port-ranges 22
Expected outcome: VM2 will not allow inbound SSH (even from VM1) once the NSG is applied.
Verify:
az network nsg rule list -g "$RG" --nsg-name "$NSG" -o table
Step 4: Create VM1 (with public IP) and VM2 (private only)
Create VM1 in subnet1:
az vm create \
-g "$RG" -n "$VM1" -l "$LOC" \
--image Ubuntu2204 \
--admin-username "$ADMINUSER" \
--generate-ssh-keys \
--vnet-name "$VNET" --subnet "$SUBNET1" \
--public-ip-sku Standard
Create VM2 in subnet2 (no public IP):
az vm create \
-g "$RG" -n "$VM2" -l "$LOC" \
--image Ubuntu2204 \
--admin-username "$ADMINUSER" \
--generate-ssh-keys \
--vnet-name "$VNET" --subnet "$SUBNET2" \
--public-ip-address ""
Apply the NSG to VM2’s NIC (NIC-level is clear for labs):
VM2_NIC_ID=$(az vm show -g "$RG" -n "$VM2" --query "networkProfile.networkInterfaces[0].id" -o tsv)
az network nic update \
--ids "$VM2_NIC_ID" \
--network-security-group "$NSG"
Expected outcome: VM1 is reachable via SSH from your machine; VM2 has no public IP and blocks SSH inbound due to the NSG.
Verify VM IPs:
VM1_PUBLIC_IP=$(az vm show -d -g "$RG" -n "$VM1" --query publicIps -o tsv)
VM1_PRIVATE_IP=$(az vm show -d -g "$RG" -n "$VM1" --query privateIps -o tsv)
VM2_PRIVATE_IP=$(az vm show -d -g "$RG" -n "$VM2" --query privateIps -o tsv)
echo "VM1 public: $VM1_PUBLIC_IP"
echo "VM1 private: $VM1_PRIVATE_IP"
echo "VM2 private: $VM2_PRIVATE_IP"
Step 5: Ensure Azure Network Watcher is enabled in the region
In many subscriptions, Azure enables Network Watcher automatically when networking resources exist. Still, explicitly enabling it avoids confusion.
Run:
az network watcher configure --locations "$LOC" --enabled true
Expected outcome: Network Watcher is enabled for the region.
Verify:
az network watcher list -g "NetworkWatcherRG" -o table 2>/dev/null || true
If the
NetworkWatcherRGresource group name differs or isn’t visible due to permissions, verify in the Azure Portal: search Network Watcher → ensure the region is enabled.
Step 6: Reproduce the problem (SSH from VM1 to VM2 should fail)
SSH into VM1 from your local machine:
ssh ${ADMINUSER}@${VM1_PUBLIC_IP}
From VM1, attempt to SSH to VM2 private IP:
ssh -o ConnectTimeout=5 ${ADMINUSER}@${VM2_PRIVATE_IP}
Expected outcome: SSH fails (timeout or connection failure), because VM2 inbound TCP 22 is denied by the NSG.
Exit VM1 (or keep it open for later):
exit
Step 7: Use Network Watcher “IP flow verify” to confirm NSG is denying
Run IP flow verify against VM2 NIC for inbound port 22. You need: – Target VM (VM2) – Direction: inbound – Protocol: TCP – Local port: 22 – Remote IP: VM1 private IP (source)
In Azure CLI, IP flow verify is exposed under Network Watcher. The exact CLI parameters can vary by CLI version; use --help if needed.
Check help:
az network watcher ip-flow-verify --help
A commonly used pattern is:
az network watcher ip-flow-verify \
-g "$RG" \
--vm "$VM2" \
--direction Inbound \
--protocol Tcp \
--local "$VM2_PRIVATE_IP:22" \
--remote "$VM1_PRIVATE_IP:12345"
Expected outcome: Result indicates Deny and identifies the rule (for example, Deny-SSH-Inbound).
If the CLI syntax differs in your installed version, use the Azure Portal alternative: Network Watcher → IP flow verify → select VM2 → specify inbound, TCP, local port 22, remote IP VM1.
Step 8: Use Network Watcher “Next hop” to confirm routing is not the issue
Check next hop from VM1 to VM2 private IP:
az network watcher show-next-hop \
-g "$RG" \
--vm "$VM1" \
--destination-ip-address "$VM2_PRIVATE_IP"
Expected outcome: Next hop should indicate a VNet route (for example, VnetLocal) and show that routing is normal inside the VNet.
Step 9: Use Network Watcher “Test connectivity” (connection troubleshoot)
Run a connectivity test from VM1 to VM2:22.
Check help:
az network watcher test-connectivity --help
Run the test:
az network watcher test-connectivity \
-g "$RG" \
--source-resource "$(az vm show -g "$RG" -n "$VM1" --query id -o tsv)" \
--dest-address "$VM2_PRIVATE_IP" \
--dest-port 22
Expected outcome: Status should be Unreachable (or similar), and details may point to NSG denial.
If it reports agent/extension requirements, install the Network Watcher VM extension (next step) or use portal-based diagnostics which may guide you.
Step 10: Fix the NSG and re-test
Now allow SSH from VM1 subnet to VM2 on port 22 (more secure than allowing *).
Create an allow rule with higher priority (lower number) than the deny rule:
az network nsg rule create \
-g "$RG" --nsg-name "$NSG" -n "Allow-SSH-From-Subnet1" \
--priority 90 \
--direction Inbound --access Allow --protocol Tcp \
--source-address-prefixes 10.10.1.0/24 --source-port-ranges "*" \
--destination-address-prefixes "*" --destination-port-ranges 22
Expected outcome: SSH from VM1 to VM2 should succeed now.
Re-SSH to VM1 and test again:
ssh ${ADMINUSER}@${VM1_PUBLIC_IP}
ssh -o ConnectTimeout=5 ${ADMINUSER}@${VM2_PRIVATE_IP}
You should get an SSH prompt on VM2. Exit both sessions:
exit
exit
Step 11 (Optional): Enable NSG flow logs to a storage account (Portal-first)
This optional step adds observability but can increase cost and complexity. It’s valuable to see real flow records.
- Create a storage account (CLI):
STORAGE="stnwl$RANDOM$RANDOM"
az storage account create \
-g "$RG" -n "$STORAGE" -l "$LOC" \
--sku Standard_LRS \
--kind StorageV2
-
In Azure Portal: – Go to Network Watcher → NSG flow logs – Select your NSG (
nsg-vm2) – Set Flow logs = On – Choose the storage account you created – Choose retention (keep it short for lab) – Save -
Generate some traffic (from VM1 to VM2): – SSH to VM1 – SSH to VM2 a few times, or run
curlto a port if you open one – Wait a few minutes for logs to appear -
View logs in the storage account: – Storage account → Containers – Look for the flow logs container/path created by the feature – Download a JSON log file and inspect it
Expected outcome: You’ll find flow log blobs that record allowed/denied flows through the NSG.
Exact container names and schema can evolve; use Microsoft’s documentation for current flow log format and fields.
Validation
You have successfully validated: – An NSG deny rule caused an SSH outage (reproduced). – IP flow verify identified the deny action and (typically) the matching rule. – Next hop confirmed routing was not the issue. – Test connectivity confirmed the reachability status. – Updating NSG rules restored connectivity. – (Optional) NSG flow logs captured flow telemetry to storage.
Troubleshooting
Common issues and fixes:
-
Network Watcher isn’t enabled in the region – Symptom: Tools fail or region isn’t selectable. – Fix: Enable it for the region:
bash az network watcher configure --locations "$LOC" --enabled true -
RBAC permissions – Symptom: Access denied when running diagnostics or configuring flow logs. – Fix: Ensure you have appropriate roles (Network Contributor, Contributor, plus storage/log roles for data access).
-
NSG applied to wrong place – Symptom: SSH isn’t blocked even though you created a deny rule, or remains blocked after allowing. – Fix: Confirm the NSG is associated with the correct NIC or subnet, and verify effective rules: – Portal → VM → Networking → NIC NSG association – Network Watcher → Security group view (effective rules)
-
SSH fails even after allow rule – Potential causes: – VM2’s OS firewall (UFW/iptables) blocks port 22 – Wrong username or keys – You’re testing from a different source IP range than allowed – Fix: Verify OS firewall and confirm rule source prefix matches VM1 subnet.
-
Flow logs not appearing – Causes: – Flow logs not enabled on the correct NSG – Wrong storage account selected or access issue – Not enough time elapsed – Fix: Re-check configuration, then generate traffic and wait several minutes.
Cleanup
Delete the entire resource group to avoid ongoing charges:
az group delete -n "$RG" --yes --no-wait
Expected outcome: All lab resources are removed (VMs, network, NSG, storage). Confirm in portal after deletion completes.
11. Best Practices
Architecture best practices
- Design for debuggability: Standardize NSG usage (subnet vs NIC) so effective policy is predictable.
- Hub-and-spoke clarity: In complex networks, document UDRs, firewall paths, and DNS—Network Watcher helps validate, but architecture clarity prevents incidents.
- Centralize logs deliberately: Decide whether flow logs go to per-subscription storage or centralized logging subscriptions.
IAM/security best practices
- Apply least privilege:
- Many engineers can run “read-only” diagnostics (topology, effective routes).
- Only a small group should run packet capture.
- Separate permissions for:
- Running packet capture
- Reading capture output in storage
- Use Privileged Identity Management (PIM) where appropriate for just-in-time elevation (verify applicability in your tenant).
Cost best practices
- Enable NSG flow logs selectively and review periodically.
- Control retention and storage lifecycle policies.
- Avoid indiscriminate Log Analytics ingestion for high-volume flow logs unless you have a clear detection/analytics need and budget.
Performance best practices
- Prefer flow logs for broad visibility and packet capture for targeted deep dives.
- Schedule captures for short windows; use filters where supported.
- For continuous monitoring, pick intervals appropriate for the SLO (don’t over-sample).
Reliability best practices
- Use Connection Monitor for critical dependencies with alerting via Azure Monitor.
- Run periodic “network health checks” as part of operational readiness.
Operations best practices
- Maintain an incident runbook:
- Step 1: test-connectivity
- Step 2: IP flow verify
- Step 3: next hop + effective routes
- Step 4: check NSG flow logs (if enabled)
- Step 5: packet capture (only if needed)
- Standardize naming conventions:
nsg-<app>-<env>-<region>rt-<subnet>-<env>- Tags:
env,owner,costCenter,dataSensitivity
Governance/tagging/naming best practices
- Tag NSGs and logging storage accounts with:
dataClassification(because flow logs can be sensitive)retentionPolicysecurityOwner- Use Azure Policy where possible to audit:
- NSG presence on subnets
- Flow log enablement (availability depends on policy definitions—verify current built-ins)
12. Security Considerations
Identity and access model
- Azure Network Watcher actions are controlled by Azure RBAC.
- Treat the ability to run packet capture and to read its output as privileged.
Encryption
- At rest: Azure Storage and Log Analytics encrypt data at rest by default (verify current guarantees and configuration options).
- In transit: Access to storage/workspaces uses TLS.
- Consider customer-managed keys (CMK) if required by policy (verify service support for your chosen storage/workspace configuration).
Network exposure
- Network Watcher does not expose inbound endpoints into your VNets, but:
- Packet capture outputs and flow logs are stored in storage accounts—secure those endpoints (private endpoints, firewall rules, least privilege).
- Avoid public access to storage where possible.
Secrets handling
- Prefer identity-based access (Azure AD) to storage over shared keys when possible.
- Rotate storage keys if they must be used (some workflows historically relied on keys; verify current options).
- Avoid sharing capture files outside controlled channels.
Audit/logging
- Log and audit:
- Who enabled flow logs
- Who ran packet captures and when
- Storage access (Blob access logs, Azure Activity Logs)
- Consider sending Activity Logs to a central workspace/SIEM.
Compliance considerations
- Flow logs and packet captures may contain:
- IP addresses, ports, and metadata
- Potential payload data (packet capture)
- Apply:
- Data retention limits
- Access reviews
- Incident handling procedures
Common security mistakes
- Storing packet captures in a broadly accessible storage account.
- Long retention without justification.
- Enabling flow logs everywhere without a plan, then failing to secure or review the data.
- Granting broad Contributor rights to too many people, enabling unintended data exposure.
Secure deployment recommendations
- Use dedicated, locked-down storage accounts for network logs.
- Apply private endpoints and storage firewall rules where possible.
- Define a minimal group for packet capture capability and enforce JIT access.
- Document data handling and retention policies for flow logs and captures.
13. Limitations and Gotchas
The exact limitations vary by feature and evolve over time. Always confirm in official documentation for your region and scenario.
Common, practical gotchas include:
- Regional enablement: Network Watcher is regional; troubleshooting a resource in a region where it’s not enabled can fail or be confusing.
- Agent/extension dependencies: Some features (notably packet capture and certain continuous monitoring scenarios) may require VM extensions/agents and proper VM access.
- Data volume growth: NSG flow logs can become massive in busy environments.
- Retention surprises: Keeping logs “forever” in hot storage is expensive and risky.
- Storage security: Flow logs and packet capture files are sensitive; a misconfigured storage account is a security incident waiting to happen.
- Point-in-time vs continuous: Tools like IP flow verify and next hop are point-in-time evaluations; they don’t replace continuous telemetry.
- Complex routing: Effective routes can be correct yet traffic still fails due to downstream appliances, asymmetric routing, or on-prem routes.
- Portal UX changes: Azure Portal often renames or reorganizes blades; rely on official docs when you can’t find a feature.
- CLI command variations: Azure CLI evolves; if a command differs, use
--helpand cross-check official CLI reference pages.
14. Comparison with Alternatives
Azure Network Watcher is not the only way to monitor and troubleshoot networking. It’s often used alongside other tools.
Comparison table
| Option | Best For | Strengths | Weaknesses | When to Choose |
|---|---|---|---|---|
| Azure Network Watcher | Azure VNet diagnostics, NSG flow logs, routing/connectivity troubleshooting | First-party Azure-aware tools; deep diagnostics (IP flow verify, next hop); integrates with Azure Monitor | Some features require agents; logging can be high-volume; not a full SIEM/APM | Default choice for Azure IaaS network troubleshooting and baseline network visibility |
| Azure Monitor (Logs/Metrics/Alerts) | Central monitoring, alerting, query, dashboards | Strong analytics and alerting; cross-service observability | Needs data sources (flow logs, VM logs); may increase cost with ingestion | Use with Network Watcher for alerting and long-term analysis |
| Microsoft Sentinel | SIEM and security analytics | Correlation, detection rules, incident management | Additional cost and tuning; needs good data hygiene | Choose when security monitoring and SOC workflows are required |
| Azure Firewall logs / NVA logs | Centralized egress/ingress inspection | L3–L7 visibility at chokepoints | Doesn’t replace NSG-level visibility everywhere | Use when you enforce centralized inspection and want firewall-centric visibility |
| AWS VPC Reachability Analyzer + VPC Flow Logs | Similar capabilities in AWS | Strong path analysis; flow logging | Different cloud; not Azure-native | Choose for AWS environments |
| Google Network Intelligence Center (incl. Connectivity Tests) | Similar capabilities in GCP | Network insights and tests | Different cloud | Choose for GCP environments |
| Self-managed tcpdump/Wireshark/Zeek | Deep packet and protocol analysis | Maximum detail and control | Operational overhead; access challenges; not Azure-aware by default | Use when you need deep inspection beyond what managed tooling provides (often alongside Network Watcher) |
15. Real-World Example
Enterprise example (regulated, hub-and-spoke)
- Problem: A financial services company runs a hub-and-spoke Azure network with strict segmentation. Periodic incidents occur after NSG/UDR changes, and audits require evidence of traffic controls.
- Proposed architecture:
- Enable Azure Network Watcher in all production regions.
- Enable NSG flow logs for:
- Hub firewall subnets
- Spoke subnets containing regulated workloads
- Send flow logs to dedicated storage accounts with lifecycle policies.
- Use Connection Monitor for critical dependency paths (web → API → DB, and hybrid endpoints).
- Use Azure Monitor alerts for connection monitor failures.
- Why this service was chosen:
- Azure-native diagnostics map directly to NSGs, NICs, and UDRs.
- Supports audit and incident response with flow evidence.
- Expected outcomes:
- Faster resolution of “blocked traffic” incidents.
- Improved audit readiness with consistent, reviewable network logs.
- Reduced risk from misconfigured routes and security rules.
Startup/small-team example (lean ops)
- Problem: A startup hosts a small SaaS on a few VMs. They occasionally break internal connectivity when tightening NSG rules and need a fast way to debug without a dedicated network team.
- Proposed architecture:
- Use Azure Network Watcher’s IP flow verify + next hop as standard incident steps.
- Enable NSG flow logs only on the production backend subnet NSG with short retention.
- Use Connection Monitor only for the single most critical dependency path.
- Why this service was chosen:
- Low operational overhead; first-party tool integrated into Azure Portal.
- Targeted logging keeps cost manageable.
- Expected outcomes:
- Fewer “mystery outages” after configuration changes.
- Better confidence during deployments and security hardening.
16. FAQ
1) Is Azure Network Watcher free?
The “service” may not have a flat monthly fee, but many features generate usage-based costs, especially NSG flow logs, storage, and any analytics ingestion. Always check the official pricing page.
2) Is Azure Network Watcher global or regional?
It is regional. You typically enable and use it per Azure region.
3) Do I need to enable Azure Network Watcher manually?
Often it is enabled automatically when you create networking resources, but not always in every scenario. If diagnostics aren’t working, explicitly enable it for the region.
4) What’s the fastest way to see if an NSG is blocking traffic?
Use IP flow verify (and optionally security group view) to see whether a specific flow is allowed or denied and which rule matches.
5) What’s the fastest way to check routing issues?
Use Next hop and effective routes. These show where Azure will send traffic and which routes apply.
6) What is the difference between NSG flow logs and packet capture?
- NSG flow logs: summarized flow records at the NSG level (who talked to whom, allowed/denied).
- Packet capture: packet-level data from a VM (deep inspection, payload risk).
7) Can Azure Network Watcher troubleshoot PaaS services directly?
Network Watcher primarily targets VNet-attached resources (VMs, NICs, NSGs, routes). For PaaS, you often combine it with service-specific diagnostics and Azure Monitor.
8) Does Connection Monitor replace the older Network Performance Monitor?
Historically, Azure offered Network Performance Monitor (NPM) and earlier “classic” experiences. Today, Connection Monitor is the primary approach under Azure Network Watcher/Azure Monitor. Verify current migration guidance in official docs.
9) Can I use Connection Monitor for hybrid connectivity?
Often yes, depending on endpoint types and agent support. Verify current supported endpoints and requirements in the official Connection Monitor documentation.
10) Where should I store NSG flow logs?
Commonly in an Azure Storage account with restricted access and lifecycle policies. Some organizations centralize logs into a dedicated logging subscription.
11) How long should I retain flow logs?
Keep the minimum required for: – troubleshooting (often days/weeks) – compliance (varies) Use storage lifecycle policies to reduce cost and exposure.
12) Can Network Watcher prove my application is healthy?
No. It can show network reachability and network-level symptoms, but application health depends on app logs, dependencies, and performance telemetry.
13) Why does “next hop” look correct but traffic still fails?
Because routing can be correct while: – NSGs deny the flow – a firewall/NVA drops traffic – asymmetric routing breaks return traffic – DNS resolves incorrectly – the destination service isn’t listening
14) Is it safe to run packet capture in production?
It can be, if you: – restrict access – run short captures with filters – store outputs securely – have an approved data handling policy But treat captures as sensitive.
15) How do I operationalize Azure Network Watcher for governance?
Standardize: – which NSGs have flow logs enabled – where logs are stored – retention and access controls – runbooks for incident triage using IP flow verify/next hop/test connectivity Then audit regularly.
16) Does Azure Network Watcher work across subscriptions?
It operates within what you have RBAC access to. Cross-subscription scenarios are common in enterprises, but you must design permissions, logging destinations, and operational processes accordingly.
17) Can I automate Network Watcher diagnostics?
Yes. Many actions are exposed via Azure CLI, PowerShell, and ARM APIs, enabling scripted troubleshooting and runbook automation.
17. Top Online Resources to Learn Azure Network Watcher
| Resource Type | Name | Why It Is Useful |
|---|---|---|
| Official documentation | Azure Network Watcher documentation: https://learn.microsoft.com/azure/network-watcher/ | Authoritative reference for features, requirements, and latest updates |
| Official documentation | Connection Monitor: https://learn.microsoft.com/azure/network-watcher/connection-monitor | How to set up continuous connectivity monitoring and interpret results |
| Official documentation | IP flow verify / traffic filtering diagnostics: https://learn.microsoft.com/azure/network-watcher/diagnose-network-traffic-filtering-problem | Step-by-step NSG deny/allow troubleshooting |
| Official documentation | Next hop / routing diagnostics: https://learn.microsoft.com/azure/network-watcher/diagnose-vm-network-routing-problem | Understand effective routes and route-related failures |
| Official documentation | Packet capture: https://learn.microsoft.com/azure/network-watcher/packet-capture-overview | How to safely capture packets and manage outputs |
| Official documentation | NSG flow logs: https://learn.microsoft.com/azure/network-watcher/nsg-flow-logs | Configuration, log format, and operational guidance |
| Official documentation | VPN troubleshooting: https://learn.microsoft.com/azure/network-watcher/network-watcher-troubleshoot-vpn | Supported scenarios and troubleshooting steps |
| Official pricing page | Azure Network Watcher pricing: https://azure.microsoft.com/pricing/details/network-watcher/ | Current meters and billing model |
| Pricing tool | Azure Pricing Calculator: https://azure.microsoft.com/pricing/calculator/ | Estimate total cost including storage and Log Analytics |
| Official CLI reference | Azure CLI Network Watcher commands: https://learn.microsoft.com/cli/azure/network/watcher | Exact CLI syntax and parameters by version |
| Architecture guidance | Azure Architecture Center: https://learn.microsoft.com/azure/architecture/ | Patterns for hub-spoke, governance, and observability (combine with Network Watcher) |
| Official videos | Microsoft Azure YouTube channel: https://www.youtube.com/@MicrosoftAzure | Search for Network Watcher/Connection Monitor walkthroughs and updates |
| Samples (verify) | Azure samples on GitHub: https://github.com/Azure | Find scripts and examples; validate they match current docs before using |
18. Training and Certification Providers
The following training providers may offer Azure, networking, or operations courses. Verify current course availability and delivery modes on their websites.
| Institute | Suitable Audience | Likely Learning Focus | Mode | Website URL |
|---|---|---|---|---|
| DevOpsSchool.com | DevOps engineers, cloud engineers, SREs | Azure operations, DevOps practices, monitoring fundamentals | Check website | https://www.devopsschool.com/ |
| ScmGalaxy.com | Beginners to intermediate engineers | DevOps, SCM, cloud fundamentals | Check website | https://www.scmgalaxy.com/ |
| CLoudOpsNow.in | Cloud operations teams | Cloud ops, monitoring, governance | Check website | https://www.cloudopsnow.in/ |
| SreSchool.com | SREs, platform engineers | Reliability engineering, incident response, observability | Check website | https://www.sreschool.com/ |
| AiOpsSchool.com | Ops and monitoring teams | AIOps concepts, automation, monitoring analytics | Check website | https://www.aiopsschool.com/ |
19. Top Trainers
These sites may list trainers, coaching, or training services. Verify background and course relevance directly on each site.
| Platform/Site | Likely Specialization | Suitable Audience | Website URL |
|---|---|---|---|
| RajeshKumar.xyz | DevOps/cloud training content (verify specifics) | Beginners to intermediate | https://rajeshkumar.xyz/ |
| devopstrainer.in | DevOps training services (verify specifics) | DevOps engineers, admins | https://devopstrainer.in/ |
| devopsfreelancer.com | Freelance DevOps guidance (verify specifics) | Teams needing short-term help | https://devopsfreelancer.com/ |
| devopssupport.in | DevOps support/training resources (verify specifics) | Ops/DevOps teams | https://devopssupport.in/ |
20. Top Consulting Companies
These consulting organizations may help with Azure architecture, operations, and governance initiatives. Confirm service scope, references, and delivery model directly with each provider.
| Company Name | Likely Service Area | Where They May Help | Consulting Use Case Examples | Website URL |
|---|---|---|---|---|
| cotocus.com | Cloud/DevOps consulting (verify service catalog) | Cloud adoption, operations setup, governance | Designing network observability approach; implementing logging storage and access controls; runbooks for incident response | https://cotocus.com/ |
| DevOpsSchool.com | DevOps/cloud consulting and training (verify service catalog) | DevOps transformation, platform practices | Building operational playbooks; implementing monitoring strategy; standardizing Azure RBAC and tagging | https://www.devopsschool.com/ |
| DEVOPSCONSULTING.IN | DevOps consulting (verify service catalog) | CI/CD, cloud operations | Operationalizing network diagnostics; integrating logs into monitoring workflows; improving MTTR processes | https://devopsconsulting.in/ |
21. Career and Learning Roadmap
What to learn before Azure Network Watcher
To use Azure Network Watcher effectively, you should understand:
– Azure fundamentals: subscriptions, resource groups, RBAC
– Azure networking basics:
– VNets, subnets
– NSGs and rule evaluation
– Route tables (UDRs) and system routes
– VNet peering
– DNS basics in Azure
– Basic Linux/Windows networking tools:
– ping, traceroute, ss/netstat, curl, tcpdump (even if you plan to use managed tools)
What to learn after Azure Network Watcher
To mature beyond ad-hoc troubleshooting: – Azure Monitor (Logs, Metrics, Alerts) – Log Analytics / KQL querying – Microsoft Sentinel (if security monitoring is a requirement) – Azure Firewall and/or NVA patterns – Infrastructure as Code (Bicep/Terraform) for consistent NSG/flow log provisioning – Azure Policy for governance and auditing
Job roles that use it
- Cloud engineer / cloud operations engineer
- Network engineer (cloud)
- SRE / platform engineer
- DevOps engineer
- Security engineer / SOC analyst (as a data source for investigations)
- Solutions architect (designing observability and governance)
Certification path (Azure)
Azure certifications change over time; check Microsoft’s certification pages for current tracks. Commonly relevant areas: – Azure Fundamentals (baseline) – Azure Administrator (operations) – Azure Network Engineer (network specialization) – Azure Security Engineer (security monitoring and governance)
Project ideas for practice
- Build a hub-and-spoke lab and validate:
- UDR routing through a firewall/NVA
- segmentation using IP flow verify
- flow log baselining for allowed/denied traffic
- Create an “incident runbook” repository with scripts:
- test-connectivity wrapper
- next hop + effective routes export
- NSG rule evaluation helper
- Build a cost-controlled logging design:
- flow logs → storage with lifecycle rules
- a small subset → Log Analytics for alerts
22. Glossary
- Azure Network Watcher: Azure service for network monitoring and diagnostics in VNets.
- VNet (Virtual Network): Private network in Azure for hosting resources.
- Subnet: A range within a VNet where resources are placed.
- NIC (Network Interface): Network adapter attached to a VM.
- NSG (Network Security Group): L3/L4 stateful filtering rules controlling inbound/outbound traffic.
- UDR (User Defined Route): Custom route table entries to override system routing.
- Effective routes: The final set of routes applied to a NIC (system + UDR).
- Next hop: The next routing destination Azure selects for traffic to a destination IP.
- IP flow verify: A check that returns whether a given 5-tuple is allowed/denied by NSG rules.
- NSG flow logs: Logs that record flows through an NSG (allowed/denied).
- Packet capture: Capturing packets (pcap) from a VM for deep network analysis.
- Connection Monitor: Continuous connectivity monitoring with metrics and (typically) alert integration.
- Log Analytics workspace: Azure Monitor logs store queried with KQL.
- KQL (Kusto Query Language): Query language for Log Analytics and related services.
- Azure RBAC: Role-based access control for managing access to Azure resources.
- MTTR: Mean time to recovery/resolve; a key operations metric.
23. Summary
Azure Network Watcher is Azure’s built-in, regional network diagnostics and visibility service. It matters because cloud networking failures are often caused by subtle interactions between NSGs, routes, and gateways—and Network Watcher provides Azure-native tools (IP flow verify, next hop, connectivity tests, flow logs, and packet capture) to troubleshoot quickly and consistently.
From a cost perspective, the key is understanding that logging and analytics drive spend: NSG flow logs can generate large volumes, and storage/Log Analytics ingestion and retention can become significant. From a security perspective, treat flow logs and especially packet captures as sensitive data, and lock down access to both the diagnostic actions and the stored outputs.
Use Azure Network Watcher when you need reliable, first-party network troubleshooting and governance for Azure VNets. Pair it with Azure Monitor and (optionally) Sentinel when you need alerting and centralized security analytics.
Next step: implement a small production-ready pattern—selective NSG flow logs + Connection Monitor for critical paths + runbooks—and validate it against your organization’s operational and compliance requirements using Microsoft’s official documentation.