Oracle Cloud Stack Monitoring Tutorial: Architecture, Pricing, Use Cases, and Hands-On Guide for Observability and Management

1. Introduction

Oracle Cloud Stack Monitoring is an Oracle Cloud (OCI) Observability and Management service designed to help you monitor and troubleshoot the health and performance of your technology stack—from hosts to middleware and databases—using centralized discovery, metrics, topology views, and alerting integrations.

In simple terms: Stack Monitoring helps you see what you’re running, understand how components depend on each other, and detect problems early by collecting and organizing operational telemetry (primarily metrics) across supported resources.

Technically, Stack Monitoring uses the OCI Management Agent for agent-based data collection and resource discovery, builds a monitored resource inventory and topology (dependency) model, and surfaces metrics and status so operations teams can analyze behavior over time and respond to incidents. It integrates with other OCI services such as Monitoring (metrics/alarms), Notifications, and Logging (for agent/service logs), depending on your setup and workflows.

The problem it solves is common in real environments: when you have multiple layers (compute, OS, middleware, databases), troubleshooting becomes slow because telemetry is scattered and dependencies are unclear. Stack Monitoring focuses on stack-level visibility and operational clarity rather than requiring you to assemble everything yourself from raw metrics and logs.

Service name status: As of the latest publicly available OCI documentation, “Stack Monitoring” is an active OCI service name under Observability and Management. If you find a renamed UI label in your tenancy, verify in official docs and your region’s console because OCI UI groupings can evolve.

2. What is Stack Monitoring?

Official purpose (what it is for)

Stack Monitoring’s purpose is to provide monitoring for your application infrastructure stack by: – Discovering supported resources (for example, hosts and certain Oracle middleware/database technologies) – Collecting key performance and availability metrics – Presenting topology and resource relationships – Enabling alerting/incident workflows through OCI integrations

For the official product documentation, start here:
https://docs.oracle.com/en-us/iaas/stack-monitoring/

Core capabilities (high-level)

Resource discovery (agent-based) and creation of monitored resources
Inventory of monitored resources by compartment
Topology visualization to understand dependencies
Metrics collection and visualization for supported resource types
Alerting integration (commonly via OCI Monitoring alarms + Notifications, depending on how your tenancy is configured)
OCI-native governance using compartments, IAM policies, tags, and audit logging

Major components

While naming can vary slightly across console versions, Stack Monitoring solutions typically involve:

Stack Monitoring service (control plane + UI) – Defines monitored resource types – Maintains inventory and topology model – Provides views, dashboards, and administration workflows
OCI Management Agent – Installed on a host (OCI compute, on-prem VM/bare metal, or other cloud VM) – Performs local discovery/collection and securely sends telemetry to OCI endpoints
Docs: https://docs.oracle.com/en-us/iaas/management-agents/
Monitored resources – Logical representations of discovered components (example: a host, a middleware domain, a database target—depending on supported types in your region/tenancy)
Metrics storage and alerting integrations – Metrics are typically surfaced through OCI’s observability primitives. – Alerting is commonly implemented using OCI Monitoring alarms and Notifications (implementation details can vary; verify exact integration behavior in official docs for your target types and region).

Service type

Managed OCI service (SaaS-like within your OCI tenancy)
Uses agent-based collection for many monitored resource types

Scope: regional vs global, and OCI resource boundaries

Regional service behavior: Stack Monitoring is operated within an OCI region. Agents register to a region endpoint. Telemetry and monitored resource inventory are associated with that region.
Tenancy and compartment scoped: You organize and control access using compartments and IAM policies.
Not zonal: OCI uses Availability Domains (ADs) and Fault Domains for compute, but Stack Monitoring itself is not “zonal” in the way some other clouds describe services.

How it fits into the Oracle Cloud ecosystem

Stack Monitoring sits inside OCI’s Observability and Management portfolio and complements: – OCI Monitoring for metrics/alarms at OCI resource level – OCI Logging and (optionally) Logging Analytics for logs – OCI APM for application tracing and end-user performance (different focus) – Database Management and Operations Insights for database-centric and capacity analytics (different scope) – OCI Events + Notifications + Functions for incident automation patterns (integration pattern; verify specifics)

3. Why use Stack Monitoring?

Business reasons

Reduced downtime and faster incident resolution: Topology + curated metrics reduce “time to identify” and “time to resolve”.
Standardized operations across environments: A consistent approach for OCI + on-prem + hybrid (where agent connectivity is possible).
Lower operational overhead: Instead of assembling custom scripts/collectors for every layer, use a managed approach.

Technical reasons

Dependency awareness: Understanding which host supports which middleware component helps avoid treating symptoms instead of root cause.
Curated telemetry: For supported technologies, Stack Monitoring can surface meaningful metrics without you hand-crafting every collection pipeline.
Agent-based extensibility: You can often onboard new hosts quickly by installing an agent and running discovery.

Operational reasons (SRE/DevOps)

Inventory and ownership boundaries: Compartment-based organization aligns with teams and environments (prod/dev).
Alerting workflows: Integrate alarms with on-call notification channels (email, PagerDuty via webhook bridges, etc.—implementation varies).
Repeatable onboarding: Standardize how hosts and supported applications are discovered and monitored.

Security/compliance reasons

IAM-controlled access: Fine-grained control via OCI IAM policies for who can view or manage monitored resources.
Auditability: OCI Audit can capture administrative actions taken in OCI services (verify which Stack Monitoring actions are audited in your region).
Data residency: Choose region to align with residency needs; telemetry stored in-region (verify retention and residency details in official docs).

Scalability/performance reasons

Designed for fleets: Many organizations monitor large numbers of hosts and middleware instances.
Centralized operational view: Scales better than SSH-ing into servers and manually checking health.

When teams should choose Stack Monitoring

Choose Stack Monitoring when: – You need stack-level monitoring (host + middleware + database layers) with dependency/topology views. – You run supported Oracle technologies (and want Oracle-aligned monitoring semantics). – You want OCI-native governance (compartments/tags/IAM) for observability operations.

When teams should not choose it

Avoid or deprioritize Stack Monitoring when: – Your primary need is distributed tracing and code-level profiling (look at OCI APM). – You only need basic infrastructure metrics for OCI resources (OCI Monitoring may be enough). – Your stack is mostly non-supported components and you need deep, vendor-neutral integrations (you may prefer Prometheus/Grafana or a third-party platform). – You cannot install or operate the required agent(s) due to policy or environment constraints.

4. Where is Stack Monitoring used?

Industries

Financial services (regulated ops, hybrid environments)
Telecom (large fleets, strict SLAs)
Retail/e-commerce (capacity and incident response)
Healthcare (compliance + high availability needs)
Public sector (governance + regional residency)
SaaS providers (standardizing operations)

Team types

SRE and platform engineering teams
DevOps/operations teams
Middleware administrators (e.g., WebLogic administrators)
Database operations teams (when combined with complementary services)
Cloud Center of Excellence (CCoE) and governance teams

Workloads and architectures

Three-tier enterprise apps: web tier + middleware + database
Oracle middleware-centric stacks (where supported)
Hybrid apps with on-prem databases and OCI compute front-ends
Lift-and-shift migrations needing operational parity
Shared services platforms hosting multiple applications

Real-world deployment contexts

OCI-only: All components in OCI VCNs, agents installed on compute instances
Hybrid: Agents installed on on-prem VMs/bare metal; private connectivity via VPN/FastConnect
Multi-cloud (partial): Agents on VMs in other clouds that can securely reach OCI endpoints (architecture and security review required)

Production vs dev/test usage

Production: Strong value due to alerting, topology, and operational dashboards
Dev/test: Useful for validating performance baselines and catching regressions, but cost/effort may be reduced; some teams only monitor key shared environments

5. Top Use Cases and Scenarios

Below are realistic scenarios that align with how Stack Monitoring is typically used in OCI.

1) Host fleet monitoring with consistent baselines

Problem: Teams manage many Linux hosts; CPU/memory/disk issues are detected late.
Why Stack Monitoring fits: Agent-based onboarding + standardized host metrics views.
Scenario: A platform team installs OCI Management Agent on 200 OCI compute instances and monitors capacity trends and availability.

2) Middleware dependency troubleshooting (host ↔ middleware)

Problem: Middleware performance issues get misdiagnosed because host saturation isn’t visible.
Why it fits: Topology helps correlate middleware symptoms with host metrics.
Scenario: WebLogic response time spikes are correlated to host memory pressure and swapping.

3) Migration validation (on-prem to OCI)

Problem: During migration, you need to ensure the OCI deployment behaves like the old environment.
Why it fits: Standard monitoring views across environments if agents are deployed similarly.
Scenario: During a phased cutover, both on-prem and OCI stacks are monitored to compare performance.

4) Standardized alerting for common failure modes

Problem: Alert rules differ per team and environment; on-call noise is high.
Why it fits: Central visibility + alarm integration patterns.
Scenario: A shared “CPU high for 15 minutes” rule is standardized and routed via Notifications topics.

5) Environment segmentation by compartment (prod vs non-prod)

Problem: Ops teams need separation of duties and visibility boundaries.
Why it fits: OCI compartments + IAM policies map to environments.
Scenario: Only SREs can manage discovery jobs in Prod; developers can view non-prod metrics.

6) Capacity planning signals for middleware hosts

Problem: Capacity planning is done from spreadsheets, not telemetry.
Why it fits: Trends and inventory help quantify growth.
Scenario: Quarterly planning uses host memory utilization trends and growth rates from monitoring.

7) Incident triage with topology-first navigation

Problem: During incidents, teams don’t know where to start.
Why it fits: Topology helps prioritize likely root causes.
Scenario: A database-dependent middleware cluster shows upstream degradation; topology points to the database host resource.

8) Compliance operations: auditable access to monitoring data

Problem: Monitoring access must be controlled and audited.
Why it fits: IAM policies and OCI Audit integrate with governance.
Scenario: Audit logs show who changed monitored resource associations and who altered alerting settings (verify audited events).

9) Monitoring of shared platform services

Problem: Shared services (identity, integration, messaging) affect many apps.
Why it fits: Central view and ownership tagging.
Scenario: Platform team monitors key middleware/hosts and shares read-only dashboards with app teams.

10) Post-patch validation

Problem: After OS or middleware patching, regressions occur.
Why it fits: Compare pre/post metrics and availability behavior.
Scenario: After kernel patching, file system IO wait increases; monitoring highlights the regression.

6. Core Features

Note: Exact supported target types and feature names can vary by region and OCI release. Use the official Stack Monitoring docs to confirm support for your specific technologies and versions: https://docs.oracle.com/en-us/iaas/stack-monitoring/

1) OCI Management Agent-based collection

What it does: Uses a locally installed agent to collect telemetry and perform discovery.
Why it matters: Enables monitoring for resources not natively emitting OCI metrics, including on-prem.
Practical benefit: Standard onboarding pattern; avoids building custom collectors.
Limitations/caveats: Requires OS-level installation and outbound connectivity to OCI endpoints; may require sudo/root privileges.

2) Resource discovery and onboarding

What it does: Discovers supported resources and creates them as monitored resources in Stack Monitoring.
Why it matters: Inventory creation is foundational—without it, you don’t have a reliable model of what exists.
Practical benefit: Faster onboarding than manual “register everything”.
Limitations/caveats: Discovery typically works for supported technologies only; discovery accuracy depends on permissions and local configuration.

3) Monitored resource inventory (by compartment)

What it does: Provides a list of monitored resources organized via OCI compartments.
Why it matters: Operations require an authoritative inventory.
Practical benefit: Filter by environment/team; align with IAM boundaries.
Limitations/caveats: Inventory reflects what’s discovered/onboarded, not necessarily every resource in your estate.

4) Topology and relationship modeling

What it does: Shows dependencies between monitored resources (for supported types).
Why it matters: Incidents often propagate across tiers; topology reduces guesswork.
Practical benefit: Faster root cause analysis and blast-radius assessment.
Limitations/caveats: Relationship depth depends on discovered resource types and supported integrations.

5) Metrics visualization for supported resources

What it does: Surfaces performance/availability metrics for monitored resources.
Why it matters: Metrics are the core signal for performance troubleshooting.
Practical benefit: Quickly see CPU/memory/disk (hosts) and key middleware/database metrics (where supported).
Limitations/caveats: Metric names, granularity, and retention depend on OCI configurations and service defaults; verify retention and limits in official docs.

6) Alerting integration (alarms + notifications patterns)

What it does: Enables alerting on collected metrics, typically via OCI Monitoring alarms and Notifications.
Why it matters: Monitoring without alerting is mostly reactive.
Practical benefit: Route incidents to email/SMS/webhooks via Notifications topics; integrate into incident response.
Limitations/caveats: Alerting setup may span multiple services (Monitoring, Notifications). Confirm alarm creation workflow supported in your region.

7) OCI-native governance (IAM, compartments, tags)

What it does: Uses OCI IAM policies for access control and compartments for scoping.
Why it matters: In enterprises, observability access must be controlled.
Practical benefit: Separate duties between platform and app teams; enforce least privilege.
Limitations/caveats: Mis-scoped policies can block discovery/collection; you must plan policy boundaries.

8) APIs/automation support (where available)

What it does: OCI services typically provide APIs and SDKs for automation.
Why it matters: Fleet onboarding and policy-driven operations are hard to do manually.
Practical benefit: Automate agent rollout and discovery job creation using IaC/CI pipelines.
Limitations/caveats: API surface and SDK support vary; verify the Stack Monitoring API reference for current operations in your region.

7. Architecture and How It Works

High-level architecture

At a high level, Stack Monitoring looks like this:

You install an OCI Management Agent on a host.
The agent authenticates/identifies itself to OCI and sends telemetry to OCI endpoints in your region.
Stack Monitoring uses discovery to create monitored resources (inventory).
Metrics and health status are displayed in Stack Monitoring views.
Alerts are configured to notify teams via OCI’s notification and alarm mechanisms.

Data flow and control flow (typical)

Admin control plane actions – IAM policies grant rights to manage agents and stack monitoring resources. – You configure discovery and monitoring scope (compartments, agent groups, etc., depending on your setup).
Agent runtime – Agent runs on monitored host, collects data periodically. – Agent transmits telemetry to OCI endpoints over TLS.
Service-side processing – Stack Monitoring associates telemetry with monitored resources. – Metrics become available for visualization and alerting.
Alerting – Alarm rules evaluate metrics. – On trigger, Notifications routes messages to endpoints (email, HTTPS, etc.).

Integrations with related OCI services

Common integrations/patterns include: – OCI IAM: policies for managing agents and monitoring configurations – OCI Monitoring: alarms and metric handling patterns – OCI Notifications: delivering alerts to responders – OCI Logging: agent logs and diagnostic logs – OCI Events / Functions: automation pattern for remediation (for example: alarms → notifications/webhook → function runbook). This is an architectural pattern; verify supported direct triggers in your region.

Dependency services

OCI Management Agent service (agent lifecycle management)
Networking (VCN + routing/NAT/Service Gateway) to allow agents to reach OCI endpoints

Security/authentication model (practical view)

Human access: controlled by OCI IAM policies (groups, dynamic groups, compartments).
Agent access: agent registration keys and agent identity managed through OCI agent management workflows. Exact mechanism can vary by agent install method; follow the Management Agent documentation for current registration methods: https://docs.oracle.com/en-us/iaas/management-agents/

Networking model

Agents typically require outbound connectivity to OCI service endpoints in the region.
In private networks, you may use:
NAT Gateway for outbound internet access, or
Service Gateway / private endpoints where supported (verify current requirements), or
Controlled egress via proxies/firewalls (ensure TLS inspection policies don’t break agent connectivity)

Monitoring/logging/governance considerations

Treat observability configuration as production infrastructure:
Tag monitored resources by environment and owner (where supported).
Centralize alert routing through Notifications topics.
Review IAM regularly and audit changes.
Use compartments to separate prod from non-prod.

Simple architecture diagram (Mermaid)

flowchart LR
  U[Operator / SRE] -->|Console / API| SM[OCI Stack Monitoring]

  H1[Host / VM] -->|Install| MA[OCI Management Agent]
  MA -->|TLS telemetry| SM

  SM --> M[OCI Monitoring (alarms/metrics patterns)]
  M --> N[OCI Notifications]
  N --> O[Email / HTTPS / On-call tooling]

Production-style architecture diagram (Mermaid)

flowchart TB
  subgraph Tenancy[OCI Tenancy]
    subgraph CompProd[Compartment: Prod]
      SM[Stack Monitoring (Region)]
      MON[OCI Monitoring]
      NOTIF[OCI Notifications Topic]
      LOG[OCI Logging]
    end

    IAM[OCI IAM Policies & Groups]
    AUD[OCI Audit]
  end

  subgraph Network[OCI VCNs / Hybrid Network]
    subgraph AppVCN[VCN: App]
      WLS1[Compute: Middleware Host(s)]
      DBH[Compute/On-prem: DB Host (if supported)]
      NAT[NAT Gateway / Controlled Egress]
    end

    OnPrem[On-Prem Network]
    VPN[VPN / FastConnect]
  end

  IAM --> SM
  SM --> AUD

  WLS1 -->|Agent installed| MA1[Management Agent]
  DBH -->|Agent installed| MA2[Management Agent]

  MA1 -->|TLS via NAT/egress| SM
  MA2 -->|TLS via VPN/FastConnect + egress| SM

  SM --> MON
  SM --> LOG
  MON --> NOTIF
  NOTIF --> OnCall[On-call Email/HTTPS Integration]

8. Prerequisites

Tenancy / account requirements

An active Oracle Cloud tenancy with permissions to use Observability and Management services.
A target OCI region where Stack Monitoring is available. Availability can vary; verify regional availability in your console or official docs.

Permissions / IAM roles (typical)

You need IAM permissions for: – Managing Stack Monitoring resources in the target compartment – Managing Management Agents and agent install keys – Reading metrics (and creating alarms if you plan to)

OCI policies vary by org design. A common starting point (adapt to least privilege) is:

Allow group <YourGroup> to manage stack-monitoring-family in compartment <YourCompartment>
Allow group <YourGroup> to manage management-agents in compartment <YourCompartment>
Allow group <YourGroup> to read metrics in compartment <YourCompartment>
Allow group <YourGroup> to manage alarms in compartment <YourCompartment>
Allow group <YourGroup> to manage ons-topics in compartment <YourCompartment>

Notes: – Policy verbs and resource families must match OCI’s current IAM model. If a policy statement fails validation, use the policy builder and consult official IAM docs. – Some tenancies centralize Notifications topics in a shared compartment; scope accordingly.

Billing requirements

A billing-enabled tenancy is usually required for paid services.
Do not assume Stack Monitoring is included in Always Free. Verify free tier eligibility (if any) on official pricing pages.

Tools

OCI Console access (enough for this tutorial)
Optional:
OCI CLI: https://docs.oracle.com/en-us/iaas/Content/API/SDKDocs/cliinstall.htm
SSH client for connecting to the compute instance where you install the agent

Region availability

Choose a region where:
Stack Monitoring is enabled/available
Management Agent endpoints are reachable from your network

Quotas / limits

Expect limits around:
Number of management agents
Number of monitored resources
Discovery operations
Alarm counts
Review Limits, Quotas, and Usage in OCI Console and the service limits documentation. Verify current limits for Stack Monitoring in your tenancy.

Prerequisite services

For the hands-on lab you will need: – An OCI Compute instance (Oracle Linux recommended for beginner lab) – Network egress to reach OCI endpoints – IAM permissions for agent management and stack monitoring

9. Pricing / Cost

Pricing changes over time and varies by region and contract. Do not rely on blog posts or old SKUs. Always confirm using official Oracle pricing sources.

Official pricing sources

Oracle Cloud pricing landing page: https://www.oracle.com/cloud/pricing/
Oracle Cloud Price List (search for “Stack Monitoring”): https://www.oracle.com/cloud/price-list/
Oracle Cloud Cost Estimator: https://www.oracle.com/cloud/costestimator.html

Pricing dimensions (how you are typically billed)

Stack Monitoring pricing is generally usage-based (metered). Common metering dimensions for monitoring services include: – Number/type of monitored resources (for example, per monitored host or per monitored instance) and duration (per hour/month) – Associated service usage such as: – Alarms (OCI Monitoring) – Notifications deliveries – Logging ingestion and storage (if enabled) – Compute/network required to run monitored workloads and agents

Because exact SKUs and meters can differ, verify the specific Stack Monitoring meters and units in the Oracle Cloud Price List for your region.

Free tier (if applicable)

Some OCI services have Always Free quotas, but Stack Monitoring free eligibility is not guaranteed.
Treat Stack Monitoring as potentially billable even if you run it on Always Free compute. Verify in the price list and your tenancy’s billing dashboard.

Main cost drivers

Monitored resource-hours – The more hosts/middleware instances you monitor and the longer you keep them onboarded, the higher the cost.
Environment sprawl – Monitoring dev/test/stage/prod equally can multiply resource counts.
Alarm and notification volume – Large fleets with noisy thresholds can drive indirect operational cost and possibly service usage charges (depending on SKUs).
Logging ingestion – If you forward agent logs or enable additional diagnostic logs, ingestion and retention can add cost.
Network egress architecture – Agents need outbound access; NAT Gateways and data transfer patterns can add cost (particularly in hybrid designs).

Hidden or indirect costs

Compute cost for monitored workloads is usually far larger than monitoring cost.
Operations cost due to alert noise and insufficient routing.
Change management overhead: agent deployment, patching, and inventory drift.

Network/data transfer implications

Agent telemetry is outbound. In private subnets, you may pay for NAT Gateway usage and data processing (pricing varies).
For on-prem → OCI telemetry, consider VPN/FastConnect costs and egress from your on-prem environment (not an OCI charge but a real cost).

How to optimize cost

Monitor what matters:
Start with production and critical shared services.
Add dev/test selectively or use shorter retention and fewer alarms.
Use compartments and tags to track ownership and chargeback.
Standardize alert thresholds and use suppression/maintenance windows patterns (where supported) to reduce noise.
Periodically remove stale monitored resources (decommissioned hosts).

Example low-cost starter estimate (no fabricated numbers)

A reasonable starter approach: – 1 monitored host for a short test period (1–3 days) – Minimal alarms (1–2) – Basic agent logs only

To estimate: 1. Find the Stack Monitoring meter in the Oracle Cloud Price List (your region). 2. Multiply: – rate_per_monitored_resource_hour × 24 × number_of_days × number_of_resources 3. Add: – Notifications deliveries (if any) – Logging ingestion/retention (if enabled)

Example production cost considerations

For production you should model: – N hosts + M middleware instances (as monitored resources) – Multiple environments (prod + DR) – Alarm count and notification throughput – Hybrid connectivity and NAT/data costs – Potential need for separate compartments per app domain (organizational overhead more than service cost)

10. Step-by-Step Hands-On Tutorial

Objective

Onboard a single OCI Compute Linux host into Oracle Cloud Stack Monitoring using the OCI Management Agent, verify that the host appears as a monitored resource, and configure a basic alert notification workflow.

Lab Overview

You will: 1. Create a compartment and IAM policies (or validate existing access). 2. Launch a small OCI compute instance. 3. Install and register the OCI Management Agent. 4. Run Stack Monitoring discovery (host). 5. Validate resource inventory and basic metrics visibility. 6. Configure a simple notification channel and an alarm (pattern). 7. Clean up all resources to avoid ongoing charges.

This lab is designed to be low-risk and low-cost. Actual charges depend on your tenancy and region—verify pricing before enabling monitoring in production.

Step 1: Prepare a compartment and IAM access

In the OCI Console, open Identity & Security → Compartments.
Create a compartment, for example: – Name: obs-lab – Description: Stack Monitoring lab compartment

Expected outcome: You have a dedicated compartment to keep lab resources isolated.

Ensure your user (or group) has permissions. In Identity & Security → Policies, create or update a policy in the parent compartment (often the root compartment) that allows your group to manage Stack Monitoring and Management Agents in obs-lab.

Example policy statements (adjust names):

Allow group ObservabilityAdmins to manage stack-monitoring-family in compartment obs-lab
Allow group ObservabilityAdmins to manage management-agents in compartment obs-lab
Allow group ObservabilityAdmins to read metrics in compartment obs-lab
Allow group ObservabilityAdmins to manage alarms in compartment obs-lab
Allow group ObservabilityAdmins to manage ons-topics in compartment obs-lab
Allow group ObservabilityAdmins to manage ons-subscriptions in compartment obs-lab

Verification: – You can open the Stack Monitoring pages without authorization errors. – You can access Observability & Management → Management Agents.

If policy validation fails: OCI policy families can differ by tenancy features. Use the console’s policy syntax helper and verify in official IAM docs.

Step 2: Create a compute instance for the agent

Go to Compute → Instances and ensure you are in the correct region.
Click Create instance.
Choose: – Name: sm-lab-host-01 – Compartment: obs-lab – Image: Oracle Linux (a current supported version) – Shape: a small VM shape suitable for labs – Networking: create a new VCN or use an existing lab VCN – Ensure outbound connectivity:
- Easiest: public subnet with public IP (lab only)
- More realistic: private subnet + NAT Gateway
Create or use an existing SSH keypair and save the private key securely.

Expected outcome: Instance is in RUNNING state and you can SSH to it.

Verification (SSH):

ssh -i /path/to/private_key opc@<public_ip>
uname -a

If you used a private subnet, connect through a bastion or VPN as required.

Step 3: Install and register the OCI Management Agent

You must install the agent using the official install steps for your OS. The safest, most current method is to copy the install command directly from the OCI Console.

In the OCI Console, go to Observability & Management → Management Agents.
Select the compartment obs-lab.
Click Download and install agent (or similarly named action).
Select your platform (Oracle Linux) and follow the wizard to: – Create or select an Agent Install Key (naming varies by console version) – Copy the installation command
SSH into your instance and run the copied command (as instructed by OCI). This often requires sudo.

Expected outcome: The agent installs and registers, and the instance appears as an agent in the console.

Verification: – In Management Agents, your agent status becomes Active (may take a few minutes). – On the host, verify the agent service is running. The exact service name can differ by version; check the install output and official docs. Common checks include:

sudo systemctl status <agent-service-name>

If you’re not sure, list services:

sudo systemctl list-units --type=service | grep -i agent

Notes: – If the agent stays “Inactive”, the most common causes are missing outbound connectivity, DNS issues, time drift, or missing IAM permissions.

Step 4: Enable Stack Monitoring and run discovery for the host

Go to Observability & Management → Stack Monitoring.
Ensure compartment is set to obs-lab.
Find the Discovery area (often under “Monitored Resources” or “Administration” depending on console layout).
Create a discovery job (name examples vary). Choose: – Discovery type: Host-based (or similar) – Agent: select your registered management agent – Scope/compartment: obs-lab – Schedule: one-time for this lab
Run the discovery job and wait for completion.

Expected outcome: The host becomes visible in Stack Monitoring as a monitored resource.

Verification: – Go to Stack Monitoring → Monitored Resources. – Filter by type (Host) and search for sm-lab-host-01. – Open the resource and confirm you see: – Resource details (name, compartment) – Status/availability indicators (if provided) – Metrics charts/tabs (exact charts depend on supported metrics)

If you do not see metrics immediately, wait several collection intervals.

Step 5: Create a notification topic and subscription (for alerts)

This step sets up where alarms will send notifications.

Go to Developer Services → Notifications.
In compartment obs-lab, create a Topic: – Name: sm-lab-alerts
Create a Subscription: – Protocol: Email – Email: your address
Confirm the subscription from the email you receive.

Expected outcome: Topic exists and your subscription is confirmed.

Verification: – Notifications topic shows subscription status as Confirmed.

Step 6: Create a basic alarm for the monitored host (pattern)

Alarm creation can be done either: – Directly in OCI Monitoring (common), or – From within Stack Monitoring metric views (if your console provides a shortcut)

Because metric namespaces and names can vary by resource type and release, the most reliable approach is to create an alarm from an existing metric chart in the console (where available), or to locate the relevant metric in OCI Monitoring.

Option A (recommended for beginners): create from a chart 1. In Stack Monitoring, open your monitored host resource. 2. Open a metric chart (for example CPU utilization if available). 3. Use the action menu (often “Create alarm”) to create an alarm based on that metric. 4. Configure: – Alarm name: sm-lab-host-cpu-high – Severity: Warning – Trigger: CPU > threshold for N minutes (choose a conservative lab value) – Notification: select topic sm-lab-alerts

Option B: create in OCI Monitoring 1. Go to Observability & Management → Monitoring → Alarms. 2. Click Create Alarm and select the metric emitted for your monitored resource. 3. Attach the Notifications topic.

Expected outcome: Alarm exists and is in “OK” state initially.

Verification: – View the alarm’s status in OCI Monitoring. – If you want to force a test, create a temporary CPU load on the host:

# Install stress tool if available in your distro repositories
sudo dnf -y install stress-ng || sudo yum -y install stress || true
# Run a short CPU stress (command may vary based on tool availability)
stress-ng --cpu 1 --timeout 180s || true

Then observe whether the alarm triggers (it may take several minutes depending on evaluation window).

Validation

Use this checklist:

Agent validation – Management Agent appears in Management Agents with status Active.
Discovery validation – Discovery job completed successfully. – Host appears under Stack Monitoring → Monitored Resources.
Metrics validation – You can view at least some host metrics charts (exact set varies).
Alerting validation – Notifications topic and confirmed subscription exist. – Alarm exists and shows evaluation/OK state. – Optional: alarm triggers under induced load (may take time).

Troubleshooting

Common issues and fixes:

Agent is not Active – Check outbound connectivity (security list/NSG, route tables, NAT/internet gateway). – Check DNS resolution from the host. – Ensure system time is correct (NTP/chrony). – Confirm IAM policies allow management agent operations in the compartment.
Discovery job fails – Verify the correct compartment is selected. – Verify you selected the correct agent. – Check agent logs (often available via local files and/or OCI Logging if enabled). – Confirm the resource type you’re trying to discover is supported (host discovery is the simplest baseline).
Host appears but no metrics – Wait for collection intervals. – Confirm the agent is running and not blocked by host firewall rules. – Verify whether additional plugins/config are required for metrics collection (depends on resource type; verify in official docs).
No alert emails received – Confirm subscription is “Confirmed”. – Check spam filters. – Ensure the alarm is actually firing and routed to the correct topic.

Cleanup

To avoid ongoing charges and clutter:

Delete alarm – Monitoring → Alarms → delete sm-lab-host-cpu-high
Delete Notifications subscription and topic – Notifications → Topic sm-lab-alerts → delete subscription – Delete the topic
Deregister/delete management agent record (optional) – Observability & Management → Management Agents → select agent → delete/deregister (if your console supports it)
Terminate compute instance – Compute → Instances → sm-lab-host-01 → Terminate – Choose to delete boot volume if you do not need it
Delete compartment (optional) – Delete resources first; compartments can only be deleted when empty

11. Best Practices

Architecture best practices

Start with a reference compartment model
prod, nonprod, shared-services, security are common patterns.
Standardize onboarding
Use consistent naming: env-app-tier-## (e.g., prod-payments-wls-01)
Use tags: CostCenter, Owner, Environment, AppName
Design for hybrid
If monitoring on-prem, plan network connectivity and egress carefully.
Use VPN/FastConnect and tightly controlled outbound rules.

IAM/security best practices

Least privilege
Separate roles:
- Agent installers (can manage management-agents)
- Monitoring admins (can manage stack-monitoring)
- Readers (can read monitored resources/metrics)
Compartment scoping
Keep production monitoring configuration changes limited to a smaller admin group.
Audit changes
Review OCI Audit logs for changes to monitoring resources and policies.

Cost best practices

Monitor production first
Expand to non-prod based on value and budget.
Prune stale monitored resources
Decommissioned instances should be removed from monitoring.
Reduce alert noise
Use sane thresholds, longer evaluation windows, and maintenance patterns (where supported).

Performance best practices

Agent resource footprint
Validate CPU/memory overhead on small shapes.
Keep OS patched and avoid resource contention.
Collection interval tuning
Use defaults first; tighten only when you have a clear need (and understand cost/noise implications).

Reliability best practices

Egress resilience
Ensure agents can reach OCI endpoints consistently (redundant NAT if required, stable DNS).
Document runbooks
For common alerts, include “first checks” and “escalation path”.

Operations best practices

Golden signals
Latency, traffic, errors, saturation—map host/middleware metrics to these signals.
Unified incident routing
Route alarms through centralized Notifications topics with consistent naming.
Change management
Treat monitoring changes like code where possible (review, test in non-prod).

Governance/tagging/naming best practices

Use tags consistently:
Environment=Prod/Dev
OwnerEmail=team@company.com
DataClassification=Internal/Restricted
Use naming conventions for alarms and topics:
alarm-prod-payments-host-cpu-high
topic-prod-oncall

12. Security Considerations

Identity and access model

Stack Monitoring is governed by OCI IAM.
Use:
Groups for human users
Policies scoped to compartments
Where automation is used, consider OCI-native identities (for example instance principals/dynamic groups) as appropriate—verify the supported automation identity patterns for Stack Monitoring APIs you plan to use.

Encryption

OCI services use encryption at rest and TLS in transit as standard practice, but details can vary per service and region.
For sensitive environments, confirm:
In-transit TLS requirements
At-rest encryption behavior
Whether customer-managed keys (KMS) are supported for any stored data (verify in official docs)

Network exposure

Agents require outbound access to OCI endpoints.
Avoid placing monitored hosts on open public networks in production.
Prefer private subnets + NAT, or private connectivity designs (VPN/FastConnect).

Secrets handling

Do not store agent install keys in source code repositories.
Treat registration keys like credentials:
Limit who can create them
Rotate if exposure is suspected
Store in a secure secrets manager (OCI Vault) if you must distribute them

Audit/logging

Enable and review:
OCI Audit logs for administrative actions
Agent logs for connectivity/collection issues
If you centralize logging, ensure logs don’t leak secrets or internal hostnames beyond approved boundaries.

Compliance considerations

Validate:
Data residency (region)
Retention policies for metrics/logs
Access review processes and separation of duties

Common security mistakes

Overbroad policies like manage all-resources in tenancy
Agents installed with excessive OS privileges without change control
Sending alerts to public webhooks without authentication
Monitoring production from a shared non-prod compartment

Secure deployment recommendations

Use separate compartments for prod and non-prod.
Lock down who can:
Register agents
Run discovery
Create/modify alarms
Require MFA for privileged accounts.
Use private networking patterns and controlled egress.

13. Limitations and Gotchas

The most important limitation to understand: Stack Monitoring only provides deep monitoring for supported resource types and versions. Always check compatibility before you standardize on it.

Known limitations (verify current list in docs)

Supported technologies are specific
Not every middleware/database/platform is supported.
Agent requirement
Many onboarding flows require an agent and local access.
Cross-region visibility
Stack Monitoring is regional; if you operate multi-region, plan per-region monitoring and dashboards.
Topology completeness
Relationships depend on discovery and supported types; topology may not show every dependency in complex microservices.

Quotas

You may encounter limits on:
Agent counts
Monitored resources
Discovery jobs
Alarm resources
Review in OCI Console: Governance & Administration → Limits, Quotas and Usage (menu names may vary).

Regional constraints

Service availability and supported target types can vary by region.
Some features appear earlier in commercial regions than in restricted/government regions.

Pricing surprises

Monitoring non-prod at the same scale as prod can multiply costs.
NAT gateways and logging ingestion can add unexpected cost.
Long retention and high-frequency collection can increase telemetry volume (depending on service behavior).

Compatibility issues

OS version and package compatibility for management agent installation.
Corporate proxy/TLS inspection breaking agent connectivity.
Host hardening policies blocking agent services or required system calls.

Operational gotchas

Alarm fatigue if thresholds are too aggressive.
Discovery drift if environments are frequently rebuilt (immutable infrastructure)—consider automating agent install and cleanup.

Migration challenges

Migrating from Enterprise Manager or Prometheus requires:
Metric mapping
Alarm re-creation
On-call process updates
Don’t attempt a “big bang” cutover; run in parallel until confidence is high.

Vendor-specific nuances

Stack Monitoring aligns well with Oracle technology stacks, but for heterogeneous stacks you may need additional tools.

14. Comparison with Alternatives

Nearest services in Oracle Cloud

OCI Monitoring: great for OCI resource metrics and alarms; less focused on topology for application stacks.
OCI Logging / Logging Analytics: logs-centric (search, parsing, analytics) rather than stack topology.
OCI APM: code-level tracing and performance for applications; different focus than infrastructure stack monitoring.
Database Management / Operations Insights: database operations and capacity analytics; complementary.

Nearest services in other clouds (not OCI)

AWS CloudWatch (metrics/logs/alarms) + CloudWatch Application Insights (application-centric patterns)
Azure Monitor + Application Insights
Google Cloud Operations Suite (Cloud Monitoring/Logging/Trace)

Open-source or self-managed alternatives

Prometheus + Alertmanager + Grafana
OpenTelemetry collectors + backends (Prometheus, Tempo, Loki, etc.)
Zabbix, Nagios (legacy but common)

Comparison table

Option	Best For	Strengths	Weaknesses	When to Choose
OCI Stack Monitoring	Stack-level monitoring with supported resource discovery/topology in Oracle Cloud	Topology + inventory model; OCI IAM/compartments; agent-based hybrid potential	Supported targets are specific; requires agent rollout; pricing depends on meters	You run supported Oracle stacks and want OCI-native operations
OCI Monitoring	OCI resource metrics + alarms	Simple, native, scalable; integrates with Notifications	Less “application stack topology” context by itself	You mainly monitor OCI services and need alarms quickly
OCI Logging / Logging Analytics	Central log collection and analysis	Great for troubleshooting text logs and searches	Not a replacement for metrics/topology; can become expensive at scale	You need log search, parsing, and forensic investigation
OCI APM	Distributed tracing and app performance	Deep application insight; traces, spans, service maps (APM-specific)	Requires instrumentation; different focus than host/middleware ops	You need code-level insight and distributed tracing
Oracle Enterprise Manager (self-managed)	Deep Oracle ecosystem monitoring on-prem	Mature enterprise monitoring for Oracle tech	You manage infrastructure and upgrades; licensing/ops overhead	You need on-prem centralized monitoring and already run OEM
Prometheus + Grafana (self-managed)	Vendor-neutral metrics at scale	Flexible, open ecosystem; strong community	You operate and scale it; topology modeling is custom	You need portability and have platform engineering bandwidth
Datadog / New Relic (SaaS)	Full-stack observability across vendors	Quick onboarding, broad integrations	Subscription costs; data residency concerns	You want fast time-to-value across heterogeneous stacks

15. Real-World Example

Enterprise example (hybrid Oracle middleware estate)

Problem A financial services organization runs: – On-prem WebLogic clusters – Oracle databases – A growing OCI footprint for new services
Incidents take too long because ownership is split and dependencies aren’t visible.

Proposed architecture – Install OCI Management Agent on: – OCI compute hosts running middleware – On-prem VMs (where allowed) – Use Stack Monitoring in an OCI region aligned to compliance needs – Organize monitored resources by compartments: – prod-payments, prod-risk, shared-services – Centralize alarms through: – OCI Monitoring alarms – OCI Notifications topics per domain (payments-oncall, risk-oncall) – Enable OCI Logging for agent/service diagnostics; send to centralized logging compartment if required

Why Stack Monitoring was chosen – Need stack-centric visibility and topology for supported Oracle technologies – Strong alignment with OCI governance (IAM + compartments) – Supports hybrid patterns via agent connectivity

Expected outcomes – Faster triage via topology and consistent dashboards – Better separation of duties and auditability – More predictable onboarding for new environments

Startup/small-team example (lean operations on OCI compute)

Problem A small SaaS team runs several OCI compute instances hosting: – A small middleware tier – Background jobs They lack a dedicated ops team and want early warnings for host saturation.

Proposed architecture – Install Management Agent on each instance – Use Stack Monitoring to maintain inventory and basic host health – Create a few alarms: – CPU sustained high – Disk usage above threshold – Send alerts to a single Notifications topic subscribed by on-call email

Why Stack Monitoring was chosen – Low friction to onboard hosts (agent + discovery) – Central view in OCI without building a full Prometheus stack

Expected outcomes – Reduced “surprise outages” from disk exhaustion – Clearer visibility into which instance is unhealthy – A baseline monitoring foundation that can expand later

16. FAQ

1) Is Stack Monitoring the same as OCI Monitoring?
No. OCI Monitoring is a general metrics and alarms service for OCI resources. Stack Monitoring focuses on stack-level monitored resources, discovery, and topology across supported components, typically using the Management Agent.

2) Do I need to install an agent?
Often yes. Stack Monitoring commonly relies on the OCI Management Agent for discovery and telemetry collection.

3) Can I monitor on-premises servers with Stack Monitoring?
Potentially, if you can install the Management Agent and provide secure outbound connectivity to OCI endpoints. Verify supported OS and network requirements in the Management Agent docs.

4) Is Stack Monitoring regional?
Yes, it operates within an OCI region. Plan multi-region monitoring accordingly.

5) Does Stack Monitoring support Kubernetes monitoring?
Do not assume so. OCI has Kubernetes observability patterns using other services (and open standards). Check the Stack Monitoring supported resources list in official docs.

6) What Oracle technologies are supported?
Support is specific (and changes over time). Use the official Stack Monitoring documentation for the current supported resource types and versions.

7) How do alarms work with Stack Monitoring?
In many OCI setups, metrics can be used with OCI Monitoring alarms and routed via OCI Notifications. Exact workflows can vary; verify in official docs for your resource type.

8) Can I automate onboarding?
Yes, typically through agent installation automation (cloud-init/Ansible) and OCI APIs/SDKs where available. Verify the Stack Monitoring API reference for supported operations.

9) What’s the difference between Stack Monitoring and OCI APM?
APM is for application-level observability (traces, spans, service performance). Stack Monitoring focuses more on infrastructure/middleware stack monitoring with topology and curated metrics.

10) Does Stack Monitoring store logs?
Stack Monitoring is primarily metrics/topology-oriented. Logs are usually handled by OCI Logging / Logging Analytics. Agents generate logs that you can collect and analyze separately.

11) How secure is agent communication?
Agents communicate to OCI endpoints over TLS. For strict environments, validate certificates, proxy behavior, and network paths per your security policies.

12) Can I use private networking only (no public internet) for agents?
Often you can use controlled egress (NAT, proxies, private connectivity). Whether fully private endpoints are supported depends on region/service capabilities—verify in official docs.

13) How quickly do metrics appear after onboarding?
Usually within minutes, but depends on collection intervals and discovery completion. If metrics don’t appear, check agent status and logs.

14) What should I monitor first?
Start with: – Availability – CPU/memory saturation – Disk utilization
Then expand into middleware/database signals as supported.

15) How do I avoid alert fatigue?
Use longer evaluation windows, tune thresholds based on baselines, route alerts by ownership, and silence alerts during planned maintenance (using your organization’s standard process and supported OCI features).

17. Top Online Resources to Learn Stack Monitoring

Resource Type	Name	Why It Is Useful
Official documentation	Stack Monitoring docs — https://docs.oracle.com/en-us/iaas/stack-monitoring/	Primary source for supported resources, onboarding, discovery, and concepts
Official documentation	Management Agent docs — https://docs.oracle.com/en-us/iaas/management-agents/	Installation, registration, troubleshooting, and OS support for the agent
Official pricing	Oracle Cloud Pricing — https://www.oracle.com/cloud/pricing/	High-level pricing entry point and service pricing navigation
Official price list	Oracle Cloud Price List — https://www.oracle.com/cloud/price-list/	Authoritative SKU/meter definitions (search for “Stack Monitoring”)
Official cost tool	Oracle Cloud Cost Estimator — https://www.oracle.com/cloud/costestimator.html	Practical way to model costs without guessing
Official docs	OCI CLI installation — https://docs.oracle.com/en-us/iaas/Content/API/SDKDocs/cliinstall.htm	Useful if you want to script onboarding and operations
Official architecture	Oracle Architecture Center — https://docs.oracle.com/en/solutions/	Reference architectures and best practices (use search for observability/monitoring patterns)
Official product overview	Observability and Management overview — https://www.oracle.com/cloud/observability/	Portfolio context; helps choose the right OCI observability services
Community (use with care)	Oracle Cloud customer/community blogs (search)	Practical war stories and setups; always validate against official docs
Community (tooling)	OpenTelemetry documentation — https://opentelemetry.io/docs/	Helpful for broader observability strategy; complementary to OCI-native services

18. Training and Certification Providers

Institute	Suitable Audience	Likely Learning Focus	Mode	Website URL
DevOpsSchool.com	DevOps engineers, SREs, platform teams	DevOps practices, cloud operations, observability fundamentals	Check website	https://www.devopsschool.com/
ScmGalaxy.com	Beginners to intermediate engineers	DevOps/SCM concepts, CI/CD, tooling foundations	Check website	https://www.scmgalaxy.com/
CLoudOpsNow.in	Cloud engineers, operations teams	Cloud operations and production support practices	Check website	https://www.cloudopsnow.in/
SreSchool.com	SREs, reliability engineers, ops leads	SRE principles, incident response, monitoring/alerting strategy	Check website	https://www.sreschool.com/
AiOpsSchool.com	Ops teams exploring automation	AIOps concepts, automation patterns, event correlation	Check website	https://www.aiopsschool.com/

19. Top Trainers

Platform/Site	Likely Specialization	Suitable Audience	Website URL
RajeshKumar.xyz	DevOps/cloud training content (verify offerings)	Beginners to practitioners	https://rajeshkumar.xyz/
devopstrainer.in	DevOps coaching and workshops (verify offerings)	DevOps engineers and teams	https://www.devopstrainer.in/
devopsfreelancer.com	Freelance DevOps enablement (verify offerings)	Small teams needing practical guidance	https://www.devopsfreelancer.com/
devopssupport.in	DevOps support and training resources (verify offerings)	Ops/DevOps teams	https://www.devopssupport.in/

20. Top Consulting Companies

Company	Likely Service Area	Where They May Help	Consulting Use Case Examples	Website URL
cotocus.com	Cloud/DevOps consulting (verify specialties)	Architecture, implementation, operations	Designing observability baselines; onboarding environments; building runbooks	https://cotocus.com/
DevOpsSchool.com	DevOps consulting and enablement	DevOps transformations, tooling, training	Implementing monitoring strategy; alerting governance; operational playbooks	https://www.devopsschool.com/
DEVOPSCONSULTING.IN	DevOps consulting (verify specialties)	CI/CD, cloud ops, observability foundations	Standardizing metrics/alarms; integrating notifications; operational maturity uplift	https://www.devopsconsulting.in/

21. Career and Learning Roadmap

What to learn before Stack Monitoring

OCI fundamentals:
Compartments, VCNs, subnets, routing, NAT/IGW
IAM policies, groups, and least privilege
Linux fundamentals:
systemd services, packages, logs, firewall basics
Observability basics:
Metrics vs logs vs traces
SLI/SLO concepts and alert hygiene

What to learn after Stack Monitoring

OCI Monitoring deep dive:
Metric queries, alarms, notification routing patterns
OCI Logging / Logging Analytics:
Centralized log ingestion, parsing, and search
OCI APM (if you need app-level telemetry)
Infrastructure as Code:
Terraform for OCI (to manage alarms, topics, compartments, and compute)
Incident management:
Runbooks, escalation policies, postmortems, error budgets

Job roles that use it

Cloud Engineer (operations)
DevOps Engineer
Site Reliability Engineer (SRE)
Platform Engineer
Middleware Administrator / Operations
Cloud Solution Architect (operational design)

Certification path (if available)

Oracle certifications evolve. For current OCI certification tracks, verify on Oracle University: https://education.oracle.com/

Stack Monitoring is often learned as part of broader OCI operations/observability skillsets rather than a single dedicated certification.

Project ideas for practice

Build a “prod vs non-prod” compartment model and onboard 5 hosts with consistent tags.
Create standard alarms for CPU, memory, disk and route to different on-call topics.
Implement a maintenance workflow (document + test) for patch windows to reduce noise.
Run a failure injection exercise (disk fill, CPU load) and validate alerting + triage steps.
Hybrid practice: install an agent on a VM outside OCI (lab) and validate secure connectivity.

22. Glossary

Agent: A software component installed on a host to collect telemetry and interact with a management service.
OCI Management Agent: Oracle-provided agent used for data collection and discovery for management/observability services.
Monitored Resource: A logical object in Stack Monitoring representing a discovered component (host, middleware, etc.).
Discovery Job: A process that scans/identifies supported resources and onboards them as monitored resources.
Topology: A representation of relationships and dependencies between monitored resources.
Compartment: OCI’s logical container for organizing and isolating resources and access control.
IAM Policy: A statement defining what actions a group or dynamic group can perform on which resource types in a scope.
Metric: A time-series numerical measurement (CPU %, memory usage, response time).
Alarm: A rule that evaluates metric data and triggers notifications when conditions are met.
Notifications Topic: A message distribution channel in OCI Notifications, used to deliver alarm messages to subscribers.
SLI/SLO: Service Level Indicator / Objective, used to define and measure reliability targets.
Alert fatigue: Operational risk when too many alerts reduce the ability to respond effectively.
NAT Gateway: Provides outbound internet connectivity for resources in private subnets without inbound exposure.
Service Gateway: Allows private access from a VCN to certain OCI services without using the public internet (availability depends on service).

23. Summary

Oracle Cloud Stack Monitoring is an Observability and Management service that helps you discover, monitor, and troubleshoot supported stack components using agent-based telemetry, monitored resource inventory, and topology views. It matters because real incidents are rarely isolated to one layer—Stack Monitoring is designed to make dependencies and operational signals easier to interpret.

Cost is typically driven by how many resources you monitor and for how long, plus any related usage in alarms, notifications, logging, and network egress. Security and governance rely heavily on OCI IAM, compartment design, controlled agent rollout, and audited operational processes.

Use Stack Monitoring when you want OCI-native, stack-aware monitoring (especially in Oracle technology environments). If you need deep application tracing, consider pairing it with OCI APM, and if you need log forensics, pair it with OCI Logging / Logging Analytics.

Next step: review the official Stack Monitoring docs and expand the lab to include your real production compartment model and a standardized alarm/notification routing strategy:
https://docs.oracle.com/en-us/iaas/stack-monitoring/

rajeshkumar

Category