Oracle Cloud Data Science Tutorial: Architecture, Pricing, Use Cases, and Hands-On Guide for Analytics and AI

1. Introduction

Oracle Cloud Data Science (often referred to in documentation as OCI Data Science) is a managed service for building, training, evaluating, deploying, and operating machine learning (ML) models on Oracle Cloud Infrastructure (OCI). It provides cloud-native workflows—projects, notebook sessions, jobs, models, and model deployments—so teams can move from experimentation to production without assembling every component from scratch.

In simple terms: Data Science gives you a managed Jupyter-based environment to explore data and train models, and a managed deployment mechanism to expose trained models as scalable endpoints—while integrating with core OCI services such as Object Storage, IAM, Vault, Logging, and Monitoring.

Technically, Data Science is an OCI control-plane service that orchestrates ML workloads on OCI compute shapes (CPU/GPU). It tracks ML assets (notebooks, jobs, models), supports reproducible environments (Conda environments and curated runtimes), and provides managed online inference through model deployments. You typically store datasets and model artifacts in OCI Object Storage and secure everything through OCI IAM policies and network controls (VCNs, subnets, security lists/NSGs).

What problem it solves: teams often struggle with “last-mile ML” problems—consistent environments, repeatable training runs, artifact management, secure deployments, and operational monitoring. Data Science reduces that friction by standardizing these workflows on Oracle Cloud while keeping integration points (Object Storage, VCN, IAM) explicit and governable.

2. What is Data Science?

Official purpose

Oracle Cloud Data Science is designed to help you build, train, deploy, and manage machine learning models using managed notebook-based development, managed/batch execution (jobs), and managed model deployments for online inference—integrated with OCI’s identity, networking, observability, and storage foundations.

Service naming note: The current service is commonly labeled OCI Data Science in Oracle documentation. In this tutorial, “Data Science” refers specifically to Oracle Cloud Infrastructure Data Science.

Core capabilities (what you can do)

Create Projects to organize ML assets (notebooks, jobs, models)
Use managed Notebook Sessions (JupyterLab) for exploration and training
Run repeatable Jobs for batch training/inference and scheduled runs
Register trained models in a Model Catalog
Create Model Deployments (managed HTTPS endpoints) for online inference
Use curated Conda environments and Oracle’s ML tooling (for example, the OCI Accelerated Data Science library—often referenced as ADS in Oracle materials)
Integrate with OCI services: Object Storage, IAM, VCN, Logging, Monitoring, Vault, Container Registry (where applicable)

Major components (conceptual model)

Project: logical container for organizing Data Science resources
Notebook Session: managed Jupyter environment running on a chosen compute shape
Job: a managed run of code (often training or batch scoring), typically more reproducible than ad-hoc notebook execution
Model: a registered ML artifact with metadata; typically stored in Object Storage
Model Deployment: managed online inference endpoint backed by configurable compute

Service type

Managed ML platform service (control plane) orchestrating compute for notebooks, jobs, and deployments.

Scope: regional and compartment-based

Data Science resources are typically regional and created within an OCI compartment.
Access is governed by OCI IAM (users, groups, policies) and often by resource principals (workload identities) for notebooks/jobs/deployments.

How it fits into the Oracle Cloud ecosystem

Data Science sits in the Analytics and AI category and connects naturally to: – OCI Object Storage for datasets and model artifacts – OCI Data Flow / big data services for large-scale processing (when needed) – Autonomous Database / OCI databases as data sources – VCN and private networking for secure access to data sources – OCI Logging/Monitoring for operational visibility – OCI Vault for managing secrets/keys used by applications and pipelines

3. Why use Data Science?

Business reasons

Faster path from proof-of-concept to production with standardized ML workflows
Reduced platform engineering effort compared to building your own notebooks + model serving + IAM + monitoring stack
Better governance: projects, compartments, policies, tagging, and auditable operations

Technical reasons

Managed notebooks and deployments on OCI compute (CPU/GPU) using consistent environments
Model registration and lifecycle management via a model catalog
Integration with OCI-native networking and identity (private endpoints, resource principals)
Supports common Python ML stacks and reproducible environments

Operational reasons

Clear separation of dev (notebooks), batch (jobs), and prod (deployments)
Use OCI monitoring/logging patterns to run ML endpoints like any other production service
Easier cleanup and cost control: stop sessions, delete deployments, remove artifacts

Security/compliance reasons

Central IAM policy enforcement (least privilege at compartment level)
Private networking options using VCN/subnets (avoid public exposure)
Encryption at rest and in transit via OCI services (verify specifics per resource in official docs)
Auditing through OCI Audit logs for API actions

Scalability/performance reasons

Choose shapes appropriate to workload (small CPU for dev, GPU for training, scalable deployments for inference)
Managed model deployments can be sized and scaled based on endpoint needs (verify current scaling options in official docs)

When teams should choose Data Science

You want a managed ML workflow tied to OCI primitives (VCN/IAM/Object Storage)
You need secure, controlled access to data sources inside OCI
You want managed online model inference without running Kubernetes/model servers yourself
You want consistent ML environments and repeatable runs for teams

When teams should not choose Data Science

You require a fully open, cloud-agnostic ML platform with minimal coupling to a specific cloud’s IAM/networking model
Your organization already standardized on another ML platform (for example, SageMaker/Vertex AI/Azure ML) and migration cost outweighs benefits
You need highly specialized custom serving stacks that the managed deployment patterns don’t support (confirm current deployment customization options in official docs)

4. Where is Data Science used?

Industries

Financial services (risk scoring, fraud detection, credit models)
Retail/e-commerce (recommendations, demand forecasting, pricing models)
Healthcare/life sciences (readmission risk, triage support, operational analytics)
Manufacturing (predictive maintenance, quality inspection, anomaly detection)
Telecommunications (churn prediction, network anomaly detection)
Energy/utilities (load forecasting, outage prediction)
Public sector (resource optimization, fraud/waste detection)

Team types

Data scientists and ML engineers
Cloud engineers and platform teams supporting ML workloads
DevOps/SRE teams operating inference endpoints
Security teams implementing IAM/network controls for analytics/AI workloads

Workloads

Exploratory analysis and feature engineering in notebooks
Batch training runs and evaluation with jobs
Batch inference (scoring) for periodic pipelines
Real-time inference via model deployments
Model registry/catalog for governance and reuse

Architectures

“Lake-first”: Object Storage data lake → notebooks/jobs → model catalog → deployment
“Database-first”: Autonomous Database → notebooks/jobs → model deployment close to private network
Event-driven: object upload triggers pipeline (often via OCI Events/Functions—verify exact integration patterns in official docs)

Real-world deployment contexts

Private enterprise networks with VCN peering, private endpoints, and strict IAM
Multi-compartment environments (dev/test/prod segregation)
CI/CD for ML artifacts where models and deployments are versioned and promoted

Production vs dev/test usage

Dev/test: small notebook sessions, minimal shapes, experimental projects
Production: jobs with reproducible environments, model catalog governance, controlled deployments, monitoring/alerts, private endpoints, rigorous IAM

5. Top Use Cases and Scenarios

Below are realistic scenarios where Oracle Cloud Data Science is a good fit.

1) Customer churn prediction

Problem: identify customers likely to churn to target retention offers.
Why Data Science fits: notebooks for exploration, jobs for scheduled retraining, deployments for real-time scoring in apps.
Example: telecom customer profile + usage data stored in Object Storage; churn model deployed as an HTTPS endpoint consumed by CRM.

2) Fraud detection scoring service

Problem: score transactions for fraud risk with low latency.
Why Data Science fits: managed model deployments with controlled networking and IAM.
Example: a payment service calls the deployment endpoint; only private VCN access is allowed.

3) Demand forecasting for supply chain

Problem: forecast demand to reduce stockouts and overstock.
Why Data Science fits: jobs for periodic training/inference and artifact tracking.
Example: nightly job trains a forecasting model and writes forecasts back to Object Storage or a database.

4) Predictive maintenance (IoT)

Problem: predict equipment failure from sensor data.
Why Data Science fits: notebooks for feature engineering; jobs for batch scoring; deployments for near-real-time inference.
Example: an ingestion pipeline stores sensor windows in Object Storage; a job scores anomalies and alerts operations.

5) Document classification (lightweight)

Problem: classify documents into business categories.
Why Data Science fits: train classical ML or smaller NLP models; deploy for inference.
Example: new documents uploaded to Object Storage are batch-classified nightly.

6) Credit risk scoring

Problem: predict default risk for loan applicants.
Why Data Science fits: strong governance needs (IAM, compartments) and reproducibility.
Example: underwriting system calls a private model deployment endpoint.

7) Recommendation model prototype to production

Problem: convert a notebook-based prototype into a reliable service.
Why Data Science fits: model catalog + deployment workflow encourages operationalization.
Example: data scientist prototypes in notebook; ML engineer packages model artifact and deploys to production endpoint.

8) Retail price optimization experiment

Problem: evaluate price elasticity and optimize pricing.
Why Data Science fits: quick notebook iteration; jobs for large evaluations.
Example: train models on historical sales data and run batch simulations as jobs.

9) Anomaly detection for logs/metrics

Problem: detect unusual patterns in operational telemetry.
Why Data Science fits: batch training on historical data; deployment used by an internal tool.
Example: nightly job updates anomaly thresholds; endpoint used by ops dashboards.

10) Compliance and audit-friendly model registry

Problem: enforce traceability of model versions and metadata.
Why Data Science fits: model catalog plus OCI governance patterns (tags, compartments, audit logs).
Example: register each model with versioning metadata and link to training job output stored in Object Storage.

11) Computer vision experimentation (GPU-based)

Problem: train vision models that need GPUs.
Why Data Science fits: choose GPU shapes for notebooks/jobs; later deploy a smaller model for inference.
Example: train on labeled images in Object Storage; deploy an inference endpoint for internal QA.

12) Feature engineering sandbox with secure data access

Problem: analysts need to experiment without exporting sensitive data.
Why Data Science fits: notebooks inside private subnets with restricted egress; IAM controls.
Example: notebook session runs in a private subnet and reads data from private DB endpoints.

6. Core Features

Feature availability can vary by region and over time. For the most current details, verify in the official OCI Data Science documentation: https://docs.oracle.com/en-us/iaas/data-science/using/

Projects

What it does: organizes related Data Science resources (notebooks, jobs, models).
Why it matters: reduces sprawl and supports team-based governance.
Practical benefit: consistent compartment/tagging and clearer lifecycle management.
Caveats: projects don’t replace compartments; use compartments for environment isolation (dev/test/prod).

Notebook Sessions (managed JupyterLab)

What it does: provides a managed development environment for Python-based ML.
Why it matters: accelerates experimentation while keeping compute selection and network placement explicit.
Practical benefit: quick start with curated environments; easy stop/start.
Caveats: notebook sessions incur compute/storage costs while running; ensure you stop them when idle.

Jobs (managed batch runs)

What it does: executes code in a managed, repeatable way (training or batch inference).
Why it matters: notebooks are not ideal for repeatable production runs; jobs help standardize execution.
Practical benefit: consistent environment, better automation patterns, and clearer auditability.
Caveats: job setup requires packaging code and dependencies; plan artifact storage and logging upfront.

Model Catalog (model registration)

What it does: registers model artifacts and metadata, usually backed by Object Storage.
Why it matters: enables versioning, discovery, governance, and consistent deployment inputs.
Practical benefit: easier promotion of a known model version to staging/production.
Caveats: you must manage artifact structure and metadata discipline; the catalog doesn’t automatically ensure model quality.

Model Deployments (managed online inference)

What it does: hosts a model as an HTTPS endpoint for real-time predictions.
Why it matters: removes the need to manage your own model-serving infrastructure for many standard use cases.
Practical benefit: consistent deployment workflow, IAM/network controls, and operational visibility.
Caveats: ensure your model artifact includes correct scoring/inference code; deployment failures are often packaging-related.

Curated environments (Conda)

What it does: provides prebuilt Conda environments commonly used in ML.
Why it matters: reduces dependency conflicts and improves reproducibility.
Practical benefit: faster onboarding and fewer “works on my laptop” issues.
Caveats: if you need niche libraries, you may need a custom environment; keep security patching in mind.

OCI Accelerated Data Science (ADS) tooling (where available)

What it does: Oracle-provided Python tooling to support Data Science workflows (packaging, connectors, common ML tasks).
Why it matters: encourages consistent patterns for moving from notebook → model → deployment.
Practical benefit: speeds up artifact creation and metadata handling.
Caveats: verify current ADS capabilities and supported patterns in official docs and repos; don’t assume parity with MLflow/SageMaker tooling.

Identity and Access Management (IAM) integration

What it does: controls who can create/manage Data Science resources and what notebook/jobs can access.
Why it matters: ML systems often touch sensitive data; least-privilege is essential.
Practical benefit: compartment scoping + policies provide strong governance.
Caveats: misconfigured policies are a common source of “permission denied” errors.

Networking integration (VCN/subnets, private endpoints)

What it does: allows placing notebooks and deployments in specific network contexts.
Why it matters: many data sources are private (databases, internal APIs).
Practical benefit: private inference endpoints and controlled egress reduce exposure.
Caveats: incorrect route tables/NSGs can block package installs, Object Storage access, or endpoint invocation.

Observability integration (Logging/Monitoring)

What it does: integrates workloads with OCI monitoring/logging patterns.
Why it matters: production inference needs SLOs, alerts, and traceability.
Practical benefit: align ML endpoints with standard ops practices.
Caveats: ensure you design log retention and protect sensitive data in logs.

7. Architecture and How It Works

High-level service architecture

At a high level: – You create a Project in a compartment. – You start a Notebook Session on an OCI compute shape to explore data and train models. – You store datasets and artifacts in Object Storage (commonly). – You register a model in the Model Catalog. – You create a Model Deployment to serve predictions via HTTPS.

Request/data/control flow

Control plane: OCI API/Console calls create and manage resources (projects, sessions, jobs, models, deployments).
Data plane (training): notebook or job reads data (Object Storage, DBs), trains model, writes artifacts (Object Storage).
Data plane (inference): clients call deployment endpoint → service loads model artifact → returns prediction response.

Integrations with related OCI services

Common integrations: – Object Storage: datasets, model artifacts, logs/outputs – VCN/Subnets/NSGs: private network placement and access control – IAM: users/groups/policies, dynamic groups, resource principals – Vault: storing secrets (API keys, DB passwords) when needed – Logging/Monitoring: logs, metrics, alarms (verify which metrics are emitted for deployments in your region) – Container Registry (OCIR): may be used in advanced workflows (verify exact support for custom containers in current docs)

Dependency services

Compute (for notebooks/jobs/deployments)
Storage (Object Storage; Block Volumes for notebook storage)
Networking (VCN, subnets, gateways as required)
IAM and Audit

Security/authentication model

Human access via IAM users/groups and compartment-scoped policies.
Workload access via resource principals (recommended), enabled through dynamic groups + policies.
API access via OCI SDK/CLI using config files, instance principals, or resource principals (pattern depends on where code runs).

Networking model

You typically choose: – Public access (simpler) for notebooks and endpoints, with careful IP restrictions if available and strong authentication. – Private access (recommended for production): notebooks and deployments in private subnets, accessed via bastion/VPN/FastConnect, with controlled egress (NAT) and private service access patterns.

Monitoring/logging/governance considerations

Tag resources (project, notebook session, model, deployment) with environment and cost center.
Centralize logs and set retention policies.
Use alarms on deployment health/latency metrics if available in your region (verify in official docs).
Use compartments for isolation and limit broad IAM policies.

Simple architecture diagram (learning lab)

flowchart LR
  U[User / Data Scientist] -->|OCI Console| DS[OCI Data Science Project]
  DS --> NB[Notebook Session (JupyterLab)]
  NB --> OS[(OCI Object Storage)]
  NB --> MC[Model Catalog]
  MC --> MD[Model Deployment (HTTPS Endpoint)]
  App[Client App] -->|Predict Request| MD
  MD -->|Reads artifact| OS

Production-style architecture diagram (enterprise)

flowchart TB
  subgraph Tenancy[OCI Tenancy]
    subgraph Net[VCN (Prod)]
      subgraph Priv[Private Subnet]
        MD[Model Deployment<br/>Private Endpoint]
        NB[Notebook/Job Subnet Access]
      end
      subgraph Sec[Security Controls]
        NSG[NSGs / Security Lists]
        RT[Route Tables]
        NAT[NAT Gateway]
        SGW[Service Gateway]
      end
    end

    OS[(Object Storage Bucket<br/>Datasets + Model Artifacts)]
    VAULT[OCI Vault]
    LOG[Logging]
    MON[Monitoring/Alarms]
    AUD[Audit Logs]
    IAM[IAM Policies<br/>Compartment Isolation]
  end

  App[Internal Apps (VCN)] --> MD
  MD --> OS
  NB --> OS
  NB --> VAULT
  MD --> LOG
  MD --> MON
  IAM --> NB
  IAM --> MD
  AUD --> IAM

  Priv --- NSG
  Priv --- RT
  RT --> NAT
  RT --> SGW

8. Prerequisites

OCI account/tenancy requirements

An active Oracle Cloud tenancy with billing enabled (or Free Tier where applicable).
Access to a region where Data Science is available. Region availability can change—verify in OCI docs and your tenancy’s region subscriptions.

Permissions / IAM roles

You need permissions to: – Create/manage Data Science resources (projects, notebook sessions, models, deployments) – Create/use networking resources (VCN/subnet) if you’re placing sessions/deployments in your own subnets – Use Object Storage buckets for datasets/artifacts

Typical IAM policy patterns (examples; adjust to least privilege and your compartment structure): – Allow a group to manage Data Science resources in a compartment (often via a datascience-family policy). – Allow access to Object Storage (read/write to buckets used for artifacts). – If using resource principals: create a dynamic group for Data Science resources and grant it access to Object Storage, Vault, etc.

Policy syntax and resource types evolve. Always verify policy examples in official IAM and Data Science docs.

Billing requirements

Data Science itself is commonly billed based on underlying resources you use (compute/storage/network). Ensure your tenancy can create the required compute shapes.

Tools

OCI Console (web UI)
Optional:
OCI CLI: https://docs.oracle.com/en-us/iaas/Content/API/SDKDocs/cliinstall.htm
OCI SDK for Python (if building automation)
Git (for code)
A local terminal for curl testing of endpoints

Region availability

Verify Data Science availability in your target region in official OCI documentation or the Console service list.

Quotas/limits

Common limit areas (varies by tenancy/region): – Maximum number of notebook sessions – Compute shape quotas (OCPU/GPU) – Maximum number of model deployments – Object Storage bucket limits

Check in OCI Console: – Governance & Administration → Limits, Quotas and Usage

Prerequisite services

Object Storage bucket for datasets/model artifacts (recommended)
VCN/subnet if using private networking (recommended for production)

9. Pricing / Cost

Pricing changes and varies by region and contract. Do not rely on blog posts for exact numbers. Use the official pricing pages and your tenancy’s cost tools.

Current pricing model (how you’re charged)

In Oracle Cloud, Data Science costs typically come from the underlying resources you provision and run, such as: – Compute for notebook sessions, jobs, and model deployments (OCPU/GPU hours) – Block Volume for notebook session storage (boot/attached storage depending on configuration) – Object Storage for datasets, model artifacts, logs, and outputs – Network egress (data leaving OCI region to the public internet or cross-region), depending on architecture

In many OCI services, the service control plane may not be billed separately, but the workloads are billed through compute/storage/network. Verify the Data Science pricing section in the official OCI price list for how Oracle currently states this.

Official resources: – OCI pricing overview / price list: https://www.oracle.com/cloud/price-list/ – OCI cost estimator: https://www.oracle.com/cloud/costestimator.html

Pricing dimensions to understand

Cost Area	What drives cost	Practical examples
Notebook Sessions	Shape (CPU/GPU), hours running, attached storage	Leaving a notebook running overnight is a common cost leak
Jobs	Shape, runtime duration, number of runs	Scheduled training daily vs weekly can multiply cost
Model Deployments	Shape, number of instances (if supported), uptime	24/7 deployments cost more than on-demand
Object Storage	GB stored, requests	Large datasets + many artifacts over time
Block Volume	GB provisioned, performance tier (if applicable)	Over-provisioning notebook volumes
Network	Egress to internet/cross-region	Calling endpoints from outside OCI; downloading large artifacts

Free Tier (if applicable)

Oracle Cloud offers a Free Tier program, but eligibility and included services change. Verify current Free Tier coverage for Data Science-related resources (compute shapes, storage) in Oracle’s Free Tier documentation.

Cost drivers (what surprises people)

Always-on model deployments: online endpoints running continuously can be the biggest driver.
Idle notebooks: developers forget to stop sessions.
Over-sized shapes: using GPU shapes for CPU-only workloads.
Data transfer: large dataset movement across regions or to on-prem without planning.
Storage sprawl: repeated model artifacts and intermediate outputs left in Object Storage.

Hidden/indirect costs

Logging and monitoring retention (if you export logs or keep long retention)
CI/CD runners or external build systems
NAT Gateway and outbound traffic costs (architecture dependent)
Data labeling tools (if you add a separate labeling workflow)

Network/data transfer implications

Keep datasets and endpoints in the same region where possible.
Prefer private connectivity (FastConnect/VPN) or in-VCN calling patterns for sensitive workloads.
Minimize cross-region transfers for large artifacts.

How to optimize cost

Use the smallest practical notebook shape; scale up only when necessary.
Stop notebook sessions when not in use.
Use jobs for batch workloads and schedule appropriately.
Delete unused model deployments; recreate when needed.
Use lifecycle policies on Object Storage buckets to transition or delete old artifacts.
Tag resources and review cost reports by tag/compartment.

Example low-cost starter estimate (no fabricated numbers)

A low-cost learning setup typically includes: – One small CPU notebook session running only during lab time – A small Object Storage bucket for a few MB–GB of artifacts – No GPU shapes – A model deployment created briefly for validation, then deleted

To get real numbers: 1. Select your region and shapes in the OCI Cost Estimator: https://www.oracle.com/cloud/costestimator.html
2. Add compute for notebook + deployment hours, plus storage.

Example production cost considerations

In production, plan for: – Always-on model deployment(s) with predictable baseline usage – Separate staging and prod deployments – Monitoring/alerting overhead – Retraining jobs (daily/weekly) on larger shapes – Data retention policies for model artifacts and training data snapshots

10. Step-by-Step Hands-On Tutorial

This lab builds and deploys a small ML model end-to-end using Oracle Cloud Data Science. It’s designed to be beginner-friendly and low-cost by using CPU shapes and a small dataset.

Objective

Create an OCI Data Science Project
Launch a Notebook Session (JupyterLab)
Train a simple scikit-learn model
Package a minimal inference script
Register the model in the Model Catalog
Create a Model Deployment
Invoke the endpoint and validate predictions
Clean up resources to avoid ongoing costs

Lab Overview

You will: 1. Create a project 2. Create a notebook session 3. Train and export a model artifact (model.joblib) and scoring code (score.py) 4. Register the model 5. Deploy it as an HTTPS endpoint 6. Test with curl 7. Delete resources

Notes before you start: – Console screens can change. Use the service navigation and search if labels differ. – If any deployment packaging requirements differ in your region/tenancy (for example, required artifact structure), verify in official docs and adjust accordingly.

Step 1: Create a compartment (recommended) and a project

Goal: keep lab resources isolated for cleanup and governance.

In OCI Console, create or choose a compartment (for example: ds-lab).
Navigate to Analytics & AI → Data Science.
Click Projects → Create project.
Name: ds-iris-lab (or similar)
Select the compartment (for example: ds-lab)
Create.

Expected outcome: A Data Science project exists and appears in the Projects list.

Verification: – Open the project details page and confirm it’s in the correct compartment.

Step 2: Create a notebook session (JupyterLab)

Goal: get a managed environment to run Python ML.

Inside your project, go to Notebook Sessions → Create notebook session.
Provide: – Name: iris-notebook – Compute shape: choose a small CPU shape (avoid GPU for this lab). – Networking:
- For a quick lab, you may use a basic/public option if offered.
- For stricter security, select a VCN + subnet you control.
- If you place it in a private subnet, ensure it has access to required OCI services and package repositories (via NAT/service gateway patterns as appropriate).
Create the notebook session.
Wait until the notebook session status is Active (or equivalent).
Click Open (JupyterLab).

Expected outcome: JupyterLab opens in your browser.

Verification: – In JupyterLab, open a Terminal and run:

python3 --version

You should see a Python version output.

Step 3: Train a simple model in the notebook

Goal: create a working model artifact with minimal dependencies.

In JupyterLab, create a new notebook (Python 3). Run the following cells.

3.1 Install/verify Python packages

Many notebook environments already include these. If not, install them:

import sys
!{sys.executable} -m pip install -q scikit-learn joblib pandas numpy

Expected outcome: Packages install successfully.

Verification: Import works:

import numpy as np
import pandas as pd
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LogisticRegression
from sklearn.metrics import accuracy_score
import joblib

3.2 Train and export the model

# Load a small built-in dataset
iris = load_iris()
X = iris["data"]
y = iris["target"]

X_train, X_test, y_train, y_test = train_test_split(
    X, y, test_size=0.2, random_state=42, stratify=y
)

# Simple model
clf = LogisticRegression(max_iter=200)
clf.fit(X_train, y_train)

pred = clf.predict(X_test)
acc = accuracy_score(y_test, pred)
acc

Expected outcome: You get an accuracy value (commonly > 0.8 for this simple setup).

Export the model:

joblib.dump(clf, "model.joblib")

Expected outcome: A file named model.joblib exists in the notebook working directory.

Verification:

!ls -lh model.joblib

Step 4: Create a minimal scoring script (`score.py`)

Goal: provide inference code that the deployment runtime can call.

Create a file named score.py in the same directory as model.joblib.

In JupyterLab, you can create a new text file, or use a notebook cell:

score_py = r'''
import json
import joblib
import numpy as np

# Load the model once at startup
MODEL = joblib.load("model.joblib")

def predict(data):
    """
    A simple prediction function.

    Expected input (JSON):
    {
      "data": [[5.1, 3.5, 1.4, 0.2]]
    }

    Output:
    {
      "predictions": [0]
    }
    """
    # Accept either dict or JSON string depending on runtime caller
    if isinstance(data, (str, bytes)):
        payload = json.loads(data)
    else:
        payload = data

    X = np.array(payload["data"], dtype=float)
    y = MODEL.predict(X)
    return {"predictions": y.tolist()}
'''
with open("score.py", "w") as f:
    f.write(score_py)

!ls -lh score.py

Expected outcome: score.py exists.

Verification test locally:

import score
score.predict({"data": [[5.1, 3.5, 1.4, 0.2]]})

You should see a response like:

{'predictions': [0]}

Important: OCI Data Science model deployments can have specific handler signatures and artifact structures. If your deployment later fails because it can’t find the correct entry point, verify the required scoring interface in the official docs and adjust the script accordingly.

Step 5: Package the model artifact for upload

Goal: create a single archive with model + scoring code.

Create a zip file:

!zip -r model_artifact.zip model.joblib score.py
!ls -lh model_artifact.zip

Expected outcome: model_artifact.zip exists.

Step 6: Register the model in the Model Catalog

There are multiple ways to register a model. The most universally accessible approach for beginners is to download the artifact and upload it via the OCI Console.

6.1 Download the artifact from JupyterLab

In the JupyterLab file browser, right-click model_artifact.zip
Choose Download
Save it locally

Expected outcome: The zip file is on your machine.

6.2 Create a model entry in the project

In OCI Console → Data Science → your project
Go to Models (Model Catalog within the project/compartment context)
Click Create model
Provide: – Name: iris-logreg – Description: Iris classifier logistic regression – Artifact: upload model_artifact.zip
Create.

Expected outcome: A model appears in the Models list with a version and artifact stored (typically in Object Storage managed/linked by OCI).

Verification: – Open the model details and confirm the artifact is present and the model state is available/active.

Step 7: Create a model deployment (online endpoint)

Goal: serve predictions via HTTPS.

In the model details page, choose Create deployment (or navigate to Model Deployments and create from model).
Provide: – Name: iris-endpoint – Compute shape: small CPU shape (for low cost) – Replica count / scaling: keep minimal (often 1) – Networking:
- For quick validation, a public endpoint may be easiest.
- For production, use a private endpoint in a VCN subnet and call it from inside the network.
Create deployment.
Wait until the deployment is Active.

Expected outcome: The deployment shows an endpoint URL.

Verification: – Confirm deployment health/status is successful. – Note the endpoint URL and any required authentication method.

Authentication note: Depending on your deployment configuration, invoking the endpoint may require OCI IAM auth (request signing) or another supported method. If the console provides a “Test” feature, use it first. If you need request signing, verify the official invocation method in the Data Science docs for your deployment type.

Step 8: Invoke the endpoint (prediction test)

Use either: – The Console “Test” option (if available), or – curl from an environment that can reach the endpoint (your laptop for public endpoints; a VM in the VCN for private endpoints).

Example request body

{
  "data": [[5.1, 3.5, 1.4, 0.2]]
}

If the console provides a built-in test

Paste the JSON body
Run prediction

Expected outcome: Response includes predictions with a class id (0, 1, or 2).

If using curl (endpoint must be reachable, auth must match)

Because authentication requirements can vary, a generic unauthenticated curl might fail with 401/403. If your deployment is configured to require OCI IAM signed requests, you’ll need an SDK/CLI-based signed call pattern.

If your deployment supports a simple HTTPS call (verify in your console/docs), it may look like:

curl -X POST "https://<your-model-deployment-endpoint>" \
  -H "Content-Type: application/json" \
  -d '{"data": [[5.1, 3.5, 1.4, 0.2]]}'

Expected outcome: JSON response with predictions.

If this step fails with authorization errors, do not weaken security. Instead, use the official signed-request method or call from an authorized OCI environment. Verify the correct invocation procedure in the Data Science model deployment documentation.

Validation

You have successfully completed the lab if: – The notebook session ran training code and created model.joblib – The model artifact zip uploaded to the Model Catalog successfully – The model deployment reached an Active/Healthy state – A test invocation returned a prediction

Quick checklist: – [ ] Project created
– [ ] Notebook session Active and JupyterLab accessible
– [ ] Model trained, model.joblib created
– [ ] score.py works locally
– [ ] model_artifact.zip uploaded to Model Catalog
– [ ] Deployment Active
– [ ] Endpoint returns predictions

Troubleshooting

Common issues and fixes:

Notebook session won’t start (shape unavailable / quota exceeded) – Check Limits, Quotas and Usage – Choose a smaller shape – Request quota increase if needed
Can’t install Python packages – If notebook is in a private subnet, ensure outbound access (NAT) or approved package repository access – Consider using preinstalled curated environments – Verify DNS and route tables
Model deployment fails to become active – Artifact structure may not match required runtime expectations – Ensure model.joblib and score.py are at the root of the zip (as you packaged) – Verify required handler signature/entrypoint in official docs for your deployment type – Check deployment logs (if available) in OCI Logging
401/403 when invoking endpoint – Endpoint likely requires OCI IAM signed requests – Use Console test feature or official signed request procedure – Ensure caller is authorized and network path is correct
Timeouts when calling endpoint – If private endpoint: call from within VCN (or through VPN/FastConnect) – Check NSGs/security lists and route tables – Confirm DNS resolution and that the endpoint is reachable

Cleanup

To avoid ongoing costs, clean up in this order:

Delete model deployment (iris-endpoint) – This stops continuous compute billing for online inference.
Stop and delete notebook session – Stop first if required; then delete.
Delete model(s) from the Model Catalog (optional) – If you want to remove artifacts and metadata.
Delete Object Storage artifacts/buckets (if you created your own) – Ensure no required data remains.
Delete project (optional)
Delete compartment (only if it contains nothing else)

11. Best Practices

Architecture best practices

Separate dev/test/prod using compartments (and often separate VCNs/subnets).
Keep data, training, and deployments in the same region to reduce latency and transfer costs.
Use jobs for repeatable training and batch inference rather than long-running notebooks.
Use Object Storage as the durable artifact store; treat notebook storage as ephemeral.

IAM/security best practices

Apply least privilege policies at the compartment level.
Prefer resource principals for notebooks/jobs accessing OCI services (instead of embedding API keys).
Use dynamic groups to scope workload identities and restrict what they can access.
Enforce tagging policies for ownership and environment classification.

Cost best practices

Default to small CPU shapes; scale up only when evidence justifies it.
Stop notebooks immediately after use.
Delete model deployments when not needed (especially in dev).
Apply Object Storage lifecycle policies to old artifacts and logs.

Performance best practices

Right-size shapes based on dataset size and algorithm needs.
Use GPUs only when the training workload benefits (deep learning, large compute).
Cache intermediate datasets carefully; avoid repeated downloads/processing.

Reliability best practices

Use jobs and versioned artifacts so you can reproduce a model build.
Keep multiple model versions in the catalog; promote via controlled processes.
Plan rollback: deployment should be able to revert to the last known good model.

Operations best practices

Centralize logs and define retention.
Set alarms for deployment health and latency if metrics are available (verify).
Document runbooks for common failures: deployment build errors, endpoint auth errors, quota issues.

Governance/tagging/naming best practices

Naming conventions (example):
ds-<team>-<env>-<project>
nb-<project>-<user>
mdl-<usecase>-<version>
dep-<usecase>-<env>
Tagging keys:
CostCenter, Owner, Environment, DataSensitivity

12. Security Considerations

Identity and access model

Human access: OCI IAM users/groups and policies.
Workload access: resource principals via dynamic groups (recommended).
Avoid distributing long-lived API keys to notebooks unless necessary.

Encryption

OCI services generally provide encryption at rest and TLS in transit.
Confirm KMS/Vault integrations and encryption specifics for:
Object Storage buckets
Block volumes used by notebook sessions
Deployment endpoints
Verify in official docs for your region and configuration.

Network exposure

Prefer private subnets for:
notebooks accessing sensitive data
production model deployments
Use NSGs/security lists to restrict inbound/outbound traffic.
Avoid public endpoints unless required; if public, enforce strict auth and monitoring.

Secrets handling

Store secrets in OCI Vault, not in notebooks or code.
Avoid printing secrets in logs.
Rotate credentials and use short-lived access patterns where possible.

Audit/logging

Use OCI Audit to track API actions (who created deployments, changed policies, etc.).
Enable and review logs for deployment failures and invocation patterns.
Treat inference request/response logs as sensitive; avoid logging PII.

Compliance considerations

Data residency: keep regulated data in approved regions.
Access reviews: periodic review of IAM policies and dynamic groups.
Artifact governance: model artifacts may embed training data patterns; control distribution.

Common security mistakes

Leaving notebooks running in public networks with weak access controls
Overbroad policies like “manage all-resources in tenancy”
Public model endpoints without strong authentication and rate limiting (where applicable)
Logging sensitive payloads

Secure deployment recommendations

Private endpoints for production deployments
Signed requests / IAM-based auth (or the officially recommended secure auth pattern)
Strict compartment isolation and CI/CD-based promotion
Regular patching/updates of environments and dependencies

13. Limitations and Gotchas

Specific limits change over time and differ by region/tenancy. Always check OCI limits and current Data Science docs.

Known limitations / common gotchas

Quota limits on shapes (OCPU/GPU) can block notebook sessions or deployments.
Packaging requirements for model deployment artifacts can be strict (entrypoints, file layout).
Network configuration can break common workflows:
private subnet without NAT/service gateway may block installs or Object Storage access
Always-on deployment cost can accumulate quickly.
Artifact sprawl: too many model versions and intermediate outputs in Object Storage.
Auth mismatch: endpoint invocation often fails when callers don’t use the required auth/signing method.
Region feature skew: not all regions get the same feature updates at the same time (verify in docs).

Migration challenges

Moving from other platforms often requires:
repackaging model artifacts
reworking pipelines around OCI IAM and Object Storage
rebuilding CI/CD and monitoring patterns for OCI

Vendor-specific nuances

OCI compartments are central to governance; design them early.
Resource principals/dynamic groups are powerful, but require careful policy design to avoid privilege creep.

14. Comparison with Alternatives

Within Oracle Cloud

Oracle Machine Learning (OML) in Autonomous Database: strong when your data is in the database and you want in-db ML patterns.
OCI AI Services: better if you want prebuilt AI APIs (vision, language, speech) rather than building your own models.
Oracle Analytics Cloud: analytics and BI platform; not a substitute for full ML development/deployment workflows.

Other clouds

AWS SageMaker, Azure Machine Learning, Google Vertex AI provide similar managed ML platform capabilities with cloud-specific integrations.

Open-source/self-managed

Kubeflow, MLflow + Kubernetes, self-managed JupyterHub can provide flexibility but require more operations.

Comparison table

Option	Best For	Strengths	Weaknesses	When to Choose
Oracle Cloud Data Science	Teams building/deploying custom ML models on OCI	Tight OCI integration (IAM/VCN/Object Storage), managed notebooks & deployments	Cloud-specific workflows; packaging/deployment patterns must be learned	You run on OCI and want managed ML lifecycle + secure OCI-native ops
Oracle Machine Learning (OML)	ML close to Autonomous Database data	In-database ML and governance around DB	Not a general-purpose model serving platform	Data stays in ADB and you want ML without moving it
OCI AI Services	Using pretrained AI APIs	Fast time-to-value, no training required	Limited to offered APIs/capabilities	You need OCR/NLP/vision APIs more than custom model training
AWS SageMaker	AWS-native ML platform	Mature ecosystem, many managed features	AWS coupling; migration effort	Your infrastructure is primarily on AWS
Azure Machine Learning	Azure-native ML platform	Strong MLOps integrations	Azure coupling; migration effort	Your infrastructure is primarily on Azure
Google Vertex AI	GCP-native ML platform	Unified training + serving + MLOps	GCP coupling; migration effort	Your infrastructure is primarily on GCP
Kubeflow / MLflow (self-managed)	Maximum control and portability	Flexible, open tooling	High ops burden (K8s, upgrades, security)	You have platform engineering capacity and need cloud portability

15. Real-World Example

Enterprise example: Private credit risk scoring platform

Problem: A bank needs to deploy a credit risk model with strong security controls, private data sources, and auditability.
Proposed architecture:
Data stored in Autonomous Database and Object Storage (feature snapshots)
Data Science jobs run training monthly in a controlled subnet
Model artifacts registered in the Model Catalog
Model deployments exposed via private endpoints in a VCN
Access controlled by IAM policies and dynamic groups (resource principals)
Logs routed to OCI Logging; alarms on error rates/latency
Why Data Science was chosen: OCI-native identity/networking integration aligns with strict security requirements; managed deployment avoids self-hosting.
Expected outcomes:
Reduced time to deploy model updates
Clear audit trail for model versions and promotions
Private inference with controlled access paths

Startup/small-team example: Churn model MVP

Problem: A SaaS startup wants a churn prediction endpoint for internal dashboards and customer success workflows.
Proposed architecture:
Weekly export of customer metrics to Object Storage
Data scientist trains in a small notebook session and runs scheduled jobs
Model registered in the catalog; deployment kept small and scaled minimally
Endpoint called by an internal service
Why Data Science was chosen: quick setup, low operational overhead, and direct path from notebook to deployment.
Expected outcomes:
MVP endpoint in days instead of weeks
Controlled costs by stopping notebooks and deleting/recreating deployments as needed
A foundation for future MLOps improvements

16. FAQ

Is “Data Science” the official Oracle Cloud service name?
Yes. In Oracle Cloud Infrastructure, the service is commonly documented as OCI Data Science. This tutorial uses “Data Science” to mean that OCI service.
Is Data Science a fully managed ML platform like SageMaker/Vertex AI?
It provides managed notebooks, jobs, model catalog, and model deployments. Exact feature parity differs across clouds—evaluate based on your workflow needs.
Do I pay separately for Data Science?
Costs typically come from underlying compute, storage, and network resources used by notebooks, jobs, and deployments. Confirm current billing behavior in the official OCI price list.
What’s the difference between a notebook session and a job?
Notebook sessions are interactive (ideal for exploration). Jobs are for repeatable, managed runs (better for scheduled training and batch inference).
Where should I store training data and artifacts?
Commonly in OCI Object Storage. Treat notebook storage as non-authoritative and keep artifacts versioned.
Can I deploy private endpoints for inference?
Private networking is a common production pattern. Exact configuration depends on your VCN/subnet setup and current Data Science deployment options—verify in official docs.
How do I authenticate a notebook to access Object Storage securely?
Prefer resource principals with dynamic groups and least-privilege policies. Avoid embedding API keys in notebooks.
How do I version models?
Use the Model Catalog and adopt naming/version metadata conventions. Store training metadata and code commit references alongside model versions.
Can I run GPU training?
OCI supports GPU shapes, but availability depends on region and quota. Use GPUs when the workload benefits (deep learning, large training jobs).
How do I monitor model deployments?
Use OCI monitoring/logging where supported. Track latency, error rates, and request volume; define alarms and runbooks. Verify available metrics in your region.
What’s the best way to reduce cost?
Stop notebooks when idle, keep deployments minimal, delete unused endpoints, and avoid over-sizing shapes.
Can I integrate CI/CD for model promotion?
Yes, using OCI DevOps or external CI systems, with artifacts stored in Object Storage and controlled promotion processes. Implementation details vary.
How do I keep sensitive data out of logs?
Don’t log raw payloads by default. Mask PII, control log retention, and restrict log access via IAM.
Can I call a model deployment from outside OCI?
If the endpoint is public and authentication allows it, yes. For sensitive workloads, prefer private access (VPN/FastConnect or in-VCN callers).
What causes most deployment failures?
Packaging/entrypoint mismatches, missing dependencies, incorrect artifact structure, and networking restrictions during startup.
How do compartments help?
Compartments provide isolation boundaries for IAM policies, cost reporting, and environment separation (dev/test/prod).
Can Data Science replace a feature store or full MLOps suite?
It provides core ML workflow pieces, but you may still need additional governance, feature management, drift monitoring, and CI/CD patterns depending on requirements.

17. Top Online Resources to Learn Data Science

Resource Type	Name	Why It Is Useful
Official documentation	OCI Data Science Docs — https://docs.oracle.com/en-us/iaas/data-science/using/	Primary source for current features, concepts, and step-by-step guidance
Official pricing	OCI Price List — https://www.oracle.com/cloud/price-list/	Authoritative pricing reference (region/SKU dependent)
Pricing calculator	OCI Cost Estimator — https://www.oracle.com/cloud/costestimator.html	Estimate costs for notebook shapes, deployments, storage, and network
Official CLI docs	OCI CLI Installation — https://docs.oracle.com/en-us/iaas/Content/API/SDKDocs/cliinstall.htm	Automate Data Science and related OCI resources
Architecture guidance	OCI Architecture Center — https://docs.oracle.com/en/solutions/	Reference architectures for OCI networking, security, and deployment patterns
Hands-on labs	Oracle LiveLabs — https://livelabs.oracle.com/	Guided labs; search for “Data Science” and related ML labs
Official samples (GitHub)	Oracle OCI Data Science AI Samples — https://github.com/oracle/oci-data-science-ai-samples	Practical notebooks and examples aligned to OCI Data Science tooling
Observability docs	OCI Logging — https://docs.oracle.com/en-us/iaas/Content/Logging/home.htm	Learn how to route and manage logs from OCI services
IAM docs	OCI IAM — https://docs.oracle.com/en-us/iaas/Content/Identity/home.htm	Policies, dynamic groups, least-privilege patterns
Community learning	Oracle Cloud Community — https://community.oracle.com/	Discussions and real-world tips (validate against official docs)

18. Training and Certification Providers

Institute	Suitable Audience	Likely Learning Focus	Mode	Website URL
DevOpsSchool.com	DevOps, cloud engineers, platform teams, beginners	OCI fundamentals, DevOps/MLOps adjacent practices, structured training	Check website	https://www.devopsschool.com/
ScmGalaxy.com	Students, engineers learning tooling and platforms	Software lifecycle, DevOps and platform practices that can support ML delivery	Check website	https://www.scmgalaxy.com/
CLoudOpsNow.in	Cloud operations and engineering roles	Cloud operations, reliability, governance and cost practices	Check website	https://www.cloudopsnow.in/
SreSchool.com	SREs, operations, reliability engineers	Monitoring, reliability, incident response patterns applicable to ML services	Check website	https://www.sreschool.com/
AiOpsSchool.com	Ops + AI/automation learners	AIOps concepts, operational analytics, automation foundations	Check website	https://www.aiopsschool.com/

19. Top Trainers

Platform/Site	Likely Specialization	Suitable Audience	Website URL
RajeshKumar.xyz	DevOps/cloud training and guidance (verify offerings)	Beginners to intermediate engineers	https://rajeshkumar.xyz/
devopstrainer.in	DevOps training platform (verify OCI coverage)	DevOps engineers and cloud practitioners	https://www.devopstrainer.in/
devopsfreelancer.com	Freelance DevOps consulting/training marketplace (verify scope)	Teams needing practical guidance	https://www.devopsfreelancer.com/
devopssupport.in	DevOps support/training services (verify scope)	Engineers seeking hands-on operational help	https://www.devopssupport.in/

20. Top Consulting Companies

Company	Likely Service Area	Where They May Help	Consulting Use Case Examples	Website URL
cotocus.com	Cloud/DevOps consulting (verify specific OCI services)	Architecture, automation, platform setup	Cost governance, IaC foundations, secure networking for ML platforms	https://cotocus.com/
DevOpsSchool.com	Training and consulting services	Enablement + implementation support	Setting up OCI landing zones, IAM patterns, operational runbooks for Data Science deployments	https://www.devopsschool.com/
DEVOPSCONSULTING.IN	DevOps consulting services	CI/CD, observability, operations	Building deployment pipelines, monitoring/alerting integration, policy and tagging standards	https://www.devopsconsulting.in/

21. Career and Learning Roadmap

What to learn before Data Science (recommended prerequisites)

OCI fundamentals: compartments, VCN, IAM policies, Object Storage
Basic Linux and networking concepts (subnets, routing, security lists/NSGs)
Python fundamentals: functions, packages, environments
ML basics: supervised learning, train/test split, evaluation metrics
Git basics for version control

What to learn after Data Science (to become production-ready)

MLOps practices: reproducible training, artifact versioning, promotion workflows
Infrastructure as Code (Terraform on OCI) for repeatable environments
Secure networking: private endpoints, service gateways, bastions
Observability: logging, metrics, alarms, SLOs
Model governance: drift monitoring patterns, bias testing, lineage (may require additional tooling)

Job roles that use it

Data Scientist
Machine Learning Engineer
Cloud Engineer (Analytics/AI)
DevOps Engineer / Platform Engineer supporting ML platforms
SRE/Operations Engineer for ML inference services
Security Engineer (IAM, network security for AI workloads)

Certification path (if available)

Oracle certification offerings change. Check Oracle University / OCI certification listings for current credentials relevant to: – OCI foundations – OCI architect tracks – Analytics/AI specialization
Verify current certification options in official Oracle training portals.

Project ideas for practice

Build and deploy a churn prediction endpoint using Object Storage datasets
Create a scheduled job that retrains a model weekly and registers a new model version
Secure a deployment with private networking and demonstrate in-VCN invocation
Implement cost tagging + cleanup automation for non-prod deployments
Create a lightweight CI pipeline that packages a model artifact and triggers deployment updates (verify supported automation patterns)

22. Glossary

OCI: Oracle Cloud Infrastructure.
Compartment: OCI governance boundary for organizing resources and applying IAM policies.
Project (Data Science): Logical container for Data Science resources.
Notebook Session: Managed Jupyter environment on OCI compute.
Job (Data Science): Managed execution of code for repeatable training/inference.
Model Catalog: Registry of model artifacts and metadata used for governance and deployment.
Model Artifact: Packaged files needed for inference (model binary, scoring code, metadata).
Model Deployment: Managed online inference endpoint hosting a model.
VCN: Virtual Cloud Network—your private network in OCI.
Subnet: A segment of a VCN where resources are placed.
NSG: Network Security Group—virtual firewall rules applied to resources.
Dynamic Group: IAM construct for grouping resources (workloads) by matching rules.
Resource Principal: Workload identity mechanism for OCI resources to call OCI APIs without user API keys.
Object Storage: OCI service for storing unstructured data (datasets, artifacts).
OCPU: Oracle CPU unit used for compute billing and sizing.
Egress: Outbound network traffic leaving a region or to the public internet (potential cost driver).

23. Summary

Oracle Cloud Data Science in the Analytics and AI category is OCI’s managed service for building, training, registering, and deploying machine learning models. It fits best when you want an OCI-native workflow: notebooks and jobs for development and repeatable runs, a model catalog for governance, and model deployments for online inference—secured with OCI IAM and VCN networking.

Cost management is primarily about controlling compute runtime (notebooks and always-on deployments), right-sizing shapes, and governing storage and artifacts in Object Storage. Security success depends on compartment isolation, least-privilege IAM policies, resource principals/dynamic groups, private networking where appropriate, and disciplined logging/auditing.

Next step: follow the official documentation for your region/tenancy and extend this lab into a repeatable pipeline using jobs, versioned artifacts, and production-grade IAM/networking patterns: https://docs.oracle.com/en-us/iaas/data-science/using/

rajeshkumar

Category