Category
Analytics and AI
1. Introduction
Oracle Cloud Data Science (often referred to in documentation as OCI Data Science) is a managed service for building, training, evaluating, deploying, and operating machine learning (ML) models on Oracle Cloud Infrastructure (OCI). It provides cloud-native workflows—projects, notebook sessions, jobs, models, and model deployments—so teams can move from experimentation to production without assembling every component from scratch.
In simple terms: Data Science gives you a managed Jupyter-based environment to explore data and train models, and a managed deployment mechanism to expose trained models as scalable endpoints—while integrating with core OCI services such as Object Storage, IAM, Vault, Logging, and Monitoring.
Technically, Data Science is an OCI control-plane service that orchestrates ML workloads on OCI compute shapes (CPU/GPU). It tracks ML assets (notebooks, jobs, models), supports reproducible environments (Conda environments and curated runtimes), and provides managed online inference through model deployments. You typically store datasets and model artifacts in OCI Object Storage and secure everything through OCI IAM policies and network controls (VCNs, subnets, security lists/NSGs).
What problem it solves: teams often struggle with “last-mile ML” problems—consistent environments, repeatable training runs, artifact management, secure deployments, and operational monitoring. Data Science reduces that friction by standardizing these workflows on Oracle Cloud while keeping integration points (Object Storage, VCN, IAM) explicit and governable.
2. What is Data Science?
Official purpose
Oracle Cloud Data Science is designed to help you build, train, deploy, and manage machine learning models using managed notebook-based development, managed/batch execution (jobs), and managed model deployments for online inference—integrated with OCI’s identity, networking, observability, and storage foundations.
Service naming note: The current service is commonly labeled OCI Data Science in Oracle documentation. In this tutorial, “Data Science” refers specifically to Oracle Cloud Infrastructure Data Science.
Core capabilities (what you can do)
- Create Projects to organize ML assets (notebooks, jobs, models)
- Use managed Notebook Sessions (JupyterLab) for exploration and training
- Run repeatable Jobs for batch training/inference and scheduled runs
- Register trained models in a Model Catalog
- Create Model Deployments (managed HTTPS endpoints) for online inference
- Use curated Conda environments and Oracle’s ML tooling (for example, the OCI Accelerated Data Science library—often referenced as ADS in Oracle materials)
- Integrate with OCI services: Object Storage, IAM, VCN, Logging, Monitoring, Vault, Container Registry (where applicable)
Major components (conceptual model)
- Project: logical container for organizing Data Science resources
- Notebook Session: managed Jupyter environment running on a chosen compute shape
- Job: a managed run of code (often training or batch scoring), typically more reproducible than ad-hoc notebook execution
- Model: a registered ML artifact with metadata; typically stored in Object Storage
- Model Deployment: managed online inference endpoint backed by configurable compute
Service type
- Managed ML platform service (control plane) orchestrating compute for notebooks, jobs, and deployments.
Scope: regional and compartment-based
- Data Science resources are typically regional and created within an OCI compartment.
- Access is governed by OCI IAM (users, groups, policies) and often by resource principals (workload identities) for notebooks/jobs/deployments.
How it fits into the Oracle Cloud ecosystem
Data Science sits in the Analytics and AI category and connects naturally to: – OCI Object Storage for datasets and model artifacts – OCI Data Flow / big data services for large-scale processing (when needed) – Autonomous Database / OCI databases as data sources – VCN and private networking for secure access to data sources – OCI Logging/Monitoring for operational visibility – OCI Vault for managing secrets/keys used by applications and pipelines
3. Why use Data Science?
Business reasons
- Faster path from proof-of-concept to production with standardized ML workflows
- Reduced platform engineering effort compared to building your own notebooks + model serving + IAM + monitoring stack
- Better governance: projects, compartments, policies, tagging, and auditable operations
Technical reasons
- Managed notebooks and deployments on OCI compute (CPU/GPU) using consistent environments
- Model registration and lifecycle management via a model catalog
- Integration with OCI-native networking and identity (private endpoints, resource principals)
- Supports common Python ML stacks and reproducible environments
Operational reasons
- Clear separation of dev (notebooks), batch (jobs), and prod (deployments)
- Use OCI monitoring/logging patterns to run ML endpoints like any other production service
- Easier cleanup and cost control: stop sessions, delete deployments, remove artifacts
Security/compliance reasons
- Central IAM policy enforcement (least privilege at compartment level)
- Private networking options using VCN/subnets (avoid public exposure)
- Encryption at rest and in transit via OCI services (verify specifics per resource in official docs)
- Auditing through OCI Audit logs for API actions
Scalability/performance reasons
- Choose shapes appropriate to workload (small CPU for dev, GPU for training, scalable deployments for inference)
- Managed model deployments can be sized and scaled based on endpoint needs (verify current scaling options in official docs)
When teams should choose Data Science
- You want a managed ML workflow tied to OCI primitives (VCN/IAM/Object Storage)
- You need secure, controlled access to data sources inside OCI
- You want managed online model inference without running Kubernetes/model servers yourself
- You want consistent ML environments and repeatable runs for teams
When teams should not choose Data Science
- You require a fully open, cloud-agnostic ML platform with minimal coupling to a specific cloud’s IAM/networking model
- Your organization already standardized on another ML platform (for example, SageMaker/Vertex AI/Azure ML) and migration cost outweighs benefits
- You need highly specialized custom serving stacks that the managed deployment patterns don’t support (confirm current deployment customization options in official docs)
4. Where is Data Science used?
Industries
- Financial services (risk scoring, fraud detection, credit models)
- Retail/e-commerce (recommendations, demand forecasting, pricing models)
- Healthcare/life sciences (readmission risk, triage support, operational analytics)
- Manufacturing (predictive maintenance, quality inspection, anomaly detection)
- Telecommunications (churn prediction, network anomaly detection)
- Energy/utilities (load forecasting, outage prediction)
- Public sector (resource optimization, fraud/waste detection)
Team types
- Data scientists and ML engineers
- Cloud engineers and platform teams supporting ML workloads
- DevOps/SRE teams operating inference endpoints
- Security teams implementing IAM/network controls for analytics/AI workloads
Workloads
- Exploratory analysis and feature engineering in notebooks
- Batch training runs and evaluation with jobs
- Batch inference (scoring) for periodic pipelines
- Real-time inference via model deployments
- Model registry/catalog for governance and reuse
Architectures
- “Lake-first”: Object Storage data lake → notebooks/jobs → model catalog → deployment
- “Database-first”: Autonomous Database → notebooks/jobs → model deployment close to private network
- Event-driven: object upload triggers pipeline (often via OCI Events/Functions—verify exact integration patterns in official docs)
Real-world deployment contexts
- Private enterprise networks with VCN peering, private endpoints, and strict IAM
- Multi-compartment environments (dev/test/prod segregation)
- CI/CD for ML artifacts where models and deployments are versioned and promoted
Production vs dev/test usage
- Dev/test: small notebook sessions, minimal shapes, experimental projects
- Production: jobs with reproducible environments, model catalog governance, controlled deployments, monitoring/alerts, private endpoints, rigorous IAM
5. Top Use Cases and Scenarios
Below are realistic scenarios where Oracle Cloud Data Science is a good fit.
1) Customer churn prediction
- Problem: identify customers likely to churn to target retention offers.
- Why Data Science fits: notebooks for exploration, jobs for scheduled retraining, deployments for real-time scoring in apps.
- Example: telecom customer profile + usage data stored in Object Storage; churn model deployed as an HTTPS endpoint consumed by CRM.
2) Fraud detection scoring service
- Problem: score transactions for fraud risk with low latency.
- Why Data Science fits: managed model deployments with controlled networking and IAM.
- Example: a payment service calls the deployment endpoint; only private VCN access is allowed.
3) Demand forecasting for supply chain
- Problem: forecast demand to reduce stockouts and overstock.
- Why Data Science fits: jobs for periodic training/inference and artifact tracking.
- Example: nightly job trains a forecasting model and writes forecasts back to Object Storage or a database.
4) Predictive maintenance (IoT)
- Problem: predict equipment failure from sensor data.
- Why Data Science fits: notebooks for feature engineering; jobs for batch scoring; deployments for near-real-time inference.
- Example: an ingestion pipeline stores sensor windows in Object Storage; a job scores anomalies and alerts operations.
5) Document classification (lightweight)
- Problem: classify documents into business categories.
- Why Data Science fits: train classical ML or smaller NLP models; deploy for inference.
- Example: new documents uploaded to Object Storage are batch-classified nightly.
6) Credit risk scoring
- Problem: predict default risk for loan applicants.
- Why Data Science fits: strong governance needs (IAM, compartments) and reproducibility.
- Example: underwriting system calls a private model deployment endpoint.
7) Recommendation model prototype to production
- Problem: convert a notebook-based prototype into a reliable service.
- Why Data Science fits: model catalog + deployment workflow encourages operationalization.
- Example: data scientist prototypes in notebook; ML engineer packages model artifact and deploys to production endpoint.
8) Retail price optimization experiment
- Problem: evaluate price elasticity and optimize pricing.
- Why Data Science fits: quick notebook iteration; jobs for large evaluations.
- Example: train models on historical sales data and run batch simulations as jobs.
9) Anomaly detection for logs/metrics
- Problem: detect unusual patterns in operational telemetry.
- Why Data Science fits: batch training on historical data; deployment used by an internal tool.
- Example: nightly job updates anomaly thresholds; endpoint used by ops dashboards.
10) Compliance and audit-friendly model registry
- Problem: enforce traceability of model versions and metadata.
- Why Data Science fits: model catalog plus OCI governance patterns (tags, compartments, audit logs).
- Example: register each model with versioning metadata and link to training job output stored in Object Storage.
11) Computer vision experimentation (GPU-based)
- Problem: train vision models that need GPUs.
- Why Data Science fits: choose GPU shapes for notebooks/jobs; later deploy a smaller model for inference.
- Example: train on labeled images in Object Storage; deploy an inference endpoint for internal QA.
12) Feature engineering sandbox with secure data access
- Problem: analysts need to experiment without exporting sensitive data.
- Why Data Science fits: notebooks inside private subnets with restricted egress; IAM controls.
- Example: notebook session runs in a private subnet and reads data from private DB endpoints.
6. Core Features
Feature availability can vary by region and over time. For the most current details, verify in the official OCI Data Science documentation: https://docs.oracle.com/en-us/iaas/data-science/using/
Projects
- What it does: organizes related Data Science resources (notebooks, jobs, models).
- Why it matters: reduces sprawl and supports team-based governance.
- Practical benefit: consistent compartment/tagging and clearer lifecycle management.
- Caveats: projects don’t replace compartments; use compartments for environment isolation (dev/test/prod).
Notebook Sessions (managed JupyterLab)
- What it does: provides a managed development environment for Python-based ML.
- Why it matters: accelerates experimentation while keeping compute selection and network placement explicit.
- Practical benefit: quick start with curated environments; easy stop/start.
- Caveats: notebook sessions incur compute/storage costs while running; ensure you stop them when idle.
Jobs (managed batch runs)
- What it does: executes code in a managed, repeatable way (training or batch inference).
- Why it matters: notebooks are not ideal for repeatable production runs; jobs help standardize execution.
- Practical benefit: consistent environment, better automation patterns, and clearer auditability.
- Caveats: job setup requires packaging code and dependencies; plan artifact storage and logging upfront.
Model Catalog (model registration)
- What it does: registers model artifacts and metadata, usually backed by Object Storage.
- Why it matters: enables versioning, discovery, governance, and consistent deployment inputs.
- Practical benefit: easier promotion of a known model version to staging/production.
- Caveats: you must manage artifact structure and metadata discipline; the catalog doesn’t automatically ensure model quality.
Model Deployments (managed online inference)
- What it does: hosts a model as an HTTPS endpoint for real-time predictions.
- Why it matters: removes the need to manage your own model-serving infrastructure for many standard use cases.
- Practical benefit: consistent deployment workflow, IAM/network controls, and operational visibility.
- Caveats: ensure your model artifact includes correct scoring/inference code; deployment failures are often packaging-related.
Curated environments (Conda)
- What it does: provides prebuilt Conda environments commonly used in ML.
- Why it matters: reduces dependency conflicts and improves reproducibility.
- Practical benefit: faster onboarding and fewer “works on my laptop” issues.
- Caveats: if you need niche libraries, you may need a custom environment; keep security patching in mind.
OCI Accelerated Data Science (ADS) tooling (where available)
- What it does: Oracle-provided Python tooling to support Data Science workflows (packaging, connectors, common ML tasks).
- Why it matters: encourages consistent patterns for moving from notebook → model → deployment.
- Practical benefit: speeds up artifact creation and metadata handling.
- Caveats: verify current ADS capabilities and supported patterns in official docs and repos; don’t assume parity with MLflow/SageMaker tooling.
Identity and Access Management (IAM) integration
- What it does: controls who can create/manage Data Science resources and what notebook/jobs can access.
- Why it matters: ML systems often touch sensitive data; least-privilege is essential.
- Practical benefit: compartment scoping + policies provide strong governance.
- Caveats: misconfigured policies are a common source of “permission denied” errors.
Networking integration (VCN/subnets, private endpoints)
- What it does: allows placing notebooks and deployments in specific network contexts.
- Why it matters: many data sources are private (databases, internal APIs).
- Practical benefit: private inference endpoints and controlled egress reduce exposure.
- Caveats: incorrect route tables/NSGs can block package installs, Object Storage access, or endpoint invocation.
Observability integration (Logging/Monitoring)
- What it does: integrates workloads with OCI monitoring/logging patterns.
- Why it matters: production inference needs SLOs, alerts, and traceability.
- Practical benefit: align ML endpoints with standard ops practices.
- Caveats: ensure you design log retention and protect sensitive data in logs.
7. Architecture and How It Works
High-level service architecture
At a high level: – You create a Project in a compartment. – You start a Notebook Session on an OCI compute shape to explore data and train models. – You store datasets and artifacts in Object Storage (commonly). – You register a model in the Model Catalog. – You create a Model Deployment to serve predictions via HTTPS.
Request/data/control flow
- Control plane: OCI API/Console calls create and manage resources (projects, sessions, jobs, models, deployments).
- Data plane (training): notebook or job reads data (Object Storage, DBs), trains model, writes artifacts (Object Storage).
- Data plane (inference): clients call deployment endpoint → service loads model artifact → returns prediction response.
Integrations with related OCI services
Common integrations: – Object Storage: datasets, model artifacts, logs/outputs – VCN/Subnets/NSGs: private network placement and access control – IAM: users/groups/policies, dynamic groups, resource principals – Vault: storing secrets (API keys, DB passwords) when needed – Logging/Monitoring: logs, metrics, alarms (verify which metrics are emitted for deployments in your region) – Container Registry (OCIR): may be used in advanced workflows (verify exact support for custom containers in current docs)
Dependency services
- Compute (for notebooks/jobs/deployments)
- Storage (Object Storage; Block Volumes for notebook storage)
- Networking (VCN, subnets, gateways as required)
- IAM and Audit
Security/authentication model
- Human access via IAM users/groups and compartment-scoped policies.
- Workload access via resource principals (recommended), enabled through dynamic groups + policies.
- API access via OCI SDK/CLI using config files, instance principals, or resource principals (pattern depends on where code runs).
Networking model
You typically choose: – Public access (simpler) for notebooks and endpoints, with careful IP restrictions if available and strong authentication. – Private access (recommended for production): notebooks and deployments in private subnets, accessed via bastion/VPN/FastConnect, with controlled egress (NAT) and private service access patterns.
Monitoring/logging/governance considerations
- Tag resources (project, notebook session, model, deployment) with environment and cost center.
- Centralize logs and set retention policies.
- Use alarms on deployment health/latency metrics if available in your region (verify in official docs).
- Use compartments for isolation and limit broad IAM policies.
Simple architecture diagram (learning lab)
flowchart LR
U[User / Data Scientist] -->|OCI Console| DS[OCI Data Science Project]
DS --> NB[Notebook Session (JupyterLab)]
NB --> OS[(OCI Object Storage)]
NB --> MC[Model Catalog]
MC --> MD[Model Deployment (HTTPS Endpoint)]
App[Client App] -->|Predict Request| MD
MD -->|Reads artifact| OS
Production-style architecture diagram (enterprise)
flowchart TB
subgraph Tenancy[OCI Tenancy]
subgraph Net[VCN (Prod)]
subgraph Priv[Private Subnet]
MD[Model Deployment<br/>Private Endpoint]
NB[Notebook/Job Subnet Access]
end
subgraph Sec[Security Controls]
NSG[NSGs / Security Lists]
RT[Route Tables]
NAT[NAT Gateway]
SGW[Service Gateway]
end
end
OS[(Object Storage Bucket<br/>Datasets + Model Artifacts)]
VAULT[OCI Vault]
LOG[Logging]
MON[Monitoring/Alarms]
AUD[Audit Logs]
IAM[IAM Policies<br/>Compartment Isolation]
end
App[Internal Apps (VCN)] --> MD
MD --> OS
NB --> OS
NB --> VAULT
MD --> LOG
MD --> MON
IAM --> NB
IAM --> MD
AUD --> IAM
Priv --- NSG
Priv --- RT
RT --> NAT
RT --> SGW
8. Prerequisites
OCI account/tenancy requirements
- An active Oracle Cloud tenancy with billing enabled (or Free Tier where applicable).
- Access to a region where Data Science is available. Region availability can change—verify in OCI docs and your tenancy’s region subscriptions.
Permissions / IAM roles
You need permissions to: – Create/manage Data Science resources (projects, notebook sessions, models, deployments) – Create/use networking resources (VCN/subnet) if you’re placing sessions/deployments in your own subnets – Use Object Storage buckets for datasets/artifacts
Typical IAM policy patterns (examples; adjust to least privilege and your compartment structure):
– Allow a group to manage Data Science resources in a compartment (often via a datascience-family policy).
– Allow access to Object Storage (read/write to buckets used for artifacts).
– If using resource principals: create a dynamic group for Data Science resources and grant it access to Object Storage, Vault, etc.
Policy syntax and resource types evolve. Always verify policy examples in official IAM and Data Science docs.
Billing requirements
- Data Science itself is commonly billed based on underlying resources you use (compute/storage/network). Ensure your tenancy can create the required compute shapes.
Tools
- OCI Console (web UI)
- Optional:
- OCI CLI: https://docs.oracle.com/en-us/iaas/Content/API/SDKDocs/cliinstall.htm
- OCI SDK for Python (if building automation)
- Git (for code)
- A local terminal for
curltesting of endpoints
Region availability
- Verify Data Science availability in your target region in official OCI documentation or the Console service list.
Quotas/limits
Common limit areas (varies by tenancy/region): – Maximum number of notebook sessions – Compute shape quotas (OCPU/GPU) – Maximum number of model deployments – Object Storage bucket limits
Check in OCI Console: – Governance & Administration → Limits, Quotas and Usage
Prerequisite services
- Object Storage bucket for datasets/model artifacts (recommended)
- VCN/subnet if using private networking (recommended for production)
9. Pricing / Cost
Pricing changes and varies by region and contract. Do not rely on blog posts for exact numbers. Use the official pricing pages and your tenancy’s cost tools.
Current pricing model (how you’re charged)
In Oracle Cloud, Data Science costs typically come from the underlying resources you provision and run, such as: – Compute for notebook sessions, jobs, and model deployments (OCPU/GPU hours) – Block Volume for notebook session storage (boot/attached storage depending on configuration) – Object Storage for datasets, model artifacts, logs, and outputs – Network egress (data leaving OCI region to the public internet or cross-region), depending on architecture
In many OCI services, the service control plane may not be billed separately, but the workloads are billed through compute/storage/network. Verify the Data Science pricing section in the official OCI price list for how Oracle currently states this.
Official resources: – OCI pricing overview / price list: https://www.oracle.com/cloud/price-list/ – OCI cost estimator: https://www.oracle.com/cloud/costestimator.html
Pricing dimensions to understand
| Cost Area | What drives cost | Practical examples |
|---|---|---|
| Notebook Sessions | Shape (CPU/GPU), hours running, attached storage | Leaving a notebook running overnight is a common cost leak |
| Jobs | Shape, runtime duration, number of runs | Scheduled training daily vs weekly can multiply cost |
| Model Deployments | Shape, number of instances (if supported), uptime | 24/7 deployments cost more than on-demand |
| Object Storage | GB stored, requests | Large datasets + many artifacts over time |
| Block Volume | GB provisioned, performance tier (if applicable) | Over-provisioning notebook volumes |
| Network | Egress to internet/cross-region | Calling endpoints from outside OCI; downloading large artifacts |
Free Tier (if applicable)
Oracle Cloud offers a Free Tier program, but eligibility and included services change. Verify current Free Tier coverage for Data Science-related resources (compute shapes, storage) in Oracle’s Free Tier documentation.
Cost drivers (what surprises people)
- Always-on model deployments: online endpoints running continuously can be the biggest driver.
- Idle notebooks: developers forget to stop sessions.
- Over-sized shapes: using GPU shapes for CPU-only workloads.
- Data transfer: large dataset movement across regions or to on-prem without planning.
- Storage sprawl: repeated model artifacts and intermediate outputs left in Object Storage.
Hidden/indirect costs
- Logging and monitoring retention (if you export logs or keep long retention)
- CI/CD runners or external build systems
- NAT Gateway and outbound traffic costs (architecture dependent)
- Data labeling tools (if you add a separate labeling workflow)
Network/data transfer implications
- Keep datasets and endpoints in the same region where possible.
- Prefer private connectivity (FastConnect/VPN) or in-VCN calling patterns for sensitive workloads.
- Minimize cross-region transfers for large artifacts.
How to optimize cost
- Use the smallest practical notebook shape; scale up only when necessary.
- Stop notebook sessions when not in use.
- Use jobs for batch workloads and schedule appropriately.
- Delete unused model deployments; recreate when needed.
- Use lifecycle policies on Object Storage buckets to transition or delete old artifacts.
- Tag resources and review cost reports by tag/compartment.
Example low-cost starter estimate (no fabricated numbers)
A low-cost learning setup typically includes: – One small CPU notebook session running only during lab time – A small Object Storage bucket for a few MB–GB of artifacts – No GPU shapes – A model deployment created briefly for validation, then deleted
To get real numbers:
1. Select your region and shapes in the OCI Cost Estimator: https://www.oracle.com/cloud/costestimator.html
2. Add compute for notebook + deployment hours, plus storage.
Example production cost considerations
In production, plan for: – Always-on model deployment(s) with predictable baseline usage – Separate staging and prod deployments – Monitoring/alerting overhead – Retraining jobs (daily/weekly) on larger shapes – Data retention policies for model artifacts and training data snapshots
10. Step-by-Step Hands-On Tutorial
This lab builds and deploys a small ML model end-to-end using Oracle Cloud Data Science. It’s designed to be beginner-friendly and low-cost by using CPU shapes and a small dataset.
Objective
- Create an OCI Data Science Project
- Launch a Notebook Session (JupyterLab)
- Train a simple scikit-learn model
- Package a minimal inference script
- Register the model in the Model Catalog
- Create a Model Deployment
- Invoke the endpoint and validate predictions
- Clean up resources to avoid ongoing costs
Lab Overview
You will:
1. Create a project
2. Create a notebook session
3. Train and export a model artifact (model.joblib) and scoring code (score.py)
4. Register the model
5. Deploy it as an HTTPS endpoint
6. Test with curl
7. Delete resources
Notes before you start: – Console screens can change. Use the service navigation and search if labels differ. – If any deployment packaging requirements differ in your region/tenancy (for example, required artifact structure), verify in official docs and adjust accordingly.
Step 1: Create a compartment (recommended) and a project
Goal: keep lab resources isolated for cleanup and governance.
- In OCI Console, create or choose a compartment (for example:
ds-lab). - Navigate to Analytics & AI → Data Science.
- Click Projects → Create project.
- Name:
ds-iris-lab(or similar) - Select the compartment (for example:
ds-lab) - Create.
Expected outcome: A Data Science project exists and appears in the Projects list.
Verification: – Open the project details page and confirm it’s in the correct compartment.
Step 2: Create a notebook session (JupyterLab)
Goal: get a managed environment to run Python ML.
- Inside your project, go to Notebook Sessions → Create notebook session.
- Provide:
– Name:
iris-notebook– Compute shape: choose a small CPU shape (avoid GPU for this lab). – Networking:- For a quick lab, you may use a basic/public option if offered.
- For stricter security, select a VCN + subnet you control.
- If you place it in a private subnet, ensure it has access to required OCI services and package repositories (via NAT/service gateway patterns as appropriate).
- Create the notebook session.
- Wait until the notebook session status is Active (or equivalent).
- Click Open (JupyterLab).
Expected outcome: JupyterLab opens in your browser.
Verification: – In JupyterLab, open a Terminal and run:
python3 --version
You should see a Python version output.
Step 3: Train a simple model in the notebook
Goal: create a working model artifact with minimal dependencies.
In JupyterLab, create a new notebook (Python 3). Run the following cells.
3.1 Install/verify Python packages
Many notebook environments already include these. If not, install them:
import sys
!{sys.executable} -m pip install -q scikit-learn joblib pandas numpy
Expected outcome: Packages install successfully.
Verification: Import works:
import numpy as np
import pandas as pd
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LogisticRegression
from sklearn.metrics import accuracy_score
import joblib
3.2 Train and export the model
# Load a small built-in dataset
iris = load_iris()
X = iris["data"]
y = iris["target"]
X_train, X_test, y_train, y_test = train_test_split(
X, y, test_size=0.2, random_state=42, stratify=y
)
# Simple model
clf = LogisticRegression(max_iter=200)
clf.fit(X_train, y_train)
pred = clf.predict(X_test)
acc = accuracy_score(y_test, pred)
acc
Expected outcome: You get an accuracy value (commonly > 0.8 for this simple setup).
Export the model:
joblib.dump(clf, "model.joblib")
Expected outcome: A file named model.joblib exists in the notebook working directory.
Verification:
!ls -lh model.joblib
Step 4: Create a minimal scoring script (score.py)
Goal: provide inference code that the deployment runtime can call.
Create a file named score.py in the same directory as model.joblib.
In JupyterLab, you can create a new text file, or use a notebook cell:
score_py = r'''
import json
import joblib
import numpy as np
# Load the model once at startup
MODEL = joblib.load("model.joblib")
def predict(data):
"""
A simple prediction function.
Expected input (JSON):
{
"data": [[5.1, 3.5, 1.4, 0.2]]
}
Output:
{
"predictions": [0]
}
"""
# Accept either dict or JSON string depending on runtime caller
if isinstance(data, (str, bytes)):
payload = json.loads(data)
else:
payload = data
X = np.array(payload["data"], dtype=float)
y = MODEL.predict(X)
return {"predictions": y.tolist()}
'''
with open("score.py", "w") as f:
f.write(score_py)
!ls -lh score.py
Expected outcome: score.py exists.
Verification test locally:
import score
score.predict({"data": [[5.1, 3.5, 1.4, 0.2]]})
You should see a response like:
{'predictions': [0]}
Important: OCI Data Science model deployments can have specific handler signatures and artifact structures. If your deployment later fails because it can’t find the correct entry point, verify the required scoring interface in the official docs and adjust the script accordingly.
Step 5: Package the model artifact for upload
Goal: create a single archive with model + scoring code.
Create a zip file:
!zip -r model_artifact.zip model.joblib score.py
!ls -lh model_artifact.zip
Expected outcome: model_artifact.zip exists.
Step 6: Register the model in the Model Catalog
There are multiple ways to register a model. The most universally accessible approach for beginners is to download the artifact and upload it via the OCI Console.
6.1 Download the artifact from JupyterLab
- In the JupyterLab file browser, right-click
model_artifact.zip - Choose Download
- Save it locally
Expected outcome: The zip file is on your machine.
6.2 Create a model entry in the project
- In OCI Console → Data Science → your project
- Go to Models (Model Catalog within the project/compartment context)
- Click Create model
- Provide:
– Name:
iris-logreg– Description:Iris classifier logistic regression– Artifact: uploadmodel_artifact.zip - Create.
Expected outcome: A model appears in the Models list with a version and artifact stored (typically in Object Storage managed/linked by OCI).
Verification: – Open the model details and confirm the artifact is present and the model state is available/active.
Step 7: Create a model deployment (online endpoint)
Goal: serve predictions via HTTPS.
- In the model details page, choose Create deployment (or navigate to Model Deployments and create from model).
- Provide:
– Name:
iris-endpoint– Compute shape: small CPU shape (for low cost) – Replica count / scaling: keep minimal (often 1) – Networking:- For quick validation, a public endpoint may be easiest.
- For production, use a private endpoint in a VCN subnet and call it from inside the network.
- Create deployment.
- Wait until the deployment is Active.
Expected outcome: The deployment shows an endpoint URL.
Verification: – Confirm deployment health/status is successful. – Note the endpoint URL and any required authentication method.
Authentication note: Depending on your deployment configuration, invoking the endpoint may require OCI IAM auth (request signing) or another supported method. If the console provides a “Test” feature, use it first. If you need request signing, verify the official invocation method in the Data Science docs for your deployment type.
Step 8: Invoke the endpoint (prediction test)
Use either:
– The Console “Test” option (if available), or
– curl from an environment that can reach the endpoint (your laptop for public endpoints; a VM in the VCN for private endpoints).
Example request body
{
"data": [[5.1, 3.5, 1.4, 0.2]]
}
If the console provides a built-in test
- Paste the JSON body
- Run prediction
Expected outcome: Response includes predictions with a class id (0, 1, or 2).
If using curl (endpoint must be reachable, auth must match)
Because authentication requirements can vary, a generic unauthenticated curl might fail with 401/403. If your deployment is configured to require OCI IAM signed requests, you’ll need an SDK/CLI-based signed call pattern.
If your deployment supports a simple HTTPS call (verify in your console/docs), it may look like:
curl -X POST "https://<your-model-deployment-endpoint>" \
-H "Content-Type: application/json" \
-d '{"data": [[5.1, 3.5, 1.4, 0.2]]}'
Expected outcome: JSON response with predictions.
If this step fails with authorization errors, do not weaken security. Instead, use the official signed-request method or call from an authorized OCI environment. Verify the correct invocation procedure in the Data Science model deployment documentation.
Validation
You have successfully completed the lab if:
– The notebook session ran training code and created model.joblib
– The model artifact zip uploaded to the Model Catalog successfully
– The model deployment reached an Active/Healthy state
– A test invocation returned a prediction
Quick checklist:
– [ ] Project created
– [ ] Notebook session Active and JupyterLab accessible
– [ ] Model trained, model.joblib created
– [ ] score.py works locally
– [ ] model_artifact.zip uploaded to Model Catalog
– [ ] Deployment Active
– [ ] Endpoint returns predictions
Troubleshooting
Common issues and fixes:
-
Notebook session won’t start (shape unavailable / quota exceeded) – Check Limits, Quotas and Usage – Choose a smaller shape – Request quota increase if needed
-
Can’t install Python packages – If notebook is in a private subnet, ensure outbound access (NAT) or approved package repository access – Consider using preinstalled curated environments – Verify DNS and route tables
-
Model deployment fails to become active – Artifact structure may not match required runtime expectations – Ensure
model.joblibandscore.pyare at the root of the zip (as you packaged) – Verify required handler signature/entrypoint in official docs for your deployment type – Check deployment logs (if available) in OCI Logging -
401/403 when invoking endpoint – Endpoint likely requires OCI IAM signed requests – Use Console test feature or official signed request procedure – Ensure caller is authorized and network path is correct
-
Timeouts when calling endpoint – If private endpoint: call from within VCN (or through VPN/FastConnect) – Check NSGs/security lists and route tables – Confirm DNS resolution and that the endpoint is reachable
Cleanup
To avoid ongoing costs, clean up in this order:
-
Delete model deployment (
iris-endpoint) – This stops continuous compute billing for online inference. -
Stop and delete notebook session – Stop first if required; then delete.
-
Delete model(s) from the Model Catalog (optional) – If you want to remove artifacts and metadata.
-
Delete Object Storage artifacts/buckets (if you created your own) – Ensure no required data remains.
-
Delete project (optional)
- Delete compartment (only if it contains nothing else)
11. Best Practices
Architecture best practices
- Separate dev/test/prod using compartments (and often separate VCNs/subnets).
- Keep data, training, and deployments in the same region to reduce latency and transfer costs.
- Use jobs for repeatable training and batch inference rather than long-running notebooks.
- Use Object Storage as the durable artifact store; treat notebook storage as ephemeral.
IAM/security best practices
- Apply least privilege policies at the compartment level.
- Prefer resource principals for notebooks/jobs accessing OCI services (instead of embedding API keys).
- Use dynamic groups to scope workload identities and restrict what they can access.
- Enforce tagging policies for ownership and environment classification.
Cost best practices
- Default to small CPU shapes; scale up only when evidence justifies it.
- Stop notebooks immediately after use.
- Delete model deployments when not needed (especially in dev).
- Apply Object Storage lifecycle policies to old artifacts and logs.
Performance best practices
- Right-size shapes based on dataset size and algorithm needs.
- Use GPUs only when the training workload benefits (deep learning, large compute).
- Cache intermediate datasets carefully; avoid repeated downloads/processing.
Reliability best practices
- Use jobs and versioned artifacts so you can reproduce a model build.
- Keep multiple model versions in the catalog; promote via controlled processes.
- Plan rollback: deployment should be able to revert to the last known good model.
Operations best practices
- Centralize logs and define retention.
- Set alarms for deployment health and latency if metrics are available (verify).
- Document runbooks for common failures: deployment build errors, endpoint auth errors, quota issues.
Governance/tagging/naming best practices
- Naming conventions (example):
ds-<team>-<env>-<project>nb-<project>-<user>mdl-<usecase>-<version>dep-<usecase>-<env>- Tagging keys:
CostCenter,Owner,Environment,DataSensitivity
12. Security Considerations
Identity and access model
- Human access: OCI IAM users/groups and policies.
- Workload access: resource principals via dynamic groups (recommended).
- Avoid distributing long-lived API keys to notebooks unless necessary.
Encryption
- OCI services generally provide encryption at rest and TLS in transit.
- Confirm KMS/Vault integrations and encryption specifics for:
- Object Storage buckets
- Block volumes used by notebook sessions
- Deployment endpoints
Verify in official docs for your region and configuration.
Network exposure
- Prefer private subnets for:
- notebooks accessing sensitive data
- production model deployments
- Use NSGs/security lists to restrict inbound/outbound traffic.
- Avoid public endpoints unless required; if public, enforce strict auth and monitoring.
Secrets handling
- Store secrets in OCI Vault, not in notebooks or code.
- Avoid printing secrets in logs.
- Rotate credentials and use short-lived access patterns where possible.
Audit/logging
- Use OCI Audit to track API actions (who created deployments, changed policies, etc.).
- Enable and review logs for deployment failures and invocation patterns.
- Treat inference request/response logs as sensitive; avoid logging PII.
Compliance considerations
- Data residency: keep regulated data in approved regions.
- Access reviews: periodic review of IAM policies and dynamic groups.
- Artifact governance: model artifacts may embed training data patterns; control distribution.
Common security mistakes
- Leaving notebooks running in public networks with weak access controls
- Overbroad policies like “manage all-resources in tenancy”
- Public model endpoints without strong authentication and rate limiting (where applicable)
- Logging sensitive payloads
Secure deployment recommendations
- Private endpoints for production deployments
- Signed requests / IAM-based auth (or the officially recommended secure auth pattern)
- Strict compartment isolation and CI/CD-based promotion
- Regular patching/updates of environments and dependencies
13. Limitations and Gotchas
Specific limits change over time and differ by region/tenancy. Always check OCI limits and current Data Science docs.
Known limitations / common gotchas
- Quota limits on shapes (OCPU/GPU) can block notebook sessions or deployments.
- Packaging requirements for model deployment artifacts can be strict (entrypoints, file layout).
- Network configuration can break common workflows:
- private subnet without NAT/service gateway may block installs or Object Storage access
- Always-on deployment cost can accumulate quickly.
- Artifact sprawl: too many model versions and intermediate outputs in Object Storage.
- Auth mismatch: endpoint invocation often fails when callers don’t use the required auth/signing method.
- Region feature skew: not all regions get the same feature updates at the same time (verify in docs).
Migration challenges
- Moving from other platforms often requires:
- repackaging model artifacts
- reworking pipelines around OCI IAM and Object Storage
- rebuilding CI/CD and monitoring patterns for OCI
Vendor-specific nuances
- OCI compartments are central to governance; design them early.
- Resource principals/dynamic groups are powerful, but require careful policy design to avoid privilege creep.
14. Comparison with Alternatives
Within Oracle Cloud
- Oracle Machine Learning (OML) in Autonomous Database: strong when your data is in the database and you want in-db ML patterns.
- OCI AI Services: better if you want prebuilt AI APIs (vision, language, speech) rather than building your own models.
- Oracle Analytics Cloud: analytics and BI platform; not a substitute for full ML development/deployment workflows.
Other clouds
- AWS SageMaker, Azure Machine Learning, Google Vertex AI provide similar managed ML platform capabilities with cloud-specific integrations.
Open-source/self-managed
- Kubeflow, MLflow + Kubernetes, self-managed JupyterHub can provide flexibility but require more operations.
Comparison table
| Option | Best For | Strengths | Weaknesses | When to Choose |
|---|---|---|---|---|
| Oracle Cloud Data Science | Teams building/deploying custom ML models on OCI | Tight OCI integration (IAM/VCN/Object Storage), managed notebooks & deployments | Cloud-specific workflows; packaging/deployment patterns must be learned | You run on OCI and want managed ML lifecycle + secure OCI-native ops |
| Oracle Machine Learning (OML) | ML close to Autonomous Database data | In-database ML and governance around DB | Not a general-purpose model serving platform | Data stays in ADB and you want ML without moving it |
| OCI AI Services | Using pretrained AI APIs | Fast time-to-value, no training required | Limited to offered APIs/capabilities | You need OCR/NLP/vision APIs more than custom model training |
| AWS SageMaker | AWS-native ML platform | Mature ecosystem, many managed features | AWS coupling; migration effort | Your infrastructure is primarily on AWS |
| Azure Machine Learning | Azure-native ML platform | Strong MLOps integrations | Azure coupling; migration effort | Your infrastructure is primarily on Azure |
| Google Vertex AI | GCP-native ML platform | Unified training + serving + MLOps | GCP coupling; migration effort | Your infrastructure is primarily on GCP |
| Kubeflow / MLflow (self-managed) | Maximum control and portability | Flexible, open tooling | High ops burden (K8s, upgrades, security) | You have platform engineering capacity and need cloud portability |
15. Real-World Example
Enterprise example: Private credit risk scoring platform
- Problem: A bank needs to deploy a credit risk model with strong security controls, private data sources, and auditability.
- Proposed architecture:
- Data stored in Autonomous Database and Object Storage (feature snapshots)
- Data Science jobs run training monthly in a controlled subnet
- Model artifacts registered in the Model Catalog
- Model deployments exposed via private endpoints in a VCN
- Access controlled by IAM policies and dynamic groups (resource principals)
- Logs routed to OCI Logging; alarms on error rates/latency
- Why Data Science was chosen: OCI-native identity/networking integration aligns with strict security requirements; managed deployment avoids self-hosting.
- Expected outcomes:
- Reduced time to deploy model updates
- Clear audit trail for model versions and promotions
- Private inference with controlled access paths
Startup/small-team example: Churn model MVP
- Problem: A SaaS startup wants a churn prediction endpoint for internal dashboards and customer success workflows.
- Proposed architecture:
- Weekly export of customer metrics to Object Storage
- Data scientist trains in a small notebook session and runs scheduled jobs
- Model registered in the catalog; deployment kept small and scaled minimally
- Endpoint called by an internal service
- Why Data Science was chosen: quick setup, low operational overhead, and direct path from notebook to deployment.
- Expected outcomes:
- MVP endpoint in days instead of weeks
- Controlled costs by stopping notebooks and deleting/recreating deployments as needed
- A foundation for future MLOps improvements
16. FAQ
-
Is “Data Science” the official Oracle Cloud service name?
Yes. In Oracle Cloud Infrastructure, the service is commonly documented as OCI Data Science. This tutorial uses “Data Science” to mean that OCI service. -
Is Data Science a fully managed ML platform like SageMaker/Vertex AI?
It provides managed notebooks, jobs, model catalog, and model deployments. Exact feature parity differs across clouds—evaluate based on your workflow needs. -
Do I pay separately for Data Science?
Costs typically come from underlying compute, storage, and network resources used by notebooks, jobs, and deployments. Confirm current billing behavior in the official OCI price list. -
What’s the difference between a notebook session and a job?
Notebook sessions are interactive (ideal for exploration). Jobs are for repeatable, managed runs (better for scheduled training and batch inference). -
Where should I store training data and artifacts?
Commonly in OCI Object Storage. Treat notebook storage as non-authoritative and keep artifacts versioned. -
Can I deploy private endpoints for inference?
Private networking is a common production pattern. Exact configuration depends on your VCN/subnet setup and current Data Science deployment options—verify in official docs. -
How do I authenticate a notebook to access Object Storage securely?
Prefer resource principals with dynamic groups and least-privilege policies. Avoid embedding API keys in notebooks. -
How do I version models?
Use the Model Catalog and adopt naming/version metadata conventions. Store training metadata and code commit references alongside model versions. -
Can I run GPU training?
OCI supports GPU shapes, but availability depends on region and quota. Use GPUs when the workload benefits (deep learning, large training jobs). -
How do I monitor model deployments?
Use OCI monitoring/logging where supported. Track latency, error rates, and request volume; define alarms and runbooks. Verify available metrics in your region. -
What’s the best way to reduce cost?
Stop notebooks when idle, keep deployments minimal, delete unused endpoints, and avoid over-sizing shapes. -
Can I integrate CI/CD for model promotion?
Yes, using OCI DevOps or external CI systems, with artifacts stored in Object Storage and controlled promotion processes. Implementation details vary. -
How do I keep sensitive data out of logs?
Don’t log raw payloads by default. Mask PII, control log retention, and restrict log access via IAM. -
Can I call a model deployment from outside OCI?
If the endpoint is public and authentication allows it, yes. For sensitive workloads, prefer private access (VPN/FastConnect or in-VCN callers). -
What causes most deployment failures?
Packaging/entrypoint mismatches, missing dependencies, incorrect artifact structure, and networking restrictions during startup. -
How do compartments help?
Compartments provide isolation boundaries for IAM policies, cost reporting, and environment separation (dev/test/prod). -
Can Data Science replace a feature store or full MLOps suite?
It provides core ML workflow pieces, but you may still need additional governance, feature management, drift monitoring, and CI/CD patterns depending on requirements.
17. Top Online Resources to Learn Data Science
| Resource Type | Name | Why It Is Useful |
|---|---|---|
| Official documentation | OCI Data Science Docs — https://docs.oracle.com/en-us/iaas/data-science/using/ | Primary source for current features, concepts, and step-by-step guidance |
| Official pricing | OCI Price List — https://www.oracle.com/cloud/price-list/ | Authoritative pricing reference (region/SKU dependent) |
| Pricing calculator | OCI Cost Estimator — https://www.oracle.com/cloud/costestimator.html | Estimate costs for notebook shapes, deployments, storage, and network |
| Official CLI docs | OCI CLI Installation — https://docs.oracle.com/en-us/iaas/Content/API/SDKDocs/cliinstall.htm | Automate Data Science and related OCI resources |
| Architecture guidance | OCI Architecture Center — https://docs.oracle.com/en/solutions/ | Reference architectures for OCI networking, security, and deployment patterns |
| Hands-on labs | Oracle LiveLabs — https://livelabs.oracle.com/ | Guided labs; search for “Data Science” and related ML labs |
| Official samples (GitHub) | Oracle OCI Data Science AI Samples — https://github.com/oracle/oci-data-science-ai-samples | Practical notebooks and examples aligned to OCI Data Science tooling |
| Observability docs | OCI Logging — https://docs.oracle.com/en-us/iaas/Content/Logging/home.htm | Learn how to route and manage logs from OCI services |
| IAM docs | OCI IAM — https://docs.oracle.com/en-us/iaas/Content/Identity/home.htm | Policies, dynamic groups, least-privilege patterns |
| Community learning | Oracle Cloud Community — https://community.oracle.com/ | Discussions and real-world tips (validate against official docs) |
18. Training and Certification Providers
| Institute | Suitable Audience | Likely Learning Focus | Mode | Website URL |
|---|---|---|---|---|
| DevOpsSchool.com | DevOps, cloud engineers, platform teams, beginners | OCI fundamentals, DevOps/MLOps adjacent practices, structured training | Check website | https://www.devopsschool.com/ |
| ScmGalaxy.com | Students, engineers learning tooling and platforms | Software lifecycle, DevOps and platform practices that can support ML delivery | Check website | https://www.scmgalaxy.com/ |
| CLoudOpsNow.in | Cloud operations and engineering roles | Cloud operations, reliability, governance and cost practices | Check website | https://www.cloudopsnow.in/ |
| SreSchool.com | SREs, operations, reliability engineers | Monitoring, reliability, incident response patterns applicable to ML services | Check website | https://www.sreschool.com/ |
| AiOpsSchool.com | Ops + AI/automation learners | AIOps concepts, operational analytics, automation foundations | Check website | https://www.aiopsschool.com/ |
19. Top Trainers
| Platform/Site | Likely Specialization | Suitable Audience | Website URL |
|---|---|---|---|
| RajeshKumar.xyz | DevOps/cloud training and guidance (verify offerings) | Beginners to intermediate engineers | https://rajeshkumar.xyz/ |
| devopstrainer.in | DevOps training platform (verify OCI coverage) | DevOps engineers and cloud practitioners | https://www.devopstrainer.in/ |
| devopsfreelancer.com | Freelance DevOps consulting/training marketplace (verify scope) | Teams needing practical guidance | https://www.devopsfreelancer.com/ |
| devopssupport.in | DevOps support/training services (verify scope) | Engineers seeking hands-on operational help | https://www.devopssupport.in/ |
20. Top Consulting Companies
| Company | Likely Service Area | Where They May Help | Consulting Use Case Examples | Website URL |
|---|---|---|---|---|
| cotocus.com | Cloud/DevOps consulting (verify specific OCI services) | Architecture, automation, platform setup | Cost governance, IaC foundations, secure networking for ML platforms | https://cotocus.com/ |
| DevOpsSchool.com | Training and consulting services | Enablement + implementation support | Setting up OCI landing zones, IAM patterns, operational runbooks for Data Science deployments | https://www.devopsschool.com/ |
| DEVOPSCONSULTING.IN | DevOps consulting services | CI/CD, observability, operations | Building deployment pipelines, monitoring/alerting integration, policy and tagging standards | https://www.devopsconsulting.in/ |
21. Career and Learning Roadmap
What to learn before Data Science (recommended prerequisites)
- OCI fundamentals: compartments, VCN, IAM policies, Object Storage
- Basic Linux and networking concepts (subnets, routing, security lists/NSGs)
- Python fundamentals: functions, packages, environments
- ML basics: supervised learning, train/test split, evaluation metrics
- Git basics for version control
What to learn after Data Science (to become production-ready)
- MLOps practices: reproducible training, artifact versioning, promotion workflows
- Infrastructure as Code (Terraform on OCI) for repeatable environments
- Secure networking: private endpoints, service gateways, bastions
- Observability: logging, metrics, alarms, SLOs
- Model governance: drift monitoring patterns, bias testing, lineage (may require additional tooling)
Job roles that use it
- Data Scientist
- Machine Learning Engineer
- Cloud Engineer (Analytics/AI)
- DevOps Engineer / Platform Engineer supporting ML platforms
- SRE/Operations Engineer for ML inference services
- Security Engineer (IAM, network security for AI workloads)
Certification path (if available)
Oracle certification offerings change. Check Oracle University / OCI certification listings for current credentials relevant to:
– OCI foundations
– OCI architect tracks
– Analytics/AI specialization
Verify current certification options in official Oracle training portals.
Project ideas for practice
- Build and deploy a churn prediction endpoint using Object Storage datasets
- Create a scheduled job that retrains a model weekly and registers a new model version
- Secure a deployment with private networking and demonstrate in-VCN invocation
- Implement cost tagging + cleanup automation for non-prod deployments
- Create a lightweight CI pipeline that packages a model artifact and triggers deployment updates (verify supported automation patterns)
22. Glossary
- OCI: Oracle Cloud Infrastructure.
- Compartment: OCI governance boundary for organizing resources and applying IAM policies.
- Project (Data Science): Logical container for Data Science resources.
- Notebook Session: Managed Jupyter environment on OCI compute.
- Job (Data Science): Managed execution of code for repeatable training/inference.
- Model Catalog: Registry of model artifacts and metadata used for governance and deployment.
- Model Artifact: Packaged files needed for inference (model binary, scoring code, metadata).
- Model Deployment: Managed online inference endpoint hosting a model.
- VCN: Virtual Cloud Network—your private network in OCI.
- Subnet: A segment of a VCN where resources are placed.
- NSG: Network Security Group—virtual firewall rules applied to resources.
- Dynamic Group: IAM construct for grouping resources (workloads) by matching rules.
- Resource Principal: Workload identity mechanism for OCI resources to call OCI APIs without user API keys.
- Object Storage: OCI service for storing unstructured data (datasets, artifacts).
- OCPU: Oracle CPU unit used for compute billing and sizing.
- Egress: Outbound network traffic leaving a region or to the public internet (potential cost driver).
23. Summary
Oracle Cloud Data Science in the Analytics and AI category is OCI’s managed service for building, training, registering, and deploying machine learning models. It fits best when you want an OCI-native workflow: notebooks and jobs for development and repeatable runs, a model catalog for governance, and model deployments for online inference—secured with OCI IAM and VCN networking.
Cost management is primarily about controlling compute runtime (notebooks and always-on deployments), right-sizing shapes, and governing storage and artifacts in Object Storage. Security success depends on compartment isolation, least-privilege IAM policies, resource principals/dynamic groups, private networking where appropriate, and disciplined logging/auditing.
Next step: follow the official documentation for your region/tenancy and extend this lab into a repeatable pipeline using jobs, versioned artifacts, and production-grade IAM/networking patterns: https://docs.oracle.com/en-us/iaas/data-science/using/