Oracle Cloud Data Integration Tutorial: Architecture, Pricing, Use Cases, and Hands-On Guide for Integration

1. Introduction

Oracle Cloud Data Integration is a managed service in Oracle Cloud Infrastructure (OCI) for building, running, and monitoring data movement and transformation workflows—commonly called ETL/ELT—across Oracle and non-Oracle systems.

In simple terms: Data Integration helps you pull data from one place, clean/transform it, and load it into another place, using a visual designer and managed execution so you don’t have to run your own ETL servers.

Technically, Data Integration provides a workspace-based design environment (projects, folders, data assets/connections, tasks, and pipelines) plus a managed runtime for executing data flows and orchestration. It integrates with OCI Identity and Access Management (IAM), compartments, policies, and OCI governance services such as Audit. You typically use it to implement ingestion into analytics platforms (like Autonomous Data Warehouse), operational reporting stores, or curated data lakes.

The problem it solves: teams need repeatable, secure, observable, cost-controlled ways to integrate data across applications and databases—without building a patchwork of scripts, cron jobs, and long-lived ETL servers.

Naming check: The service is commonly referred to as OCI Data Integration in official Oracle documentation. It is distinct from Oracle Data Integrator (ODI) (a separate product) and from Oracle Integration (application integration/iPaaS). This tutorial focuses only on Oracle Cloud (OCI) Data Integration.

2. What is Data Integration?

Official purpose (what Oracle positions it to do)

Oracle Cloud Data Integration is a fully managed cloud service for designing and running data pipelines that ingest, transform, and load data between heterogeneous sources and targets. It is intended to support common data engineering patterns—batch ingestion, transformations, incremental loads (where supported by source/target patterns), and orchestration—using a visual, metadata-driven approach.

For the canonical definition and current scope, verify in the official docs:
https://docs.oracle.com/en-us/iaas/data-integration/home.htm

Core capabilities (high level)

Design-time tooling in the OCI Console: create projects, define sources/targets, build data flows and pipelines.
Connections to data systems via “data assets” (connectors vary by environment and Oracle updates; verify supported connectors in docs for your region).
Transformations using data-flow steps (select, filter, join, aggregate, derive columns, mapping, etc.—exact transformation set depends on current release).
Orchestration with pipelines: chain tasks, manage dependencies, handle failures and retries (capabilities vary; verify current pipeline controls in docs).
Operational execution and monitoring: run tasks, view runs, check statuses, troubleshoot failures.
OCI-native governance: IAM policies, compartments, tagging, Audit integration.

Major components (how you work with the service)

While exact UI labels evolve, the core concepts in Data Integration generally include:

Workspace: the top-level container where you design and operate. Usually created per environment (dev/test/prod) and per domain/team.
Projects and folders: organize integration assets by subject area (finance, customer, telemetry, etc.).
Data assets / connections: represent sources/targets and how to connect (credentials, endpoints, wallets, etc.).
Tasks:
Data flows: transformation logic (mapping and shaping data).
Pipelines: orchestration logic (sequence, dependency, branching where supported).
Other task types may exist depending on current release; verify in docs.
Applications / publications (if present in your tenancy): promote or package artifacts for deployment between environments. Verify the current lifecycle model in official docs.
Work requests / runs: execution records you monitor for success/failure and runtime metrics.

Service type

Managed cloud service (serverless-style from the user perspective): you design and trigger jobs; Oracle operates the underlying service components.
Strongly aligned with Integration category, but focused specifically on data integration rather than application/event integration.

Scope: regional vs global; tenancy/compartment model

Data Integration is an OCI regional service: a workspace exists in a specific OCI region.
Resources are governed using tenancy, compartments, and IAM policies.
You usually design separate workspaces per region and per environment.

Always validate current regional availability in OCI documentation and the OCI Console region selector.

How it fits into the Oracle Cloud ecosystem

Data Integration often sits in the middle of these OCI building blocks:

Sources/targets: Autonomous Database (ATP/ADW), Oracle Database on OCI, Object Storage, and potentially other supported systems/connectors.
Data lake and analytics: Object Storage (raw/curated zones), Autonomous Data Warehouse, Oracle Analytics Cloud (downstream).
Governance: IAM policies, compartments, tagging, Audit.
Operations: OCI Monitoring/Logging (where supported), Notifications/Alarms around job states (often via integration patterns; verify supported hooks).

3. Why use Data Integration?

Business reasons

Faster delivery of data pipelines: visual development and reusable assets reduce time-to-value.
Lower operational overhead: less infrastructure to manage compared to self-hosted ETL servers.
Standardization: consistent patterns for ingestion and transformations across teams.
Auditability: better traceability than scattered scripts.

Technical reasons

Metadata-driven development: organize connections, schemas, and tasks as managed artifacts.
Repeatable orchestration: schedule/trigger workflows (depending on available scheduling features and your orchestration approach).
OCI-native integration: works naturally with compartments, IAM, and OCI database services.
Separation of design and execution: build once, run reliably.

Operational reasons

Central monitoring: view execution status, runs, failures, and (where available) logs.
Environment separation: manage dev/test/prod with compartments and workspaces.
Governance: tagging, access control, and audit trails.

Security/compliance reasons

IAM-based access: least-privilege policies per compartment/team.
Audit events: OCI Audit captures relevant API activity.
Network control patterns: can be paired with private endpoints and VCN designs depending on sources/targets (verify per connector).

Scalability/performance reasons

Managed scaling: avoids fixed-capacity ETL servers.
Parallelism patterns: data flows typically support distributed processing patterns for transformations (verify current runtime details and limits).

When teams should choose Data Integration

Choose Oracle Cloud Data Integration when: – You need batch ingestion + transformation in OCI. – Your primary targets are Autonomous Data Warehouse or other OCI data platforms. – You want OCI-governed pipelines managed via compartments/IAM. – You want to reduce custom scripting and improve reliability.

When teams should not choose it

Consider alternatives when: – You need real-time CDC replication with low latency (evaluate Oracle GoldenGate for OCI). – You need event-driven application integration and SaaS connectors at the application workflow level (evaluate Oracle Integration). – You need fully custom Spark control, notebooks, or bespoke code-first pipelines (evaluate OCI Data Flow, or code-first orchestration like Airflow on Kubernetes/Compute). – You need complex cross-cloud networking patterns that aren’t supported by the connectors/runtime model (validate connector and networking support first).

4. Where is Data Integration used?

Industries

Financial services: daily regulatory reports, risk aggregation, customer 360.
Retail/e-commerce: sales/returns analytics, inventory reconciliation, clickstream batch ingestion.
Healthcare/life sciences: claims data normalization, batch de-identification staging (with strict governance).
Telecom: CDR aggregation, churn analytics.
Manufacturing/IoT: batch ingestion of telemetry files, quality metrics.
Public sector: data consolidation for reporting, data lake standardization.

Team types

Data engineering teams building ingestion/transform pipelines.
Platform teams standardizing integration patterns.
Analytics engineering teams curating dimensional models.
DevOps/SRE teams supporting reliability and cost governance.
Security teams enforcing IAM, encryption, and auditing.

Workloads

Batch ELT/ETL: nightly loads, hourly loads, backfills.
Data lake zone processing: raw → staged → curated.
Warehouse loading: star schema, slowly changing dimensions (implementation depends on design patterns).
Operational reporting extracts.

Architectures

Lakehouse-style: Object Storage as lake + curated ADW marts.
Hub-and-spoke integration: standardize ingestion into a central curated store.
Multi-compartment enterprise governance: separate domains with shared platform.

Real-world deployment contexts

Production: scheduled and monitored, strict IAM, alarms, runbooks, cost controls.
Dev/test: smaller datasets, sandbox workspaces, experimental transformations.
Migration: moving from ODI/Informatica/Talend-style on-prem ETL to managed OCI patterns.

5. Top Use Cases and Scenarios

Below are realistic scenarios where Oracle Cloud Data Integration is commonly a good fit.

1) Load CSV files from Object Storage into Autonomous Data Warehouse

Problem: Analysts drop files into a bucket; the warehouse needs structured tables.
Why Data Integration fits: Visual mapping + managed runs; integrates naturally with OCI.
Example: Daily orders_YYYYMMDD.csv lands in Object Storage → Data Integration loads into DW_ORDERS_STAGE.

2) Curate a raw data lake into a “silver” zone

Problem: Raw files are messy (types, missing fields, inconsistent formats).
Why it fits: Data flow transformations can standardize and validate data.
Example: Raw JSON exports → normalized Parquet-like structures (format support varies; verify) → curated bucket prefix.

3) Join multiple source tables into a reporting mart

Problem: Reporting needs denormalized tables for BI.
Why it fits: Visual join/aggregate transformations.
Example: Join customers + orders + payments → MART_CUSTOMER_REVENUE.

4) Standardize dimensions (conformed dimensions)

Problem: Different systems represent “product” differently.
Why it fits: Central transformation logic with reusable components.
Example: ERP products + e-commerce products mapped to a single DIM_PRODUCT.

5) Batch ingestion from Oracle Database into ADW

Problem: Operational Oracle DB data must be copied nightly to analytics.
Why it fits: Strong Oracle-to-Oracle integration patterns; managed credentials and execution.
Example: Nightly extract of SALES_TXN → transform → load into DW_SALES_FACT.

6) Mask or tokenize data before analytics (basic patterns)

Problem: Sensitive fields must not reach general analytics tables.
Why it fits: Transformations can remove/hash fields; governance via compartments and IAM.
Example: Hash email, truncate addresses, remove SSNs before loading curated tables. (For strong masking, evaluate Oracle Data Safe and database-native controls too.)

7) Build a parameterized pipeline for multiple regions/business units

Problem: Same pipeline must run for BU=A, BU=B with different source paths.
Why it fits: Parameterization patterns reduce duplication (verify exact parameter features).
Example: source_prefix=/raw/bu=${BU}/ parameter drives the ingestion path.

8) Backfill historical data with controlled runs

Problem: Need to load 2 years of historical files without breaking production.
Why it fits: Managed runs + organized projects; easier run tracking and retry.
Example: Run pipeline per month partition, validate counts, then proceed.

9) Data quality checks as part of pipeline (basic validation)

Problem: Downstream BI breaks when null rates spike or schema changes.
Why it fits: Add validation steps and fail-fast patterns (implementation depends on supported transforms).
Example: If order_id null rate > 0, stop pipeline and notify.

10) Replace cron + SQL scripts with governed orchestration

Problem: “Works on my VM” scripts are hard to maintain and audit.
Why it fits: Centralized jobs, IAM access, run tracking, and repeatability.
Example: Replace shell scripts that call SQL*Plus with a pipeline that runs consistently.

6. Core Features

Feature availability can change by region and over time. For the most accurate list, verify in official docs: https://docs.oracle.com/en-us/iaas/data-integration/home.htm

Workspaces (design + operations boundary)

What it does: Provides an isolated environment to create and operate integration artifacts.
Why it matters: Enables clean separation between teams and environments.
Practical benefit: Easier governance (IAM/tagging), predictable organization.
Caveats: Workspaces are regional; cross-region designs need explicit data movement patterns.

Projects and folders (asset organization)

What it does: Lets you group pipelines, flows, connections, and related artifacts.
Why it matters: Keeps large integration estates manageable.
Practical benefit: Teams can align projects to domains (Finance, HR, Sales).
Caveats: Organization doesn’t replace IAM; use compartments and policies for access control.

Data assets / connections (source/target definitions)

What it does: Stores metadata and connection details for sources/targets.
Why it matters: Reuse connection definitions and manage credentials centrally.
Practical benefit: Faster onboarding; fewer hardcoded secrets in scripts.
Caveats: Supported connectors vary; validate that your exact source/target and auth method are supported.

Data flows (transformations)

What it does: Implements transformation logic—mapping columns, filtering, joining, aggregating, deriving fields.
Why it matters: Converts raw data into analytics-ready datasets.
Practical benefit: Visual logic is easier to review and maintain than ad-hoc scripts for many teams.
Caveats: Not every transformation pattern is available visually; complex logic might require database-side SQL transformations or alternative services.

Pipelines (orchestration)

What it does: Orchestrates multiple tasks with dependencies.
Why it matters: Real pipelines need steps: ingest → transform → load → validate → publish.
Practical benefit: One place to manage execution order and outcomes.
Caveats: Advanced branching/looping patterns may be limited; verify current orchestration capabilities.

Parameterization and reusability (where supported)

What it does: Allows using parameters for environment-specific or run-specific values (paths, table names, dates).
Why it matters: Promotes reuse and reduces duplication.
Practical benefit: Same pipeline can run for different partitions or BUs.
Caveats: Parameter scoping rules and supported parameter types vary—confirm in docs.

Execution management and run history

What it does: Provides a history of runs (status, timing, failures).
Why it matters: Troubleshooting depends on visibility.
Practical benefit: Faster root cause analysis than searching through VM logs.
Caveats: Log detail and retention may vary; confirm how to export logs and what is retained.

IAM integration (compartment-based governance)

What it does: Uses OCI IAM for authentication/authorization to create/manage DI resources.
Why it matters: Least privilege and separation of duties.
Practical benefit: Platform teams can restrict production changes.
Caveats: Access to external data sources also requires correct policies and networking patterns.

OCI Audit integration

What it does: Records relevant API events for governance and compliance.
Why it matters: You need a trail of who changed what.
Practical benefit: Supports compliance controls and investigations.
Caveats: Audit captures control-plane events, not necessarily every row-level data operation.

7. Architecture and How It Works

High-level service architecture

Data Integration typically separates concerns into: – Control plane: UI/API actions (create workspace, define assets, run tasks). Governed by IAM, recorded by Audit. – Data plane (runtime): Executes flows/pipelines and reads/writes data to configured systems.

You design integrations as artifacts in a workspace. When you trigger a run, the managed runtime connects to your sources/targets using the configuration and credentials, performs transformations, and writes results. Operational metadata (run status) is tracked for monitoring and troubleshooting.

Request / data / control flow (conceptual)

User (or automation) calls OCI APIs / Console to create and configure DI artifacts.
User triggers a task run (manual, scheduled, or programmatic—verify scheduling and APIs).
DI runtime reads from source(s), transforms, writes to target(s).
Run status and logs are stored for inspection; Audit logs capture changes.

Integrations with related OCI services (common patterns)

Object Storage: staging and lake storage for raw/curated files.
Autonomous Database (ATP/ADW): common targets for analytics and marts.
Oracle Database on Compute/Exadata Cloud Service: operational sources/targets.
OCI Vault: store secrets/keys (where supported by the connection model and your design).
IAM/Compartments/Tags: governance and access control.
VCN / Private Endpoints: private connectivity patterns for databases (depends on target configuration and service capabilities—verify in docs).

Dependency services (what you still need)

Data Integration does not replace: – Your storage (Object Storage buckets) and lifecycle policies. – Your database (Autonomous or DB on OCI) and its scaling/backups. – Your network architecture (VCNs, subnets, routing, DNS). – Your operations (alerting, runbooks, on-call).

Security/authentication model (practical view)

Access to Data Integration resources is controlled by OCI IAM policies.
Access from the runtime to sources/targets depends on:
How the connector authenticates (user/password, wallet, token, etc.).
Whether the runtime can reach the endpoint (public vs private networking).
Policies that allow required OCI operations (for example, reading objects from a bucket).

Because IAM policy statements and connector auth differ by scenario, always confirm the exact policy examples for Data Integration in official docs.

Networking model (practical view)

If your source/target is public (public Object Storage endpoint, public database endpoint), connectivity is simpler—but may not be acceptable for production security.
For production, many teams prefer private endpoints and VCN-only access to databases and services. Confirm whether and how Data Integration supports private connectivity in your region and for your connector types.

Monitoring/logging/governance considerations

OCI Audit: captures API changes and access patterns.
Run monitoring: check task run states and errors in the Data Integration UI.
OCI Logging/Monitoring: integration points vary; if you need centralized logs/metrics, verify current capabilities and consider exporting run outcomes to a monitoring system.

Simple architecture diagram (conceptual)

flowchart LR
  U[Engineer / Analyst] -->|Console/API| DI[OCI Data Integration Workspace]
  DI -->|Run Data Flow / Pipeline| RT[Managed Runtime]
  RT --> OS[(OCI Object Storage)]
  RT --> ADW[(Autonomous Data Warehouse)]
  DI --> AUD[OCI Audit]

Production-style architecture diagram (more realistic)

flowchart TB
  subgraph Tenancy[OCI Tenancy]
    subgraph Net[VCN / Networking]
      DBP[(Private ADW / DB Endpoint)]
      NAT[NAT Gateway or Service Gateway]
    end

    subgraph Gov[Governance]
      IAM[IAM Policies & Compartments]
      AUD[OCI Audit]
      TAG[Tags / Cost Tracking]
    end

    subgraph Data[Data Layer]
      OSRAW[(Object Storage - Raw Zone)]
      OSCUR[(Object Storage - Curated Zone)]
      DW[(Autonomous Data Warehouse - Marts)]
    end

    subgraph DI[Data Integration]
      WS[Workspace (Dev/Test/Prod)]
      PJ[Projects / Folders]
      DF[Data Flows]
      PL[Pipelines]
      RUN[Runs / Work Requests]
    end
  end

  IAM --> WS
  WS --> DF
  WS --> PL
  DF --> RUN
  PL --> RUN

  RUN --> OSRAW
  RUN --> OSCUR
  RUN --> DBP
  DBP --> DW

  WS --> AUD
  TAG --> WS
  NAT --> DBP

8. Prerequisites

Tenancy/account requirements

An Oracle Cloud (OCI) tenancy with permission to use Data Integration in a region where it is available.
A compartment strategy (at minimum: one compartment for this lab).

Permissions / IAM roles

You need permissions to: – Create/manage Data Integration workspaces and artifacts. – Read/write to Object Storage (for source/target files). – Connect to and create objects in the target database (Autonomous Database recommended for the lab).

OCI policies for Data Integration use specific resource types and verbs. Because policy syntax can evolve, use Oracle’s official policy examples as the source of truth and adapt to your compartments and groups:

Data Integration documentation home: https://docs.oracle.com/en-us/iaas/data-integration/home.htm
OCI IAM policy reference: https://docs.oracle.com/en-us/iaas/Content/Identity/home.htm

Billing requirements

Data Integration is a paid OCI service (unless covered by specific promotions). You need a valid billing setup.
If you use Autonomous Database Always Free, that can reduce costs for the target, but Data Integration usage may still generate charges depending on tenancy and region. Verify current Free Tier eligibility: https://www.oracle.com/cloud/free/

Tools needed

OCI Console access (web browser).
Optional but useful:
OCI CLI (for Object Storage operations and automation): https://docs.oracle.com/en-us/iaas/Content/API/SDKDocs/cliinstall.htm
A SQL client for Autonomous Database:
- Database Actions (web UI) or SQL Developer. Database Actions is typically easiest.

Region availability

Choose an OCI region where Data Integration is available.
Confirm via the OCI Console service list or the service documentation.

Quotas/limits

OCI enforces service limits/quotas (workspaces, runs, concurrency, etc.).
Check in Console: Governance & Administration → Limits, Quotas and Usage.
If you hit limits, request an increase via OCI support (process depends on your account).

Prerequisite services for this lab

Object Storage bucket (for source CSV file).
Autonomous Database (ATP or ADW; Always Free works well for learning).
Data Integration workspace.

9. Pricing / Cost

Do not rely on any blog for pricing numbers. OCI pricing is region-specific and may change. Use official pages.

Current pricing model (how costs are typically measured)

Oracle Cloud Data Integration pricing is usage-based. In practice, your bill is driven by: – Data Integration job execution consumption (often measured in compute/time units for the managed runtime).
Verify the exact billing metric and unit names (for example, OCPU-hours or equivalent) on the official pricing page for your region. – Underlying services you use: – Object Storage capacity and requests – Autonomous Database compute and storage (if not Always Free) – Data transfer (cross-region, internet egress) – Logging/Monitoring ingestion (if exporting logs)

Official pricing sources

OCI price list (search for “Data Integration”): https://www.oracle.com/cloud/price-list/
OCI Cost Estimator: https://www.oracle.com/cloud/costestimator.html
OCI Free Tier overview (to reduce lab cost): https://www.oracle.com/cloud/free/

Pricing dimensions to understand (cost drivers)

Number of runs and runtime duration – More frequent pipelines (e.g., every 5 minutes) cost more than nightly batch.
Data volume processed – Larger datasets typically increase runtime and consumption.
Transformation complexity – Joins, aggregations, and wide transformations usually cost more than simple copies.
Concurrency – Running many pipelines simultaneously can increase consumption and hit limits.
Network placement – Moving data across regions or out to the internet can add data transfer charges.
Source/target performance – Slow databases can increase runtime and therefore cost.

Free tier considerations

OCI Free Tier is mainly about compute/database/storage products. Data Integration may not be Always Free in your region/tenancy.
Verify current Free Tier eligibility for Data Integration specifically in official pages or your tenancy’s subscription details.

Hidden/indirect costs (common surprises)

Autonomous Database scaling: if you auto-scale or choose higher CPU/storage, DI loads may trigger more DB usage.
Object Storage request costs: frequent small-file processing increases request counts.
Logging ingestion/retention: exporting logs at high volume can cost money.
Cross-region traffic: replicating data or reading across regions can add transfer fees.

Network/data transfer implications

In OCI, egress to the internet and some inter-region transfers are charged.
Keep your Data Integration workspace, Object Storage, and database in the same region for cost and performance unless you have a strong reason not to.

How to optimize cost (practical checklist)

Prefer batch windows over continuous micro-batching unless you truly need it.
Consolidate small files into fewer larger files (where your pipeline supports it).
Push heavy transformations into the database when it’s cheaper/faster and fits governance.
Use partitioned loads (date partitions) and incremental patterns where feasible.
Separate dev/test/prod and turn off non-production schedules.
Tag resources for cost tracking (Cost Analysis works best with consistent tags).

Example low-cost starter estimate (how to think about it)

A learning lab typically includes: – 1 Data Integration workspace – A few small runs (MBs of CSV) – Always Free Autonomous Database (if eligible) – Minimal Object Storage

Your cost will depend on the minimum billable runtime units and the billing metric for Data Integration in your region. Use the Cost Estimator and run a small test, then check Billing → Cost Analysis.

Example production cost considerations (what to plan for)

In production, plan for: – Daily/hourly schedules across multiple domains (runs/day) – Backfills (temporary cost spikes) – Separate environments (dev/test/prod) – Higher log retention and monitoring exports – Stronger networking (private endpoints) and possible added network components

10. Step-by-Step Hands-On Tutorial

This lab builds a simple but real pipeline: load a CSV file from OCI Object Storage into an Autonomous Database table using Oracle Cloud Data Integration.

Objective

Create an OCI Data Integration workspace and a basic data flow that: 1. Reads a CSV file from an Object Storage bucket 2. Maps columns to a target table 3. Loads the data into an Autonomous Database table 4. Verifies row counts and cleans up

Lab Overview

You will: 1. Create/prepare an Autonomous Database table 2. Create an Object Storage bucket and upload a sample CSV 3. Create a Data Integration workspace and project 4. Create connections (data assets) to Object Storage and Autonomous Database 5. Build and run a Data Flow (or equivalent task) to load data 6. Validate results 7. Clean up resources to avoid ongoing cost

Notes before you start: – UI labels can vary slightly by OCI Console updates. – If any option differs in your tenancy, follow the closest equivalent and verify with the official docs: https://docs.oracle.com/en-us/iaas/data-integration/home.htm

Step 1: Create (or choose) a compartment for the lab

In the OCI Console, open Identity & Security → Compartments.
Create a compartment such as lab-data-integration (or reuse an existing lab compartment).
Record: – Compartment name – Compartment OCID (optional but useful)

Expected outcome: You have a compartment where you will create the bucket, database, and Data Integration workspace.

Step 2: Create an Autonomous Database (ATP or ADW) and a target table

If you already have an Autonomous Database, you can reuse it.

Go to Oracle Database → Autonomous Database.
Click Create Autonomous Database.
For low cost, choose an Always Free option if available in your region/tenancy.
Set: – Display name: adb-di-lab – Database name: something short like DILAB – Admin password: store securely
Create the database and wait for it to become Available.

Now create a table using Database Actions: 1. Open the Autonomous Database details page. 2. Click Database Actions → SQL. 3. Run:

CREATE TABLE DI_CUSTOMERS (
  CUSTOMER_ID   NUMBER,
  FIRST_NAME    VARCHAR2(100),
  LAST_NAME     VARCHAR2(100),
  EMAIL         VARCHAR2(200),
  SIGNUP_DATE   DATE
);

Expected outcome: Autonomous Database is running and has an empty DI_CUSTOMERS table.

Verification:

SELECT COUNT(*) FROM DI_CUSTOMERS;

Should return 0.

Step 3: Create an Object Storage bucket and upload a sample CSV

Go to Storage → Buckets.
Ensure you are in the same region and compartment.
Click Create Bucket: – Name: di-lab-bucket-<unique-suffix> – Default storage tier is fine for a lab.
Open the bucket → Upload.

Create a local file named customers.csv with this content:

CUSTOMER_ID,FIRST_NAME,LAST_NAME,EMAIL,SIGNUP_DATE
1,Ana,Gomez,ana.gomez@example.com,2024-01-15
2,Sam,Lee,sam.lee@example.com,2024-02-20
3,Priya,Shah,priya.shah@example.com,2024-03-05
4,Noah,Kim,noah.kim@example.com,2024-03-18

Upload it to the bucket (root or a prefix like input/).

Expected outcome: The bucket contains customers.csv.

Verification: Click the object and confirm size and last modified timestamp.

Step 4: Create a Data Integration workspace

In the OCI Console, go to Data Integration.
Click Create workspace.
Choose the lab compartment.
Name: di-workspace-lab
Create.

Wait until the workspace is active.

Expected outcome: Workspace exists and you can open it.

Step 5: Create a Data Integration project

Open the workspace.
Create a Project: – Name: customer-load-lab
Optionally create folders such as: – connections – dataflows – pipelines

Expected outcome: You have a project where you’ll build assets.

Step 6: Configure access (IAM and policies) for Object Storage and Autonomous Database

Data Integration needs permission to interact with OCI resources (like Object Storage), and it needs valid database credentials/connectivity for Autonomous Database.

Because the exact policy statements and resource types can vary, follow Oracle’s official policy examples for Data Integration and apply least privilege in your compartment.

Start here: – Data Integration docs: https://docs.oracle.com/en-us/iaas/data-integration/home.htm – IAM policy reference: https://docs.oracle.com/en-us/iaas/Content/Identity/home.htm

Common pattern to validate in docs (do not copy blindly): – Allow your admin group to manage Data Integration resources in the lab compartment. – Allow Data Integration service access to read objects from the specific bucket (or bucket compartment). – Ensure database connectivity and credentials are available for the connector method used.

Expected outcome: Policies are in place; no authorization errors when testing connections later.

Verification: You should be able to create data assets and browse/select the Object Storage object from within Data Integration (or at least run a task that reads it).

Step 7: Create a connection (data asset) to Object Storage

In the Data Integration workspace (inside your project):

Navigate to Data Assets or Connections (terminology may vary).
Create a new data asset for Object Storage.
Provide: – Compartment/bucket details – Namespace (Object Storage namespace from tenancy) – Bucket name – Authentication method per the UI (often OCI-native/IAM-based)

Expected outcome: Object Storage data asset is created.

Verification: Use any available Test Connection or browse feature (if provided) to confirm you can locate customers.csv.

If you cannot browse but creation succeeds, proceed; the real test is running the flow.

Step 8: Create a connection (data asset) to Autonomous Database

Create a new data asset for Autonomous Database (or Oracle Database).
Provide connection details: – Database OCID or connection string (depending on UI) – Username (e.g., ADMIN or a dedicated ETL user) – Password (store securely) – Wallet/SSL settings if required by the connector

Recommended for production: create a dedicated database user with least privileges (create session + insert/select on target schema), not ADMIN.

Expected outcome: Autonomous Database data asset is created.

Verification: Use Test Connection if available.

Step 9: Create a Data Flow to load `customers.csv` into `DI_CUSTOMERS`

Create a new Data Flow in your project.
Add a Source: – Source type: Object Storage – Select the bucket and object customers.csv – Configure CSV format:
- Header row present: yes
- Delimiter: comma
- Date format: YYYY-MM-DD (or configure a parsing rule if the UI requires it)
Add transformations as needed: – Ensure column names map correctly:
- CUSTOMER_ID → number
- FIRST_NAME, LAST_NAME, EMAIL → strings
- SIGNUP_DATE → date (parse from string)
Add a Target: – Target type: Autonomous Database – Target table: DI_CUSTOMERS – Write mode: for a lab, choose a safe mode:
- If you want repeatable runs: TRUNCATE then INSERT (if supported), or delete rows before loading.
- If you want append-only: INSERT.
Save the Data Flow.

Expected outcome: A saved Data Flow that reads from Object Storage and writes to the database.

Verification: Validate the data flow graph (most UIs provide a validation step). Resolve schema/type mapping warnings.

Step 10: Run the Data Flow (or create a Task and run it)

Depending on your UI, you may: – Run the Data Flow directly, or – Create a Task from the Data Flow and run the task

Click Run.
Observe the run status (Submitted → Running → Succeeded/Failed).
Open run details if available.

Expected outcome: Run completes successfully.

Validation

Validate in Autonomous Database

In Database Actions → SQL:

SELECT COUNT(*) AS row_count FROM DI_CUSTOMERS;

Expected: 4

Check the data:

SELECT
  CUSTOMER_ID, FIRST_NAME, LAST_NAME, EMAIL,
  TO_CHAR(SIGNUP_DATE, 'YYYY-MM-DD') AS SIGNUP_DATE
FROM DI_CUSTOMERS
ORDER BY CUSTOMER_ID;

Expected: rows 1–4 with correct values.

Validate in Data Integration

The run should show Succeeded.
If a run history is available, confirm runtime and any warnings.

Troubleshooting

Error: Authorization failed / NotAuthorizedOrNotFound

Cause: missing IAM policy for Data Integration to access the bucket or DI resources.
Fix:
Confirm you are in the correct compartment.
Review policies using official policy examples for Data Integration.
Verify the bucket is in the same compartment you granted access to.

Error: Cannot connect to Autonomous Database

Cause: wrong credentials, missing wallet/SSL config, network restrictions (private endpoint).
Fix:
Re-test with Database Actions using the same user.
If the DB is private, confirm Data Integration supports the required private connectivity pattern and that your VCN/security lists/NSGs allow it.
Confirm the connector’s required connection string/wallet details in the docs.

Error: Date parsing / invalid month

Cause: CSV date format doesn’t match parsing rule.
Fix:
Ensure date format YYYY-MM-DD.
Add an explicit cast/parse transformation (if available).

Error: Column mapping mismatch

Cause: CSV headers don’t match target columns, or inferred types differ.
Fix:
Ensure customers.csv header names match expected mappings.
Add an explicit mapping step and cast types.

Error: Duplicate rows on re-run

Cause: using INSERT append mode.
Fix:
Use truncate + load pattern (if supported), or run TRUNCATE TABLE DI_CUSTOMERS before re-running.

Cleanup

To avoid ongoing cost and clutter, delete lab resources:

Data Integration – Delete the task(s), data flows, and project (optional). – Delete the workspace di-workspace-lab (if not needed).
Object Storage – Delete customers.csv. – Delete the bucket.
Autonomous Database – Drop the table (optional): sql DROP TABLE DI_CUSTOMERS PURGE; – Terminate the Autonomous Database if it was created only for this lab (unless Always Free and you want to keep it).
IAM policies – Remove any lab-only policies you created (keep least privilege).

11. Best Practices

Architecture best practices

Separate dev/test/prod using compartments and separate workspaces.
Adopt a layered data architecture (raw → staged → curated → marts).
Keep data close to compute: same region for workspace, buckets, and DB targets.
Prefer idempotent designs:
Partitioned loads (by date)
Merge/upsert patterns (when supported and appropriate)
Staging + swap for stable publishing

IAM/security best practices

Use least privilege:
Separate “designers” (create/update flows) from “operators” (run/monitor).
Use dedicated DB users for Data Integration with minimal privileges.
Store secrets appropriately:
Prefer OCI Vault patterns where supported; otherwise restrict who can view/edit connection assets.
Apply tagging consistently (environment, cost center, owner, data domain).

Cost best practices

Avoid high-frequency schedules for batch workloads.
Reduce small-file overhead by consolidating files upstream.
Monitor job runtimes and tune transformations.
Use Cost Analysis with tags to detect runaway costs early.

Performance best practices

Push down transformations to the database when that is faster/cheaper and aligns with governance.
Use partitioned reads/writes where supported.
Avoid unnecessary wide joins; pre-filter data early in the flow.

Reliability best practices

Build pipelines with:
Clear failure handling (stop on critical step failure)
Retries for transient errors (where supported)
Validation steps (row counts, null checks)
Maintain runbooks:
What to do on failure
How to replay/backfill safely
Escalation path

Operations best practices

Standardize naming:
di-<env>-<domain>-<purpose>
Keep an asset inventory per workspace (projects, connections, schedules).
Establish change management:
Peer reviews of flows/pipelines
Controlled promotion to production (verify DI’s promotion model in your tenancy)

Governance best practices

Use compartments to model ownership and data domains.
Tag everything for cost and ownership.
Document data lineage externally if you need full lineage (Data Integration alone may not cover enterprise lineage requirements; consider OCI Data Catalog patterns where appropriate).

12. Security Considerations

Identity and access model

Data Integration uses OCI IAM for:
User authentication to the Console/API
Authorization to manage workspaces and artifacts
Use groups and policies rather than individual user grants.
Separate duties:
Data engineers: design assets
Operators: run/monitor
Security/admin: manage policies

Encryption

OCI services typically encrypt data at rest by default (service-dependent). Confirm encryption behavior for:
Object Storage buckets
Autonomous Database
For sensitive workloads, use customer-managed keys where required (OCI Vault + KMS), and verify Data Integration compatibility with CMEK scenarios for each dependent service.

Network exposure

Prefer private connectivity for production databases where possible.
If your DB is publicly accessible, restrict with:
IP allowlists (if applicable)
Strong credentials
Minimal privileges
Keep the Data Integration workspace and data sources in the same region to minimize exposure and transfer.

Secrets handling

Avoid embedding secrets in scripts; store them in managed connection objects with restricted access.
Rotate DB passwords and update connection assets as part of your security hygiene.
Consider using database auth patterns that reduce static secrets (availability varies; verify in docs).

Audit/logging

OCI Audit captures relevant administrative operations.
Ensure Audit logs are retained per compliance requirements.
If you need centralized observability, integrate with OCI Logging/Monitoring where supported and define alert rules around job failures.

Compliance considerations

Map controls to:
Access control (IAM policies)
Change management (who can modify flows)
Data protection (encryption, masking)
Logging and retention (Audit, run history)
For regulated data, ensure the entire path (source, transport, target, backups) meets compliance requirements.

Common security mistakes

Using ADMIN for database loads in production.
Overly broad IAM policies at the tenancy root.
Leaving public endpoints open without strong restrictions.
Allowing all developers to edit production connections and credentials.
No audit review process.

Secure deployment recommendations

Use compartment isolation per environment.
Create a dedicated “integration runtime” DB user per pipeline domain.
Apply tagging and resource naming conventions.
Regularly review policies and connection assets permissions.

13. Limitations and Gotchas

Limits and capabilities vary by region and release. Always verify in official docs and your tenancy’s service limits page.

Known limitations (categories)

Connector availability: not every data source is supported natively; some require staging via Object Storage or database links. Verify connector list.
Private networking: private endpoint support depends on the connector and service capabilities; validate before committing to architecture.
Advanced orchestration: complex branching/looping and event triggers may be limited compared to dedicated orchestrators.
Real-time CDC: Data Integration is generally a batch integration tool; for CDC replication, evaluate GoldenGate.

Quotas and concurrency

Workspaces, projects, tasks, and concurrent runs may have limits.
Concurrency spikes during backfills can hit limits and increase costs.

Regional constraints

Workspaces are regional; cross-region pipelines require explicit patterns and may incur data transfer charges.

Pricing surprises

Backfills can run for hours/days, driving consumption.
Many small files can inflate processing overhead and Object Storage request costs.
Storing extensive logs externally can add Logging costs.

Compatibility issues

CSV/JSON schema drift: header changes can break mappings.
Date/time parsing differences between source formats and database types.
Character set issues (UTF-8 vs other encodings) if files originate from legacy systems.

Operational gotchas

Re-runs can cause duplicates without idempotent design.
Credential rotations can silently break scheduled loads if not updated.
Lack of standardized naming makes incident response slower.

Migration challenges

Migrating from ODI/Informatica/Talend may require redesign:
Different transformation semantics
Different operational model (managed vs self-hosted)
Different scheduling/orchestration patterns

Vendor-specific nuances

Oracle database targets can be very fast, but you must still design:
Load strategy (append vs merge)
Index maintenance timing
Constraints handling

14. Comparison with Alternatives

Data Integration is one option in a broader integration and data engineering toolbox.

Alternatives in Oracle Cloud (OCI)

Oracle GoldenGate (OCI): best for real-time CDC replication.
OCI Data Flow: serverless Apache Spark jobs for code-first transformations.
Oracle Integration: iPaaS for application/SaaS integration and process automation (not a data engineering ETL tool first).
OCI Data Catalog: metadata management and governance (complements DI; not an ETL runtime).

Alternatives in other clouds

AWS Glue: managed ETL + data catalog integration.
Azure Data Factory: orchestration + connectors + mapping data flows.
Google Cloud Data Fusion / Dataflow: visual pipeline (Data Fusion) and managed stream/batch processing (Dataflow).

Open-source / self-managed

Apache Airflow (self-managed or managed elsewhere): orchestration (not ETL itself).
Apache NiFi: flow-based ingestion.
dbt: SQL-based transformations in the warehouse (often complements ingestion tools).
Spark on Kubernetes: maximum control, maximum ops overhead.

Comparison table

Option	Best For	Strengths	Weaknesses	When to Choose
OCI Data Integration	Batch ingestion + transformation in OCI	Managed design/runtime, OCI-native governance, good fit with ADW/Object Storage	Connector/networking constraints, not CDC-first, orchestration depth may be limited	You want OCI-governed ETL/ELT without running servers
Oracle GoldenGate (OCI)	Real-time replication / CDC	Low-latency change capture, replication patterns	More specialized, can be costlier/complex	You need near real-time data movement with CDC
OCI Data Flow	Code-first Spark processing	Flexible, scalable Spark jobs	More engineering/ops than visual ETL	You need custom Spark logic beyond visual transforms
Oracle Integration	App/SaaS integration	SaaS adapters, process automation	Not designed primarily for large-scale data engineering	You integrate applications and events, not bulk analytics loads
AWS Glue	ETL on AWS	Strong AWS ecosystem integration	Different cloud; migration overhead	Your platform is AWS-first
Azure Data Factory	Data integration on Azure	Mature orchestration + connectors	Different cloud	Your platform is Azure-first
Airflow (self-managed)	Orchestration across tools	Very flexible DAG orchestration	You manage infra and reliability	You need multi-tool orchestration and have ops maturity

15. Real-World Example

Enterprise example: retail analytics modernization

Problem A retail enterprise has: – Oracle E-Business Suite/ERP on Oracle Database – Daily store sales files landing in Object Storage – A mandate to build a governed analytics platform on Autonomous Data Warehouse

They need consistent, auditable pipelines with environment separation and access controls.

Proposed architecture – Object Storage: – /raw/pos/ for store extracts – /raw/erp/ for exports – /curated/ for standardized datasets – OCI Data Integration: – Separate workspaces per environment (dev/test/prod) – Projects per domain: sales, inventory, customer – Pipelines orchestrating: ingest → transform → load ADW → validate – Autonomous Data Warehouse: – Staging schema + curated marts – Governance: – IAM policies per team – Tags for cost allocation – Audit reviews for production changes

Why Data Integration was chosen – Visual development accelerates delivery across multiple teams. – OCI-native IAM and compartments align with enterprise governance. – Managed runtime reduces operational burden versus self-hosted ETL servers.

Expected outcomes – Reduced pipeline failures via standardized orchestration and monitoring – Faster onboarding for new subject areas – Improved auditability and controlled promotion to production – Predictable costs through tagging and run discipline

Startup/small-team example: SaaS product usage analytics

Problem A startup collects daily usage exports (CSV) and wants to build KPIs in a warehouse without hiring a full-time platform engineer to manage ETL servers.

Proposed architecture – Object Storage bucket receives daily exports from the application. – OCI Data Integration: – Single workspace for staging + transformations – A small set of data flows loading into ADW – Autonomous Database (Always Free initially; later scale up): – Simple schema for dashboards

Why Data Integration was chosen – Minimal infrastructure management. – Quick to build and change transformations. – Strong fit with OCI-native services used by the startup.

Expected outcomes – Working dashboards in days rather than weeks – Low operational overhead – Smooth scaling path as data volume grows

16. FAQ

1) Is Oracle Cloud Data Integration the same as Oracle Data Integrator (ODI)?

No. OCI Data Integration is a managed OCI service. ODI is a separate product (often on-prem or self-managed on cloud). Validate product scope in Oracle docs for your exact environment.

2) Is Data Integration the same as Oracle Integration?

No. Oracle Integration is an iPaaS focused on application integration and process automation. Data Integration is focused on data ingestion and transformation pipelines.

3) Do I need to run servers or clusters for Data Integration?

Typically no—Data Integration is managed. You design and run jobs; Oracle manages service infrastructure. Verify runtime characteristics and limits in official docs.

4) Is Data Integration regional?

Yes, workspaces are created in a specific OCI region. Keep sources/targets in-region when possible.

5) Can Data Integration load into Autonomous Data Warehouse?

Yes, ADW is a common target. You configure a connection and load into tables.

6) Can Data Integration read files from Object Storage?

Yes, Object Storage is a common source/landing zone for CSV and other file-based ingestion patterns (format support depends on connector features).

7) How do I schedule pipelines?

Scheduling options depend on current service features and your chosen approach. If native scheduling is limited for your needs, orchestrate runs externally (for example, with OCI services or CI/CD). Verify current scheduling features in docs.

8) How do I implement incremental loads?

Common patterns include: – Partitioned loads by date – Change-tracking columns in source tables – Staging + merge/upsert in the database
Exact implementation depends on connectors and transformation features.

9) Does Data Integration support CDC?

Data Integration is generally batch-oriented. For CDC replication, evaluate Oracle GoldenGate.

10) Can I deploy the same pipeline to dev/test/prod?

Yes, typically by using separate workspaces and consistent naming/parameters. Confirm current promotion/export/import capabilities in your tenancy.

11) How do I secure database credentials used by Data Integration?

Use dedicated DB users with least privileges and restrict access to connection assets. Use OCI Vault patterns where supported by your connector model.

12) What’s the best way to avoid duplicates when re-running?

Use idempotent patterns: – Truncate-and-load for full refresh tables – Partition overwrite – Merge/upsert keyed by business key and effective dates

13) How do I monitor failures?

Use Data Integration run history and error details. For production, integrate job outcomes with your alerting process (Notifications/alarms patterns vary—verify available integrations).

14) How do I estimate costs?

Use the official pricing page and the OCI Cost Estimator: – https://www.oracle.com/cloud/price-list/ – https://www.oracle.com/cloud/costestimator.html
Then validate by running a small workload and reviewing Billing → Cost Analysis.

15) What’s the easiest beginner lab?

Load a small CSV from Object Storage into an Always Free Autonomous Database table (the lab in this tutorial).

17. Top Online Resources to Learn Data Integration

Resource Type	Name	Why It Is Useful
Official documentation	OCI Data Integration Docs	Source of truth for concepts, features, limits, and how-to steps. https://docs.oracle.com/en-us/iaas/data-integration/home.htm
Official documentation	OCI IAM Docs	Required for correct policies, compartments, dynamic groups, and security model. https://docs.oracle.com/en-us/iaas/Content/Identity/home.htm
Official pricing	OCI Price List	Find Data Integration pricing dimensions for your region. https://www.oracle.com/cloud/price-list/
Official calculator	OCI Cost Estimator	Model Data Integration + Object Storage + DB costs. https://www.oracle.com/cloud/costestimator.html
Official free tier	OCI Free Tier	Reduce lab cost; check what is Always Free. https://www.oracle.com/cloud/free/
Architecture guidance	OCI Architecture Center	Reference architectures and best practices (search for data integration and analytics patterns). https://docs.oracle.com/en/solutions/
Tutorials	OCI Tutorials (Oracle)	Step-by-step labs for OCI services; search for Data Integration. https://docs.oracle.com/en/learn/
Videos	Oracle Cloud YouTube channel	Product overviews and demos; verify freshness by date. https://www.youtube.com/@OracleCloudInfrastructure
Samples	Oracle GitHub (official org)	Some OCI services provide samples; search repositories for “data integration”. https://github.com/oracle
Community (reputable)	Oracle Cloud Customer Connect	Practical discussions and Q&A validate answers against docs. https://cloudcustomerconnect.oracle.com/

18. Training and Certification Providers

Institute	Suitable Audience	Likely Learning Focus	Mode	Website URL
DevOpsSchool.com	Engineers, DevOps, cloud practitioners	Cloud/DevOps training; may include OCI and integration fundamentals	Check website	https://www.devopsschool.com/
ScmGalaxy.com	Beginners to intermediate IT professionals	DevOps, SCM, automation foundations that support integration operations	Check website	https://www.scmgalaxy.com/
CLoudOpsNow.in	Cloud ops and platform teams	Cloud operations practices, monitoring, governance	Check website	https://www.cloudopsnow.in/
SreSchool.com	SREs, operations, reliability engineers	Reliability engineering practices for running production pipelines	Check website	https://www.sreschool.com/
AiOpsSchool.com	Ops and platform teams exploring AIOps	Monitoring/automation concepts that can complement data pipeline ops	Check website	https://www.aiopsschool.com/

19. Top Trainers

Platform/Site	Likely Specialization	Suitable Audience	Website URL
RajeshKumar.xyz	DevOps/cloud training content (verify current offerings)	Beginners to intermediate	https://rajeshkumar.xyz/
devopstrainer.in	DevOps training and coaching (verify OCI coverage)	DevOps engineers and students	https://www.devopstrainer.in/
devopsfreelancer.com	Freelance DevOps/platform help (verify services)	Teams needing short-term guidance	https://www.devopsfreelancer.com/
devopssupport.in	DevOps support/training style services (verify scope)	Ops teams and learners	https://www.devopssupport.in/

20. Top Consulting Companies

Company	Likely Service Area	Where They May Help	Consulting Use Case Examples	Website URL
cotocus.com	Cloud/DevOps consulting (verify service catalog)	Architecture, implementation support, operations setup	Landing zone setup, CI/CD for data pipelines, monitoring runbooks	https://cotocus.com/
DevOpsSchool.com	Training + consulting (verify offerings)	Enablement + implementation guidance	Platform standardization, governance/tagging strategy, operational maturity	https://www.devopsschool.com/
DEVOPSCONSULTING.IN	DevOps consulting (verify service catalog)	Delivery assistance and ops processes	Automation, infrastructure-as-code support, operational playbooks	https://www.devopsconsulting.in/

21. Career and Learning Roadmap

What to learn before Data Integration

OCI fundamentals:
Tenancy, compartments, IAM users/groups/policies
VCN basics (subnets, routing, security lists/NSGs)
Object Storage concepts (buckets, namespaces, lifecycle)
Data fundamentals:
Relational modeling, SQL basics
CSV/file formats and schema basics
ETL vs ELT patterns
Security basics:
Least privilege
Secret management patterns

What to learn after Data Integration

Advanced analytics architecture:
Data lakehouse patterns on OCI
Dimensional modeling (Kimball) and data vault concepts
Orchestration and platform engineering:
CI/CD for data pipelines (Git-based workflows)
Testing strategies for data (unit tests, reconciliation)
Specialized tools:
Oracle GoldenGate for CDC
OCI Data Flow for advanced Spark workloads
Data governance tools (OCI Data Catalog) for lineage and discovery

Job roles that use it

Data Engineer (OCI)
Analytics Engineer
Cloud Engineer / Platform Engineer (data platform)
DevOps/SRE supporting data pipelines
Solution Architect (data and analytics)

Certification path (if available)

Oracle certification offerings change over time. For current OCI certification paths, verify on Oracle University: – https://education.oracle.com/

Look for OCI-focused tracks related to data management, integration, and analytics.

Project ideas for practice

Build a raw-to-curated pipeline with partitioned loads (daily folders).
Implement an idempotent load pattern (staging + merge).
Add validation steps (row counts, null checks) and a failure notification pattern.
Create separate dev/prod workspaces and practice promoting artifacts.
Cost governance: tag everything and produce a weekly cost report by tag.

22. Glossary

ADW (Autonomous Data Warehouse): Oracle’s managed analytics database service on OCI.
ATP (Autonomous Transaction Processing): Oracle’s managed transactional database service on OCI.
Bucket: Object Storage container for objects (files).
Compartment: OCI logical container for resources and access control.
Control plane: Management layer (create/update/run configuration).
Data asset / connection: Definition of a source/target system and how to connect to it.
Data flow: A transformation pipeline that reads, transforms, and writes data.
Data plane/runtime: Execution layer that moves/transforms data.
ETL/ELT: Extract-Transform-Load / Extract-Load-Transform integration patterns.
IAM policy: Rules that define who can do what to which resources.
Idempotent load: A load that can be rerun without creating duplicates or incorrect results.
Object Storage namespace: Tenancy-level identifier used in Object Storage endpoints.
Pipeline: Orchestration of tasks with dependencies and run order.
Run / work request: Execution record of a task/pipeline.
VCN: Virtual Cloud Network—your private network in OCI.

23. Summary

Oracle Cloud Data Integration is OCI’s managed service in the Integration category for building and operating batch-oriented data ingestion and transformation pipelines. It fits best when you want OCI-native governance (IAM/compartments/tags), a visual development experience, and a managed runtime—especially for common patterns like Object Storage to Autonomous Database loads.

Cost is primarily driven by job execution consumption (verify exact billing units on the official pricing page) and by dependent services like Object Storage and Autonomous Database. Security and compliance depend on least-privilege IAM policies, careful credential handling, network design (public vs private endpoints), and using Audit/run history for traceability.

Use Data Integration when you need governed ETL/ELT in OCI; consider GoldenGate for CDC and Data Flow for code-first Spark. Next, deepen skills by implementing idempotent patterns, environment promotion, and operational monitoring runbooks—then validate everything against the official docs: https://docs.oracle.com/en-us/iaas/data-integration/home.htm

rajeshkumar

Category