{"id":917,"date":"2026-04-16T16:51:28","date_gmt":"2026-04-16T16:51:28","guid":{"rendered":"https:\/\/www.devopsschool.com\/tutorials\/oracle-cloud-data-integration-tutorial-architecture-pricing-use-cases-and-hands-on-guide-for-integration\/"},"modified":"2026-04-16T16:51:28","modified_gmt":"2026-04-16T16:51:28","slug":"oracle-cloud-data-integration-tutorial-architecture-pricing-use-cases-and-hands-on-guide-for-integration","status":"publish","type":"post","link":"https:\/\/www.devopsschool.com\/tutorials\/oracle-cloud-data-integration-tutorial-architecture-pricing-use-cases-and-hands-on-guide-for-integration\/","title":{"rendered":"Oracle Cloud Data Integration Tutorial: Architecture, Pricing, Use Cases, and Hands-On Guide for Integration"},"content":{"rendered":"\n<h2 class=\"wp-block-heading\">Category<\/h2>\n\n\n\n<p>Integration<\/p>\n\n\n\n<h2 class=\"wp-block-heading\">1. Introduction<\/h2>\n\n\n\n<p>Oracle Cloud <strong>Data Integration<\/strong> is a managed service in Oracle Cloud Infrastructure (OCI) for building, running, and monitoring data movement and transformation workflows\u2014commonly called ETL\/ELT\u2014across Oracle and non-Oracle systems.<\/p>\n\n\n\n<p>In simple terms: <strong>Data Integration helps you pull data from one place, clean\/transform it, and load it into another place<\/strong>, using a visual designer and managed execution so you don\u2019t have to run your own ETL servers.<\/p>\n\n\n\n<p>Technically, Data Integration provides a <strong>workspace-based design environment<\/strong> (projects, folders, data assets\/connections, tasks, and pipelines) plus a <strong>managed runtime<\/strong> for executing data flows and orchestration. It integrates with OCI Identity and Access Management (IAM), compartments, policies, and OCI governance services such as Audit. You typically use it to implement ingestion into analytics platforms (like Autonomous Data Warehouse), operational reporting stores, or curated data lakes.<\/p>\n\n\n\n<p>The problem it solves: teams need repeatable, secure, observable, cost-controlled ways to integrate data across applications and databases\u2014without building a patchwork of scripts, cron jobs, and long-lived ETL servers.<\/p>\n\n\n\n<blockquote>\n<p>Naming check: The service is commonly referred to as <strong>OCI Data Integration<\/strong> in official Oracle documentation. It is distinct from <strong>Oracle Data Integrator (ODI)<\/strong> (a separate product) and from <strong>Oracle Integration<\/strong> (application integration\/iPaaS). This tutorial focuses only on <strong>Oracle Cloud (OCI) Data Integration<\/strong>.<\/p>\n<\/blockquote>\n\n\n\n<hr class=\"wp-block-separator\" \/>\n\n\n\n<h2 class=\"wp-block-heading\">2. What is Data Integration?<\/h2>\n\n\n\n<h3 class=\"wp-block-heading\">Official purpose (what Oracle positions it to do)<\/h3>\n\n\n\n<p>Oracle Cloud <strong>Data Integration<\/strong> is a <strong>fully managed<\/strong> cloud service for <strong>designing and running data pipelines<\/strong> that ingest, transform, and load data between heterogeneous sources and targets. It is intended to support common data engineering patterns\u2014batch ingestion, transformations, incremental loads (where supported by source\/target patterns), and orchestration\u2014using a visual, metadata-driven approach.<\/p>\n\n\n\n<p>For the canonical definition and current scope, verify in the official docs:<br\/>\nhttps:\/\/docs.oracle.com\/en-us\/iaas\/data-integration\/home.htm<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">Core capabilities (high level)<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>Design-time tooling<\/strong> in the OCI Console: create projects, define sources\/targets, build data flows and pipelines.<\/li>\n<li><strong>Connections to data systems<\/strong> via \u201cdata assets\u201d (connectors vary by environment and Oracle updates; verify supported connectors in docs for your region).<\/li>\n<li><strong>Transformations<\/strong> using data-flow steps (select, filter, join, aggregate, derive columns, mapping, etc.\u2014exact transformation set depends on current release).<\/li>\n<li><strong>Orchestration<\/strong> with pipelines: chain tasks, manage dependencies, handle failures and retries (capabilities vary; verify current pipeline controls in docs).<\/li>\n<li><strong>Operational execution and monitoring<\/strong>: run tasks, view runs, check statuses, troubleshoot failures.<\/li>\n<li><strong>OCI-native governance<\/strong>: IAM policies, compartments, tagging, Audit integration.<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">Major components (how you work with the service)<\/h3>\n\n\n\n<p>While exact UI labels evolve, the core concepts in Data Integration generally include:<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>Workspace<\/strong>: the top-level container where you design and operate. Usually created per environment (dev\/test\/prod) and per domain\/team.<\/li>\n<li><strong>Projects and folders<\/strong>: organize integration assets by subject area (finance, customer, telemetry, etc.).<\/li>\n<li><strong>Data assets \/ connections<\/strong>: represent sources\/targets and how to connect (credentials, endpoints, wallets, etc.).<\/li>\n<li><strong>Tasks<\/strong>:<\/li>\n<li><strong>Data flows<\/strong>: transformation logic (mapping and shaping data).<\/li>\n<li><strong>Pipelines<\/strong>: orchestration logic (sequence, dependency, branching where supported).<\/li>\n<li>Other task types may exist depending on current release; verify in docs.<\/li>\n<li><strong>Applications \/ publications<\/strong> (if present in your tenancy): promote or package artifacts for deployment between environments. Verify the current lifecycle model in official docs.<\/li>\n<li><strong>Work requests \/ runs<\/strong>: execution records you monitor for success\/failure and runtime metrics.<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">Service type<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>Managed cloud service<\/strong> (serverless-style from the user perspective): you design and trigger jobs; Oracle operates the underlying service components.<\/li>\n<li>Strongly aligned with <strong>Integration<\/strong> category, but focused specifically on <strong>data integration<\/strong> rather than application\/event integration.<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">Scope: regional vs global; tenancy\/compartment model<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Data Integration is an <strong>OCI regional service<\/strong>: a workspace exists in a specific OCI region.<\/li>\n<li>Resources are governed using <strong>tenancy<\/strong>, <strong>compartments<\/strong>, and <strong>IAM policies<\/strong>.<\/li>\n<li>You usually design separate workspaces per region and per environment.<\/li>\n<\/ul>\n\n\n\n<p>Always validate current regional availability in OCI documentation and the OCI Console region selector.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">How it fits into the Oracle Cloud ecosystem<\/h3>\n\n\n\n<p>Data Integration often sits in the middle of these OCI building blocks:<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>Sources\/targets<\/strong>: Autonomous Database (ATP\/ADW), Oracle Database on OCI, Object Storage, and potentially other supported systems\/connectors.<\/li>\n<li><strong>Data lake and analytics<\/strong>: Object Storage (raw\/curated zones), Autonomous Data Warehouse, Oracle Analytics Cloud (downstream).<\/li>\n<li><strong>Governance<\/strong>: IAM policies, compartments, tagging, Audit.<\/li>\n<li><strong>Operations<\/strong>: OCI Monitoring\/Logging (where supported), Notifications\/Alarms around job states (often via integration patterns; verify supported hooks).<\/li>\n<\/ul>\n\n\n\n<hr class=\"wp-block-separator\" \/>\n\n\n\n<h2 class=\"wp-block-heading\">3. Why use Data Integration?<\/h2>\n\n\n\n<h3 class=\"wp-block-heading\">Business reasons<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>Faster delivery of data pipelines<\/strong>: visual development and reusable assets reduce time-to-value.<\/li>\n<li><strong>Lower operational overhead<\/strong>: less infrastructure to manage compared to self-hosted ETL servers.<\/li>\n<li><strong>Standardization<\/strong>: consistent patterns for ingestion and transformations across teams.<\/li>\n<li><strong>Auditability<\/strong>: better traceability than scattered scripts.<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">Technical reasons<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>Metadata-driven development<\/strong>: organize connections, schemas, and tasks as managed artifacts.<\/li>\n<li><strong>Repeatable orchestration<\/strong>: schedule\/trigger workflows (depending on available scheduling features and your orchestration approach).<\/li>\n<li><strong>OCI-native integration<\/strong>: works naturally with compartments, IAM, and OCI database services.<\/li>\n<li><strong>Separation of design and execution<\/strong>: build once, run reliably.<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">Operational reasons<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>Central monitoring<\/strong>: view execution status, runs, failures, and (where available) logs.<\/li>\n<li><strong>Environment separation<\/strong>: manage dev\/test\/prod with compartments and workspaces.<\/li>\n<li><strong>Governance<\/strong>: tagging, access control, and audit trails.<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">Security\/compliance reasons<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>IAM-based access<\/strong>: least-privilege policies per compartment\/team.<\/li>\n<li><strong>Audit events<\/strong>: OCI Audit captures relevant API activity.<\/li>\n<li><strong>Network control patterns<\/strong>: can be paired with private endpoints and VCN designs depending on sources\/targets (verify per connector).<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">Scalability\/performance reasons<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>Managed scaling<\/strong>: avoids fixed-capacity ETL servers.<\/li>\n<li><strong>Parallelism patterns<\/strong>: data flows typically support distributed processing patterns for transformations (verify current runtime details and limits).<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">When teams should choose Data Integration<\/h3>\n\n\n\n<p>Choose Oracle Cloud Data Integration when:\n&#8211; You need <strong>batch ingestion + transformation<\/strong> in OCI.\n&#8211; Your primary targets are <strong>Autonomous Data Warehouse<\/strong> or other OCI data platforms.\n&#8211; You want <strong>OCI-governed<\/strong> pipelines managed via compartments\/IAM.\n&#8211; You want to reduce custom scripting and improve reliability.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">When teams should not choose it<\/h3>\n\n\n\n<p>Consider alternatives when:\n&#8211; You need <strong>real-time CDC replication<\/strong> with low latency (evaluate <strong>Oracle GoldenGate<\/strong> for OCI).\n&#8211; You need event-driven application integration and SaaS connectors at the application workflow level (evaluate <strong>Oracle Integration<\/strong>).\n&#8211; You need fully custom Spark control, notebooks, or bespoke code-first pipelines (evaluate <strong>OCI Data Flow<\/strong>, or code-first orchestration like Airflow on Kubernetes\/Compute).\n&#8211; You need complex cross-cloud networking patterns that aren\u2019t supported by the connectors\/runtime model (validate connector and networking support first).<\/p>\n\n\n\n<hr class=\"wp-block-separator\" \/>\n\n\n\n<h2 class=\"wp-block-heading\">4. Where is Data Integration used?<\/h2>\n\n\n\n<h3 class=\"wp-block-heading\">Industries<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Financial services: daily regulatory reports, risk aggregation, customer 360.<\/li>\n<li>Retail\/e-commerce: sales\/returns analytics, inventory reconciliation, clickstream batch ingestion.<\/li>\n<li>Healthcare\/life sciences: claims data normalization, batch de-identification staging (with strict governance).<\/li>\n<li>Telecom: CDR aggregation, churn analytics.<\/li>\n<li>Manufacturing\/IoT: batch ingestion of telemetry files, quality metrics.<\/li>\n<li>Public sector: data consolidation for reporting, data lake standardization.<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">Team types<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Data engineering teams building ingestion\/transform pipelines.<\/li>\n<li>Platform teams standardizing integration patterns.<\/li>\n<li>Analytics engineering teams curating dimensional models.<\/li>\n<li>DevOps\/SRE teams supporting reliability and cost governance.<\/li>\n<li>Security teams enforcing IAM, encryption, and auditing.<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">Workloads<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Batch ELT\/ETL: nightly loads, hourly loads, backfills.<\/li>\n<li>Data lake zone processing: raw \u2192 staged \u2192 curated.<\/li>\n<li>Warehouse loading: star schema, slowly changing dimensions (implementation depends on design patterns).<\/li>\n<li>Operational reporting extracts.<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">Architectures<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Lakehouse-style: Object Storage as lake + curated ADW marts.<\/li>\n<li>Hub-and-spoke integration: standardize ingestion into a central curated store.<\/li>\n<li>Multi-compartment enterprise governance: separate domains with shared platform.<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">Real-world deployment contexts<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>Production<\/strong>: scheduled and monitored, strict IAM, alarms, runbooks, cost controls.<\/li>\n<li><strong>Dev\/test<\/strong>: smaller datasets, sandbox workspaces, experimental transformations.<\/li>\n<li><strong>Migration<\/strong>: moving from ODI\/Informatica\/Talend-style on-prem ETL to managed OCI patterns.<\/li>\n<\/ul>\n\n\n\n<hr class=\"wp-block-separator\" \/>\n\n\n\n<h2 class=\"wp-block-heading\">5. Top Use Cases and Scenarios<\/h2>\n\n\n\n<p>Below are realistic scenarios where Oracle Cloud Data Integration is commonly a good fit.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">1) Load CSV files from Object Storage into Autonomous Data Warehouse<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>Problem:<\/strong> Analysts drop files into a bucket; the warehouse needs structured tables.<\/li>\n<li><strong>Why Data Integration fits:<\/strong> Visual mapping + managed runs; integrates naturally with OCI.<\/li>\n<li><strong>Example:<\/strong> Daily <code>orders_YYYYMMDD.csv<\/code> lands in Object Storage \u2192 Data Integration loads into <code>DW_ORDERS_STAGE<\/code>.<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">2) Curate a raw data lake into a \u201csilver\u201d zone<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>Problem:<\/strong> Raw files are messy (types, missing fields, inconsistent formats).<\/li>\n<li><strong>Why it fits:<\/strong> Data flow transformations can standardize and validate data.<\/li>\n<li><strong>Example:<\/strong> Raw JSON exports \u2192 normalized Parquet-like structures (format support varies; verify) \u2192 curated bucket prefix.<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">3) Join multiple source tables into a reporting mart<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>Problem:<\/strong> Reporting needs denormalized tables for BI.<\/li>\n<li><strong>Why it fits:<\/strong> Visual join\/aggregate transformations.<\/li>\n<li><strong>Example:<\/strong> Join customers + orders + payments \u2192 <code>MART_CUSTOMER_REVENUE<\/code>.<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">4) Standardize dimensions (conformed dimensions)<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>Problem:<\/strong> Different systems represent \u201cproduct\u201d differently.<\/li>\n<li><strong>Why it fits:<\/strong> Central transformation logic with reusable components.<\/li>\n<li><strong>Example:<\/strong> ERP products + e-commerce products mapped to a single <code>DIM_PRODUCT<\/code>.<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">5) Batch ingestion from Oracle Database into ADW<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>Problem:<\/strong> Operational Oracle DB data must be copied nightly to analytics.<\/li>\n<li><strong>Why it fits:<\/strong> Strong Oracle-to-Oracle integration patterns; managed credentials and execution.<\/li>\n<li><strong>Example:<\/strong> Nightly extract of <code>SALES_TXN<\/code> \u2192 transform \u2192 load into <code>DW_SALES_FACT<\/code>.<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">6) Mask or tokenize data before analytics (basic patterns)<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>Problem:<\/strong> Sensitive fields must not reach general analytics tables.<\/li>\n<li><strong>Why it fits:<\/strong> Transformations can remove\/hash fields; governance via compartments and IAM.<\/li>\n<li><strong>Example:<\/strong> Hash email, truncate addresses, remove SSNs before loading curated tables. (For strong masking, evaluate Oracle Data Safe and database-native controls too.)<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">7) Build a parameterized pipeline for multiple regions\/business units<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>Problem:<\/strong> Same pipeline must run for BU=A, BU=B with different source paths.<\/li>\n<li><strong>Why it fits:<\/strong> Parameterization patterns reduce duplication (verify exact parameter features).<\/li>\n<li><strong>Example:<\/strong> <code>source_prefix=\/raw\/bu=${BU}\/<\/code> parameter drives the ingestion path.<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">8) Backfill historical data with controlled runs<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>Problem:<\/strong> Need to load 2 years of historical files without breaking production.<\/li>\n<li><strong>Why it fits:<\/strong> Managed runs + organized projects; easier run tracking and retry.<\/li>\n<li><strong>Example:<\/strong> Run pipeline per month partition, validate counts, then proceed.<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">9) Data quality checks as part of pipeline (basic validation)<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>Problem:<\/strong> Downstream BI breaks when null rates spike or schema changes.<\/li>\n<li><strong>Why it fits:<\/strong> Add validation steps and fail-fast patterns (implementation depends on supported transforms).<\/li>\n<li><strong>Example:<\/strong> If <code>order_id<\/code> null rate &gt; 0, stop pipeline and notify.<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">10) Replace cron + SQL scripts with governed orchestration<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>Problem:<\/strong> \u201cWorks on my VM\u201d scripts are hard to maintain and audit.<\/li>\n<li><strong>Why it fits:<\/strong> Centralized jobs, IAM access, run tracking, and repeatability.<\/li>\n<li><strong>Example:<\/strong> Replace shell scripts that call SQL*Plus with a pipeline that runs consistently.<\/li>\n<\/ul>\n\n\n\n<hr class=\"wp-block-separator\" \/>\n\n\n\n<h2 class=\"wp-block-heading\">6. Core Features<\/h2>\n\n\n\n<blockquote>\n<p>Feature availability can change by region and over time. For the most accurate list, verify in official docs: https:\/\/docs.oracle.com\/en-us\/iaas\/data-integration\/home.htm<\/p>\n<\/blockquote>\n\n\n\n<h3 class=\"wp-block-heading\">Workspaces (design + operations boundary)<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>What it does:<\/strong> Provides an isolated environment to create and operate integration artifacts.<\/li>\n<li><strong>Why it matters:<\/strong> Enables clean separation between teams and environments.<\/li>\n<li><strong>Practical benefit:<\/strong> Easier governance (IAM\/tagging), predictable organization.<\/li>\n<li><strong>Caveats:<\/strong> Workspaces are regional; cross-region designs need explicit data movement patterns.<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">Projects and folders (asset organization)<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>What it does:<\/strong> Lets you group pipelines, flows, connections, and related artifacts.<\/li>\n<li><strong>Why it matters:<\/strong> Keeps large integration estates manageable.<\/li>\n<li><strong>Practical benefit:<\/strong> Teams can align projects to domains (Finance, HR, Sales).<\/li>\n<li><strong>Caveats:<\/strong> Organization doesn\u2019t replace IAM; use compartments and policies for access control.<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">Data assets \/ connections (source\/target definitions)<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>What it does:<\/strong> Stores metadata and connection details for sources\/targets.<\/li>\n<li><strong>Why it matters:<\/strong> Reuse connection definitions and manage credentials centrally.<\/li>\n<li><strong>Practical benefit:<\/strong> Faster onboarding; fewer hardcoded secrets in scripts.<\/li>\n<li><strong>Caveats:<\/strong> Supported connectors vary; validate that your exact source\/target and auth method are supported.<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">Data flows (transformations)<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>What it does:<\/strong> Implements transformation logic\u2014mapping columns, filtering, joining, aggregating, deriving fields.<\/li>\n<li><strong>Why it matters:<\/strong> Converts raw data into analytics-ready datasets.<\/li>\n<li><strong>Practical benefit:<\/strong> Visual logic is easier to review and maintain than ad-hoc scripts for many teams.<\/li>\n<li><strong>Caveats:<\/strong> Not every transformation pattern is available visually; complex logic might require database-side SQL transformations or alternative services.<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">Pipelines (orchestration)<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>What it does:<\/strong> Orchestrates multiple tasks with dependencies.<\/li>\n<li><strong>Why it matters:<\/strong> Real pipelines need steps: ingest \u2192 transform \u2192 load \u2192 validate \u2192 publish.<\/li>\n<li><strong>Practical benefit:<\/strong> One place to manage execution order and outcomes.<\/li>\n<li><strong>Caveats:<\/strong> Advanced branching\/looping patterns may be limited; verify current orchestration capabilities.<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">Parameterization and reusability (where supported)<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>What it does:<\/strong> Allows using parameters for environment-specific or run-specific values (paths, table names, dates).<\/li>\n<li><strong>Why it matters:<\/strong> Promotes reuse and reduces duplication.<\/li>\n<li><strong>Practical benefit:<\/strong> Same pipeline can run for different partitions or BUs.<\/li>\n<li><strong>Caveats:<\/strong> Parameter scoping rules and supported parameter types vary\u2014confirm in docs.<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">Execution management and run history<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>What it does:<\/strong> Provides a history of runs (status, timing, failures).<\/li>\n<li><strong>Why it matters:<\/strong> Troubleshooting depends on visibility.<\/li>\n<li><strong>Practical benefit:<\/strong> Faster root cause analysis than searching through VM logs.<\/li>\n<li><strong>Caveats:<\/strong> Log detail and retention may vary; confirm how to export logs and what is retained.<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">IAM integration (compartment-based governance)<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>What it does:<\/strong> Uses OCI IAM for authentication\/authorization to create\/manage DI resources.<\/li>\n<li><strong>Why it matters:<\/strong> Least privilege and separation of duties.<\/li>\n<li><strong>Practical benefit:<\/strong> Platform teams can restrict production changes.<\/li>\n<li><strong>Caveats:<\/strong> Access to external data sources also requires correct policies and networking patterns.<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">OCI Audit integration<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>What it does:<\/strong> Records relevant API events for governance and compliance.<\/li>\n<li><strong>Why it matters:<\/strong> You need a trail of who changed what.<\/li>\n<li><strong>Practical benefit:<\/strong> Supports compliance controls and investigations.<\/li>\n<li><strong>Caveats:<\/strong> Audit captures control-plane events, not necessarily every row-level data operation.<\/li>\n<\/ul>\n\n\n\n<hr class=\"wp-block-separator\" \/>\n\n\n\n<h2 class=\"wp-block-heading\">7. Architecture and How It Works<\/h2>\n\n\n\n<h3 class=\"wp-block-heading\">High-level service architecture<\/h3>\n\n\n\n<p>Data Integration typically separates concerns into:\n&#8211; <strong>Control plane:<\/strong> UI\/API actions (create workspace, define assets, run tasks). Governed by IAM, recorded by Audit.\n&#8211; <strong>Data plane (runtime):<\/strong> Executes flows\/pipelines and reads\/writes data to configured systems.<\/p>\n\n\n\n<p>You design integrations as artifacts in a workspace. When you trigger a run, the managed runtime connects to your sources\/targets using the configuration and credentials, performs transformations, and writes results. Operational metadata (run status) is tracked for monitoring and troubleshooting.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">Request \/ data \/ control flow (conceptual)<\/h3>\n\n\n\n<ol class=\"wp-block-list\">\n<li>User (or automation) calls OCI APIs \/ Console to create and configure DI artifacts.<\/li>\n<li>User triggers a task run (manual, scheduled, or programmatic\u2014verify scheduling and APIs).<\/li>\n<li>DI runtime reads from source(s), transforms, writes to target(s).<\/li>\n<li>Run status and logs are stored for inspection; Audit logs capture changes.<\/li>\n<\/ol>\n\n\n\n<h3 class=\"wp-block-heading\">Integrations with related OCI services (common patterns)<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>Object Storage<\/strong>: staging and lake storage for raw\/curated files.<\/li>\n<li><strong>Autonomous Database (ATP\/ADW)<\/strong>: common targets for analytics and marts.<\/li>\n<li><strong>Oracle Database on Compute\/Exadata Cloud Service<\/strong>: operational sources\/targets.<\/li>\n<li><strong>OCI Vault<\/strong>: store secrets\/keys (where supported by the connection model and your design).<\/li>\n<li><strong>IAM\/Compartments\/Tags<\/strong>: governance and access control.<\/li>\n<li><strong>VCN \/ Private Endpoints<\/strong>: private connectivity patterns for databases (depends on target configuration and service capabilities\u2014verify in docs).<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">Dependency services (what you still need)<\/h3>\n\n\n\n<p>Data Integration does not replace:\n&#8211; <strong>Your storage<\/strong> (Object Storage buckets) and lifecycle policies.\n&#8211; <strong>Your database<\/strong> (Autonomous or DB on OCI) and its scaling\/backups.\n&#8211; <strong>Your network architecture<\/strong> (VCNs, subnets, routing, DNS).\n&#8211; <strong>Your operations<\/strong> (alerting, runbooks, on-call).<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">Security\/authentication model (practical view)<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Access to Data Integration resources is controlled by <strong>OCI IAM policies<\/strong>.<\/li>\n<li>Access from the runtime to sources\/targets depends on:<\/li>\n<li>How the connector authenticates (user\/password, wallet, token, etc.).<\/li>\n<li>Whether the runtime can reach the endpoint (public vs private networking).<\/li>\n<li>Policies that allow required OCI operations (for example, reading objects from a bucket).<\/li>\n<\/ul>\n\n\n\n<p>Because IAM policy statements and connector auth differ by scenario, always confirm the exact policy examples for Data Integration in official docs.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">Networking model (practical view)<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li>If your source\/target is public (public Object Storage endpoint, public database endpoint), connectivity is simpler\u2014but may not be acceptable for production security.<\/li>\n<li>For production, many teams prefer <strong>private endpoints<\/strong> and <strong>VCN-only access<\/strong> to databases and services. Confirm whether and how Data Integration supports private connectivity in your region and for your connector types.<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">Monitoring\/logging\/governance considerations<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>OCI Audit<\/strong>: captures API changes and access patterns.<\/li>\n<li><strong>Run monitoring<\/strong>: check task run states and errors in the Data Integration UI.<\/li>\n<li><strong>OCI Logging\/Monitoring<\/strong>: integration points vary; if you need centralized logs\/metrics, verify current capabilities and consider exporting run outcomes to a monitoring system.<\/li>\n<\/ul>\n\n\n\n<h4 class=\"wp-block-heading\">Simple architecture diagram (conceptual)<\/h4>\n\n\n\n<pre><code class=\"language-mermaid\">flowchart LR\n  U[Engineer \/ Analyst] --&gt;|Console\/API| DI[OCI Data Integration Workspace]\n  DI --&gt;|Run Data Flow \/ Pipeline| RT[Managed Runtime]\n  RT --&gt; OS[(OCI Object Storage)]\n  RT --&gt; ADW[(Autonomous Data Warehouse)]\n  DI --&gt; AUD[OCI Audit]\n<\/code><\/pre>\n\n\n\n<h4 class=\"wp-block-heading\">Production-style architecture diagram (more realistic)<\/h4>\n\n\n\n<pre><code class=\"language-mermaid\">flowchart TB\n  subgraph Tenancy[OCI Tenancy]\n    subgraph Net[VCN \/ Networking]\n      DBP[(Private ADW \/ DB Endpoint)]\n      NAT[NAT Gateway or Service Gateway]\n    end\n\n    subgraph Gov[Governance]\n      IAM[IAM Policies &amp; Compartments]\n      AUD[OCI Audit]\n      TAG[Tags \/ Cost Tracking]\n    end\n\n    subgraph Data[Data Layer]\n      OSRAW[(Object Storage - Raw Zone)]\n      OSCUR[(Object Storage - Curated Zone)]\n      DW[(Autonomous Data Warehouse - Marts)]\n    end\n\n    subgraph DI[Data Integration]\n      WS[Workspace (Dev\/Test\/Prod)]\n      PJ[Projects \/ Folders]\n      DF[Data Flows]\n      PL[Pipelines]\n      RUN[Runs \/ Work Requests]\n    end\n  end\n\n  IAM --&gt; WS\n  WS --&gt; DF\n  WS --&gt; PL\n  DF --&gt; RUN\n  PL --&gt; RUN\n\n  RUN --&gt; OSRAW\n  RUN --&gt; OSCUR\n  RUN --&gt; DBP\n  DBP --&gt; DW\n\n  WS --&gt; AUD\n  TAG --&gt; WS\n  NAT --&gt; DBP\n<\/code><\/pre>\n\n\n\n<hr class=\"wp-block-separator\" \/>\n\n\n\n<h2 class=\"wp-block-heading\">8. Prerequisites<\/h2>\n\n\n\n<h3 class=\"wp-block-heading\">Tenancy\/account requirements<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li>An <strong>Oracle Cloud (OCI) tenancy<\/strong> with permission to use Data Integration in a region where it is available.<\/li>\n<li>A compartment strategy (at minimum: one compartment for this lab).<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">Permissions \/ IAM roles<\/h3>\n\n\n\n<p>You need permissions to:\n&#8211; Create\/manage Data Integration workspaces and artifacts.\n&#8211; Read\/write to Object Storage (for source\/target files).\n&#8211; Connect to and create objects in the target database (Autonomous Database recommended for the lab).<\/p>\n\n\n\n<p>OCI policies for Data Integration use specific resource types and verbs. Because policy syntax can evolve, use Oracle\u2019s official policy examples as the source of truth and adapt to your compartments and groups:<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Data Integration documentation home: https:\/\/docs.oracle.com\/en-us\/iaas\/data-integration\/home.htm  <\/li>\n<li>OCI IAM policy reference: https:\/\/docs.oracle.com\/en-us\/iaas\/Content\/Identity\/home.htm<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">Billing requirements<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Data Integration is a paid OCI service (unless covered by specific promotions). You need a valid billing setup.<\/li>\n<li>If you use Autonomous Database Always Free, that can reduce costs for the target, but Data Integration usage may still generate charges depending on tenancy and region. Verify current Free Tier eligibility: https:\/\/www.oracle.com\/cloud\/free\/<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">Tools needed<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li>OCI Console access (web browser).<\/li>\n<li>Optional but useful:<\/li>\n<li><strong>OCI CLI<\/strong> (for Object Storage operations and automation): https:\/\/docs.oracle.com\/en-us\/iaas\/Content\/API\/SDKDocs\/cliinstall.htm<\/li>\n<li>A SQL client for Autonomous Database:<ul>\n<li>Database Actions (web UI) or SQL Developer. Database Actions is typically easiest.<\/li>\n<\/ul>\n<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">Region availability<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Choose an OCI region where <strong>Data Integration<\/strong> is available.<\/li>\n<li>Confirm via the OCI Console service list or the service documentation.<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">Quotas\/limits<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li>OCI enforces service limits\/quotas (workspaces, runs, concurrency, etc.).<\/li>\n<li>Check in Console: <strong>Governance &amp; Administration \u2192 Limits, Quotas and Usage<\/strong>.<\/li>\n<li>If you hit limits, request an increase via OCI support (process depends on your account).<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">Prerequisite services for this lab<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>Object Storage<\/strong> bucket (for source CSV file).<\/li>\n<li><strong>Autonomous Database<\/strong> (ATP or ADW; Always Free works well for learning).<\/li>\n<li><strong>Data Integration workspace<\/strong>.<\/li>\n<\/ul>\n\n\n\n<hr class=\"wp-block-separator\" \/>\n\n\n\n<h2 class=\"wp-block-heading\">9. Pricing \/ Cost<\/h2>\n\n\n\n<blockquote>\n<p>Do not rely on any blog for pricing numbers. OCI pricing is region-specific and may change. Use official pages.<\/p>\n<\/blockquote>\n\n\n\n<h3 class=\"wp-block-heading\">Current pricing model (how costs are typically measured)<\/h3>\n\n\n\n<p>Oracle Cloud <strong>Data Integration<\/strong> pricing is <strong>usage-based<\/strong>. In practice, your bill is driven by:\n&#8211; <strong>Data Integration job execution consumption<\/strong> (often measured in compute\/time units for the managed runtime).<br\/>\n<strong>Verify the exact billing metric and unit names<\/strong> (for example, OCPU-hours or equivalent) on the official pricing page for your region.\n&#8211; <strong>Underlying services you use<\/strong>:\n  &#8211; Object Storage capacity and requests\n  &#8211; Autonomous Database compute and storage (if not Always Free)\n  &#8211; Data transfer (cross-region, internet egress)\n  &#8211; Logging\/Monitoring ingestion (if exporting logs)<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">Official pricing sources<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li>OCI price list (search for \u201cData Integration\u201d): https:\/\/www.oracle.com\/cloud\/price-list\/<\/li>\n<li>OCI Cost Estimator: https:\/\/www.oracle.com\/cloud\/costestimator.html<\/li>\n<li>OCI Free Tier overview (to reduce lab cost): https:\/\/www.oracle.com\/cloud\/free\/<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">Pricing dimensions to understand (cost drivers)<\/h3>\n\n\n\n<ol class=\"wp-block-list\">\n<li><strong>Number of runs and runtime duration<\/strong>\n   &#8211; More frequent pipelines (e.g., every 5 minutes) cost more than nightly batch.<\/li>\n<li><strong>Data volume processed<\/strong>\n   &#8211; Larger datasets typically increase runtime and consumption.<\/li>\n<li><strong>Transformation complexity<\/strong>\n   &#8211; Joins, aggregations, and wide transformations usually cost more than simple copies.<\/li>\n<li><strong>Concurrency<\/strong>\n   &#8211; Running many pipelines simultaneously can increase consumption and hit limits.<\/li>\n<li><strong>Network placement<\/strong>\n   &#8211; Moving data across regions or out to the internet can add data transfer charges.<\/li>\n<li><strong>Source\/target performance<\/strong>\n   &#8211; Slow databases can increase runtime and therefore cost.<\/li>\n<\/ol>\n\n\n\n<h3 class=\"wp-block-heading\">Free tier considerations<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li>OCI Free Tier is mainly about compute\/database\/storage products. Data Integration may not be Always Free in your region\/tenancy.<br\/>\n<strong>Verify current Free Tier eligibility<\/strong> for Data Integration specifically in official pages or your tenancy\u2019s subscription details.<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">Hidden\/indirect costs (common surprises)<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>Autonomous Database scaling<\/strong>: if you auto-scale or choose higher CPU\/storage, DI loads may trigger more DB usage.<\/li>\n<li><strong>Object Storage request costs<\/strong>: frequent small-file processing increases request counts.<\/li>\n<li><strong>Logging ingestion\/retention<\/strong>: exporting logs at high volume can cost money.<\/li>\n<li><strong>Cross-region traffic<\/strong>: replicating data or reading across regions can add transfer fees.<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">Network\/data transfer implications<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li>In OCI, <strong>egress to the internet<\/strong> and some inter-region transfers are charged.<\/li>\n<li>Keep your Data Integration workspace, Object Storage, and database <strong>in the same region<\/strong> for cost and performance unless you have a strong reason not to.<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">How to optimize cost (practical checklist)<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Prefer <strong>batch windows<\/strong> over continuous micro-batching unless you truly need it.<\/li>\n<li>Consolidate small files into fewer larger files (where your pipeline supports it).<\/li>\n<li>Push heavy transformations <strong>into the database<\/strong> when it\u2019s cheaper\/faster and fits governance.<\/li>\n<li>Use <strong>partitioned loads<\/strong> (date partitions) and incremental patterns where feasible.<\/li>\n<li>Separate dev\/test\/prod and <strong>turn off non-production schedules<\/strong>.<\/li>\n<li>Tag resources for cost tracking (Cost Analysis works best with consistent tags).<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">Example low-cost starter estimate (how to think about it)<\/h3>\n\n\n\n<p>A learning lab typically includes:\n&#8211; 1 Data Integration workspace\n&#8211; A few small runs (MBs of CSV)\n&#8211; Always Free Autonomous Database (if eligible)\n&#8211; Minimal Object Storage<\/p>\n\n\n\n<p>Your cost will depend on the <strong>minimum billable runtime units<\/strong> and the <strong>billing metric<\/strong> for Data Integration in your region. Use the Cost Estimator and run a small test, then check <strong>Billing \u2192 Cost Analysis<\/strong>.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">Example production cost considerations (what to plan for)<\/h3>\n\n\n\n<p>In production, plan for:\n&#8211; Daily\/hourly schedules across multiple domains (runs\/day)\n&#8211; Backfills (temporary cost spikes)\n&#8211; Separate environments (dev\/test\/prod)\n&#8211; Higher log retention and monitoring exports\n&#8211; Stronger networking (private endpoints) and possible added network components<\/p>\n\n\n\n<hr class=\"wp-block-separator\" \/>\n\n\n\n<h2 class=\"wp-block-heading\">10. Step-by-Step Hands-On Tutorial<\/h2>\n\n\n\n<p>This lab builds a simple but real pipeline: <strong>load a CSV file from OCI Object Storage into an Autonomous Database table<\/strong> using Oracle Cloud Data Integration.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">Objective<\/h3>\n\n\n\n<p>Create an OCI Data Integration workspace and a basic data flow that:\n1. Reads a CSV file from an Object Storage bucket\n2. Maps columns to a target table\n3. Loads the data into an Autonomous Database table\n4. Verifies row counts and cleans up<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">Lab Overview<\/h3>\n\n\n\n<p>You will:\n1. Create\/prepare an Autonomous Database table\n2. Create an Object Storage bucket and upload a sample CSV\n3. Create a Data Integration workspace and project\n4. Create connections (data assets) to Object Storage and Autonomous Database\n5. Build and run a Data Flow (or equivalent task) to load data\n6. Validate results\n7. Clean up resources to avoid ongoing cost<\/p>\n\n\n\n<blockquote>\n<p>Notes before you start:\n&#8211; UI labels can vary slightly by OCI Console updates.\n&#8211; If any option differs in your tenancy, follow the closest equivalent and verify with the official docs: https:\/\/docs.oracle.com\/en-us\/iaas\/data-integration\/home.htm<\/p>\n<\/blockquote>\n\n\n\n<hr class=\"wp-block-separator\" \/>\n\n\n\n<h3 class=\"wp-block-heading\">Step 1: Create (or choose) a compartment for the lab<\/h3>\n\n\n\n<ol class=\"wp-block-list\">\n<li>In the OCI Console, open <strong>Identity &amp; Security \u2192 Compartments<\/strong>.<\/li>\n<li>Create a compartment such as <code>lab-data-integration<\/code> (or reuse an existing lab compartment).<\/li>\n<li>Record:\n   &#8211; Compartment name\n   &#8211; Compartment OCID (optional but useful)<\/li>\n<\/ol>\n\n\n\n<p><strong>Expected outcome:<\/strong> You have a compartment where you will create the bucket, database, and Data Integration workspace.<\/p>\n\n\n\n<hr class=\"wp-block-separator\" \/>\n\n\n\n<h3 class=\"wp-block-heading\">Step 2: Create an Autonomous Database (ATP or ADW) and a target table<\/h3>\n\n\n\n<p>If you already have an Autonomous Database, you can reuse it.<\/p>\n\n\n\n<ol class=\"wp-block-list\">\n<li>Go to <strong>Oracle Database \u2192 Autonomous Database<\/strong>.<\/li>\n<li>Click <strong>Create Autonomous Database<\/strong>.<\/li>\n<li>For low cost, choose an <strong>Always Free<\/strong> option if available in your region\/tenancy.<\/li>\n<li>Set:\n   &#8211; Display name: <code>adb-di-lab<\/code>\n   &#8211; Database name: something short like <code>DILAB<\/code>\n   &#8211; Admin password: store securely<\/li>\n<li>Create the database and wait for it to become <strong>Available<\/strong>.<\/li>\n<\/ol>\n\n\n\n<p>Now create a table using <strong>Database Actions<\/strong>:\n1. Open the Autonomous Database details page.\n2. Click <strong>Database Actions<\/strong> \u2192 <strong>SQL<\/strong>.\n3. Run:<\/p>\n\n\n\n<pre><code class=\"language-sql\">CREATE TABLE DI_CUSTOMERS (\n  CUSTOMER_ID   NUMBER,\n  FIRST_NAME    VARCHAR2(100),\n  LAST_NAME     VARCHAR2(100),\n  EMAIL         VARCHAR2(200),\n  SIGNUP_DATE   DATE\n);\n<\/code><\/pre>\n\n\n\n<p><strong>Expected outcome:<\/strong> Autonomous Database is running and has an empty <code>DI_CUSTOMERS<\/code> table.<\/p>\n\n\n\n<p><strong>Verification:<\/strong><\/p>\n\n\n\n<pre><code class=\"language-sql\">SELECT COUNT(*) FROM DI_CUSTOMERS;\n<\/code><\/pre>\n\n\n\n<p>Should return <code>0<\/code>.<\/p>\n\n\n\n<hr class=\"wp-block-separator\" \/>\n\n\n\n<h3 class=\"wp-block-heading\">Step 3: Create an Object Storage bucket and upload a sample CSV<\/h3>\n\n\n\n<ol class=\"wp-block-list\">\n<li>Go to <strong>Storage \u2192 Buckets<\/strong>.<\/li>\n<li>Ensure you are in the same <strong>region<\/strong> and <strong>compartment<\/strong>.<\/li>\n<li>Click <strong>Create Bucket<\/strong>:\n   &#8211; Name: <code>di-lab-bucket-&lt;unique-suffix&gt;<\/code>\n   &#8211; Default storage tier is fine for a lab.<\/li>\n<li>Open the bucket \u2192 <strong>Upload<\/strong>.<\/li>\n<\/ol>\n\n\n\n<p>Create a local file named <code>customers.csv<\/code> with this content:<\/p>\n\n\n\n<pre><code class=\"language-csv\">CUSTOMER_ID,FIRST_NAME,LAST_NAME,EMAIL,SIGNUP_DATE\n1,Ana,Gomez,ana.gomez@example.com,2024-01-15\n2,Sam,Lee,sam.lee@example.com,2024-02-20\n3,Priya,Shah,priya.shah@example.com,2024-03-05\n4,Noah,Kim,noah.kim@example.com,2024-03-18\n<\/code><\/pre>\n\n\n\n<p>Upload it to the bucket (root or a prefix like <code>input\/<\/code>).<\/p>\n\n\n\n<p><strong>Expected outcome:<\/strong> The bucket contains <code>customers.csv<\/code>.<\/p>\n\n\n\n<p><strong>Verification:<\/strong> Click the object and confirm size and last modified timestamp.<\/p>\n\n\n\n<hr class=\"wp-block-separator\" \/>\n\n\n\n<h3 class=\"wp-block-heading\">Step 4: Create a Data Integration workspace<\/h3>\n\n\n\n<ol class=\"wp-block-list\">\n<li>In the OCI Console, go to <strong>Data Integration<\/strong>.<\/li>\n<li>Click <strong>Create workspace<\/strong>.<\/li>\n<li>Choose the lab compartment.<\/li>\n<li>Name: <code>di-workspace-lab<\/code><\/li>\n<li>Create.<\/li>\n<\/ol>\n\n\n\n<p>Wait until the workspace is active.<\/p>\n\n\n\n<p><strong>Expected outcome:<\/strong> Workspace exists and you can open it.<\/p>\n\n\n\n<hr class=\"wp-block-separator\" \/>\n\n\n\n<h3 class=\"wp-block-heading\">Step 5: Create a Data Integration project<\/h3>\n\n\n\n<ol class=\"wp-block-list\">\n<li>Open the workspace.<\/li>\n<li>Create a <strong>Project<\/strong>:\n   &#8211; Name: <code>customer-load-lab<\/code><\/li>\n<li>Optionally create folders such as:\n   &#8211; <code>connections<\/code>\n   &#8211; <code>dataflows<\/code>\n   &#8211; <code>pipelines<\/code><\/li>\n<\/ol>\n\n\n\n<p><strong>Expected outcome:<\/strong> You have a project where you\u2019ll build assets.<\/p>\n\n\n\n<hr class=\"wp-block-separator\" \/>\n\n\n\n<h3 class=\"wp-block-heading\">Step 6: Configure access (IAM and policies) for Object Storage and Autonomous Database<\/h3>\n\n\n\n<p>Data Integration needs permission to interact with OCI resources (like Object Storage), and it needs valid database credentials\/connectivity for Autonomous Database.<\/p>\n\n\n\n<p>Because the exact policy statements and resource types can vary, follow Oracle\u2019s official policy examples for Data Integration and apply least privilege in your compartment.<\/p>\n\n\n\n<p>Start here:\n&#8211; Data Integration docs: https:\/\/docs.oracle.com\/en-us\/iaas\/data-integration\/home.htm\n&#8211; IAM policy reference: https:\/\/docs.oracle.com\/en-us\/iaas\/Content\/Identity\/home.htm<\/p>\n\n\n\n<p><strong>Common pattern to validate in docs (do not copy blindly):<\/strong>\n&#8211; Allow your admin group to manage Data Integration resources in the lab compartment.\n&#8211; Allow Data Integration service access to read objects from the specific bucket (or bucket compartment).\n&#8211; Ensure database connectivity and credentials are available for the connector method used.<\/p>\n\n\n\n<p><strong>Expected outcome:<\/strong> Policies are in place; no authorization errors when testing connections later.<\/p>\n\n\n\n<p><strong>Verification:<\/strong> You should be able to create data assets and browse\/select the Object Storage object from within Data Integration (or at least run a task that reads it).<\/p>\n\n\n\n<hr class=\"wp-block-separator\" \/>\n\n\n\n<h3 class=\"wp-block-heading\">Step 7: Create a connection (data asset) to Object Storage<\/h3>\n\n\n\n<p>In the Data Integration workspace (inside your project):<\/p>\n\n\n\n<ol class=\"wp-block-list\">\n<li>Navigate to <strong>Data Assets<\/strong> or <strong>Connections<\/strong> (terminology may vary).<\/li>\n<li>Create a new data asset for <strong>Object Storage<\/strong>.<\/li>\n<li>Provide:\n   &#8211; Compartment\/bucket details\n   &#8211; Namespace (Object Storage namespace from tenancy)\n   &#8211; Bucket name\n   &#8211; Authentication method per the UI (often OCI-native\/IAM-based)<\/li>\n<\/ol>\n\n\n\n<p><strong>Expected outcome:<\/strong> Object Storage data asset is created.<\/p>\n\n\n\n<p><strong>Verification:<\/strong> Use any available <strong>Test Connection<\/strong> or browse feature (if provided) to confirm you can locate <code>customers.csv<\/code>.<\/p>\n\n\n\n<p>If you cannot browse but creation succeeds, proceed; the real test is running the flow.<\/p>\n\n\n\n<hr class=\"wp-block-separator\" \/>\n\n\n\n<h3 class=\"wp-block-heading\">Step 8: Create a connection (data asset) to Autonomous Database<\/h3>\n\n\n\n<ol class=\"wp-block-list\">\n<li>Create a new data asset for <strong>Autonomous Database<\/strong> (or Oracle Database).<\/li>\n<li>Provide connection details:\n   &#8211; Database OCID or connection string (depending on UI)\n   &#8211; Username (e.g., <code>ADMIN<\/code> or a dedicated ETL user)\n   &#8211; Password (store securely)\n   &#8211; Wallet\/SSL settings if required by the connector<\/li>\n<\/ol>\n\n\n\n<p><strong>Recommended for production:<\/strong> create a dedicated database user with least privileges (create session + insert\/select on target schema), not ADMIN.<\/p>\n\n\n\n<p><strong>Expected outcome:<\/strong> Autonomous Database data asset is created.<\/p>\n\n\n\n<p><strong>Verification:<\/strong> Use <strong>Test Connection<\/strong> if available.<\/p>\n\n\n\n<hr class=\"wp-block-separator\" \/>\n\n\n\n<h3 class=\"wp-block-heading\">Step 9: Create a Data Flow to load <code>customers.csv<\/code> into <code>DI_CUSTOMERS<\/code><\/h3>\n\n\n\n<ol class=\"wp-block-list\">\n<li>Create a new <strong>Data Flow<\/strong> in your project.<\/li>\n<li>Add a <strong>Source<\/strong>:\n   &#8211; Source type: Object Storage\n   &#8211; Select the bucket and object <code>customers.csv<\/code>\n   &#8211; Configure CSV format:<ul>\n<li>Header row present: yes<\/li>\n<li>Delimiter: comma<\/li>\n<li>Date format: <code>YYYY-MM-DD<\/code> (or configure a parsing rule if the UI requires it)<\/li>\n<\/ul>\n<\/li>\n<li>Add transformations as needed:\n   &#8211; Ensure column names map correctly:<ul>\n<li><code>CUSTOMER_ID<\/code> \u2192 number<\/li>\n<li><code>FIRST_NAME<\/code>, <code>LAST_NAME<\/code>, <code>EMAIL<\/code> \u2192 strings<\/li>\n<li><code>SIGNUP_DATE<\/code> \u2192 date (parse from string)<\/li>\n<\/ul>\n<\/li>\n<li>Add a <strong>Target<\/strong>:\n   &#8211; Target type: Autonomous Database\n   &#8211; Target table: <code>DI_CUSTOMERS<\/code>\n   &#8211; Write mode: for a lab, choose a safe mode:<ul>\n<li>If you want repeatable runs: TRUNCATE then INSERT (if supported), or delete rows before loading.<\/li>\n<li>If you want append-only: INSERT.<\/li>\n<\/ul>\n<\/li>\n<li>Save the Data Flow.<\/li>\n<\/ol>\n\n\n\n<p><strong>Expected outcome:<\/strong> A saved Data Flow that reads from Object Storage and writes to the database.<\/p>\n\n\n\n<p><strong>Verification:<\/strong> Validate the data flow graph (most UIs provide a validation step). Resolve schema\/type mapping warnings.<\/p>\n\n\n\n<hr class=\"wp-block-separator\" \/>\n\n\n\n<h3 class=\"wp-block-heading\">Step 10: Run the Data Flow (or create a Task and run it)<\/h3>\n\n\n\n<p>Depending on your UI, you may:\n&#8211; Run the Data Flow directly, or\n&#8211; Create a <strong>Task<\/strong> from the Data Flow and run the task<\/p>\n\n\n\n<ol class=\"wp-block-list\">\n<li>Click <strong>Run<\/strong>.<\/li>\n<li>Observe the run status (Submitted \u2192 Running \u2192 Succeeded\/Failed).<\/li>\n<li>Open run details if available.<\/li>\n<\/ol>\n\n\n\n<p><strong>Expected outcome:<\/strong> Run completes successfully.<\/p>\n\n\n\n<hr class=\"wp-block-separator\" \/>\n\n\n\n<h3 class=\"wp-block-heading\">Validation<\/h3>\n\n\n\n<h4 class=\"wp-block-heading\">Validate in Autonomous Database<\/h4>\n\n\n\n<p>In Database Actions \u2192 SQL:<\/p>\n\n\n\n<pre><code class=\"language-sql\">SELECT COUNT(*) AS row_count FROM DI_CUSTOMERS;\n<\/code><\/pre>\n\n\n\n<p>Expected: <code>4<\/code><\/p>\n\n\n\n<p>Check the data:<\/p>\n\n\n\n<pre><code class=\"language-sql\">SELECT\n  CUSTOMER_ID, FIRST_NAME, LAST_NAME, EMAIL,\n  TO_CHAR(SIGNUP_DATE, 'YYYY-MM-DD') AS SIGNUP_DATE\nFROM DI_CUSTOMERS\nORDER BY CUSTOMER_ID;\n<\/code><\/pre>\n\n\n\n<p>Expected: rows 1\u20134 with correct values.<\/p>\n\n\n\n<h4 class=\"wp-block-heading\">Validate in Data Integration<\/h4>\n\n\n\n<ul class=\"wp-block-list\">\n<li>The run should show <strong>Succeeded<\/strong>.<\/li>\n<li>If a run history is available, confirm runtime and any warnings.<\/li>\n<\/ul>\n\n\n\n<hr class=\"wp-block-separator\" \/>\n\n\n\n<h3 class=\"wp-block-heading\">Troubleshooting<\/h3>\n\n\n\n<h4 class=\"wp-block-heading\">Error: Authorization failed \/ NotAuthorizedOrNotFound<\/h4>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Cause: missing IAM policy for Data Integration to access the bucket or DI resources.<\/li>\n<li>Fix:<\/li>\n<li>Confirm you are in the correct compartment.<\/li>\n<li>Review policies using official policy examples for Data Integration.<\/li>\n<li>Verify the bucket is in the same compartment you granted access to.<\/li>\n<\/ul>\n\n\n\n<h4 class=\"wp-block-heading\">Error: Cannot connect to Autonomous Database<\/h4>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Cause: wrong credentials, missing wallet\/SSL config, network restrictions (private endpoint).<\/li>\n<li>Fix:<\/li>\n<li>Re-test with Database Actions using the same user.<\/li>\n<li>If the DB is private, confirm Data Integration supports the required private connectivity pattern and that your VCN\/security lists\/NSGs allow it.<\/li>\n<li>Confirm the connector\u2019s required connection string\/wallet details in the docs.<\/li>\n<\/ul>\n\n\n\n<h4 class=\"wp-block-heading\">Error: Date parsing \/ invalid month<\/h4>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Cause: CSV date format doesn\u2019t match parsing rule.<\/li>\n<li>Fix:<\/li>\n<li>Ensure date format <code>YYYY-MM-DD<\/code>.<\/li>\n<li>Add an explicit cast\/parse transformation (if available).<\/li>\n<\/ul>\n\n\n\n<h4 class=\"wp-block-heading\">Error: Column mapping mismatch<\/h4>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Cause: CSV headers don\u2019t match target columns, or inferred types differ.<\/li>\n<li>Fix:<\/li>\n<li>Ensure <code>customers.csv<\/code> header names match expected mappings.<\/li>\n<li>Add an explicit mapping step and cast types.<\/li>\n<\/ul>\n\n\n\n<h4 class=\"wp-block-heading\">Error: Duplicate rows on re-run<\/h4>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Cause: using INSERT append mode.<\/li>\n<li>Fix:<\/li>\n<li>Use truncate + load pattern (if supported), or run <code>TRUNCATE TABLE DI_CUSTOMERS<\/code> before re-running.<\/li>\n<\/ul>\n\n\n\n<hr class=\"wp-block-separator\" \/>\n\n\n\n<h3 class=\"wp-block-heading\">Cleanup<\/h3>\n\n\n\n<p>To avoid ongoing cost and clutter, delete lab resources:<\/p>\n\n\n\n<ol class=\"wp-block-list\">\n<li>\n<p><strong>Data Integration<\/strong>\n   &#8211; Delete the task(s), data flows, and project (optional).\n   &#8211; Delete the workspace <code>di-workspace-lab<\/code> (if not needed).<\/p>\n<\/li>\n<li>\n<p><strong>Object Storage<\/strong>\n   &#8211; Delete <code>customers.csv<\/code>.\n   &#8211; Delete the bucket.<\/p>\n<\/li>\n<li>\n<p><strong>Autonomous Database<\/strong>\n   &#8211; Drop the table (optional):\n     <code>sql\n     DROP TABLE DI_CUSTOMERS PURGE;<\/code>\n   &#8211; Terminate the Autonomous Database if it was created only for this lab (unless Always Free and you want to keep it).<\/p>\n<\/li>\n<li>\n<p><strong>IAM policies<\/strong>\n   &#8211; Remove any lab-only policies you created (keep least privilege).<\/p>\n<\/li>\n<\/ol>\n\n\n\n<hr class=\"wp-block-separator\" \/>\n\n\n\n<h2 class=\"wp-block-heading\">11. Best Practices<\/h2>\n\n\n\n<h3 class=\"wp-block-heading\">Architecture best practices<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>Separate dev\/test\/prod<\/strong> using compartments and separate workspaces.<\/li>\n<li>Adopt a <strong>layered data architecture<\/strong> (raw \u2192 staged \u2192 curated \u2192 marts).<\/li>\n<li>Keep <strong>data close to compute<\/strong>: same region for workspace, buckets, and DB targets.<\/li>\n<li>Prefer <strong>idempotent designs<\/strong>:<\/li>\n<li>Partitioned loads (by date)<\/li>\n<li>Merge\/upsert patterns (when supported and appropriate)<\/li>\n<li>Staging + swap for stable publishing<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">IAM\/security best practices<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Use <strong>least privilege<\/strong>:<\/li>\n<li>Separate \u201cdesigners\u201d (create\/update flows) from \u201coperators\u201d (run\/monitor).<\/li>\n<li>Use <strong>dedicated DB users<\/strong> for Data Integration with minimal privileges.<\/li>\n<li>Store secrets appropriately:<\/li>\n<li>Prefer OCI Vault patterns where supported; otherwise restrict who can view\/edit connection assets.<\/li>\n<li>Apply <strong>tagging<\/strong> consistently (environment, cost center, owner, data domain).<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">Cost best practices<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Avoid high-frequency schedules for batch workloads.<\/li>\n<li>Reduce small-file overhead by consolidating files upstream.<\/li>\n<li>Monitor job runtimes and tune transformations.<\/li>\n<li>Use Cost Analysis with tags to detect runaway costs early.<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">Performance best practices<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Push down transformations to the database when that is faster\/cheaper and aligns with governance.<\/li>\n<li>Use partitioned reads\/writes where supported.<\/li>\n<li>Avoid unnecessary wide joins; pre-filter data early in the flow.<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">Reliability best practices<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Build pipelines with:<\/li>\n<li>Clear failure handling (stop on critical step failure)<\/li>\n<li>Retries for transient errors (where supported)<\/li>\n<li>Validation steps (row counts, null checks)<\/li>\n<li>Maintain runbooks:<\/li>\n<li>What to do on failure<\/li>\n<li>How to replay\/backfill safely<\/li>\n<li>Escalation path<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">Operations best practices<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Standardize naming:<\/li>\n<li><code>di-&lt;env&gt;-&lt;domain&gt;-&lt;purpose&gt;<\/code><\/li>\n<li>Keep an <strong>asset inventory<\/strong> per workspace (projects, connections, schedules).<\/li>\n<li>Establish change management:<\/li>\n<li>Peer reviews of flows\/pipelines<\/li>\n<li>Controlled promotion to production (verify DI\u2019s promotion model in your tenancy)<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">Governance best practices<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Use compartments to model ownership and data domains.<\/li>\n<li>Tag everything for cost and ownership.<\/li>\n<li>Document data lineage externally if you need full lineage (Data Integration alone may not cover enterprise lineage requirements; consider OCI Data Catalog patterns where appropriate).<\/li>\n<\/ul>\n\n\n\n<hr class=\"wp-block-separator\" \/>\n\n\n\n<h2 class=\"wp-block-heading\">12. Security Considerations<\/h2>\n\n\n\n<h3 class=\"wp-block-heading\">Identity and access model<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Data Integration uses <strong>OCI IAM<\/strong> for:<\/li>\n<li>User authentication to the Console\/API<\/li>\n<li>Authorization to manage workspaces and artifacts<\/li>\n<li>Use <strong>groups and policies<\/strong> rather than individual user grants.<\/li>\n<li>Separate duties:<\/li>\n<li>Data engineers: design assets<\/li>\n<li>Operators: run\/monitor<\/li>\n<li>Security\/admin: manage policies<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">Encryption<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li>OCI services typically encrypt data at rest by default (service-dependent). Confirm encryption behavior for:<\/li>\n<li>Object Storage buckets<\/li>\n<li>Autonomous Database<\/li>\n<li>For sensitive workloads, use <strong>customer-managed keys<\/strong> where required (OCI Vault + KMS), and verify Data Integration compatibility with CMEK scenarios for each dependent service.<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">Network exposure<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Prefer private connectivity for production databases where possible.<\/li>\n<li>If your DB is publicly accessible, restrict with:<\/li>\n<li>IP allowlists (if applicable)<\/li>\n<li>Strong credentials<\/li>\n<li>Minimal privileges<\/li>\n<li>Keep the Data Integration workspace and data sources in the same region to minimize exposure and transfer.<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">Secrets handling<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Avoid embedding secrets in scripts; store them in managed connection objects with restricted access.<\/li>\n<li>Rotate DB passwords and update connection assets as part of your security hygiene.<\/li>\n<li>Consider using database auth patterns that reduce static secrets (availability varies; verify in docs).<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">Audit\/logging<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>OCI Audit<\/strong> captures relevant administrative operations.<\/li>\n<li>Ensure Audit logs are retained per compliance requirements.<\/li>\n<li>If you need centralized observability, integrate with OCI Logging\/Monitoring where supported and define alert rules around job failures.<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">Compliance considerations<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Map controls to:<\/li>\n<li>Access control (IAM policies)<\/li>\n<li>Change management (who can modify flows)<\/li>\n<li>Data protection (encryption, masking)<\/li>\n<li>Logging and retention (Audit, run history)<\/li>\n<li>For regulated data, ensure the entire path (source, transport, target, backups) meets compliance requirements.<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">Common security mistakes<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Using <code>ADMIN<\/code> for database loads in production.<\/li>\n<li>Overly broad IAM policies at the tenancy root.<\/li>\n<li>Leaving public endpoints open without strong restrictions.<\/li>\n<li>Allowing all developers to edit production connections and credentials.<\/li>\n<li>No audit review process.<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">Secure deployment recommendations<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Use compartment isolation per environment.<\/li>\n<li>Create a dedicated \u201cintegration runtime\u201d DB user per pipeline domain.<\/li>\n<li>Apply tagging and resource naming conventions.<\/li>\n<li>Regularly review policies and connection assets permissions.<\/li>\n<\/ul>\n\n\n\n<hr class=\"wp-block-separator\" \/>\n\n\n\n<h2 class=\"wp-block-heading\">13. Limitations and Gotchas<\/h2>\n\n\n\n<blockquote>\n<p>Limits and capabilities vary by region and release. Always verify in official docs and your tenancy\u2019s service limits page.<\/p>\n<\/blockquote>\n\n\n\n<h3 class=\"wp-block-heading\">Known limitations (categories)<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>Connector availability<\/strong>: not every data source is supported natively; some require staging via Object Storage or database links. Verify connector list.<\/li>\n<li><strong>Private networking<\/strong>: private endpoint support depends on the connector and service capabilities; validate before committing to architecture.<\/li>\n<li><strong>Advanced orchestration<\/strong>: complex branching\/looping and event triggers may be limited compared to dedicated orchestrators.<\/li>\n<li><strong>Real-time CDC<\/strong>: Data Integration is generally a batch integration tool; for CDC replication, evaluate GoldenGate.<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">Quotas and concurrency<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Workspaces, projects, tasks, and concurrent runs may have limits.<\/li>\n<li>Concurrency spikes during backfills can hit limits and increase costs.<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">Regional constraints<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Workspaces are regional; cross-region pipelines require explicit patterns and may incur data transfer charges.<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">Pricing surprises<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Backfills can run for hours\/days, driving consumption.<\/li>\n<li>Many small files can inflate processing overhead and Object Storage request costs.<\/li>\n<li>Storing extensive logs externally can add Logging costs.<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">Compatibility issues<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li>CSV\/JSON schema drift: header changes can break mappings.<\/li>\n<li>Date\/time parsing differences between source formats and database types.<\/li>\n<li>Character set issues (UTF-8 vs other encodings) if files originate from legacy systems.<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">Operational gotchas<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Re-runs can cause duplicates without idempotent design.<\/li>\n<li>Credential rotations can silently break scheduled loads if not updated.<\/li>\n<li>Lack of standardized naming makes incident response slower.<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">Migration challenges<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Migrating from ODI\/Informatica\/Talend may require redesign:<\/li>\n<li>Different transformation semantics<\/li>\n<li>Different operational model (managed vs self-hosted)<\/li>\n<li>Different scheduling\/orchestration patterns<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">Vendor-specific nuances<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Oracle database targets can be very fast, but you must still design:<\/li>\n<li>Load strategy (append vs merge)<\/li>\n<li>Index maintenance timing<\/li>\n<li>Constraints handling<\/li>\n<\/ul>\n\n\n\n<hr class=\"wp-block-separator\" \/>\n\n\n\n<h2 class=\"wp-block-heading\">14. Comparison with Alternatives<\/h2>\n\n\n\n<p>Data Integration is one option in a broader integration and data engineering toolbox.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">Alternatives in Oracle Cloud (OCI)<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>Oracle GoldenGate (OCI)<\/strong>: best for real-time CDC replication.<\/li>\n<li><strong>OCI Data Flow<\/strong>: serverless Apache Spark jobs for code-first transformations.<\/li>\n<li><strong>Oracle Integration<\/strong>: iPaaS for application\/SaaS integration and process automation (not a data engineering ETL tool first).<\/li>\n<li><strong>OCI Data Catalog<\/strong>: metadata management and governance (complements DI; not an ETL runtime).<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">Alternatives in other clouds<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>AWS Glue<\/strong>: managed ETL + data catalog integration.<\/li>\n<li><strong>Azure Data Factory<\/strong>: orchestration + connectors + mapping data flows.<\/li>\n<li><strong>Google Cloud Data Fusion \/ Dataflow<\/strong>: visual pipeline (Data Fusion) and managed stream\/batch processing (Dataflow).<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">Open-source \/ self-managed<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>Apache Airflow<\/strong> (self-managed or managed elsewhere): orchestration (not ETL itself).<\/li>\n<li><strong>Apache NiFi<\/strong>: flow-based ingestion.<\/li>\n<li><strong>dbt<\/strong>: SQL-based transformations in the warehouse (often complements ingestion tools).<\/li>\n<li><strong>Spark on Kubernetes<\/strong>: maximum control, maximum ops overhead.<\/li>\n<\/ul>\n\n\n\n<h4 class=\"wp-block-heading\">Comparison table<\/h4>\n\n\n\n<figure class=\"wp-block-table\"><table>\n<thead>\n<tr>\n<th>Option<\/th>\n<th>Best For<\/th>\n<th>Strengths<\/th>\n<th>Weaknesses<\/th>\n<th>When to Choose<\/th>\n<\/tr>\n<\/thead>\n<tbody>\n<tr>\n<td><strong>OCI Data Integration<\/strong><\/td>\n<td>Batch ingestion + transformation in OCI<\/td>\n<td>Managed design\/runtime, OCI-native governance, good fit with ADW\/Object Storage<\/td>\n<td>Connector\/networking constraints, not CDC-first, orchestration depth may be limited<\/td>\n<td>You want OCI-governed ETL\/ELT without running servers<\/td>\n<\/tr>\n<tr>\n<td><strong>Oracle GoldenGate (OCI)<\/strong><\/td>\n<td>Real-time replication \/ CDC<\/td>\n<td>Low-latency change capture, replication patterns<\/td>\n<td>More specialized, can be costlier\/complex<\/td>\n<td>You need near real-time data movement with CDC<\/td>\n<\/tr>\n<tr>\n<td><strong>OCI Data Flow<\/strong><\/td>\n<td>Code-first Spark processing<\/td>\n<td>Flexible, scalable Spark jobs<\/td>\n<td>More engineering\/ops than visual ETL<\/td>\n<td>You need custom Spark logic beyond visual transforms<\/td>\n<\/tr>\n<tr>\n<td><strong>Oracle Integration<\/strong><\/td>\n<td>App\/SaaS integration<\/td>\n<td>SaaS adapters, process automation<\/td>\n<td>Not designed primarily for large-scale data engineering<\/td>\n<td>You integrate applications and events, not bulk analytics loads<\/td>\n<\/tr>\n<tr>\n<td><strong>AWS Glue<\/strong><\/td>\n<td>ETL on AWS<\/td>\n<td>Strong AWS ecosystem integration<\/td>\n<td>Different cloud; migration overhead<\/td>\n<td>Your platform is AWS-first<\/td>\n<\/tr>\n<tr>\n<td><strong>Azure Data Factory<\/strong><\/td>\n<td>Data integration on Azure<\/td>\n<td>Mature orchestration + connectors<\/td>\n<td>Different cloud<\/td>\n<td>Your platform is Azure-first<\/td>\n<\/tr>\n<tr>\n<td><strong>Airflow (self-managed)<\/strong><\/td>\n<td>Orchestration across tools<\/td>\n<td>Very flexible DAG orchestration<\/td>\n<td>You manage infra and reliability<\/td>\n<td>You need multi-tool orchestration and have ops maturity<\/td>\n<\/tr>\n<\/tbody>\n<\/table><\/figure>\n\n\n\n<hr class=\"wp-block-separator\" \/>\n\n\n\n<h2 class=\"wp-block-heading\">15. Real-World Example<\/h2>\n\n\n\n<h3 class=\"wp-block-heading\">Enterprise example: retail analytics modernization<\/h3>\n\n\n\n<p><strong>Problem<\/strong>\nA retail enterprise has:\n&#8211; Oracle E-Business Suite\/ERP on Oracle Database\n&#8211; Daily store sales files landing in Object Storage\n&#8211; A mandate to build a governed analytics platform on Autonomous Data Warehouse<\/p>\n\n\n\n<p>They need consistent, auditable pipelines with environment separation and access controls.<\/p>\n\n\n\n<p><strong>Proposed architecture<\/strong>\n&#8211; Object Storage:\n  &#8211; <code>\/raw\/pos\/<\/code> for store extracts\n  &#8211; <code>\/raw\/erp\/<\/code> for exports\n  &#8211; <code>\/curated\/<\/code> for standardized datasets\n&#8211; OCI Data Integration:\n  &#8211; Separate workspaces per environment (dev\/test\/prod)\n  &#8211; Projects per domain: <code>sales<\/code>, <code>inventory<\/code>, <code>customer<\/code>\n  &#8211; Pipelines orchestrating: ingest \u2192 transform \u2192 load ADW \u2192 validate\n&#8211; Autonomous Data Warehouse:\n  &#8211; Staging schema + curated marts\n&#8211; Governance:\n  &#8211; IAM policies per team\n  &#8211; Tags for cost allocation\n  &#8211; Audit reviews for production changes<\/p>\n\n\n\n<p><strong>Why Data Integration was chosen<\/strong>\n&#8211; Visual development accelerates delivery across multiple teams.\n&#8211; OCI-native IAM and compartments align with enterprise governance.\n&#8211; Managed runtime reduces operational burden versus self-hosted ETL servers.<\/p>\n\n\n\n<p><strong>Expected outcomes<\/strong>\n&#8211; Reduced pipeline failures via standardized orchestration and monitoring\n&#8211; Faster onboarding for new subject areas\n&#8211; Improved auditability and controlled promotion to production\n&#8211; Predictable costs through tagging and run discipline<\/p>\n\n\n\n<hr class=\"wp-block-separator\" \/>\n\n\n\n<h3 class=\"wp-block-heading\">Startup\/small-team example: SaaS product usage analytics<\/h3>\n\n\n\n<p><strong>Problem<\/strong>\nA startup collects daily usage exports (CSV) and wants to build KPIs in a warehouse without hiring a full-time platform engineer to manage ETL servers.<\/p>\n\n\n\n<p><strong>Proposed architecture<\/strong>\n&#8211; Object Storage bucket receives daily exports from the application.\n&#8211; OCI Data Integration:\n  &#8211; Single workspace for staging + transformations\n  &#8211; A small set of data flows loading into ADW\n&#8211; Autonomous Database (Always Free initially; later scale up):\n  &#8211; Simple schema for dashboards<\/p>\n\n\n\n<p><strong>Why Data Integration was chosen<\/strong>\n&#8211; Minimal infrastructure management.\n&#8211; Quick to build and change transformations.\n&#8211; Strong fit with OCI-native services used by the startup.<\/p>\n\n\n\n<p><strong>Expected outcomes<\/strong>\n&#8211; Working dashboards in days rather than weeks\n&#8211; Low operational overhead\n&#8211; Smooth scaling path as data volume grows<\/p>\n\n\n\n<hr class=\"wp-block-separator\" \/>\n\n\n\n<h2 class=\"wp-block-heading\">16. FAQ<\/h2>\n\n\n\n<h3 class=\"wp-block-heading\">1) Is Oracle Cloud Data Integration the same as Oracle Data Integrator (ODI)?<\/h3>\n\n\n\n<p>No. <strong>OCI Data Integration<\/strong> is a managed OCI service. <strong>ODI<\/strong> is a separate product (often on-prem or self-managed on cloud). Validate product scope in Oracle docs for your exact environment.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">2) Is Data Integration the same as Oracle Integration?<\/h3>\n\n\n\n<p>No. <strong>Oracle Integration<\/strong> is an iPaaS focused on application integration and process automation. <strong>Data Integration<\/strong> is focused on data ingestion and transformation pipelines.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">3) Do I need to run servers or clusters for Data Integration?<\/h3>\n\n\n\n<p>Typically no\u2014Data Integration is managed. You design and run jobs; Oracle manages service infrastructure. Verify runtime characteristics and limits in official docs.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">4) Is Data Integration regional?<\/h3>\n\n\n\n<p>Yes, workspaces are created in a specific OCI region. Keep sources\/targets in-region when possible.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">5) Can Data Integration load into Autonomous Data Warehouse?<\/h3>\n\n\n\n<p>Yes, ADW is a common target. You configure a connection and load into tables.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">6) Can Data Integration read files from Object Storage?<\/h3>\n\n\n\n<p>Yes, Object Storage is a common source\/landing zone for CSV and other file-based ingestion patterns (format support depends on connector features).<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">7) How do I schedule pipelines?<\/h3>\n\n\n\n<p>Scheduling options depend on current service features and your chosen approach. If native scheduling is limited for your needs, orchestrate runs externally (for example, with OCI services or CI\/CD). Verify current scheduling features in docs.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">8) How do I implement incremental loads?<\/h3>\n\n\n\n<p>Common patterns include:\n&#8211; Partitioned loads by date\n&#8211; Change-tracking columns in source tables\n&#8211; Staging + merge\/upsert in the database<br\/>\nExact implementation depends on connectors and transformation features.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">9) Does Data Integration support CDC?<\/h3>\n\n\n\n<p>Data Integration is generally batch-oriented. For CDC replication, evaluate <strong>Oracle GoldenGate<\/strong>.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">10) Can I deploy the same pipeline to dev\/test\/prod?<\/h3>\n\n\n\n<p>Yes, typically by using separate workspaces and consistent naming\/parameters. Confirm current promotion\/export\/import capabilities in your tenancy.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">11) How do I secure database credentials used by Data Integration?<\/h3>\n\n\n\n<p>Use dedicated DB users with least privileges and restrict access to connection assets. Use OCI Vault patterns where supported by your connector model.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">12) What\u2019s the best way to avoid duplicates when re-running?<\/h3>\n\n\n\n<p>Use idempotent patterns:\n&#8211; Truncate-and-load for full refresh tables\n&#8211; Partition overwrite\n&#8211; Merge\/upsert keyed by business key and effective dates<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">13) How do I monitor failures?<\/h3>\n\n\n\n<p>Use Data Integration run history and error details. For production, integrate job outcomes with your alerting process (Notifications\/alarms patterns vary\u2014verify available integrations).<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">14) How do I estimate costs?<\/h3>\n\n\n\n<p>Use the official pricing page and the OCI Cost Estimator:\n&#8211; https:\/\/www.oracle.com\/cloud\/price-list\/\n&#8211; https:\/\/www.oracle.com\/cloud\/costestimator.html<br\/>\nThen validate by running a small workload and reviewing Billing \u2192 Cost Analysis.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">15) What\u2019s the easiest beginner lab?<\/h3>\n\n\n\n<p>Load a small CSV from Object Storage into an Always Free Autonomous Database table (the lab in this tutorial).<\/p>\n\n\n\n<hr class=\"wp-block-separator\" \/>\n\n\n\n<h2 class=\"wp-block-heading\">17. Top Online Resources to Learn Data Integration<\/h2>\n\n\n\n<figure class=\"wp-block-table\"><table>\n<thead>\n<tr>\n<th>Resource Type<\/th>\n<th>Name<\/th>\n<th>Why It Is Useful<\/th>\n<\/tr>\n<\/thead>\n<tbody>\n<tr>\n<td>Official documentation<\/td>\n<td>OCI Data Integration Docs<\/td>\n<td>Source of truth for concepts, features, limits, and how-to steps. https:\/\/docs.oracle.com\/en-us\/iaas\/data-integration\/home.htm<\/td>\n<\/tr>\n<tr>\n<td>Official documentation<\/td>\n<td>OCI IAM Docs<\/td>\n<td>Required for correct policies, compartments, dynamic groups, and security model. https:\/\/docs.oracle.com\/en-us\/iaas\/Content\/Identity\/home.htm<\/td>\n<\/tr>\n<tr>\n<td>Official pricing<\/td>\n<td>OCI Price List<\/td>\n<td>Find Data Integration pricing dimensions for your region. https:\/\/www.oracle.com\/cloud\/price-list\/<\/td>\n<\/tr>\n<tr>\n<td>Official calculator<\/td>\n<td>OCI Cost Estimator<\/td>\n<td>Model Data Integration + Object Storage + DB costs. https:\/\/www.oracle.com\/cloud\/costestimator.html<\/td>\n<\/tr>\n<tr>\n<td>Official free tier<\/td>\n<td>OCI Free Tier<\/td>\n<td>Reduce lab cost; check what is Always Free. https:\/\/www.oracle.com\/cloud\/free\/<\/td>\n<\/tr>\n<tr>\n<td>Architecture guidance<\/td>\n<td>OCI Architecture Center<\/td>\n<td>Reference architectures and best practices (search for data integration and analytics patterns). https:\/\/docs.oracle.com\/en\/solutions\/<\/td>\n<\/tr>\n<tr>\n<td>Tutorials<\/td>\n<td>OCI Tutorials (Oracle)<\/td>\n<td>Step-by-step labs for OCI services; search for Data Integration. https:\/\/docs.oracle.com\/en\/learn\/<\/td>\n<\/tr>\n<tr>\n<td>Videos<\/td>\n<td>Oracle Cloud YouTube channel<\/td>\n<td>Product overviews and demos; verify freshness by date. https:\/\/www.youtube.com\/@OracleCloudInfrastructure<\/td>\n<\/tr>\n<tr>\n<td>Samples<\/td>\n<td>Oracle GitHub (official org)<\/td>\n<td>Some OCI services provide samples; search repositories for \u201cdata integration\u201d. https:\/\/github.com\/oracle<\/td>\n<\/tr>\n<tr>\n<td>Community (reputable)<\/td>\n<td>Oracle Cloud Customer Connect<\/td>\n<td>Practical discussions and Q&amp;A validate answers against docs. https:\/\/cloudcustomerconnect.oracle.com\/<\/td>\n<\/tr>\n<\/tbody>\n<\/table><\/figure>\n\n\n\n<hr class=\"wp-block-separator\" \/>\n\n\n\n<h2 class=\"wp-block-heading\">18. Training and Certification Providers<\/h2>\n\n\n\n<figure class=\"wp-block-table\"><table>\n<thead>\n<tr>\n<th>Institute<\/th>\n<th>Suitable Audience<\/th>\n<th>Likely Learning Focus<\/th>\n<th>Mode<\/th>\n<th>Website URL<\/th>\n<\/tr>\n<\/thead>\n<tbody>\n<tr>\n<td>DevOpsSchool.com<\/td>\n<td>Engineers, DevOps, cloud practitioners<\/td>\n<td>Cloud\/DevOps training; may include OCI and integration fundamentals<\/td>\n<td>Check website<\/td>\n<td>https:\/\/www.devopsschool.com\/<\/td>\n<\/tr>\n<tr>\n<td>ScmGalaxy.com<\/td>\n<td>Beginners to intermediate IT professionals<\/td>\n<td>DevOps, SCM, automation foundations that support integration operations<\/td>\n<td>Check website<\/td>\n<td>https:\/\/www.scmgalaxy.com\/<\/td>\n<\/tr>\n<tr>\n<td>CLoudOpsNow.in<\/td>\n<td>Cloud ops and platform teams<\/td>\n<td>Cloud operations practices, monitoring, governance<\/td>\n<td>Check website<\/td>\n<td>https:\/\/www.cloudopsnow.in\/<\/td>\n<\/tr>\n<tr>\n<td>SreSchool.com<\/td>\n<td>SREs, operations, reliability engineers<\/td>\n<td>Reliability engineering practices for running production pipelines<\/td>\n<td>Check website<\/td>\n<td>https:\/\/www.sreschool.com\/<\/td>\n<\/tr>\n<tr>\n<td>AiOpsSchool.com<\/td>\n<td>Ops and platform teams exploring AIOps<\/td>\n<td>Monitoring\/automation concepts that can complement data pipeline ops<\/td>\n<td>Check website<\/td>\n<td>https:\/\/www.aiopsschool.com\/<\/td>\n<\/tr>\n<\/tbody>\n<\/table><\/figure>\n\n\n\n<hr class=\"wp-block-separator\" \/>\n\n\n\n<h2 class=\"wp-block-heading\">19. Top Trainers<\/h2>\n\n\n\n<figure class=\"wp-block-table\"><table>\n<thead>\n<tr>\n<th>Platform\/Site<\/th>\n<th>Likely Specialization<\/th>\n<th>Suitable Audience<\/th>\n<th>Website URL<\/th>\n<\/tr>\n<\/thead>\n<tbody>\n<tr>\n<td>RajeshKumar.xyz<\/td>\n<td>DevOps\/cloud training content (verify current offerings)<\/td>\n<td>Beginners to intermediate<\/td>\n<td>https:\/\/rajeshkumar.xyz\/<\/td>\n<\/tr>\n<tr>\n<td>devopstrainer.in<\/td>\n<td>DevOps training and coaching (verify OCI coverage)<\/td>\n<td>DevOps engineers and students<\/td>\n<td>https:\/\/www.devopstrainer.in\/<\/td>\n<\/tr>\n<tr>\n<td>devopsfreelancer.com<\/td>\n<td>Freelance DevOps\/platform help (verify services)<\/td>\n<td>Teams needing short-term guidance<\/td>\n<td>https:\/\/www.devopsfreelancer.com\/<\/td>\n<\/tr>\n<tr>\n<td>devopssupport.in<\/td>\n<td>DevOps support\/training style services (verify scope)<\/td>\n<td>Ops teams and learners<\/td>\n<td>https:\/\/www.devopssupport.in\/<\/td>\n<\/tr>\n<\/tbody>\n<\/table><\/figure>\n\n\n\n<hr class=\"wp-block-separator\" \/>\n\n\n\n<h2 class=\"wp-block-heading\">20. Top Consulting Companies<\/h2>\n\n\n\n<figure class=\"wp-block-table\"><table>\n<thead>\n<tr>\n<th>Company<\/th>\n<th>Likely Service Area<\/th>\n<th>Where They May Help<\/th>\n<th>Consulting Use Case Examples<\/th>\n<th>Website URL<\/th>\n<\/tr>\n<\/thead>\n<tbody>\n<tr>\n<td>cotocus.com<\/td>\n<td>Cloud\/DevOps consulting (verify service catalog)<\/td>\n<td>Architecture, implementation support, operations setup<\/td>\n<td>Landing zone setup, CI\/CD for data pipelines, monitoring runbooks<\/td>\n<td>https:\/\/cotocus.com\/<\/td>\n<\/tr>\n<tr>\n<td>DevOpsSchool.com<\/td>\n<td>Training + consulting (verify offerings)<\/td>\n<td>Enablement + implementation guidance<\/td>\n<td>Platform standardization, governance\/tagging strategy, operational maturity<\/td>\n<td>https:\/\/www.devopsschool.com\/<\/td>\n<\/tr>\n<tr>\n<td>DEVOPSCONSULTING.IN<\/td>\n<td>DevOps consulting (verify service catalog)<\/td>\n<td>Delivery assistance and ops processes<\/td>\n<td>Automation, infrastructure-as-code support, operational playbooks<\/td>\n<td>https:\/\/www.devopsconsulting.in\/<\/td>\n<\/tr>\n<\/tbody>\n<\/table><\/figure>\n\n\n\n<hr class=\"wp-block-separator\" \/>\n\n\n\n<h2 class=\"wp-block-heading\">21. Career and Learning Roadmap<\/h2>\n\n\n\n<h3 class=\"wp-block-heading\">What to learn before Data Integration<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li>OCI fundamentals:<\/li>\n<li>Tenancy, compartments, IAM users\/groups\/policies<\/li>\n<li>VCN basics (subnets, routing, security lists\/NSGs)<\/li>\n<li>Object Storage concepts (buckets, namespaces, lifecycle)<\/li>\n<li>Data fundamentals:<\/li>\n<li>Relational modeling, SQL basics<\/li>\n<li>CSV\/file formats and schema basics<\/li>\n<li>ETL vs ELT patterns<\/li>\n<li>Security basics:<\/li>\n<li>Least privilege<\/li>\n<li>Secret management patterns<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">What to learn after Data Integration<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Advanced analytics architecture:<\/li>\n<li>Data lakehouse patterns on OCI<\/li>\n<li>Dimensional modeling (Kimball) and data vault concepts<\/li>\n<li>Orchestration and platform engineering:<\/li>\n<li>CI\/CD for data pipelines (Git-based workflows)<\/li>\n<li>Testing strategies for data (unit tests, reconciliation)<\/li>\n<li>Specialized tools:<\/li>\n<li>Oracle GoldenGate for CDC<\/li>\n<li>OCI Data Flow for advanced Spark workloads<\/li>\n<li>Data governance tools (OCI Data Catalog) for lineage and discovery<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">Job roles that use it<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Data Engineer (OCI)<\/li>\n<li>Analytics Engineer<\/li>\n<li>Cloud Engineer \/ Platform Engineer (data platform)<\/li>\n<li>DevOps\/SRE supporting data pipelines<\/li>\n<li>Solution Architect (data and analytics)<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">Certification path (if available)<\/h3>\n\n\n\n<p>Oracle certification offerings change over time. For current OCI certification paths, verify on Oracle University:\n&#8211; https:\/\/education.oracle.com\/<\/p>\n\n\n\n<p>Look for OCI-focused tracks related to data management, integration, and analytics.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">Project ideas for practice<\/h3>\n\n\n\n<ol class=\"wp-block-list\">\n<li>Build a raw-to-curated pipeline with partitioned loads (daily folders).<\/li>\n<li>Implement an idempotent load pattern (staging + merge).<\/li>\n<li>Add validation steps (row counts, null checks) and a failure notification pattern.<\/li>\n<li>Create separate dev\/prod workspaces and practice promoting artifacts.<\/li>\n<li>Cost governance: tag everything and produce a weekly cost report by tag.<\/li>\n<\/ol>\n\n\n\n<hr class=\"wp-block-separator\" \/>\n\n\n\n<h2 class=\"wp-block-heading\">22. Glossary<\/h2>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>ADW (Autonomous Data Warehouse):<\/strong> Oracle\u2019s managed analytics database service on OCI.<\/li>\n<li><strong>ATP (Autonomous Transaction Processing):<\/strong> Oracle\u2019s managed transactional database service on OCI.<\/li>\n<li><strong>Bucket:<\/strong> Object Storage container for objects (files).<\/li>\n<li><strong>Compartment:<\/strong> OCI logical container for resources and access control.<\/li>\n<li><strong>Control plane:<\/strong> Management layer (create\/update\/run configuration).<\/li>\n<li><strong>Data asset \/ connection:<\/strong> Definition of a source\/target system and how to connect to it.<\/li>\n<li><strong>Data flow:<\/strong> A transformation pipeline that reads, transforms, and writes data.<\/li>\n<li><strong>Data plane\/runtime:<\/strong> Execution layer that moves\/transforms data.<\/li>\n<li><strong>ETL\/ELT:<\/strong> Extract-Transform-Load \/ Extract-Load-Transform integration patterns.<\/li>\n<li><strong>IAM policy:<\/strong> Rules that define who can do what to which resources.<\/li>\n<li><strong>Idempotent load:<\/strong> A load that can be rerun without creating duplicates or incorrect results.<\/li>\n<li><strong>Object Storage namespace:<\/strong> Tenancy-level identifier used in Object Storage endpoints.<\/li>\n<li><strong>Pipeline:<\/strong> Orchestration of tasks with dependencies and run order.<\/li>\n<li><strong>Run \/ work request:<\/strong> Execution record of a task\/pipeline.<\/li>\n<li><strong>VCN:<\/strong> Virtual Cloud Network\u2014your private network in OCI.<\/li>\n<\/ul>\n\n\n\n<hr class=\"wp-block-separator\" \/>\n\n\n\n<h2 class=\"wp-block-heading\">23. Summary<\/h2>\n\n\n\n<p>Oracle Cloud <strong>Data Integration<\/strong> is OCI\u2019s managed service in the <strong>Integration<\/strong> category for building and operating batch-oriented data ingestion and transformation pipelines. It fits best when you want OCI-native governance (IAM\/compartments\/tags), a visual development experience, and a managed runtime\u2014especially for common patterns like Object Storage to Autonomous Database loads.<\/p>\n\n\n\n<p>Cost is primarily driven by <strong>job execution consumption<\/strong> (verify exact billing units on the official pricing page) and by dependent services like Object Storage and Autonomous Database. Security and compliance depend on least-privilege IAM policies, careful credential handling, network design (public vs private endpoints), and using Audit\/run history for traceability.<\/p>\n\n\n\n<p>Use Data Integration when you need governed ETL\/ELT in OCI; consider GoldenGate for CDC and Data Flow for code-first Spark. Next, deepen skills by implementing idempotent patterns, environment promotion, and operational monitoring runbooks\u2014then validate everything against the official docs: https:\/\/docs.oracle.com\/en-us\/iaas\/data-integration\/home.htm<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Integration<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[48,62],"tags":[],"class_list":["post-917","post","type-post","status-publish","format-standard","hentry","category-integration","category-oracle-cloud"],"_links":{"self":[{"href":"https:\/\/www.devopsschool.com\/tutorials\/wp-json\/wp\/v2\/posts\/917","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.devopsschool.com\/tutorials\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.devopsschool.com\/tutorials\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.devopsschool.com\/tutorials\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/www.devopsschool.com\/tutorials\/wp-json\/wp\/v2\/comments?post=917"}],"version-history":[{"count":0,"href":"https:\/\/www.devopsschool.com\/tutorials\/wp-json\/wp\/v2\/posts\/917\/revisions"}],"wp:attachment":[{"href":"https:\/\/www.devopsschool.com\/tutorials\/wp-json\/wp\/v2\/media?parent=917"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.devopsschool.com\/tutorials\/wp-json\/wp\/v2\/categories?post=917"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.devopsschool.com\/tutorials\/wp-json\/wp\/v2\/tags?post=917"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}