{"id":125,"date":"2026-04-12T22:01:37","date_gmt":"2026-04-12T22:01:37","guid":{"rendered":"https:\/\/www.devopsschool.com\/tutorials\/aws-amazon-datazone-tutorial-architecture-pricing-use-cases-and-hands-on-guide-for-analytics\/"},"modified":"2026-04-12T22:01:37","modified_gmt":"2026-04-12T22:01:37","slug":"aws-amazon-datazone-tutorial-architecture-pricing-use-cases-and-hands-on-guide-for-analytics","status":"publish","type":"post","link":"https:\/\/www.devopsschool.com\/tutorials\/aws-amazon-datazone-tutorial-architecture-pricing-use-cases-and-hands-on-guide-for-analytics\/","title":{"rendered":"AWS Amazon DataZone Tutorial: Architecture, Pricing, Use Cases, and Hands-On Guide for Analytics"},"content":{"rendered":"\n<h2 class=\"wp-block-heading\">Category<\/h2>\n\n\n\n<p class=\"wp-block-paragraph\">Analytics<\/p>\n\n\n\n<h2 class=\"wp-block-heading\">1. Introduction<\/h2>\n\n\n\n<p class=\"wp-block-paragraph\">Amazon DataZone is an AWS Analytics service for organizing, discovering, governing, and sharing data across teams\u2014using a business-friendly data portal backed by AWS-native access controls.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">In simple terms: Amazon DataZone helps producers publish trusted datasets (\u201cdata products\u201d) and helps consumers find and request access to them, with approvals and governance built in.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Technically, Amazon DataZone provides a <strong>domain-based<\/strong> data management layer (portal, catalog, and workflows) that integrates with AWS data systems such as <strong>Amazon S3 + AWS Glue Data Catalog + AWS Lake Formation<\/strong> and <strong>Amazon Redshift<\/strong>. It captures metadata, supports curation and ownership, and orchestrates subscription workflows so access can be granted in the underlying data systems.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">It solves a common problem in modern analytics platforms: <strong>data exists everywhere, but people can\u2019t find it, trust it, or get access to it safely<\/strong>. Amazon DataZone addresses discovery (catalog\/search), trust (metadata\/ownership), and controlled sharing (requests\/approvals + integrated permissions).<\/p>\n\n\n\n<h2 class=\"wp-block-heading\">2. What is Amazon DataZone?<\/h2>\n\n\n\n<p class=\"wp-block-paragraph\"><strong>Official purpose (what AWS built it for):<\/strong> Amazon DataZone is a managed data management service that helps organizations catalog, discover, share, and govern data at scale using a data portal, data products, and workflow-driven access.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">Core capabilities<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>Data discovery and cataloging:<\/strong> Bring technical metadata from supported sources into a searchable catalog.<\/li>\n<li><strong>Business context:<\/strong> Add business metadata (descriptions, owners, glossary terms) so data is understandable to non-experts.<\/li>\n<li><strong>Data products:<\/strong> Package datasets with context and ownership so they can be shared reliably.<\/li>\n<li><strong>Subscription workflows:<\/strong> Let consumers request access; route requests to approvers; track status.<\/li>\n<li><strong>Governed access:<\/strong> Integrate with AWS governance services (notably AWS Lake Formation for data lakes) so approvals can map to permissions in the underlying data system (capabilities vary by source type\u2014verify in official docs for your connectors and blueprints).<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">Major components (mental model)<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>Domain:<\/strong> The top-level boundary (like a \u201cdata organization\u201d) that hosts the data portal and governance model.<\/li>\n<li><strong>Data portal:<\/strong> The user-facing web experience for search, publishing, and requesting access.<\/li>\n<li><strong>Projects:<\/strong> Collaboration spaces where teams publish data products (producers) or consume them (consumers).<\/li>\n<li><strong>Environments \/ Environment profiles \/ Blueprints:<\/strong> A structured way to attach real AWS data systems (for example, a data lake or data warehouse) to a project. Amazon DataZone uses these to understand where data lives and where access should be provisioned.<\/li>\n<li><strong>Data sources and runs:<\/strong> Connect to supported metadata sources (for example, AWS Glue Data Catalog) and import assets into the catalog.<\/li>\n<li><strong>Data assets:<\/strong> Catalog entries representing tables, views, files, etc. (depending on source).<\/li>\n<li><strong>Data products:<\/strong> Curated, publishable groupings of assets with business metadata and ownership.<\/li>\n<li><strong>Subscriptions:<\/strong> Requests and approvals for access to a data product (and, where supported, automated provisioning of permissions).<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">Service type<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>Managed AWS service<\/strong> with a console experience (portal) and APIs.<\/li>\n<li>Typically used as a <strong>governance and enablement layer<\/strong> above data lakes\/warehouses, not as a storage or compute engine itself.<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">Scope (regional\/global, and how to think about boundaries)<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>Regional service in practice:<\/strong> A domain is created in a specific AWS Region and is managed there. Data sources and environment integrations are typically Region-aligned.<br\/>\n<strong>Verify current Region support and cross-Region behavior in official docs<\/strong>, because supported Regions and cross-account\/cross-Region patterns evolve.<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">Fit in the AWS ecosystem<\/h3>\n\n\n\n<p class=\"wp-block-paragraph\">Amazon DataZone is not a replacement for your lake or warehouse; it sits \u201cabove\u201d them:\n&#8211; <strong>Amazon S3<\/strong> stores data.\n&#8211; <strong>AWS Glue Data Catalog<\/strong> stores table metadata for lake datasets.\n&#8211; <strong>AWS Lake Formation<\/strong> governs fine-grained access to S3\/Glue-based data lakes.\n&#8211; <strong>Amazon Athena<\/strong> queries S3 data (and uses Glue catalog metadata).\n&#8211; <strong>Amazon Redshift<\/strong> provides warehouse analytics and access controls.\n&#8211; <strong>AWS IAM Identity Center<\/strong> commonly provides workforce identity for the portal.\n&#8211; <strong>AWS CloudTrail<\/strong> provides audit trails of API activity.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\">3. Why use Amazon DataZone?<\/h2>\n\n\n\n<h3 class=\"wp-block-heading\">Business reasons<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>Faster analytics outcomes:<\/strong> Teams spend less time hunting for data and negotiating access.<\/li>\n<li><strong>Improved trust:<\/strong> Data products introduce ownership, context, and quality signals (as defined by your org).<\/li>\n<li><strong>Scalable collaboration:<\/strong> Helps align many teams around a consistent catalog and request process.<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">Technical reasons<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>Metadata unification:<\/strong> Connects technical metadata to business context without forcing you to move data.<\/li>\n<li><strong>Workflow-driven access:<\/strong> Standardizes \u201crequest \u2192 approve \u2192 provision\u201d patterns instead of ad hoc tickets.<\/li>\n<li><strong>AWS-native integration:<\/strong> Builds on AWS governance constructs (especially relevant for S3\/Glue\/Lake Formation-based lakes).<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">Operational reasons<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>Repeatable governance model:<\/strong> Domains, projects, and roles formalize who can publish and approve.<\/li>\n<li><strong>Reduced manual access management:<\/strong> Where supported, approvals can translate into permissions automatically.<\/li>\n<li><strong>Self-service discovery:<\/strong> Portal experience reduces support load on platform teams.<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">Security\/compliance reasons<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>Controlled sharing:<\/strong> Access is requested explicitly, often with approvers and audit trails.<\/li>\n<li><strong>Least privilege:<\/strong> Aligns with IAM\/Lake Formation\/Redshift controls rather than broad bucket sharing.<\/li>\n<li><strong>Auditability:<\/strong> Actions can be logged via AWS logging\/auditing tools (notably CloudTrail).<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">Scalability\/performance reasons<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Amazon DataZone primarily manages metadata and workflows; analytics performance depends on the underlying engines (Athena, Redshift, etc.). Scaling data discovery and governance becomes more manageable because it is centralized and standardized.<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">When teams should choose Amazon DataZone<\/h3>\n\n\n\n<p class=\"wp-block-paragraph\">Choose it when you need:\n&#8211; A <strong>business-friendly data portal<\/strong> for discovery and sharing\n&#8211; A <strong>data product<\/strong> model to drive ownership and reuse\n&#8211; A <strong>standard access request process<\/strong> that can integrate with AWS governance controls\n&#8211; A <strong>multi-team or multi-account<\/strong> governance layer on AWS<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">When teams should not choose Amazon DataZone<\/h3>\n\n\n\n<p class=\"wp-block-paragraph\">Avoid (or delay) it when:\n&#8211; You only have a small number of datasets and informal sharing is sufficient.\n&#8211; You need a full enterprise data governance suite with deep, vendor-agnostic capabilities across many non-AWS platforms\u2014Amazon DataZone may still fit, but validate connectors and governance requirements first.\n&#8211; Your primary challenge is data transformation\/ETL\/ELT\u2014Amazon DataZone is not an ETL service (consider AWS Glue, EMR, or managed third-party tools).\n&#8211; You need an on-prem-first catalog with extensive custom integrations; Amazon DataZone is AWS-managed and AWS-integrated by design.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\">4. Where is Amazon DataZone used?<\/h2>\n\n\n\n<h3 class=\"wp-block-heading\">Industries<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>Financial services:<\/strong> governed access, audit trails, segregation of duties.<\/li>\n<li><strong>Healthcare &amp; life sciences:<\/strong> controlled sharing of sensitive datasets and consistent dataset documentation.<\/li>\n<li><strong>Retail &amp; e-commerce:<\/strong> cross-team sharing (marketing, merchandising, supply chain) and reuse.<\/li>\n<li><strong>Media &amp; advertising:<\/strong> discoverability across event, audience, and campaign datasets.<\/li>\n<li><strong>Manufacturing &amp; IoT:<\/strong> sharing telemetry-derived datasets across engineering, operations, and analytics.<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">Team types<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Data platform teams building an internal data portal<\/li>\n<li>Data engineering teams publishing curated datasets<\/li>\n<li>Governance and security teams defining access patterns<\/li>\n<li>Analytics\/BI teams consuming governed datasets<\/li>\n<li>ML teams discovering feature-ready datasets<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">Workloads and architectures<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>Lake house<\/strong> on AWS: S3 + Glue + Lake Formation + Athena\/Redshift<\/li>\n<li>Data mesh-style organizations with domain-oriented ownership (marketing, finance, product, etc.)<\/li>\n<li>Multi-account analytics landing zones (central governance account + producer\/consumer accounts)<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">Real-world deployment contexts<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>Production:<\/strong> typically includes cross-account integration, formal roles\/groups, approval workflows, and integration with Lake Formation\/Redshift permissioning.<\/li>\n<li><strong>Dev\/test:<\/strong> often a single account and a simplified governance model to validate portal, metadata ingestion, and subscription flows.<\/li>\n<\/ul>\n\n\n\n<h2 class=\"wp-block-heading\">5. Top Use Cases and Scenarios<\/h2>\n\n\n\n<p class=\"wp-block-paragraph\">Below are realistic scenarios where Amazon DataZone is commonly applied.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">1) Enterprise data catalog for AWS lake house<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>Problem:<\/strong> Data is spread across S3\/Glue\/Athena and Redshift; people can\u2019t find the right dataset.<\/li>\n<li><strong>Why Amazon DataZone fits:<\/strong> Central portal + metadata ingestion + ownership and descriptions.<\/li>\n<li><strong>Scenario:<\/strong> A platform team sets up one domain; each data domain creates projects to publish curated \u201cgold\u201d datasets as data products.<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">2) Data mesh enablement (domain-based ownership)<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>Problem:<\/strong> Central data team becomes a bottleneck for publishing and access.<\/li>\n<li><strong>Why it fits:<\/strong> Projects + data products allow decentralization with governance guardrails.<\/li>\n<li><strong>Scenario:<\/strong> Finance, marketing, and product each publish their own data products; consumers request access through standardized workflows.<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">3) Standardized access requests and approvals<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>Problem:<\/strong> Access is handled via tickets and manual IAM\/Lake Formation changes.<\/li>\n<li><strong>Why it fits:<\/strong> Subscriptions with approval workflow; can reduce manual steps.<\/li>\n<li><strong>Scenario:<\/strong> A consumer requests access to a PII-limited dataset; approval routes to data owner and compliance approver.<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">4) Curated \u201ccertified datasets\u201d program<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>Problem:<\/strong> Analysts use inconsistent datasets and definitions.<\/li>\n<li><strong>Why it fits:<\/strong> Data products + metadata and ownership; \u201ccertified\u201d can be represented via naming\/tagging conventions and governance.<\/li>\n<li><strong>Scenario:<\/strong> The data governance team defines product templates for certified revenue metrics datasets.<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">5) Cross-account dataset sharing within an AWS Organization<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>Problem:<\/strong> Producer and consumer teams operate in separate AWS accounts for isolation.<\/li>\n<li><strong>Why it fits:<\/strong> Domain\/projects can be designed to map to multi-account environments (verify supported multi-account patterns in docs).<\/li>\n<li><strong>Scenario:<\/strong> A shared services account hosts the domain; producer accounts publish; consumer accounts subscribe.<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">6) Onboarding new analysts and engineers faster<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>Problem:<\/strong> New hires don\u2019t know what data exists or what it means.<\/li>\n<li><strong>Why it fits:<\/strong> Portal search + glossary + business descriptions reduce tribal knowledge.<\/li>\n<li><strong>Scenario:<\/strong> New analyst searches \u201ccustomer churn,\u201d finds a product with definition, owner, and sample query.<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">7) Governance for sensitive datasets (PII\/PHI)<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>Problem:<\/strong> Sensitive datasets need controlled access and traceability.<\/li>\n<li><strong>Why it fits:<\/strong> Subscription workflows + integration with AWS governance controls.<\/li>\n<li><strong>Scenario:<\/strong> HR data products require manager approval and documented purpose before access.<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">8) Rationalizing duplicate datasets<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>Problem:<\/strong> Multiple teams publish the same dataset in different places.<\/li>\n<li><strong>Why it fits:<\/strong> Central discovery and ownership highlights duplication.<\/li>\n<li><strong>Scenario:<\/strong> Two \u201corders\u201d tables exist; DataZone metadata helps identify authoritative vs legacy.<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">9) Data producer\/consumer contract management<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>Problem:<\/strong> Consumers break when producers change schema.<\/li>\n<li><strong>Why it fits:<\/strong> Data products provide a durable interface concept; governance can formalize change processes.<\/li>\n<li><strong>Scenario:<\/strong> Producer publishes v1 and v2 products; consumers migrate with communicated timelines.<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">10) Audit-friendly reporting of who has access to what<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>Problem:<\/strong> Compliance asks who accessed datasets and who approved.<\/li>\n<li><strong>Why it fits:<\/strong> Workflow trail + integration with AWS auditing (CloudTrail) and underlying service logs.<\/li>\n<li><strong>Scenario:<\/strong> Quarterly audit exports subscription approvals and validates Lake Formation grants.<\/li>\n<\/ul>\n\n\n\n<h2 class=\"wp-block-heading\">6. Core Features<\/h2>\n\n\n\n<blockquote>\n<p>Note: Feature availability and exact integrations can vary by Region and by source\/blueprint. Always validate in the official Amazon DataZone documentation for your Region and data systems.<\/p>\n<\/blockquote>\n\n\n\n<h3 class=\"wp-block-heading\">1) Domains (governance boundary)<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>What it does:<\/strong> Creates a dedicated Amazon DataZone space with a portal, user access model, and governance configuration.<\/li>\n<li><strong>Why it matters:<\/strong> Separates governance and cataloging across organizations\/units if needed.<\/li>\n<li><strong>Practical benefit:<\/strong> Clear boundary for policy, roles, and operational ownership.<\/li>\n<li><strong>Caveat:<\/strong> Domain is Region-scoped in practice; plan for Region strategy.<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">2) Data portal (search and discovery UI)<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>What it does:<\/strong> User interface to search, browse, and understand data assets and data products.<\/li>\n<li><strong>Why it matters:<\/strong> Enables self-service discovery for analysts and engineers.<\/li>\n<li><strong>Practical benefit:<\/strong> Reduced dependency on data platform teams.<\/li>\n<li><strong>Caveat:<\/strong> Users typically authenticate via AWS workforce identity integration (commonly IAM Identity Center).<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">3) Projects (team workspaces)<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>What it does:<\/strong> Organizes producers\/consumers into collaborative scopes for publishing and subscriptions.<\/li>\n<li><strong>Why it matters:<\/strong> Aligns governance and collaboration to team boundaries.<\/li>\n<li><strong>Practical benefit:<\/strong> Ownership, approvals, and assets can be managed per project.<\/li>\n<li><strong>Caveat:<\/strong> Project role design (owner\/contributor) must be planned to avoid over-permissioning.<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">4) Environments and environment profiles (link projects to real data systems)<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>What it does:<\/strong> Connects a project to a target environment where data lives or where access is provisioned.<\/li>\n<li><strong>Why it matters:<\/strong> Data governance is only effective when tied to actual systems and permissions.<\/li>\n<li><strong>Practical benefit:<\/strong> Standardizes how teams onboard new data systems.<\/li>\n<li><strong>Caveat:<\/strong> Provisioning may require prerequisite IAM roles and (for lakes) Lake Formation configuration.<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">5) Data sources and metadata ingestion (\u201cruns\u201d)<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>What it does:<\/strong> Imports technical metadata from supported systems into the Amazon DataZone catalog.<\/li>\n<li><strong>Why it matters:<\/strong> Keeps the portal aligned with real datasets.<\/li>\n<li><strong>Practical benefit:<\/strong> Reduces manual catalog entry and drift.<\/li>\n<li><strong>Caveat:<\/strong> Permissions must allow metadata read; ingestion is not the same as data movement.<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">6) Data assets (catalog entries)<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>What it does:<\/strong> Represents datasets (tables, views, etc.) discovered from a data source and enriched with metadata.<\/li>\n<li><strong>Why it matters:<\/strong> Assets are the atomic discoverable items in the portal.<\/li>\n<li><strong>Practical benefit:<\/strong> Enables searching by name, description, schema, and other metadata.<\/li>\n<li><strong>Caveat:<\/strong> Asset types depend on the connected source.<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">7) Data products (curated shareable packages)<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>What it does:<\/strong> Packages one or more assets with business metadata (purpose, owner, usage notes).<\/li>\n<li><strong>Why it matters:<\/strong> Promotes reliable reuse and establishes accountability.<\/li>\n<li><strong>Practical benefit:<\/strong> Consumers subscribe to a product rather than random tables.<\/li>\n<li><strong>Caveat:<\/strong> You need internal standards (naming, owners, lifecycle) to make products meaningful.<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">8) Subscription requests and approval workflows<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>What it does:<\/strong> Lets consumers request access; routes to approvers; tracks status.<\/li>\n<li><strong>Why it matters:<\/strong> Replaces ad hoc access paths with auditable workflows.<\/li>\n<li><strong>Practical benefit:<\/strong> Consistent governance and reduced operational chaos.<\/li>\n<li><strong>Caveat:<\/strong> Automated provisioning depends on environment\/source integration; otherwise approvals may require manual follow-up.<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">9) Integrated access provisioning (where supported)<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>What it does:<\/strong> After approvals, configures access in underlying systems (for example, data lake permissions).<\/li>\n<li><strong>Why it matters:<\/strong> Governance must translate into enforceable permissions.<\/li>\n<li><strong>Practical benefit:<\/strong> Reduces manual policy updates.<\/li>\n<li><strong>Caveat:<\/strong> The exact behavior depends on the underlying systems and your setup; verify for your blueprint\/source.<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">10) Business metadata and glossary concepts<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>What it does:<\/strong> Enables consistent business definitions and context.<\/li>\n<li><strong>Why it matters:<\/strong> Technical schemas alone are not enough for trust and comprehension.<\/li>\n<li><strong>Practical benefit:<\/strong> Reduces ambiguity (\u201cWhat is \u2018customer\u2019?\u201d).<\/li>\n<li><strong>Caveat:<\/strong> Glossary success depends on governance ownership and upkeep.<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">11) APIs, SDK support, and automation potential<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>What it does:<\/strong> Offers programmatic access to manage domains\/projects\/assets\/subscriptions (via AWS APIs\/CLI where supported).<\/li>\n<li><strong>Why it matters:<\/strong> Enables infrastructure-as-code adjacent workflows and integration with internal tooling.<\/li>\n<li><strong>Practical benefit:<\/strong> Automate onboarding, metadata updates, and reporting.<\/li>\n<li><strong>Caveat:<\/strong> Not all portal actions may be exposed equally; validate API coverage.<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">12) Auditing and observability via AWS-native tooling<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>What it does:<\/strong> Integrates with AWS audit logging for API activity (notably CloudTrail).<\/li>\n<li><strong>Why it matters:<\/strong> Governance requires traceability.<\/li>\n<li><strong>Practical benefit:<\/strong> Supports compliance and incident investigations.<\/li>\n<li><strong>Caveat:<\/strong> You must configure organization-level logging and retention to meet requirements.<\/li>\n<\/ul>\n\n\n\n<h2 class=\"wp-block-heading\">7. Architecture and How It Works<\/h2>\n\n\n\n<h3 class=\"wp-block-heading\">High-level architecture<\/h3>\n\n\n\n<p class=\"wp-block-paragraph\">Amazon DataZone sits between users and data systems:<\/p>\n\n\n\n<ol class=\"wp-block-list\">\n<li><strong>Producers<\/strong> connect data sources (like AWS Glue Data Catalog) to import metadata into Amazon DataZone.<\/li>\n<li>Producers curate <strong>assets<\/strong> into <strong>data products<\/strong> and publish them to the portal.<\/li>\n<li><strong>Consumers<\/strong> search the portal, evaluate metadata, and submit <strong>subscription requests<\/strong>.<\/li>\n<li>Approvers review requests; when approved, Amazon DataZone (where supported) provisions permissions in the underlying data system (for example, Lake Formation grants) and tracks subscription status.<\/li>\n<\/ol>\n\n\n\n<h3 class=\"wp-block-heading\">Control plane vs data plane<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>Control plane (Amazon DataZone):<\/strong> metadata, workflows, approvals, catalog, and portal.<\/li>\n<li><strong>Data plane (your data systems):<\/strong> S3\/Glue\/Athena\/Redshift where data is stored and queried.<\/li>\n<li>Important: Amazon DataZone generally does not become your query engine; your analytics engines remain Athena, Redshift, etc.<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">Integrations and dependency services (common)<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>AWS IAM Identity Center<\/strong> for user sign-in to the portal (verify identity prerequisites for your setup).<\/li>\n<li><strong>AWS Glue Data Catalog<\/strong> as a metadata source for lake datasets.<\/li>\n<li><strong>AWS Lake Formation<\/strong> for governed permissions in S3-based data lakes.<\/li>\n<li><strong>Amazon Athena<\/strong> for querying S3 datasets.<\/li>\n<li><strong>Amazon Redshift<\/strong> for warehouse datasets and access control (verify supported patterns).<\/li>\n<li><strong>AWS CloudTrail<\/strong> for auditing.<\/li>\n<li><strong>AWS KMS<\/strong> for encryption at rest (service-managed or customer-managed keys depending on configuration; verify options in your Region).<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">Security\/authentication model (typical)<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li>End users authenticate to the portal (commonly via IAM Identity Center).<\/li>\n<li>Amazon DataZone uses AWS IAM roles to access metadata sources and to provision permissions.<\/li>\n<li>Fine-grained access enforcement occurs in the underlying data system (Lake Formation, Redshift, etc.).<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">Networking model (typical)<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Portal access is over standard AWS service endpoints.<\/li>\n<li>Access to your data plane remains within your AWS accounts and VPCs as configured for Athena\/Redshift and S3.<\/li>\n<li>If you require private connectivity (VPC endpoints\/PrivateLink), <strong>verify current support for Amazon DataZone endpoints<\/strong> in the official docs.<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">Monitoring\/logging\/governance considerations<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Use <strong>CloudTrail<\/strong> to log Amazon DataZone API calls and correlate with user activity.<\/li>\n<li>Monitor provisioning failures (environment creation, subscription provisioning) via the Amazon DataZone console and underlying services (CloudFormation\/Service Catalog\/Lake Formation\/Redshift logs depending on blueprint).<\/li>\n<li>Establish governance KPIs: number of published products, subscription turnaround time, rejected requests, stale assets.<\/li>\n<\/ul>\n\n\n\n<h4 class=\"wp-block-heading\">Simple architecture diagram (conceptual)<\/h4>\n\n\n\n<pre><code class=\"language-mermaid\">flowchart LR\n  U1[Producer] --&gt; P[Amazon DataZone Portal]\n  U2[Consumer] --&gt; P\n\n  subgraph DZ[Amazon DataZone Domain]\n    P --&gt; C[Catalog &amp; Metadata]\n    P --&gt; W[Subscription Workflow]\n  end\n\n  subgraph Lake[AWS Data Lake]\n    S3[(Amazon S3)]\n    Glue[(AWS Glue Data Catalog)]\n    LF[AWS Lake Formation]\n    Athena[Amazon Athena]\n    S3 --- Glue\n    LF --- S3\n    Athena --- Glue\n    Athena --- S3\n  end\n\n  C &lt;---&gt; Glue\n  W --&gt; LF\n<\/code><\/pre>\n\n\n\n<h4 class=\"wp-block-heading\">Production-style architecture diagram (multi-account, governed)<\/h4>\n\n\n\n<pre><code class=\"language-mermaid\">flowchart TB\n  subgraph IdP[Workforce Identity]\n    IIC[AWS IAM Identity Center]\n  end\n\n  subgraph GovAcct[Shared Services \/ Governance Account]\n    DZ[Amazon DataZone Domain + Portal]\n    CT[Org CloudTrail + Central Log Archive]\n  end\n\n  subgraph ProdAcct1[Producer Account(s)]\n    S3P[(S3 data lake buckets)]\n    GlueP[(Glue Data Catalog)]\n    LFP[AWS Lake Formation]\n    ETL[AWS Glue \/ ETL jobs]\n  end\n\n  subgraph ConsAcct1[Consumer Account(s)]\n    AthenaC[Amazon Athena]\n    RedshiftC[Amazon Redshift]\n    RoleC[IAM roles for consumers]\n  end\n\n  IIC --&gt; DZ\n  DZ --&gt;|Ingest metadata| GlueP\n  DZ --&gt;|Subscription approvals| DZ\n  DZ --&gt;|Provision governed access (where supported)| LFP\n  LFP --&gt;|Permissions| S3P\n\n  AthenaC --&gt; GlueP\n  AthenaC --&gt; S3P\n\n  DZ --&gt; CT\n  GlueP --&gt; CT\n  LFP --&gt; CT\n  AthenaC --&gt; CT\n  RedshiftC --&gt; CT\n<\/code><\/pre>\n\n\n\n<h2 class=\"wp-block-heading\">8. Prerequisites<\/h2>\n\n\n\n<h3 class=\"wp-block-heading\">AWS account and org prerequisites<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li>An AWS account where you can create and manage Amazon DataZone resources.<\/li>\n<li>For enterprise patterns: AWS Organizations and multiple accounts are common, but not required for this lab.<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">Permissions \/ IAM roles<\/h3>\n\n\n\n<p class=\"wp-block-paragraph\">You typically need permissions to:\n&#8211; Create and manage an Amazon DataZone domain and projects\n&#8211; Configure IAM Identity Center (if not already configured)\n&#8211; Read metadata from AWS Glue Data Catalog\n&#8211; Create S3 buckets and upload a small dataset\n&#8211; Run Athena queries and create Glue catalog objects (database\/table)\n&#8211; (Optional but common) Configure Lake Formation administrators and grants<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">A practical starting point for a lab:\n&#8211; Admin-level permissions in a sandbox account, or a role that includes:\n  &#8211; <code>AmazonDataZoneFullAccess<\/code> (or equivalent fine-grained permissions\u2014verify AWS managed policies available in your account)\n  &#8211; <code>IAMFullAccess<\/code> (for role creation\/inspection during lab)\n  &#8211; <code>AmazonS3FullAccess<\/code>\n  &#8211; <code>AmazonAthenaFullAccess<\/code>\n  &#8211; <code>AWSGlueConsoleFullAccess<\/code>\n  &#8211; <code>AWSLakeFormationDataAdmin<\/code> (if Lake Formation governance is part of your flow)<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">Billing requirements<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li>A billable AWS account.<\/li>\n<li>Amazon DataZone has its own pricing plus indirect costs from S3, Glue, Athena, Lake Formation, and data transfer.<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">Tools needed<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li>AWS Console access<\/li>\n<li>AWS CLI v2 (optional but useful): https:\/\/docs.aws.amazon.com\/cli\/latest\/userguide\/getting-started-install.html<\/li>\n<li>A text editor for CSV data<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">Region availability<\/h3>\n\n\n\n<p class=\"wp-block-paragraph\">Amazon DataZone is not available in every Region.\n&#8211; Verify: https:\/\/aws.amazon.com\/about-aws\/global-infrastructure\/regional-product-services\/<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">Quotas\/limits<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Check <strong>Service Quotas<\/strong> for Amazon DataZone in your Region.<\/li>\n<li>Also consider quotas for Glue, Athena, Lake Formation, IAM Identity Center, and CloudFormation\/Service Catalog if used by environment provisioning.<br\/>\n  Verify quotas in the Service Quotas console and the Amazon DataZone documentation.<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">Prerequisite services<\/h3>\n\n\n\n<p class=\"wp-block-paragraph\">Common dependencies in real deployments:\n&#8211; AWS IAM Identity Center\n&#8211; AWS Glue Data Catalog\n&#8211; Amazon S3\n&#8211; Amazon Athena and\/or Amazon Redshift\n&#8211; AWS Lake Formation (for governed S3 lake access)<\/p>\n\n\n\n<h2 class=\"wp-block-heading\">9. Pricing \/ Cost<\/h2>\n\n\n\n<p class=\"wp-block-paragraph\">Amazon DataZone pricing is usage-based. The exact dimensions and rates can change and can be Region-dependent.<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Official pricing page: https:\/\/aws.amazon.com\/datazone\/pricing\/<\/li>\n<li>AWS Pricing Calculator: https:\/\/calculator.aws\/#\/<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">Pricing dimensions (what typically drives cost)<\/h3>\n\n\n\n<p class=\"wp-block-paragraph\">From AWS\u2019s standard approach for governance\/catalog services, Amazon DataZone commonly charges based on:\n&#8211; <strong>Domain usage<\/strong> (for example, an hourly \u201cdomain\u201d charge), and\/or\n&#8211; <strong>User-based access<\/strong> (for example, active users per month)<\/p>\n\n\n\n<p class=\"wp-block-paragraph\"><strong>Verify the current pricing dimensions and definitions on the official pricing page<\/strong>, including how AWS defines \u201cusers\u201d (for example, portal users) and what constitutes billable domain time.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">Free tier<\/h3>\n\n\n\n<p class=\"wp-block-paragraph\">Amazon DataZone may or may not have a free tier or trial in your Region\/time period.\n&#8211; <strong>Verify on the official pricing page<\/strong>.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">Direct cost drivers<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Number of domains<\/li>\n<li>Number of users accessing the portal (depending on pricing model)<\/li>\n<li>Duration domains are running (if domain-hour based)<\/li>\n<li>Automation\/API usage (if requests are billed\u2014verify)<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">Hidden or indirect costs (often bigger than the DataZone line item)<\/h3>\n\n\n\n<p class=\"wp-block-paragraph\">Even if Amazon DataZone costs are modest, the total solution cost often comes from:\n&#8211; <strong>S3 storage<\/strong> (data lake)\n&#8211; <strong>Athena query scanning<\/strong> (pay-per-data-scanned)\n&#8211; <strong>Glue<\/strong> (crawlers, jobs, Data Catalog requests)\n&#8211; <strong>Redshift<\/strong> (clusters\/serverless usage)\n&#8211; <strong>CloudTrail + log storage<\/strong> (especially org-wide)\n&#8211; <strong>Data transfer<\/strong> (cross-Region, cross-account, NAT gateways; depends on architecture)<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">Network\/data transfer implications<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li>DataZone mostly handles metadata and control-plane actions, but if your consumers query data across accounts\/Regions or through NAT gateways, data transfer can dominate costs.<\/li>\n<li>Prefer same-Region access patterns for lake queries where possible.<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">How to optimize cost<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Use <strong>one domain per governance boundary<\/strong>, not per team, unless required.<\/li>\n<li>Manage portal access through groups; remove inactive users promptly.<\/li>\n<li>Keep dev\/test domains short-lived if domain-hour pricing applies.<\/li>\n<li>Reduce Athena scan costs with partitioning and columnar formats (Parquet\/ORC) and use workgroup controls.<\/li>\n<li>Use Glue crawlers sparingly; prefer schema-on-read or controlled DDL where possible.<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">Example low-cost starter estimate (formula-based, no fabricated numbers)<\/h3>\n\n\n\n<p class=\"wp-block-paragraph\">A minimal pilot often includes:\n&#8211; 1 Amazon DataZone domain\n&#8211; 10\u201350 portal users\n&#8211; A small S3 bucket with sample data\n&#8211; A few Athena queries and a Glue database\/table<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Estimate:\n&#8211; <strong>Amazon DataZone<\/strong>: (domain_rate \u00d7 hours) + (user_rate \u00d7 users) <em>(verify rates)<\/em>\n&#8211; <strong>S3<\/strong>: storage_gb_month + requests\n&#8211; <strong>Athena<\/strong>: data_scanned_gb \u00d7 rate_per_tb <em>(verify)<\/em>\n&#8211; <strong>Glue<\/strong>: crawler_minutes\/job_minutes <em>(if used)<\/em><\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Use the AWS Pricing Calculator to model your specific Region and usage.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">Example production cost considerations<\/h3>\n\n\n\n<p class=\"wp-block-paragraph\">In production, cost drivers shift:\n&#8211; Many more users (hundreds\/thousands)\n&#8211; Multiple accounts and environments\n&#8211; Larger query volumes (Athena\/Redshift)\n&#8211; Higher logging and retention requirements\n&#8211; More governance overhead (crawlers, metadata refresh schedules)<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">A best practice is to track:\n&#8211; cost per domain\n&#8211; cost per active portal user\n&#8211; cost per subscribed product\n&#8211; cost per query workload (Athena\/Redshift)<\/p>\n\n\n\n<h2 class=\"wp-block-heading\">10. Step-by-Step Hands-On Tutorial<\/h2>\n\n\n\n<p class=\"wp-block-paragraph\">This lab walks you through a small, realistic Amazon DataZone setup in a single AWS account:\n&#8211; Create an Amazon DataZone domain\n&#8211; Create producer and consumer projects\n&#8211; Register a Glue Data Catalog table as a discoverable asset\n&#8211; Publish a data product\n&#8211; Request and approve a subscription\n&#8211; Validate outcomes and clean up safely<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">Objective<\/h3>\n\n\n\n<p class=\"wp-block-paragraph\">Build a minimal governed data-sharing workflow using Amazon DataZone with an S3 + Athena + Glue dataset.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">Lab Overview<\/h3>\n\n\n\n<p class=\"wp-block-paragraph\">You will:\n1. Prepare a small dataset in Amazon S3 and register it in the AWS Glue Data Catalog (via Athena DDL).\n2. Create an Amazon DataZone domain (portal).\n3. Create two projects: <strong>Producer<\/strong> and <strong>Consumer<\/strong>.\n4. Configure a metadata source (Glue Data Catalog) and run ingestion.\n5. Create and publish a data product from the ingested asset.\n6. Request access to the data product from the consumer project and approve it.\n7. Validate that the subscription completes (and optionally validate Lake Formation grants if your setup uses LF).<\/p>\n\n\n\n<blockquote>\n<p>Notes before you begin:\n&#8211; Amazon DataZone requires a supported Region. Pick one where it is available.\n&#8211; Identity setup varies by account. Many setups use IAM Identity Center.\n&#8211; If any step differs in your console (AWS updates UI often), follow the closest equivalent in the official docs: https:\/\/docs.aws.amazon.com\/datazone\/<\/p>\n<\/blockquote>\n\n\n\n<hr class=\"wp-block-separator\" \/>\n\n\n\n<h3 class=\"wp-block-heading\">Step 1: Choose a Region and confirm prerequisites<\/h3>\n\n\n\n<ol class=\"wp-block-list\">\n<li>In the AWS Console, switch to a Region where Amazon DataZone is supported.<\/li>\n<li>Confirm you can access:\n   &#8211; Amazon S3\n   &#8211; Amazon Athena\n   &#8211; AWS Glue Data Catalog\n   &#8211; Amazon DataZone<\/li>\n<\/ol>\n\n\n\n<p class=\"wp-block-paragraph\"><strong>Expected outcome:<\/strong> You have selected a supported Region and can open the Amazon DataZone console.<\/p>\n\n\n\n<hr class=\"wp-block-separator\" \/>\n\n\n\n<h3 class=\"wp-block-heading\">Step 2: Create a small dataset in Amazon S3<\/h3>\n\n\n\n<p class=\"wp-block-paragraph\">Create a bucket (choose a globally unique name):<\/p>\n\n\n\n<ol class=\"wp-block-list\">\n<li>Go to <strong>Amazon S3 \u2192 Buckets \u2192 Create bucket<\/strong><\/li>\n<li>Bucket name: <code>datazone-lab-&lt;yourname&gt;-&lt;region&gt;-&lt;unique&gt;<\/code><\/li>\n<li>Keep defaults (block public access ON).<\/li>\n<\/ol>\n\n\n\n<p class=\"wp-block-paragraph\">Create a small CSV file named <code>customers.csv<\/code>:<\/p>\n\n\n\n<pre><code class=\"language-csv\">customer_id,customer_name,segment,signup_date,country\n1,Acme Corp,enterprise,2024-01-10,US\n2,Bluebird Ltd,midmarket,2024-02-12,GB\n3,Cedar GmbH,enterprise,2024-03-05,DE\n4,Delta Co,smb,2024-03-18,US\n<\/code><\/pre>\n\n\n\n<p class=\"wp-block-paragraph\">Upload it to S3:\n&#8211; Create a folder prefix: <code>datasets\/customers\/<\/code>\n&#8211; Upload <code>customers.csv<\/code> to <code>s3:\/\/&lt;your-bucket&gt;\/datasets\/customers\/customers.csv<\/code><\/p>\n\n\n\n<p class=\"wp-block-paragraph\"><strong>Expected outcome:<\/strong> The CSV file is stored in S3 at a stable location.<\/p>\n\n\n\n<hr class=\"wp-block-separator\" \/>\n\n\n\n<h3 class=\"wp-block-heading\">Step 3: Create an Athena database and external table (Glue Data Catalog)<\/h3>\n\n\n\n<ol class=\"wp-block-list\">\n<li>Open <strong>Amazon Athena<\/strong>.<\/li>\n<li>\n<p>Set a query results location (required):<br\/>\n   &#8211; Athena \u2192 Settings \u2192 Manage \u2192 set query result location to something like:<br\/>\n<code>s3:\/\/&lt;your-bucket&gt;\/athena-results\/<\/code><\/p>\n<\/li>\n<li>\n<p>In Athena, run:<\/p>\n<\/li>\n<\/ol>\n\n\n\n<pre><code class=\"language-sql\">CREATE DATABASE IF NOT EXISTS datazone_lab;\n<\/code><\/pre>\n\n\n\n<ol class=\"wp-block-list\" start=\"4\">\n<li>Create an external table pointing to your CSV:<\/li>\n<\/ol>\n\n\n\n<pre><code class=\"language-sql\">CREATE EXTERNAL TABLE IF NOT EXISTS datazone_lab.customers_csv (\n  customer_id int,\n  customer_name string,\n  segment string,\n  signup_date date,\n  country string\n)\nROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.OpenCSVSerde'\nWITH SERDEPROPERTIES (\n  'separatorChar' = ',',\n  'quoteChar'     = '\\\"',\n  'escapeChar'    = '\\\\'\n)\nSTORED AS TEXTFILE\nLOCATION 's3:\/\/&lt;your-bucket&gt;\/datasets\/customers\/'\nTBLPROPERTIES ('skip.header.line.count'='1');\n<\/code><\/pre>\n\n\n\n<ol class=\"wp-block-list\" start=\"5\">\n<li>Validate by querying:<\/li>\n<\/ol>\n\n\n\n<pre><code class=\"language-sql\">SELECT * FROM datazone_lab.customers_csv LIMIT 10;\n<\/code><\/pre>\n\n\n\n<p class=\"wp-block-paragraph\"><strong>Expected outcome:<\/strong> Query returns 4 rows. The table appears in the <strong>AWS Glue Data Catalog<\/strong> database <code>datazone_lab<\/code>.<\/p>\n\n\n\n<hr class=\"wp-block-separator\" \/>\n\n\n\n<h3 class=\"wp-block-heading\">Step 4: Enable IAM Identity Center (if required) and create two users<\/h3>\n\n\n\n<p class=\"wp-block-paragraph\">Amazon DataZone commonly uses IAM Identity Center for portal authentication.<\/p>\n\n\n\n<ol class=\"wp-block-list\">\n<li>Open <strong>IAM Identity Center<\/strong> in the same Region (or the Region required by your Identity Center setup).<\/li>\n<li>If not enabled, enable IAM Identity Center (console will guide you).<br\/>\n   &#8211; If your organization already has it enabled, reuse it.<\/li>\n<\/ol>\n\n\n\n<p class=\"wp-block-paragraph\">Create two users (example):\n&#8211; <code>producer.user@yourcompany.com<\/code>\n&#8211; <code>consumer.user@yourcompany.com<\/code><\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Optionally create two groups:\n&#8211; <code>datazone-producers<\/code>\n&#8211; <code>datazone-consumers<\/code><\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Assign the users to groups.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\"><strong>Expected outcome:<\/strong> You have two identities you can use to test producer and consumer workflows.<\/p>\n\n\n\n<blockquote>\n<p>If your organization uses an external IdP (Okta\/Azure AD\/etc.), follow your established process. The goal is simply: two distinct users can sign in to the DataZone portal.<\/p>\n<\/blockquote>\n\n\n\n<hr class=\"wp-block-separator\" \/>\n\n\n\n<h3 class=\"wp-block-heading\">Step 5: Create an Amazon DataZone domain<\/h3>\n\n\n\n<ol class=\"wp-block-list\">\n<li>Open <strong>Amazon DataZone<\/strong> console.<\/li>\n<li>Choose <strong>Create domain<\/strong>.<\/li>\n<li>Provide:\n   &#8211; Domain name: <code>datazone-lab-domain<\/code>\n   &#8211; Description: optional<\/li>\n<li>Identity configuration:\n   &#8211; Select your IAM Identity Center instance as prompted.<\/li>\n<li>IAM role:\n   &#8211; If the console offers to create a domain execution role, allow it (recommended for labs).\n   &#8211; If you must specify a role, follow the console guidance and ensure it can access required services.<\/li>\n<\/ol>\n\n\n\n<p class=\"wp-block-paragraph\">Create the domain.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\"><strong>Expected outcome:<\/strong> Domain is created and you can open the <strong>Data portal<\/strong> for that domain.<\/p>\n\n\n\n<hr class=\"wp-block-separator\" \/>\n\n\n\n<h3 class=\"wp-block-heading\">Step 6: Create Producer and Consumer projects<\/h3>\n\n\n\n<p class=\"wp-block-paragraph\">In the domain:<\/p>\n\n\n\n<ol class=\"wp-block-list\">\n<li>Go to <strong>Projects \u2192 Create project<\/strong><\/li>\n<li>Create:\n   &#8211; Project 1: <code>ProducerProject<\/code>\n   &#8211; Project 2: <code>ConsumerProject<\/code><\/li>\n<\/ol>\n\n\n\n<p class=\"wp-block-paragraph\">Assign members:\n&#8211; Add <code>producer.user<\/code> to <code>ProducerProject<\/code> with a role that can publish (for example, project owner\/contributor as appropriate).\n&#8211; Add <code>consumer.user<\/code> to <code>ConsumerProject<\/code>.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\"><strong>Expected outcome:<\/strong> Two projects exist, each with the right user assigned.<\/p>\n\n\n\n<hr class=\"wp-block-separator\" \/>\n\n\n\n<h3 class=\"wp-block-heading\">Step 7: Configure a data source to ingest Glue Data Catalog metadata<\/h3>\n\n\n\n<p class=\"wp-block-paragraph\">Now connect Amazon DataZone to the AWS Glue Data Catalog database you created.<\/p>\n\n\n\n<ol class=\"wp-block-list\">\n<li>Open the <strong>Data portal<\/strong> and switch to <code>ProducerProject<\/code>.<\/li>\n<li>Find <strong>Data sources<\/strong> (naming may vary slightly) and choose <strong>Create data source<\/strong>.<\/li>\n<li>Choose a source type that corresponds to <strong>AWS Glue Data Catalog<\/strong> (or \u201cData lake catalog\u201d\/Athena\/Glue integration\u2014follow console options).<\/li>\n<li>Configure:\n   &#8211; Target database: <code>datazone_lab<\/code>\n   &#8211; Scope: include the <code>customers_csv<\/code> table\n   &#8211; Schedule: \u201crun on demand\u201d for the lab<\/li>\n<\/ol>\n\n\n\n<p class=\"wp-block-paragraph\">Run the data source ingestion (\u201cRun\u201d \/ \u201cStart run\u201d).<\/p>\n\n\n\n<p class=\"wp-block-paragraph\"><strong>Expected outcome:<\/strong> The ingestion run completes successfully and the <code>customers_csv<\/code> asset appears in the Amazon DataZone catalog for the domain.<\/p>\n\n\n\n<blockquote>\n<p>If the run fails due to permissions:\n&#8211; Confirm the domain execution role (or relevant DataZone roles) can read Glue catalog metadata.\n&#8211; If Lake Formation is enforcing permissions on the Data Catalog, you may need to grant metadata access. This is common in governed environments\u2014see Troubleshooting below.<\/p>\n<\/blockquote>\n\n\n\n<hr class=\"wp-block-separator\" \/>\n\n\n\n<h3 class=\"wp-block-heading\">Step 8: Create and publish a data product<\/h3>\n\n\n\n<ol class=\"wp-block-list\">\n<li>In <code>ProducerProject<\/code>, find the ingested asset <code>customers_csv<\/code>.<\/li>\n<li>Add business metadata:\n   &#8211; Description: \u201cCustomer master list for analytics training\u201d\n   &#8211; Owner: assign yourself or the producer project owner\n   &#8211; Add keywords\/tags like <code>customer<\/code>, <code>training<\/code>, <code>demo<\/code><\/li>\n<li>Create a <strong>Data product<\/strong>:\n   &#8211; Name: <code>Customer Master (Lab)<\/code>\n   &#8211; Add the asset <code>datazone_lab.customers_csv<\/code>\n   &#8211; Add usage guidance: example query and data freshness notes<\/li>\n<li>Publish the data product to the catalog.<\/li>\n<\/ol>\n\n\n\n<p class=\"wp-block-paragraph\"><strong>Expected outcome:<\/strong> The product is discoverable in the portal by other users\/projects.<\/p>\n\n\n\n<hr class=\"wp-block-separator\" \/>\n\n\n\n<h3 class=\"wp-block-heading\">Step 9: Request a subscription from the Consumer project<\/h3>\n\n\n\n<p class=\"wp-block-paragraph\">Sign in as the consumer user (or use portal impersonation if your org allows it):<\/p>\n\n\n\n<ol class=\"wp-block-list\">\n<li>Open the Amazon DataZone portal.<\/li>\n<li>Switch to <code>ConsumerProject<\/code>.<\/li>\n<li>Search for the published data product: <code>Customer Master (Lab)<\/code>.<\/li>\n<li>Open it and choose <strong>Request subscription<\/strong> (or similar).<\/li>\n<li>Provide a reason:\n   &#8211; \u201cNeed to build a churn dashboard training exercise\u201d<\/li>\n<\/ol>\n\n\n\n<p class=\"wp-block-paragraph\">Submit the request.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\"><strong>Expected outcome:<\/strong> Subscription request is created and awaits approval.<\/p>\n\n\n\n<hr class=\"wp-block-separator\" \/>\n\n\n\n<h3 class=\"wp-block-heading\">Step 10: Approve the subscription request<\/h3>\n\n\n\n<p class=\"wp-block-paragraph\">Sign in as the producer (or the designated approver\/owner):<\/p>\n\n\n\n<ol class=\"wp-block-list\">\n<li>Open the portal and go to notifications or subscription requests.<\/li>\n<li>Review the request details (requester, purpose, product).<\/li>\n<li>Approve the request.<\/li>\n<\/ol>\n\n\n\n<p class=\"wp-block-paragraph\"><strong>Expected outcome:<\/strong> Subscription status changes to approved and eventually to fulfilled\/provisioned (wording varies).<\/p>\n\n\n\n<blockquote>\n<p>Provisioning behavior depends on how environments are configured and what source system is used. In some setups, approval triggers automated permissions in Lake Formation\/Redshift; in others, it may require additional steps. <strong>Verify your environment blueprint and integration behavior in the docs.<\/strong><\/p>\n<\/blockquote>\n\n\n\n<hr class=\"wp-block-separator\" \/>\n\n\n\n<h3 class=\"wp-block-heading\">Validation<\/h3>\n\n\n\n<p class=\"wp-block-paragraph\">Use the following checks:<\/p>\n\n\n\n<ol class=\"wp-block-list\">\n<li>\n<p><strong>Portal visibility<\/strong>\n   &#8211; As consumer, confirm the subscription is active and the product appears in your project\u2019s subscribed items.<\/p>\n<\/li>\n<li>\n<p><strong>Metadata correctness<\/strong>\n   &#8211; Verify the asset schema is visible in the portal.\n   &#8211; Verify business description, owner, and tags display.<\/p>\n<\/li>\n<li>\n<p><strong>Optional: Validate permissions at the data plane<\/strong>\n   &#8211; If using Lake Formation governed access, check <strong>Lake Formation \u2192 Permissions<\/strong> for grants related to the dataset and consumer principals\/roles.\n   &#8211; If using Athena, attempt to run:\n     <code>sql\n     SELECT COUNT(*) FROM datazone_lab.customers_csv;<\/code>\n     using the consumer\u2019s authorized path (this may require role-based access configuration beyond the portal\u2014verify your organization\u2019s access model).<\/p>\n<\/li>\n<\/ol>\n\n\n\n<p class=\"wp-block-paragraph\"><strong>Expected outcome:<\/strong> Data product is subscribed, and (where configured) access is granted in the underlying system.<\/p>\n\n\n\n<hr class=\"wp-block-separator\" \/>\n\n\n\n<h3 class=\"wp-block-heading\">Troubleshooting<\/h3>\n\n\n\n<p class=\"wp-block-paragraph\">Common issues and fixes:<\/p>\n\n\n\n<ol class=\"wp-block-list\">\n<li>\n<p><strong>\u201cAmazon DataZone is not available in this Region\u201d<\/strong>\n   &#8211; Switch to a supported Region.\n   &#8211; Verify Region support list: https:\/\/aws.amazon.com\/about-aws\/global-infrastructure\/regional-product-services\/<\/p>\n<\/li>\n<li>\n<p><strong>Cannot create domain \/ identity center issues<\/strong>\n   &#8211; Confirm IAM Identity Center is enabled and you selected the correct instance.\n   &#8211; In AWS Organizations, ensure you are using the correct delegated admin settings for IAM Identity Center (org-specific).<\/p>\n<\/li>\n<li>\n<p><strong>Data source run fails with AccessDenied (Glue\/Lake Formation)<\/strong>\n   &#8211; Confirm the domain execution role (and\/or the configured DataZone roles) can read Glue metadata.\n   &#8211; If Lake Formation is enabled, you may need to grant the role permission to describe databases\/tables and\/or access underlying locations.<br\/>\n     Validate with Lake Formation administrators. (Exact grants depend on your LF mode\u2014IAM-only vs LF-managed\u2014verify in docs.)<\/p>\n<\/li>\n<li>\n<p><strong>Subscription approved but not provisioned<\/strong>\n   &#8211; Check whether you configured environments\/environment profiles needed for automated provisioning.\n   &#8211; Inspect underlying provisioning services used by your blueprint (often CloudFormation\/Service Catalog patterns\u2014verify for your setup).\n   &#8211; Check CloudTrail for failed API calls and relevant service logs.<\/p>\n<\/li>\n<li>\n<p><strong>Athena query fails after subscription<\/strong>\n   &#8211; Subscription in DataZone does not automatically guarantee your Athena console session has the right IAM\/Lake Formation permissions.\n   &#8211; Ensure the querying principal matches the role\/user to which permissions were granted.<\/p>\n<\/li>\n<\/ol>\n\n\n\n<hr class=\"wp-block-separator\" \/>\n\n\n\n<h3 class=\"wp-block-heading\">Cleanup<\/h3>\n\n\n\n<p class=\"wp-block-paragraph\">To avoid ongoing charges and leftover resources:<\/p>\n\n\n\n<ol class=\"wp-block-list\">\n<li>\n<p><strong>Amazon DataZone<\/strong>\n   &#8211; Delete subscription requests\/subscriptions (if required by UI).\n   &#8211; Delete data products (if required).\n   &#8211; Delete data sources.\n   &#8211; Delete projects.\n   &#8211; Delete the domain.<\/p>\n<\/li>\n<li>\n<p><strong>Athena\/Glue<\/strong>\n   &#8211; Drop the table:\n     <code>sql\n     DROP TABLE IF EXISTS datazone_lab.customers_csv;<\/code>\n   &#8211; Drop the database:\n     <code>sql\n     DROP DATABASE IF EXISTS datazone_lab;<\/code><\/p>\n<\/li>\n<li>\n<p><strong>S3<\/strong>\n   &#8211; Delete uploaded data: <code>datasets\/customers\/customers.csv<\/code>\n   &#8211; Delete Athena results prefix: <code>athena-results\/<\/code>\n   &#8211; Empty and delete the bucket (if it was created only for this lab).<\/p>\n<\/li>\n<li>\n<p><strong>Identity Center<\/strong>\n   &#8211; Delete lab users\/groups if they were created solely for the lab.<\/p>\n<\/li>\n<\/ol>\n\n\n\n<p class=\"wp-block-paragraph\"><strong>Expected outcome:<\/strong> No active DataZone domain, no leftover S3 data, and no ongoing costs from lab resources.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\">11. Best Practices<\/h2>\n\n\n\n<h3 class=\"wp-block-heading\">Architecture best practices<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>Start with one domain per governance boundary.<\/strong> Too many domains fragment discovery.<\/li>\n<li><strong>Align projects to teams or data domains.<\/strong> Producer projects map to ownership boundaries; consumer projects map to consuming teams\/apps.<\/li>\n<li><strong>Standardize environments.<\/strong> Use consistent environment profiles\/blueprints to reduce drift.<\/li>\n<li><strong>Treat data products as stable contracts.<\/strong> Include ownership, SLA\/freshness, schema expectations, and change process.<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">IAM\/security best practices<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>Use groups (Identity Center) not individual users<\/strong> for project membership where possible.<\/li>\n<li><strong>Separate producer and consumer roles<\/strong>; avoid letting consumers publish production products.<\/li>\n<li><strong>Minimize domain admin count.<\/strong> Domain admins typically have broad powers.<\/li>\n<li><strong>Use least privilege for execution\/provisioning roles.<\/strong> Scope to required services, databases, buckets, and LF permissions.<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">Cost best practices<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>Limit portal access to real users<\/strong> who need it; remove inactive accounts.<\/li>\n<li><strong>Keep dev\/test domains short-lived<\/strong> if domain-hour pricing applies.<\/li>\n<li><strong>Optimize Athena costs<\/strong> (partitioning, Parquet, workgroups, query limits).<\/li>\n<li><strong>Schedule metadata refreshes thoughtfully.<\/strong> Avoid overly frequent ingestion runs if they trigger other service costs.<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">Performance best practices<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li>DataZone performance is mostly about catalog usability:<\/li>\n<li>Keep naming consistent and searchable.<\/li>\n<li>Use tags\/keywords and strong descriptions.<\/li>\n<li>Avoid dumping raw assets without curation\u2014curated products reduce time-to-find.<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">Reliability best practices<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Treat environment provisioning as infrastructure:<\/li>\n<li>Standardize via templates and change control.<\/li>\n<li>Monitor provisioning failures and retry patterns.<\/li>\n<li>Ensure logging and audit trails are enabled org-wide.<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">Operations best practices<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>Define operational ownership:<\/strong> who maintains data sources, resolves ingestion failures, and manages domain config.<\/li>\n<li><strong>Create runbooks:<\/strong> common issues (LF permission issues, failed provisioning, stale metadata).<\/li>\n<li><strong>Measure governance:<\/strong> subscription lead time, number of certified products, stale products.<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">Governance\/tagging\/naming best practices<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Establish conventions:<\/li>\n<li>Product names: <code>Domain - Subject - Level (Bronze\/Silver\/Gold)<\/code><\/li>\n<li>Owners: always set owner group + primary contact<\/li>\n<li>Tags: <code>pii<\/code>, <code>confidential<\/code>, <code>public<\/code>, <code>finance<\/code>, <code>marketing<\/code>, <code>customer<\/code><\/li>\n<li>Add lifecycle metadata: deprecated date, replacement product, update cadence.<\/li>\n<\/ul>\n\n\n\n<h2 class=\"wp-block-heading\">12. Security Considerations<\/h2>\n\n\n\n<h3 class=\"wp-block-heading\">Identity and access model<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Portal users typically authenticate via <strong>AWS IAM Identity Center<\/strong>.<\/li>\n<li>Permissions in Amazon DataZone control who can:<\/li>\n<li>create\/manage domains<\/li>\n<li>create projects<\/li>\n<li>publish products<\/li>\n<li>approve subscriptions<\/li>\n<li>Data access enforcement occurs in underlying systems:<\/li>\n<li><strong>Lake Formation<\/strong> for S3\/Glue-based lakes<\/li>\n<li><strong>Redshift<\/strong> grants for warehouses (verify integration specifics)<\/li>\n<\/ul>\n\n\n\n<p class=\"wp-block-paragraph\"><strong>Recommendation:<\/strong> Treat Amazon DataZone as your governance UX and workflow, but validate that the data plane permissions match your compliance requirements.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">Encryption<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li>AWS services typically encrypt data at rest and in transit.<\/li>\n<li>Amazon DataZone metadata storage and integrations may support AWS KMS keys depending on configuration.<br\/>\n<strong>Verify current KMS options in the official docs for your Region.<\/strong><\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">Network exposure<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Users access the portal via AWS endpoints.<\/li>\n<li>Your data remains in your accounts; network controls (VPC endpoints, private subnets, security groups) apply to Athena\/Redshift\/S3 as configured.<\/li>\n<li>If strict private connectivity is required, <strong>verify<\/strong> whether Amazon DataZone supports required VPC endpoint patterns in your Region.<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">Secrets handling<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Avoid embedding credentials in code for ingestion.<\/li>\n<li>Prefer IAM roles and service-to-service authorization.<\/li>\n<li>If external systems are integrated, use AWS Secrets Manager where supported by the connector pattern (verify connector requirements).<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">Audit\/logging<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Enable <strong>CloudTrail<\/strong> for Amazon DataZone and related services.<\/li>\n<li>Centralize logs in a dedicated log archive account.<\/li>\n<li>Retain logs per compliance requirements; control access to logs.<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">Compliance considerations<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Map your data classification scheme to product tags and access approval policies.<\/li>\n<li>Enforce separation of duties:<\/li>\n<li>Producers shouldn\u2019t approve their own sensitive access requests (use multi-approver workflows where possible).<\/li>\n<li>Ensure your data retention and deletion policies cover the underlying data plane.<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">Common security mistakes<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Granting broad Lake Formation \u201csuper\u201d permissions to many users to \u201cmake it work.\u201d<\/li>\n<li>Publishing raw\/unvetted datasets as products without ownership.<\/li>\n<li>Using personal IAM users instead of Identity Center groups for portal membership.<\/li>\n<li>Forgetting to log\/retain subscription approvals and changes.<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">Secure deployment recommendations<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Use a dedicated governance\/shared services account for the domain in multi-account setups.<\/li>\n<li>Use customer-managed KMS keys if required by policy (verify supported configuration).<\/li>\n<li>Build a periodic access review process for subscriptions and underlying grants.<\/li>\n<\/ul>\n\n\n\n<h2 class=\"wp-block-heading\">13. Limitations and Gotchas<\/h2>\n\n\n\n<p class=\"wp-block-paragraph\">Because Amazon DataZone evolves quickly, validate the latest constraints in official docs. Common gotchas include:<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>Region availability:<\/strong> Not all Regions support Amazon DataZone.<\/li>\n<li><strong>Cross-Region patterns:<\/strong> Cataloging and provisioning may be Region-aligned; cross-Region data estates require careful design (verify support).<\/li>\n<li><strong>Identity prerequisites:<\/strong> Portal sign-in often depends on IAM Identity Center; org configurations can add complexity.<\/li>\n<li><strong>Lake Formation interaction:<\/strong> If Lake Formation is enforcing governance, metadata ingestion and access provisioning can fail without correct LF permissions.<\/li>\n<li><strong>\u201cSubscribed\u201d vs \u201cQueryable\u201d:<\/strong> A DataZone subscription may complete, but the consumer still needs the right execution role\/session to query in Athena\/Redshift.<\/li>\n<li><strong>Environment provisioning complexity:<\/strong> If your setup uses blueprints that create resources, failures may occur due to missing IAM permissions, Service Catalog\/CloudFormation limitations, or SCP restrictions.<\/li>\n<li><strong>Naming and sprawl:<\/strong> Without governance, catalogs can become noisy (too many raw assets).<\/li>\n<li><strong>Pricing surprises:<\/strong> Even if DataZone costs are controlled, Athena scans, Redshift usage, and logging can grow rapidly.<\/li>\n<li><strong>Quotas:<\/strong> Domains, projects, environments, and ingestion runs may have quotas; verify in Service Quotas and docs.<\/li>\n<li><strong>Connector\/source limitations:<\/strong> Not every data source type is supported; verify what you can ingest and what you can provision access for.<\/li>\n<\/ul>\n\n\n\n<h2 class=\"wp-block-heading\">14. Comparison with Alternatives<\/h2>\n\n\n\n<p class=\"wp-block-paragraph\">Amazon DataZone is one option in a broader ecosystem of cataloging and governance tools.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">Key comparisons (AWS and non-AWS)<\/h3>\n\n\n\n<figure class=\"wp-block-table\"><table>\n<thead>\n<tr>\n<th>Option<\/th>\n<th>Best For<\/th>\n<th>Strengths<\/th>\n<th>Weaknesses<\/th>\n<th>When to Choose<\/th>\n<\/tr>\n<\/thead>\n<tbody>\n<tr>\n<td><strong>Amazon DataZone<\/strong><\/td>\n<td>AWS-centric data catalog + data products + subscriptions<\/td>\n<td>Portal + workflow model; integrates with AWS data governance patterns; designed for producer\/consumer sharing<\/td>\n<td>Region\/service availability; integrations depend on supported sources\/blueprints; still requires governance processes<\/td>\n<td>You want AWS-native data product discovery and governed access workflows<\/td>\n<\/tr>\n<tr>\n<td><strong>AWS Glue Data Catalog<\/strong><\/td>\n<td>Central technical catalog for Athena\/Glue ecosystem<\/td>\n<td>Foundational metadata store; widely integrated in AWS<\/td>\n<td>Not a business portal; limited workflow\/approval experience by itself<\/td>\n<td>You need a technical metastore but not a full portal\/workflow layer<\/td>\n<\/tr>\n<tr>\n<td><strong>AWS Lake Formation<\/strong><\/td>\n<td>Fine-grained governance for S3-based data lakes<\/td>\n<td>Strong permission model and auditing for lake access<\/td>\n<td>Not a discovery portal; user experience is admin-oriented<\/td>\n<td>You need enforced data lake permissions; use with DataZone for portal\/workflow<\/td>\n<\/tr>\n<tr>\n<td><strong>AWS Data Exchange<\/strong><\/td>\n<td>Third-party\/public data subscriptions<\/td>\n<td>Marketplace-style dataset procurement<\/td>\n<td>Not for internal data mesh governance<\/td>\n<td>You need to acquire external datasets via AWS marketplace workflows<\/td>\n<\/tr>\n<tr>\n<td><strong>Amazon Redshift (incl. sharing features)<\/strong><\/td>\n<td>Warehouse-centric sharing and governance<\/td>\n<td>Strong SQL warehouse, access controls, performance<\/td>\n<td>Not a cross-system catalog\/portal by itself<\/td>\n<td>Your primary governed sharing is inside Redshift and warehouse analytics<\/td>\n<\/tr>\n<tr>\n<td><strong>Microsoft Purview<\/strong><\/td>\n<td>Enterprise governance across Microsoft ecosystem<\/td>\n<td>Broad governance suite; MS integrations<\/td>\n<td>AWS-native access provisioning differs; connector strategy needed<\/td>\n<td>Your enterprise standard is Microsoft governance tooling<\/td>\n<\/tr>\n<tr>\n<td><strong>Google Cloud Dataplex \/ Data Catalog<\/strong><\/td>\n<td>GCP-centric governance<\/td>\n<td>Native GCP integration<\/td>\n<td>Not AWS-native; cross-cloud adds complexity<\/td>\n<td>Your primary estate is on GCP<\/td>\n<\/tr>\n<tr>\n<td><strong>Collibra \/ Alation \/ Atlan<\/strong><\/td>\n<td>Enterprise data governance &amp; cataloging across platforms<\/td>\n<td>Rich governance workflows, lineage, stewardship features<\/td>\n<td>Cost\/complexity; requires integration effort<\/td>\n<td>You need vendor-agnostic governance across many platforms<\/td>\n<\/tr>\n<tr>\n<td><strong>OpenMetadata \/ DataHub \/ Amundsen \/ Apache Atlas<\/strong><\/td>\n<td>Self-managed catalog<\/td>\n<td>Customizable; avoids vendor lock-in<\/td>\n<td>Operational burden; integration and security are on you<\/td>\n<td>You have strong platform engineering and want open-source control<\/td>\n<\/tr>\n<\/tbody>\n<\/table><\/figure>\n\n\n\n<h2 class=\"wp-block-heading\">15. Real-World Example<\/h2>\n\n\n\n<h3 class=\"wp-block-heading\">Enterprise example: Multi-account governed analytics for a regulated company<\/h3>\n\n\n\n<p class=\"wp-block-paragraph\"><strong>Problem<\/strong><br\/>\nA financial services company runs analytics across multiple AWS accounts (segregated by department and environment). Data is in S3 and Redshift. Teams struggle with:\n&#8211; finding authoritative datasets\n&#8211; consistent access approvals\n&#8211; audit demands for sensitive datasets<\/p>\n\n\n\n<p class=\"wp-block-paragraph\"><strong>Proposed architecture<\/strong>\n&#8211; Governance\/shared services account hosts <strong>Amazon DataZone domain<\/strong> and portal.\n&#8211; Producer accounts host S3 data lakes governed by <strong>Lake Formation<\/strong> and publish metadata via Glue Catalog.\n&#8211; Consumer accounts query via Athena\/Redshift with role-based access.\n&#8211; CloudTrail centralized to a log archive account.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\"><strong>Why Amazon DataZone was chosen<\/strong>\n&#8211; AWS-native portal and workflow model aligned with existing AWS governance stack.\n&#8211; Enables self-service discovery without granting broad lake access.\n&#8211; Subscription workflow supports auditability and consistent approvals.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\"><strong>Expected outcomes<\/strong>\n&#8211; Reduced time-to-access (days \u2192 hours)\n&#8211; Improved dataset trust and reuse via data products\n&#8211; Clear audit trail for approvals and provisioning events<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">Startup\/small-team example: Internal data portal for rapid BI and ML iteration<\/h3>\n\n\n\n<p class=\"wp-block-paragraph\"><strong>Problem<\/strong><br\/>\nA startup has data in S3 queried by Athena and a small Redshift warehouse. Analysts repeatedly ask engineers where data lives and what columns mean.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\"><strong>Proposed architecture<\/strong>\n&#8211; Single Amazon DataZone domain\n&#8211; Two projects: <code>DataEngineering<\/code> (producer) and <code>Analytics<\/code> (consumer)\n&#8211; Glue Catalog ingestion for core datasets\n&#8211; Curated \u201cgold\u201d data products for BI dashboards<\/p>\n\n\n\n<p class=\"wp-block-paragraph\"><strong>Why Amazon DataZone was chosen<\/strong>\n&#8211; Quick setup compared to building a custom data portal\n&#8211; Standard request flow reduces Slack-based access chaos\n&#8211; Improves documentation without heavy process overhead<\/p>\n\n\n\n<p class=\"wp-block-paragraph\"><strong>Expected outcomes<\/strong>\n&#8211; Faster analyst onboarding\n&#8211; Better dataset documentation\n&#8211; More consistent dashboard metrics<\/p>\n\n\n\n<h2 class=\"wp-block-heading\">16. FAQ<\/h2>\n\n\n\n<ol class=\"wp-block-list\">\n<li>\n<p><strong>Is Amazon DataZone a data warehouse or data lake?<\/strong><br\/>\n   No. Amazon DataZone is a data management and governance service (portal, catalog, workflows). Your data remains in S3\/Redshift\/etc.<\/p>\n<\/li>\n<li>\n<p><strong>Does Amazon DataZone move my data into a new storage layer?<\/strong><br\/>\n   Typically no. It ingests and manages <strong>metadata<\/strong> and orchestrates access workflows.<\/p>\n<\/li>\n<li>\n<p><strong>What is a \u201cdomain\u201d in Amazon DataZone?<\/strong><br\/>\n   A domain is the top-level governance boundary that contains the portal, catalog, projects, and configurations.<\/p>\n<\/li>\n<li>\n<p><strong>What is a \u201cproject\u201d?<\/strong><br\/>\n   A project is a collaboration space for a team to publish data products or subscribe to them.<\/p>\n<\/li>\n<li>\n<p><strong>What is a \u201cdata product\u201d?<\/strong><br\/>\n   A curated package of one or more data assets with business metadata (owner, description, usage guidance) intended for reuse and governed sharing.<\/p>\n<\/li>\n<li>\n<p><strong>What is a \u201csubscription\u201d in Amazon DataZone?<\/strong><br\/>\n   A request by a consumer project to access a data product, typically involving approval and (where supported) automated permission provisioning.<\/p>\n<\/li>\n<li>\n<p><strong>Does Amazon DataZone integrate with AWS Lake Formation?<\/strong><br\/>\n   It is commonly used with Lake Formation for governed lake access. Exact integration behavior depends on your blueprint\/source setup\u2014verify in official docs.<\/p>\n<\/li>\n<li>\n<p><strong>Can Amazon DataZone manage access in Athena directly?<\/strong><br\/>\n   Athena access is controlled via IAM and (often) Lake Formation permissions. DataZone can help orchestrate workflows that result in those permissions\u2014verify your specific configuration.<\/p>\n<\/li>\n<li>\n<p><strong>Can I use Amazon DataZone without Lake Formation?<\/strong><br\/>\n   Some organizations run in IAM-only modes for the lake, but governance capabilities differ. Validate your requirements and confirm supported modes in the docs.<\/p>\n<\/li>\n<li>\n<p><strong>Is Amazon DataZone suitable for a data mesh?<\/strong><br\/>\n   Yes, it\u2019s often used to implement domain-based publishing and consumption. Success depends on your organizational processes and ownership model.<\/p>\n<\/li>\n<li>\n<p><strong>How do I automate Amazon DataZone setup?<\/strong><br\/>\n   Use AWS APIs\/CLI (where available) and standard AWS IaC for dependent resources (S3\/Glue\/LF\/Redshift). Validate API coverage and consider staged rollout.<\/p>\n<\/li>\n<li>\n<p><strong>How do I audit who approved access to a dataset?<\/strong><br\/>\n   Use the DataZone workflow records plus AWS CloudTrail for API activity. For full auditability, centralize logs and retain them per policy.<\/p>\n<\/li>\n<li>\n<p><strong>What are the most common reasons ingestion fails?<\/strong><br\/>\n   Missing permissions to read Glue catalog metadata, Lake Formation restrictions, and misconfigured execution roles.<\/p>\n<\/li>\n<li>\n<p><strong>How do I prevent the catalog from becoming cluttered?<\/strong><br\/>\n   Establish publishing standards: only curated products are \u201crecommended,\u201d tag raw assets as raw, and enforce ownership\/description requirements.<\/p>\n<\/li>\n<li>\n<p><strong>Does Amazon DataZone support multi-account architectures?<\/strong><br\/>\n   Multi-account usage is common in AWS governance designs, but supported patterns vary by feature and blueprint. Verify in official docs for your target architecture.<\/p>\n<\/li>\n<li>\n<p><strong>What\u2019s the difference between AWS Glue Data Catalog and Amazon DataZone?<\/strong><br\/>\n   Glue Data Catalog is a technical metastore; Amazon DataZone provides a portal, workflows, business metadata, and data product\/subscription concepts on top.<\/p>\n<\/li>\n<li>\n<p><strong>How do I estimate total cost?<\/strong><br\/>\n   Model both Amazon DataZone pricing dimensions (domain\/user) and the larger data plane costs (Athena scans, Redshift usage, S3 storage, logging). Use the AWS Pricing Calculator.<\/p>\n<\/li>\n<\/ol>\n\n\n\n<h2 class=\"wp-block-heading\">17. Top Online Resources to Learn Amazon DataZone<\/h2>\n\n\n\n<figure class=\"wp-block-table\"><table>\n<thead>\n<tr>\n<th>Resource Type<\/th>\n<th>Name<\/th>\n<th>Why It Is Useful<\/th>\n<\/tr>\n<\/thead>\n<tbody>\n<tr>\n<td>Official Documentation<\/td>\n<td>Amazon DataZone Documentation<\/td>\n<td>Canonical reference for concepts, setup, and integrations: https:\/\/docs.aws.amazon.com\/datazone\/<\/td>\n<\/tr>\n<tr>\n<td>Official User Guide<\/td>\n<td>Amazon DataZone User Guide (entry point)<\/td>\n<td>Service overview and workflows (start here): https:\/\/docs.aws.amazon.com\/datazone\/latest\/userguide\/what-is-datazone.html<\/td>\n<\/tr>\n<tr>\n<td>Official API Reference<\/td>\n<td>Amazon DataZone API Reference<\/td>\n<td>Programmatic operations and schemas: https:\/\/docs.aws.amazon.com\/datazone\/latest\/APIReference\/Welcome.html<\/td>\n<\/tr>\n<tr>\n<td>Official CLI Reference<\/td>\n<td>AWS CLI <code>datazone<\/code> command reference<\/td>\n<td>Automate and script operations (verify coverage): https:\/\/docs.aws.amazon.com\/cli\/latest\/reference\/datazone\/<\/td>\n<\/tr>\n<tr>\n<td>Official Pricing<\/td>\n<td>Amazon DataZone Pricing<\/td>\n<td>Current pricing dimensions and rates: https:\/\/aws.amazon.com\/datazone\/pricing\/<\/td>\n<\/tr>\n<tr>\n<td>Pricing Tool<\/td>\n<td>AWS Pricing Calculator<\/td>\n<td>Build environment-specific estimates: https:\/\/calculator.aws\/#\/<\/td>\n<\/tr>\n<tr>\n<td>Regions<\/td>\n<td>Regional Product &amp; Service Availability<\/td>\n<td>Confirm Region support: https:\/\/aws.amazon.com\/about-aws\/global-infrastructure\/regional-product-services\/<\/td>\n<\/tr>\n<tr>\n<td>AWS Videos<\/td>\n<td>AWS YouTube Channel<\/td>\n<td>Search for official walkthroughs and re:Invent sessions: https:\/\/www.youtube.com\/@amazonwebservices<\/td>\n<\/tr>\n<tr>\n<td>AWS Samples<\/td>\n<td>AWS Samples on GitHub<\/td>\n<td>Find official\/community labs by searching \u201cdatazone\u201d: https:\/\/github.com\/aws-samples<\/td>\n<\/tr>\n<tr>\n<td>Governance Reference<\/td>\n<td>AWS Lake Formation Documentation<\/td>\n<td>Understand the data lake permissions model often used with DataZone: https:\/\/docs.aws.amazon.com\/lake-formation\/<\/td>\n<\/tr>\n<\/tbody>\n<\/table><\/figure>\n\n\n\n<h2 class=\"wp-block-heading\">18. Training and Certification Providers<\/h2>\n\n\n\n<figure class=\"wp-block-table\"><table>\n<thead>\n<tr>\n<th>Institute<\/th>\n<th>Suitable Audience<\/th>\n<th>Likely Learning Focus<\/th>\n<th>Mode<\/th>\n<th>Website URL<\/th>\n<\/tr>\n<\/thead>\n<tbody>\n<tr>\n<td>DevOpsSchool.com<\/td>\n<td>Cloud engineers, architects, DevOps\/SRE, data platform teams<\/td>\n<td>AWS fundamentals, DevOps + cloud operations, potentially governance patterns<\/td>\n<td>Check website<\/td>\n<td>https:\/\/www.devopsschool.com\/<\/td>\n<\/tr>\n<tr>\n<td>ScmGalaxy.com<\/td>\n<td>Engineers and managers seeking process + tooling skills<\/td>\n<td>SCM\/DevOps practices, platform enablement concepts<\/td>\n<td>Check website<\/td>\n<td>https:\/\/www.scmgalaxy.com\/<\/td>\n<\/tr>\n<tr>\n<td>CLoudOpsNow.in<\/td>\n<td>Cloud operations and platform teams<\/td>\n<td>CloudOps practices, monitoring, cost, security operations<\/td>\n<td>Check website<\/td>\n<td>https:\/\/cloudopsnow.in\/<\/td>\n<\/tr>\n<tr>\n<td>SreSchool.com<\/td>\n<td>SREs, reliability engineers, platform owners<\/td>\n<td>SRE practices, observability, reliability engineering<\/td>\n<td>Check website<\/td>\n<td>https:\/\/sreschool.com\/<\/td>\n<\/tr>\n<tr>\n<td>AiOpsSchool.com<\/td>\n<td>Ops, SRE, and IT teams adopting automation<\/td>\n<td>AIOps concepts, automation for operations and incident response<\/td>\n<td>Check website<\/td>\n<td>https:\/\/aiopsschool.com\/<\/td>\n<\/tr>\n<\/tbody>\n<\/table><\/figure>\n\n\n\n<h2 class=\"wp-block-heading\">19. Top Trainers<\/h2>\n\n\n\n<figure class=\"wp-block-table\"><table>\n<thead>\n<tr>\n<th>Platform\/Site<\/th>\n<th>Likely Specialization<\/th>\n<th>Suitable Audience<\/th>\n<th>Website URL<\/th>\n<\/tr>\n<\/thead>\n<tbody>\n<tr>\n<td>RajeshKumar.xyz<\/td>\n<td>Cloud\/DevOps training content (verify current offerings)<\/td>\n<td>Beginners to intermediate engineers<\/td>\n<td>https:\/\/rajeshkumar.xyz\/<\/td>\n<\/tr>\n<tr>\n<td>devopstrainer.in<\/td>\n<td>DevOps and cloud training (verify course catalog)<\/td>\n<td>DevOps engineers, platform teams<\/td>\n<td>https:\/\/www.devopstrainer.in\/<\/td>\n<\/tr>\n<tr>\n<td>devopsfreelancer.com<\/td>\n<td>Freelance DevOps guidance and services (verify offerings)<\/td>\n<td>Teams seeking practical help and coaching<\/td>\n<td>https:\/\/www.devopsfreelancer.com\/<\/td>\n<\/tr>\n<tr>\n<td>devopssupport.in<\/td>\n<td>Support\/training style resources (verify current focus)<\/td>\n<td>Ops\/DevOps teams needing hands-on assistance<\/td>\n<td>https:\/\/www.devopssupport.in\/<\/td>\n<\/tr>\n<\/tbody>\n<\/table><\/figure>\n\n\n\n<h2 class=\"wp-block-heading\">20. Top Consulting Companies<\/h2>\n\n\n\n<figure class=\"wp-block-table\"><table>\n<thead>\n<tr>\n<th>Company<\/th>\n<th>Likely Service Area<\/th>\n<th>Where They May Help<\/th>\n<th>Consulting Use Case Examples<\/th>\n<th>Website URL<\/th>\n<\/tr>\n<\/thead>\n<tbody>\n<tr>\n<td>cotocus.com<\/td>\n<td>Cloud\/DevOps\/IT services (verify service lines)<\/td>\n<td>Platform engineering, cloud adoption, operations<\/td>\n<td>Data platform setup, IAM\/landing zone design, automation<\/td>\n<td>https:\/\/cotocus.com\/<\/td>\n<\/tr>\n<tr>\n<td>DevOpsSchool.com<\/td>\n<td>Training + consulting (verify offerings)<\/td>\n<td>Enablement, DevOps practices, cloud skills uplift<\/td>\n<td>Governance operating model workshops, DevOps pipelines for data platforms<\/td>\n<td>https:\/\/www.devopsschool.com\/<\/td>\n<\/tr>\n<tr>\n<td>DEVOPSCONSULTING.IN<\/td>\n<td>DevOps consulting (verify portfolio)<\/td>\n<td>CI\/CD, cloud operations, reliability, security<\/td>\n<td>Multi-account AWS setup, cost optimization, operational readiness<\/td>\n<td>https:\/\/www.devopsconsulting.in\/<\/td>\n<\/tr>\n<\/tbody>\n<\/table><\/figure>\n\n\n\n<h2 class=\"wp-block-heading\">21. Career and Learning Roadmap<\/h2>\n\n\n\n<h3 class=\"wp-block-heading\">What to learn before Amazon DataZone<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li>AWS fundamentals: IAM, S3, networking basics<\/li>\n<li>Analytics basics: Athena, Glue Data Catalog, partitioning and file formats<\/li>\n<li>Data governance basics: RBAC\/ABAC, least privilege, audit logging<\/li>\n<li>Lake Formation fundamentals if you manage S3-based data lakes<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">What to learn after Amazon DataZone<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Lake Formation advanced permissions and cross-account sharing<\/li>\n<li>Redshift governance and workload management<\/li>\n<li>Data engineering pipelines (Glue\/EMR\/managed orchestration)<\/li>\n<li>Data quality and observability tooling (AWS-native or third-party)<\/li>\n<li>Enterprise logging and compliance patterns on AWS<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">Job roles that use it<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Data platform engineer<\/li>\n<li>Cloud solutions architect (analytics)<\/li>\n<li>Data governance engineer\/steward (technical)<\/li>\n<li>Security engineer focused on data access governance<\/li>\n<li>Analytics engineer \/ BI platform engineer<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">Certification path (AWS)<\/h3>\n\n\n\n<p class=\"wp-block-paragraph\">AWS certifications are role-based rather than service-specific. Relevant tracks:\n&#8211; AWS Certified Solutions Architect \u2013 Associate\/Professional\n&#8211; AWS Certified Data Engineer \u2013 Associate (if available in your region\/timeframe; verify current AWS certification catalog)\n&#8211; AWS Certified Security \u2013 Specialty<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Verify current AWS certifications: https:\/\/aws.amazon.com\/certification\/<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">Project ideas for practice<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Build a \u201ccertified metrics\u201d program: publish 10 curated data products with glossary definitions.<\/li>\n<li>Implement multi-account data mesh: one governance domain, 3 producer accounts, 2 consumer accounts.<\/li>\n<li>Automate onboarding: script creation of projects and metadata standards checks using APIs\/CLI (verify available operations).<\/li>\n<li>Governance dashboards: track subscription lead time and product adoption using exported events\/logs.<\/li>\n<\/ul>\n\n\n\n<h2 class=\"wp-block-heading\">22. Glossary<\/h2>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>Amazon DataZone Domain:<\/strong> Top-level construct that hosts portal, catalog, and governance configuration.<\/li>\n<li><strong>Data portal:<\/strong> Web UI for discovery, publishing, and access requests.<\/li>\n<li><strong>Project:<\/strong> Workspace for a producer or consumer team.<\/li>\n<li><strong>Data asset:<\/strong> Catalog entry representing a dataset (table\/view\/etc., depending on source).<\/li>\n<li><strong>Data product:<\/strong> Curated package of assets with business metadata and ownership for sharing.<\/li>\n<li><strong>Subscription:<\/strong> Request\/approval workflow for access to a data product.<\/li>\n<li><strong>AWS Glue Data Catalog:<\/strong> Metadata store for tables and databases used by Athena\/Glue and often lake architectures.<\/li>\n<li><strong>AWS Lake Formation:<\/strong> Service to manage fine-grained permissions for data lakes on S3 with Glue catalog integration.<\/li>\n<li><strong>Amazon Athena:<\/strong> Serverless query engine for S3 data.<\/li>\n<li><strong>Amazon Redshift:<\/strong> Managed data warehouse.<\/li>\n<li><strong>IAM Identity Center:<\/strong> AWS workforce identity and SSO service used to authenticate portal users.<\/li>\n<li><strong>Control plane:<\/strong> Management layer (metadata, workflows).<\/li>\n<li><strong>Data plane:<\/strong> Where data is stored and queried (S3\/Redshift\/Athena).<\/li>\n<li><strong>Least privilege:<\/strong> Security principle of granting only the permissions needed.<\/li>\n<li><strong>CloudTrail:<\/strong> AWS service for auditing API calls across AWS services.<\/li>\n<\/ul>\n\n\n\n<h2 class=\"wp-block-heading\">23. Summary<\/h2>\n\n\n\n<p class=\"wp-block-paragraph\">Amazon DataZone is an AWS Analytics service that provides a governed <strong>data portal, catalog, data product model, and subscription workflows<\/strong> so teams can discover and share data reliably across an AWS data estate.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">It matters because analytics programs fail when data is hard to find, poorly documented, and access is slow or unsafe. Amazon DataZone addresses those gaps by combining metadata ingestion, business context, ownership, and request\/approval workflows\u2014integrating with AWS services like Glue Data Catalog and (often) Lake Formation for enforceable access.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Cost planning should cover both Amazon DataZone\u2019s own pricing dimensions (verify on the official pricing page) and the larger indirect costs of your data plane (Athena scans, Redshift usage, S3 storage, logging). Security planning should focus on identity (IAM Identity Center), least privilege roles, Lake Formation\/Redshift permissions, and audit logging.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Use Amazon DataZone when you need an AWS-native governance layer for data discovery and controlled sharing. Next step: read the official user guide and validate supported sources\/blueprints for your environment, then run a pilot domain with 1\u20132 producer teams and a small set of curated data products: https:\/\/docs.aws.amazon.com\/datazone\/latest\/userguide\/what-is-datazone.html<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Analytics<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[21,20],"tags":[],"class_list":["post-125","post","type-post","status-publish","format-standard","hentry","category-analytics","category-aws"],"_links":{"self":[{"href":"https:\/\/www.devopsschool.com\/tutorials\/wp-json\/wp\/v2\/posts\/125","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.devopsschool.com\/tutorials\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.devopsschool.com\/tutorials\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.devopsschool.com\/tutorials\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/www.devopsschool.com\/tutorials\/wp-json\/wp\/v2\/comments?post=125"}],"version-history":[{"count":0,"href":"https:\/\/www.devopsschool.com\/tutorials\/wp-json\/wp\/v2\/posts\/125\/revisions"}],"wp:attachment":[{"href":"https:\/\/www.devopsschool.com\/tutorials\/wp-json\/wp\/v2\/media?parent=125"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.devopsschool.com\/tutorials\/wp-json\/wp\/v2\/categories?post=125"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.devopsschool.com\/tutorials\/wp-json\/wp\/v2\/tags?post=125"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}