Category
AI + Machine Learning
1. Introduction
Azure AI Custom Vision is an Azure AI + Machine Learning service for training custom image classification and object detection models using your own labeled images—without needing to build a full ML pipeline from scratch.
In simple terms: you upload images, tag what’s in them (or draw boxes around objects), train a model, test it, and then deploy it behind an API endpoint (or export it to run on edge devices). This is useful when “generic” computer vision doesn’t recognize your domain-specific items (your products, your parts, your defects, your brand packaging, your lab samples).
Technically, Azure AI Custom Vision provides managed training and prediction capabilities. You create a Custom Vision project, upload and label images, train “iterations,” evaluate metrics (precision/recall), and publish a model to a prediction endpoint. You can then call the endpoint from apps and automation, or export certain model types to run offline (for example, on mobile/IoT/edge) depending on the chosen domain and model type.
It solves the problem of building reliable, repeatable image recognition for specialized scenarios—like detecting defects in a manufacturing line or classifying product variants—without needing a dedicated ML platform team, GPU cluster management, or bespoke training scripts for every iteration.
Naming note (important): Microsoft has rebranded many “Cognitive Services” offerings under Azure AI services. The service is commonly referred to as Azure AI Custom Vision in current documentation and the portal experience, while older materials may call it “Custom Vision” or “Custom Vision Service.” This tutorial uses Azure AI Custom Vision as the primary, exact name and calls out legacy terms only when helpful for navigation.
2. What is Azure AI Custom Vision?
Official purpose: Azure AI Custom Vision helps you build custom computer vision models for image classification and object detection using labeled training images, and then deploy them for inference via API or export formats (where supported).
Core capabilities
- Image classification
- Multiclass (one label per image)
- Multilabel (multiple labels per image)
- Object detection
- Detect and localize objects with bounding boxes
- Model training and iteration management
- Train multiple iterations, compare performance, choose the best
- Evaluation metrics
- Precision, recall, and probability thresholds
- Deployment options
- Hosted prediction endpoint
- Model export for certain scenarios (for example, “compact” domains), when supported in your project configuration (verify availability in official docs/portal)
Major components
- Custom Vision resources in Azure
- Typically separated into:
- Training resource (to train models)
- Prediction resource (to host prediction endpoints)
- The exact resource “kind”/creation flow can vary by portal UX updates; verify in official docs if your portal differs.
- Custom Vision portal
- Web UI to create projects, upload images, label/tag, train, evaluate, and publish
- REST APIs / SDKs
- Programmatic training, dataset upload, iteration management, and prediction calls
Service type
- Managed AI service (PaaS-like): you don’t manage VMs/containers/GPUs for training in the typical hosted workflow.
- Can support edge/offline scenarios via export, depending on project/domain settings and supported formats.
Scope: regional vs global, and what’s “scoped” to what
- Azure resource scope: created in a specific Azure region and attached to a subscription and resource group.
- Project scope: Custom Vision projects are associated with your training resource and exist logically within the Custom Vision service. Access is controlled through portal access and resource keys, plus any supported Azure identity integrations (verify current auth options in official docs).
- Endpoints: prediction endpoints are regional and tied to the prediction resource.
How it fits into the Azure ecosystem
Azure AI Custom Vision is commonly used alongside: – Azure Blob Storage (store training images, inference images, outputs) – Azure Functions / Container Apps / App Service (build APIs and automation around model inference) – Azure IoT Edge (edge inference when exporting models/containers is supported) – Azure Monitor (operational monitoring of dependent app services; direct service metrics/logging depend on the resource capabilities—verify diagnostic support in your region/SKU) – Azure DevOps / GitHub Actions (CI/CD for apps that consume the model; model lifecycle can be integrated using APIs)
Official docs entry point: https://learn.microsoft.com/azure/ai-services/custom-vision-service/
3. Why use Azure AI Custom Vision?
Business reasons
- Faster time-to-value than building a bespoke CV training pipeline.
- Lower barrier for teams without deep ML expertise.
- Domain specialization: recognize your specific SKUs, parts, packaging, or defect types.
- Iterate quickly with new training images as your environment changes.
Technical reasons
- Managed training workflow with iterations and built-in evaluation metrics.
- Simple deployment via hosted prediction API.
- Supports both classification and detection, covering many practical computer vision needs.
- SDK/REST automation for repeatable training and deployment.
Operational reasons
- Clear separation between training and inference (common enterprise requirement).
- Model versioning via iterations helps operationalize rollbacks and A/B testing (implemented by deploying different published iterations or different project endpoints).
- Reduced infra management compared to self-managed GPU training.
Security/compliance reasons
- Runs within Azure’s compliance boundary for Azure AI services (always validate your specific compliance requirements against Microsoft’s official compliance documentation for your tenant/region).
- Supports standard Azure security patterns: resource keys/secrets, RBAC around resource management, and private networking options for some AI services (verify whether Private Link is supported for your exact Custom Vision resources and region).
Scalability/performance reasons
- Hosted inference scales based on the service capabilities/SKU (exact scaling behavior is SKU- and region-dependent).
- Export/edge options (when available) can reduce latency and bandwidth cost by running inference locally.
When teams should choose it
Choose Azure AI Custom Vision when: – You need custom classification/detection for a narrow domain. – You have labeled images (or can label them). – You want managed training and easy API deployment. – You prefer a productized workflow over building everything in Azure Machine Learning.
When teams should not choose it
Avoid or reconsider Azure AI Custom Vision when: – You need advanced architectures, full control over training code, or complex multi-stage pipelines (consider Azure Machine Learning). – You require segmentation (pixel-level masks) rather than bounding boxes and labels (Custom Vision focuses on classification/detection; verify current feature set if segmentation is required). – Your environment needs strict offline-only with guaranteed export format availability (export support depends on project settings/domains; validate before committing). – You have extremely large-scale datasets requiring bespoke data engineering and distributed training control.
4. Where is Azure AI Custom Vision used?
Industries
- Manufacturing (quality inspection, defect detection)
- Retail/e-commerce (product recognition, shelf compliance)
- Logistics (package type identification, label presence checks)
- Healthcare/life sciences (lab sample classification—subject to regulatory constraints)
- Agriculture (crop disease classification, pest detection)
- Construction (PPE detection, site safety checks)
- Automotive (part verification, damage detection)
- Security (object detection for restricted items—ensure lawful, ethical use)
Team types
- Application development teams integrating AI into products
- DevOps/platform teams operationalizing endpoints and deployments
- Data labeling teams supporting model improvements
- Innovation/PoC teams validating feasibility
- SRE/operations teams monitoring production services
Workloads and architectures
- Mobile apps calling hosted prediction endpoints
- Edge camera systems performing local inference (if export supported)
- Server-side batch inference for image archives (e.g., nightly classification)
- Real-time inspection pipelines connected to camera feeds (often via an intermediate service that extracts frames and calls the prediction API)
Real-world deployment contexts
- Production: stable iteration publishing, controlled access keys, monitoring/alerting, staged rollouts
- Dev/test: frequent retraining, experimental tagging strategies, limited images, free tier where possible
5. Top Use Cases and Scenarios
Below are realistic scenarios where Azure AI Custom Vision is a good fit.
1) Manufacturing defect detection (object detection)
- Problem: Identify scratches, dents, missing components on a production line.
- Why this service fits: Object detection supports locating defects/parts; iterative training improves accuracy over time.
- Example: A camera takes images of assembled units; a service calls the Custom Vision prediction endpoint to flag missing screws.
2) Product variant classification (multiclass classification)
- Problem: Distinguish between visually similar SKUs (size, label, color variants).
- Why this service fits: Custom classification learns subtle visual differences from your real product images.
- Example: Warehouse app classifies a box as “Model A / Model B / Model C” to reduce mis-picks.
3) Shelf compliance in retail (object detection)
- Problem: Detect if required products are present on a shelf and placed correctly.
- Why this service fits: Object detection with bounding boxes can identify items and count them.
- Example: Field staff use a mobile app to photograph shelves; backend checks planogram compliance.
4) PPE detection for safety checks (object detection)
- Problem: Detect helmets/vests/gloves in controlled work zones.
- Why this service fits: Object detection works well when PPE items are visually distinct and images represent real site conditions.
- Example: Site gate camera triggers an alert if a worker is missing a helmet.
5) Food sorting and grading (classification)
- Problem: Classify produce into grades (A/B/C) based on appearance.
- Why this service fits: Custom image classifiers can learn domain-specific visual cues.
- Example: A packing facility classifies apples into grade bins based on camera snapshots.
6) Document/photo triage (classification)
- Problem: Sort incoming photos into categories (receipt, invoice, ID, other).
- Why this service fits: Quick custom classification can route content to specialized downstream processors.
- Example: A customer support portal sorts attachments before sending them to OCR or human review.
7) Brand/logo detection (object detection)
- Problem: Detect whether a logo is present and where it appears.
- Why this service fits: Object detection finds logos even when small/rotated (within limits of training data).
- Example: Marketing team checks if partners used correct logo placement in store photos.
8) Equipment state recognition (classification)
- Problem: Identify if a machine is “on/off/error” based on indicator lights.
- Why this service fits: Custom classification learns indicator patterns from your exact device models.
- Example: A maintenance app classifies panel images to create incident tickets automatically.
9) Waste sorting assistance (classification/detection)
- Problem: Identify recyclable vs non-recyclable items (or detect contaminants).
- Why this service fits: A custom model trained on local waste streams improves relevance.
- Example: Kiosk app helps users sort items by photographing them.
10) Visual regression testing in software/hardware QA (classification)
- Problem: Detect changes in UI screenshots or physical assembly images.
- Why this service fits: Custom classification can flag “expected” vs “unexpected” states with curated training sets.
- Example: QA pipeline classifies screenshots into “pass/fail” categories for fast triage (note: specialized visual diff tools may be better; Custom Vision can complement).
11) Species/pest detection for agriculture (object detection)
- Problem: Detect specific pests on sticky traps or leaves.
- Why this service fits: Object detection for small targets works if images are high-quality and labeled carefully.
- Example: Field team uploads trap images; system counts pests per trap.
12) Packaging integrity checks (detection)
- Problem: Detect missing seals, misapplied labels, or damaged packaging.
- Why this service fits: Object detection can confirm presence/position of expected elements.
- Example: Automated line stops when “seal_missing” is detected above a threshold.
6. Core Features
Feature availability can vary by region/SKU and may evolve. Validate in the Azure portal and official docs.
6.1 Image classification (multiclass and multilabel)
- What it does: Predicts category labels for an entire image (one label or multiple).
- Why it matters: Many real problems are “what is this image?” rather than “where is the object?”
- Practical benefit: Simple labeling (tags) and fast iteration.
- Limitations/caveats: Performance depends heavily on representative training data (lighting, background, device camera differences).
6.2 Object detection with bounding boxes
- What it does: Detects objects and returns bounding boxes plus probabilities.
- Why it matters: Enables localization tasks (counting, presence checks, compliance).
- Practical benefit: Supports automation like “if missing part detected then fail.”
- Limitations/caveats: Requires more labeling effort; small objects and heavy occlusion can reduce accuracy.
6.3 Project-based workflow (portal + APIs)
- What it does: Organizes datasets, tags, iterations, and deployments in a single project.
- Why it matters: Improves repeatability and collaboration.
- Practical benefit: Easier lifecycle management than ad-hoc scripts alone.
- Limitations/caveats: Dataset governance and export/import processes should be planned for enterprise workflows (don’t treat the portal as your only “source of truth”).
6.4 Training iterations and publishing
- What it does: Train multiple model iterations; publish a chosen iteration to a prediction endpoint.
- Why it matters: Safe rollouts and easy rollback.
- Practical benefit: Keep a “known good” published iteration while experimenting.
- Limitations/caveats: Versioning semantics are tied to iterations/published names; you must document and automate your promotion process.
6.5 Evaluation metrics and threshold tuning
- What it does: Provides precision/recall and probability thresholds for predictions.
- Why it matters: You can tune for fewer false positives vs fewer false negatives.
- Practical benefit: Aligns model behavior with business risk (e.g., safety checks prefer fewer false negatives).
- Limitations/caveats: Offline metrics may not match production performance if your real images drift from training distribution.
6.6 REST APIs and SDK support
- What it does: Programmatically upload images, train models, manage iterations, and run inference.
- Why it matters: Enables automation and CI/CD-style retraining workflows.
- Practical benefit: Repeatable pipelines; fewer manual portal steps.
- Limitations/caveats: API versions and endpoints can change; pin SDK versions and follow official version guidance.
6.7 Export / edge deployment (supported projects only)
- What it does: Exports trained models to formats suitable for running outside Azure (availability depends on the chosen domain/model settings).
- Why it matters: Low latency, offline inference, reduced network cost.
- Practical benefit: Run inference on factory floor devices, mobile, or IoT.
- Limitations/caveats: Export is not always available for all domains/project types; validate early. Exported models may require device-specific optimization and careful update management.
6.8 Separation of training and prediction resources
- What it does: Training and hosted prediction typically use different Azure resources.
- Why it matters: Security boundary and cost management (training is spiky; prediction is steady).
- Practical benefit: Lock down training keys; scale prediction independently.
- Limitations/caveats: Requires planning for resource creation, key management, and region selection.
7. Architecture and How It Works
High-level architecture
At a high level, Azure AI Custom Vision has two primary workflows:
-
Training workflow – You upload labeled images to a project – The service trains an iteration – You evaluate and optionally publish the iteration
-
Prediction workflow – Your app sends an image to the prediction endpoint – The service returns predicted labels (classification) or bounding boxes (detection)
Request/data/control flow (typical)
- Data plane (images)
- Training: image upload + tags/boxes
- Inference: image sent to prediction endpoint
- Control plane
- Resource management (create training/prediction resources, manage keys)
- Project/iteration management via portal/API
Integrations with related services
Common patterns: – Blob Storage: source of images; apps fetch from storage and send to endpoint – Functions/Container Apps: lightweight API layer, preprocessing (resize/compress), auth, rate limiting – Event Grid: trigger inference when new blobs arrive – Key Vault: store training/prediction keys – App Insights/Azure Monitor: request tracing/metrics for your app tier that calls Custom Vision
Dependency services
- Azure resource group, networking, identity, logging (for your app)
- Custom Vision training + prediction resources
Security/authentication model (practical view)
- Prediction calls commonly use:
- Prediction Key header + endpoint URL (key-based auth)
- Training calls commonly use:
- Training Key header + training endpoint URL
- Access to create/manage resources uses:
- Azure RBAC (portal/ARM)
- Azure AD-based authentication support varies across Azure AI services and endpoints; verify in official docs for your current requirements.
Networking model
- Hosted endpoints are public by default.
- Some Azure AI services support private networking via Azure Private Link; availability for Custom Vision can vary by region/SKU and resource type. Verify in official docs before designing a private-only architecture.
Monitoring/logging/governance
- Your calling application should log:
- request IDs, timestamps, image source, model version (published iteration name), latency, top predictions, confidence
- Use governance practices:
- resource naming conventions
- tags for cost center/environment/owner
- key rotation
- periodic access review
Simple architecture diagram (Mermaid)
flowchart LR
U[User / Camera / App] --> A[App Service / Function]
A -->|HTTPS + Prediction-Key| CVP[Azure AI Custom Vision\nPrediction Endpoint]
A --> B[Blob Storage\n(Optional: store images)]
CVP --> A
Production-style architecture diagram (Mermaid)
flowchart TB
subgraph EdgeOrClient["Edge/Client"]
C[Camera / Mobile App]
end
subgraph Azure["Azure Subscription"]
EG[Event Grid (optional)]
ST[(Azure Blob Storage)]
F[Azure Functions / Container Apps\nPreprocess + Auth + Rate Limit]
KV[Azure Key Vault]
MON[Application Insights / Azure Monitor]
CVT[Azure AI Custom Vision\nTraining Resource]
CVP[Azure AI Custom Vision\nPrediction Resource]
DEV[DevOps Pipeline\n(GitHub Actions/Azure DevOps)]
end
C -->|Upload| ST
ST -->|Blob Created Event| EG
EG --> F
F -->|Fetch image| ST
F -->|Get secrets| KV
F -->|Predict (HTTPS)| CVP
F --> MON
DEV -->|Automate training via API| CVT
DEV -->|Publish iteration| CVT
CVT -->|Publishes to| CVP
8. Prerequisites
Account/subscription requirements
- An active Azure subscription with billing enabled.
- Ability to create resources in a resource group.
Permissions / IAM roles
You typically need: – Contributor on the target resource group (to create Custom Vision resources) – Or more restrictive: – Cognitive Services Contributor (or equivalent) for resource creation – Reader for auditors/observers – For secret storage: – Key Vault access roles (for example, Key Vault Secrets Officer/Secrets User depending on your access model)
Billing requirements
- Pay-as-you-go or enterprise agreement subscription.
- If using free tier, understand its limits (projects, training time, transactions)—confirm on the pricing page.
Tools needed
For the hands-on lab, you can complete everything in the portal. Optional tools: – Azure CLI (optional): https://learn.microsoft.com/cli/azure/install-azure-cli – curl (for quick endpoint tests) – Python 3.10+ (optional, for sample code) – A way to gather images (phone camera is fine)
Region availability
- Custom Vision resources are regional.
- Choose a region close to your users/cameras for latency and data residency.
- Always verify supported regions in official docs and the Azure portal during resource creation.
Quotas/limits
Limits commonly exist around: – Number of projects – Images per project – Training time/iterations – Transactions per second / rate limits – Image size constraints These change over time—verify in official docs and in your subscription quota views where applicable.
Prerequisite services (recommended)
- Azure Key Vault for storing keys
- Azure Blob Storage for storing training/inference images (optional but recommended for traceability)
- A minimal compute tier (Functions/Container Apps/App Service) to call prediction endpoints from a controlled backend rather than directly from clients
9. Pricing / Cost
Azure AI Custom Vision pricing is usage-based and depends on SKU/region. Do not lock a solution design until you confirm prices for your region and billing agreement.
Official pricing page (start here):
https://azure.microsoft.com/pricing/details/cognitive-services/custom-vision-service/
(If the URL redirects under Azure AI services branding, follow the official redirect.)
Azure Pricing Calculator:
https://azure.microsoft.com/pricing/calculator/
Pricing dimensions (typical)
Common cost meters include: – Training – Charged based on training compute/units (often time-based or per training unit) – Prediction (hosted inference) – Charged per number of prediction transactions (often per 1,000 transactions), with different rates for classification vs detection in some pricing models – Resource SKUs – Free tier vs Standard tiers (naming varies; confirm current SKUs in portal)
Free tier (if applicable)
Azure AI services often offer a free tier with: – Limited number of transactions per month – Limited training capacity/projects This changes—verify the current free-tier limits on the official pricing page.
Primary cost drivers
- Number of prediction calls (and whether you’re doing detection vs classification)
- Size/complexity of images (affects bandwidth and sometimes latency; cost is usually per transaction, but upstream costs can rise)
- Frequency of retraining (especially if training is charged)
- Environment split: dev/test/prod resources
Hidden or indirect costs
- Storage (Blob Storage for images and results)
- Data transfer
- Upload bandwidth from edge to Azure
- Egress costs if you move results across regions or out of Azure
- Compute hosting for your app layer (Functions/Container Apps/App Service)
- Key Vault operations (usually small, but measurable at scale)
- Human labeling time (often the biggest real cost)
Network/data transfer implications
- If cameras upload large images frequently, bandwidth can dominate. Consider:
- resizing/compressing before upload
- sending only necessary frames (sampling)
- moving inference closer to the edge (export-supported scenarios)
How to optimize cost
- Start with classification if detection is not necessary.
- Use sampling (don’t run inference on every frame in a video feed).
- Implement confidence thresholds and only escalate uncertain cases to humans.
- Cache results for repeated images (where appropriate).
- Separate dev/test from prod and apply budgets/alerts.
Example low-cost starter estimate (conceptual)
A typical low-cost learning setup: – 1 training resource + 1 prediction resource (free tier if available) – A few hundred prediction calls for testing – Minimal storage (a few hundred images) Because exact numbers vary, compute an estimate by entering expected monthly prediction transactions and any training units into the Pricing Calculator.
Example production cost considerations (conceptual)
For production, quantify: – peak and average predictions per second – expected monthly transactions – retraining cadence (weekly/monthly) – image sizes and upload patterns Then: – set Azure budgets – add alerts – run a short load test to confirm latency and any throttling behavior
10. Step-by-Step Hands-On Tutorial
This lab builds and deploys a basic image classification model using the Azure AI Custom Vision portal, then calls it via HTTP.
Objective
Create an Azure AI Custom Vision classification project that distinguishes between two categories (example: “Mug” vs “NotMug”), train an iteration, publish it, and call the prediction endpoint with curl.
Lab Overview
You will: 1. Create Azure resources (Training + Prediction). 2. Create a Custom Vision project. 3. Upload and tag images. 4. Train and evaluate a model iteration. 5. Publish the model to an endpoint. 6. Call the endpoint and interpret results. 7. Clean up resources to avoid ongoing charges.
Data note: Use your own photos (recommended). For best results, collect images under realistic conditions (lighting/background/angles) similar to production.
Step 1: Create Azure AI Custom Vision resources (Training and Prediction)
You can do this in the Azure portal.
- Go to the Azure portal: https://portal.azure.com
- Create a Resource group (if you don’t have one):
– Search Resource groups → Create
– Name:
rg-customvision-lab– Region: choose a region that supports Custom Vision (portal will show availability) - Create the Training resource:
– Search Custom Vision (or “Azure AI services” → find Custom Vision)
– Create a resource with a name like:
cvtrain<unique>– Select an appropriate SKU (free tier if available; otherwise a standard tier) - Create the Prediction resource:
– Create a second resource with a name like:
cvpred<unique>– Choose the same region if possible (simplifies latency and compatibility)
Expected outcome: You have two Azure resources created: one for training, one for prediction.
Verification: – In the portal, open each resource and confirm: – Status is healthy/ready – You can see Keys and Endpoint (exact blade naming can vary)
Step 2: Open the Custom Vision portal and create a project
- Open the Custom Vision portal: https://www.customvision.ai/
- Sign in with the same identity you use for Azure.
- Select New Project.
- Configure the project:
– Name:
cv-mug-classifier– Resource: select your training resource (cvtrain...) – Project Types: choose Classification – Classification Types: choose Multiclass (one label per image) – Domain: start with General (or a similar general-purpose domain shown in the portal)
Domain options can change. If you plan to export to edge, you may need a “compact” domain. Pick the domain based on your deployment requirements and verify export support early.
Expected outcome: A new, empty classification project is created.
Verification: – You should see the project dashboard with tabs like Training Images, Tags, Train, and Performance (names may vary slightly).
Step 3: Collect and upload training images
You need at least a small set of images per tag to start, but real models require more.
- Create two tags:
– Tag 1:
Mug– Tag 2:NotMug - Gather images: – Mug: 15–30 photos of mugs (different mugs, angles, backgrounds) – NotMug: 15–30 photos of other objects (bottles, bowls, plates, phone, laptop, etc.)
- Upload images:
– In Training Images → Add images
– Upload your mug images and assign the
Mugtag – Upload your non-mug images and assign theNotMugtag
Expected outcome: Images appear in the project and each image is tagged.
Verification: – Use filtering by tag to confirm you have images under both tags. – Make sure tags are not imbalanced (for the lab, keep counts roughly similar).
Step 4: Train your first iteration
- Click Train.
- Choose the training option offered (for example “Quick training” if present).
- Start training and wait for completion.
Expected outcome: Training finishes and you get an iteration with performance metrics.
Verification: – Open the Performance view. – Confirm you see metrics like precision/recall per tag and overall.
If your metrics are poor, don’t panic. For a lab model, you are validating the workflow. You can improve accuracy by adding more representative images and reducing label noise.
Step 5: Quick test the model in the portal
- Use the portal’s Quick Test feature.
- Upload a new image (not part of training):
– A mug image → should predict
Mugwith higher probability – A non-mug image → should predictNotMugwith higher probability
Expected outcome: The model returns a predicted tag and probability.
Verification: – Confirm the top prediction matches the image content at least most of the time.
Step 6: Publish the iteration to the prediction resource
- In the iteration view, click Publish.
- Choose:
– Prediction resource: select your prediction resource (
cvpred...) – Published name:mugclassifier-v1
Expected outcome: The iteration is now published and available via a hosted prediction endpoint.
Verification: – Look for a “Published” indicator on the iteration. – In the portal, open Prediction URL (or similar) to see example request formats.
Step 7: Call the prediction endpoint with curl
This step uses the Prediction URL info provided by the portal so you don’t have to guess endpoints.
- In Custom Vision portal, open Prediction URL for your published iteration.
-
Copy: – The endpoint URL – The required header name and value (usually
Prediction-Key: <your key>) – The correct path for classification (commonly includes/classify/iterations/<publishedName>/image) -
Run a
curlcall using a local test image.
Example (pattern only—use your exact URL and key from the portal):
curl -X POST "<PREDICTION_URL_FROM_PORTAL>" \
-H "Prediction-Key: <YOUR_PREDICTION_KEY>" \
-H "Content-Type: application/octet-stream" \
--data-binary "@./test-mug.jpg"
Expected outcome: A JSON response with predictions and probabilities.
Verification:
– Confirm the top prediction is Mug for a mug image.
– Save the response JSON for troubleshooting and auditing.
Step 8: (Optional) Improve the model with new images and retrain
- Add 10–20 more images for whichever class is underperforming.
- Retrain to create a new iteration.
- Compare metrics against the published iteration.
- Publish the new iteration as
mugclassifier-v2(or overwrite the same published name if your release process allows—be deliberate).
Expected outcome: A better-performing iteration is available.
Verification:
– Run the same curl tests and compare probabilities.
Validation
You have successfully completed this lab if:
– Two Azure resources exist (training + prediction).
– A Custom Vision project exists with at least two tags and tagged images.
– At least one iteration is trained and published.
– A curl request to the published endpoint returns predictions and the results are directionally correct.
Troubleshooting
Common issues and fixes:
-
401 Unauthorized / 403 Forbidden – Cause: Wrong key, wrong header name, or using a training key on prediction endpoint. – Fix: Use the Prediction URL panel and copy the exact header and endpoint. Ensure you are using the prediction resource key.
-
404 Not Found – Cause: Incorrect project/iteration path or wrong region endpoint. – Fix: Don’t hand-construct the URL. Use the portal’s generated prediction URL.
-
Model predicts the same label for everything – Cause: Too few training images, unbalanced tags, or labels too similar/noisy. – Fix: Add more diverse images, balance classes, remove mislabeled images, retrain.
-
Poor performance in real photos – Cause: Training images not representative (lighting, background, camera). – Fix: Collect training images in real deployment conditions. Include “hard negatives” (objects similar to mugs).
-
Throttling / rate limits – Cause: High request rate. – Fix: Add retry with backoff in your app; consider batching/sampling; check SKU limits.
Cleanup
To avoid ongoing costs:
1. In Azure portal, delete the resource group:
– rg-customvision-lab → Delete resource group
2. Confirm deletion removes:
– Training resource
– Prediction resource
– Any associated resources you created for the lab
If you want to keep the resource group, delete only the Custom Vision resources you created.
11. Best Practices
Architecture best practices
- Put an API/service layer between clients and Custom Vision:
- centralizes auth, throttling, logging, and routing
- Store images in Blob Storage and pass references internally:
- keep audit trails and enable reprocessing
- Use event-driven designs for batch/async inference:
- Event Grid → Functions → Prediction
- Plan model lifecycle:
- naming conventions for published iterations (e.g.,
productdetector-prod,productdetector-canary) - rollback strategy (keep prior iteration published)
IAM/security best practices
- Store keys in Azure Key Vault, not in code or client apps.
- Rotate keys regularly and on staff changes.
- Restrict who can train/publish vs who can only call prediction endpoints.
- Use least-privilege RBAC for resource management.
Cost best practices
- Use free tier for learning if available and within limits.
- Compress/resize images before inference when acceptable.
- Reduce inference volume with sampling and business rules.
- Separate dev/test and prod resources; use budgets and alerts.
Performance best practices
- Keep inference calls close to the endpoint region.
- Use timeouts and retries with exponential backoff.
- Benchmark with production-like payload sizes.
- Consider edge export (if supported) for low-latency/high-volume camera environments.
Reliability best practices
- Implement circuit breakers in your app tier.
- Cache results for idempotent operations when appropriate.
- Have a “degraded mode” if prediction endpoint is temporarily unavailable.
Operations best practices
- Log model version (published iteration name) on every prediction.
- Capture a sample of inputs/outputs for monitoring drift (ensure privacy compliance).
- Use dashboards for:
- request volume
- latency
- error rate
- confidence distribution (signals drift)
Governance/tagging/naming best practices
- Resource naming: include env/region/app
- Example:
cvpred-prod-weu-inventory - Tag resources:
env=dev|test|prodowner=teamcostCenter=...dataClass=public|internal|confidential
12. Security Considerations
Identity and access model
- Resource management: controlled by Azure RBAC (who can create/delete resources, access keys).
- Service access: commonly via API keys (training and prediction keys).
- Some Azure AI services support Azure AD token auth in certain scenarios; for Custom Vision, verify current auth support in official docs for your API version.
Encryption
- Data in transit: HTTPS endpoints.
- Data at rest: managed by Azure AI services platform. For strict requirements (CMK, customer-managed keys), verify support for your exact resource type/SKU/region in official docs.
Network exposure
- By default, prediction endpoints are public.
- If you require private-only connectivity:
- Check whether Custom Vision supports Private Link for your resource types and region (verify in official docs).
- If not supported, mitigate by restricting access through your own backend and network controls.
Secrets handling
- Put keys in Key Vault.
- Avoid embedding prediction keys in mobile apps or browser code.
- Use managed identity from your app tier to retrieve keys from Key Vault.
Audit/logging
- Log:
- who published a model (change management)
- when keys were rotated
- prediction requests (at least metadata and outcomes)
- For Azure resource changes, use Azure Activity Log and consider exporting to Log Analytics/SIEM.
Compliance considerations
- Review data residency and privacy requirements for images.
- Implement retention policies and access controls for stored images.
- If images include people or sensitive content, perform privacy impact assessment and legal review.
Common security mistakes
- Sending prediction keys directly to clients.
- No key rotation plan.
- Storing training images without access controls.
- No audit trail for model iteration promotion to production.
Secure deployment recommendations
- Front prediction calls with a backend service.
- Use Key Vault + managed identity for secrets.
- Add request validation (size/type limits) to prevent abuse.
- Maintain separate resources for dev/test/prod.
13. Limitations and Gotchas
Because capabilities evolve, treat these as planning prompts and confirm current constraints in official docs.
- Export support varies by project type/domain and may not be available for every model.
- Labeling quality dominates results:
- mislabeled images can ruin accuracy
- inconsistent bounding boxes reduce detection performance
- Data drift is common:
- new packaging designs, lighting changes, camera upgrades can degrade accuracy
- Rate limits/throttling can occur under load:
- plan retries and backoff
- Class imbalance leads to biased predictions:
- keep training sets balanced where feasible
- Regional availability:
- not every Azure region supports every Azure AI service resource
- Privacy/security requirements:
- images may be sensitive; plan storage, retention, and access carefully
- Testing on “easy” images misleads:
- always test with production-like images
- Operational coupling:
- if you overwrite a published name without change control, you can break downstream assumptions
- API version differences:
- the prediction URL format and endpoints depend on API versions; use portal-generated URLs and official SDKs
14. Comparison with Alternatives
Within Azure
- Azure AI Vision (prebuilt Image Analysis): best for generic tagging, captions, OCR-like tasks; less ideal for your proprietary categories.
- Azure Machine Learning: best for full control and advanced ML workflows (custom architectures, MLOps pipelines), but higher complexity.
Other clouds
- AWS Rekognition Custom Labels: similar concept—custom image models with managed training/inference.
- Google Cloud Vertex AI (AutoML Vision): similar managed custom vision training.
Open-source/self-managed
- YOLO (e.g., Ultralytics YOLOv8/YOLOvX) for detection; ResNet/EfficientNet/ViT for classification, trained in PyTorch/TensorFlow.
- Strength: full control and potentially lower per-inference cost at scale.
- Weakness: you manage training infra, deployment, monitoring, and security.
Comparison table
| Option | Best For | Strengths | Weaknesses | When to Choose |
|---|---|---|---|---|
| Azure AI Custom Vision | Quick custom classification/detection with managed workflow | Fast setup, portal labeling, iterations, hosted prediction API, some export options | Less control than full ML platforms; export/domain constraints; needs representative images | You want managed custom vision without building full ML infrastructure |
| Azure AI Vision (prebuilt) | General image understanding | No training needed, quick integration | Not tailored to your proprietary classes | Your problem matches generic concepts (objects/scenes/text) |
| Azure Machine Learning | Advanced/custom ML at scale | Full control, MLOps, custom training code, GPU options | Higher complexity and operational burden | You need custom architectures, pipelines, governance, or multi-modal ML |
| AWS Rekognition Custom Labels | Similar managed custom vision on AWS | Integrated with AWS ecosystem | Cloud lock-in; different pricing and workflows | Your workloads are primarily on AWS |
| Google Vertex AI AutoML Vision | Similar managed custom vision on GCP | AutoML + strong ML platform integration | Different tooling/quotas; cloud lock-in | You’re standardized on GCP |
| Self-managed (YOLO/PyTorch/TensorFlow) | Maximum control and offline-first | Full flexibility, can optimize for hardware, avoid per-call fees | Highest engineering/ops burden | You need strict offline, custom training, or very high volume cost optimization |
15. Real-World Example
Enterprise example: Manufacturing quality inspection
- Problem: A manufacturer needs to detect missing components and visible defects on an assembly line. They need auditability and controlled rollouts.
- Proposed architecture:
- Cameras upload images to Blob Storage.
- Event Grid triggers Functions to preprocess images and call Azure AI Custom Vision prediction.
- Results stored in a database; failures create tickets in an ITSM system.
- A gated process publishes new iterations after validation on a gold test set.
- Why Azure AI Custom Vision was chosen:
- Faster prototyping than building an Azure ML training pipeline.
- Clear iteration/publishing workflow.
- Hosted endpoint simplifies integration with the line-control software.
- Expected outcomes:
- Reduced manual inspection workload.
- Faster defect detection and improved consistency.
- A repeatable retraining cycle as new defect types appear.
Startup/small-team example: E-commerce product photo categorization
- Problem: A small e-commerce team wants to auto-route product photos into categories (e.g., “shoes,” “bags,” “accessories”) using their own catalog style and backgrounds.
- Proposed architecture:
- Admin tool uploads images and calls a small backend API.
- Backend calls Azure AI Custom Vision prediction endpoint.
- The tool accepts model suggestions and stores the final label for feedback.
- Why Azure AI Custom Vision was chosen:
- Small team can manage labeling and training in the portal.
- Minimal infrastructure; only a lightweight backend is required.
- Expected outcomes:
- Faster product onboarding.
- More consistent categorization.
- Ability to improve model quality with incremental labeled data.
16. FAQ
-
What is Azure AI Custom Vision used for?
Training custom image classification and object detection models using your own labeled images, then deploying them via hosted endpoints or supported export formats. -
Do I need ML expertise to use it?
You need basic understanding of labeling, evaluation, and data quality, but you typically don’t need to write training code for the managed workflow. -
What’s the difference between classification and object detection?
Classification labels the entire image; detection finds objects and returns bounding boxes and probabilities. -
Do I need separate resources for training and prediction?
Commonly yes (Training resource and Prediction resource). The portal and docs reflect the supported setup; verify the latest creation flow in official docs. -
How many images do I need?
It depends on variability and difficulty. Start with dozens per class for a prototype, but real production models often require hundreds or thousands per class and ongoing updates. -
Can I run it offline on edge devices?
Sometimes, via model export for supported project types/domains. Confirm export support early for your chosen configuration. -
How do I version models?
Use iterations and published iteration names. Maintain a release process (dev → staging → prod) and keep a rollback iteration available. -
How do I secure the prediction endpoint?
Don’t expose keys to clients. Call prediction from a backend service, store keys in Key Vault, rotate keys, and consider private networking options if supported (verify). -
Can I use Azure AD instead of keys?
Azure RBAC governs resource management, but prediction/training APIs frequently use keys. Check official Custom Vision authentication documentation for current Azure AD support. -
Why is my model accurate in the portal but fails in production?
Data drift and non-representative training images are the usual cause. Collect training images from production-like conditions and retrain. -
How do I reduce false positives?
Increase the probability threshold, add “hard negative” examples, and refine labels. Evaluate impact on recall. -
Is Custom Vision suitable for video analytics?
It’s image-based; for video you typically extract frames and run inference on selected frames. Consider sampling to control cost and latency. -
What image formats are supported?
Common web image formats are typically supported, but verify current constraints (size, format, max payload) in official docs. -
How do I automate retraining?
Use the training API/SDK to upload images, create iterations, evaluate metrics, and publish if metrics meet thresholds—integrated into GitHub Actions/Azure DevOps. -
How do I estimate cost?
Use the official pricing page and Azure Pricing Calculator. Main drivers are prediction transactions and training usage. -
Can I restrict who can publish models?
Yes, through Azure RBAC for resource management and by controlling who has training keys and portal access. Implement change management around publishing. -
What’s the best way to manage datasets?
Keep a “source of truth” dataset in storage with metadata and labeling records, and use Custom Vision as the training/deployment tool. Consider a labeling workflow and a gold test set.
17. Top Online Resources to Learn Azure AI Custom Vision
| Resource Type | Name | Why It Is Useful |
|---|---|---|
| Official documentation | Azure AI Custom Vision docs — https://learn.microsoft.com/azure/ai-services/custom-vision-service/ | Primary reference for concepts, how-tos, API versions, and current capabilities |
| Official pricing | Azure AI Custom Vision pricing — https://azure.microsoft.com/pricing/details/cognitive-services/custom-vision-service/ | Authoritative pricing model and SKU details (confirm region) |
| Pricing calculator | Azure Pricing Calculator — https://azure.microsoft.com/pricing/calculator/ | Build estimates for training/prediction usage and related services |
| Portal | Custom Vision portal — https://www.customvision.ai/ | Main UI for projects, labeling, training iterations, publishing, and prediction URLs |
| Quickstarts/tutorials | Custom Vision quickstarts (from docs hub) — https://learn.microsoft.com/azure/ai-services/custom-vision-service/ | Step-by-step onboarding aligned with current SDKs/APIs |
| SDK reference | Azure SDK documentation — https://learn.microsoft.com/azure/developer/python/sdk/ (and language selectors) | Guidance on using official SDKs safely and correctly |
| Samples (GitHub) | Azure Samples on GitHub — https://github.com/Azure-Samples | Search for “custom vision” samples and end-to-end code patterns |
| Architecture guidance | Azure Architecture Center — https://learn.microsoft.com/azure/architecture/ | Reference architectures for event-driven, secure, and scalable Azure solutions (apply patterns to Custom Vision workloads) |
| Security baseline (general) | Azure security documentation — https://learn.microsoft.com/security/ | Broader Azure security practices for identity, networking, and governance |
| Video learning | Microsoft Azure YouTube — https://www.youtube.com/@MicrosoftAzure | Walkthroughs and demos; validate content recency against current docs |
18. Training and Certification Providers
The following training providers may offer Azure and AI + Machine Learning learning paths. Verify current course outlines and delivery modes on their sites.
| Institute | Suitable Audience | Likely Learning Focus | Mode | Website URL |
|---|---|---|---|---|
| DevOpsSchool.com | DevOps engineers, cloud engineers, developers | Azure fundamentals, DevOps, CI/CD, cloud operations; may include AI service integrations | Check website | https://www.devopsschool.com/ |
| ScmGalaxy.com | Beginners to intermediate DevOps learners | DevOps, automation, tooling; potential Azure integration content | Check website | https://www.scmgalaxy.com/ |
| CLoudOpsNow.in | Cloud ops teams, SREs, platform engineers | Cloud operations practices, monitoring, reliability; may include Azure operational patterns | Check website | https://www.cloudopsnow.in/ |
| SreSchool.com | SREs, ops leads, platform teams | Reliability engineering, observability, incident response applied to cloud workloads | Check website | https://www.sreschool.com/ |
| AiOpsSchool.com | Ops + AI practitioners | AIOps concepts, monitoring automation; may complement AI service operations | Check website | https://www.aiopsschool.com/ |
19. Top Trainers
Presented as training resources/platforms (verify the specific trainer profiles, schedules, and course coverage on each website).
| Platform/Site | Likely Specialization | Suitable Audience | Website URL |
|---|---|---|---|
| RajeshKumar.xyz | DevOps/cloud training content (verify scope) | Beginners to intermediate engineers | https://rajeshkumar.xyz/ |
| devopstrainer.in | DevOps training and coaching (verify offerings) | DevOps engineers, developers | https://www.devopstrainer.in/ |
| devopsfreelancer.com | Freelance DevOps/consulting and training (verify services) | Teams seeking short-term help or mentoring | https://www.devopsfreelancer.com/ |
| devopssupport.in | DevOps support/training (verify scope) | Ops/DevOps teams needing practical support | https://www.devopssupport.in/ |
20. Top Consulting Companies
Descriptions are neutral and based on likely service positioning inferred from public presence—confirm exact service catalogs and references directly with the companies.
| Company | Likely Service Area | Where They May Help | Consulting Use Case Examples | Website URL |
|---|---|---|---|---|
| cotocus.com | Cloud/DevOps consulting (verify specialties) | Cloud architecture, delivery execution, operations | Designing an Azure integration layer for Custom Vision inference; setting up monitoring and secure secret management | https://cotocus.com/ |
| DevOpsSchool.com | DevOps and cloud consulting/training | DevOps transformation, CI/CD, platform engineering | Building CI/CD for apps that consume Custom Vision; governance, IaC, and operational readiness | https://www.devopsschool.com/ |
| DEVOPSCONSULTING.IN | DevOps consulting (verify offerings) | Automation, reliability, operations | Implementing secure deployment patterns for AI-backed services; alerting, incident response setup | https://www.devopsconsulting.in/ |
21. Career and Learning Roadmap
What to learn before Azure AI Custom Vision
- Azure fundamentals
- Resource groups, regions, IAM/RBAC, networking basics
- HTTP and REST
- headers, auth, status codes, payload formats
- Computer vision basics
- classification vs detection, overfitting, train/test split
- Data handling
- basic image preprocessing, dataset organization, labeling discipline
What to learn after Azure AI Custom Vision
- MLOps practices
- dataset versioning, automated evaluation gates, staged releases
- Azure Machine Learning
- when you need full control, pipelines, model registry, advanced monitoring
- Production architecture
- event-driven patterns, retries/circuit breakers, secure secret management
- Model monitoring and drift
- confidence tracking, sampling, human-in-the-loop feedback loops
Job roles that use it
- Cloud solutions engineer / cloud architect
- DevOps engineer / SRE supporting AI-enabled apps
- Application developer integrating AI services
- Data analyst / data steward coordinating labeling and evaluation
- ML engineer (often as a complementary tool or baseline approach)
Certification path (Azure)
There is no single certification dedicated only to Custom Vision, but relevant Azure certifications often include: – Azure Fundamentals (AZ-900) – Azure AI Fundamentals (AI-900) – Azure role-based certifications (developer/architect) depending on your responsibilities
Always verify current certification offerings and exam objectives on Microsoft Learn: https://learn.microsoft.com/credentials/
Project ideas for practice
- Build a “defect detector” prototype with 3–5 defect classes.
- Create a home-inventory classifier (electronics, tools, kitchen items).
- Implement an event-driven inference pipeline (Blob Storage → Event Grid → Function → results table).
- Create an internal model release checklist with iteration naming and rollback steps.
- Build a drift dashboard: store top-1 confidence over time and alert on distribution shifts.
22. Glossary
- Azure AI Custom Vision: Azure service for training custom image classification and object detection models and deploying them for inference.
- Classification: Predicting label(s) for an entire image.
- Multiclass classification: Exactly one label per image.
- Multilabel classification: Multiple labels can apply to one image.
- Object detection: Predicting labels plus bounding boxes locating objects in the image.
- Tag: A label/category in Custom Vision used for classification or object labeling.
- Bounding box: Rectangle drawn around an object instance for detection training.
- Iteration: A trained model version produced by a training run.
- Published iteration: An iteration deployed to the prediction endpoint with a published name.
- Precision: Of predicted positives, how many were correct (low false positives).
- Recall: Of actual positives, how many were found (low false negatives).
- Threshold: Confidence cutoff used to decide whether to accept a prediction.
- Training resource: Azure resource used to run training operations.
- Prediction resource: Azure resource that hosts the prediction endpoint for inference.
- Prediction key / Training key: API keys used to authenticate prediction/training calls.
- Data drift: Changes in real-world inputs over time that reduce model performance.
- Hard negatives: Non-target examples that are visually similar to the target class; useful for improving robustness.
23. Summary
Azure AI Custom Vision is Azure’s managed service for building custom image classification and object detection models using your own labeled images. It matters because many real AI + Machine Learning scenarios require domain-specific visual recognition that prebuilt vision APIs cannot reliably deliver.
In Azure architectures, it typically sits behind a controlled application layer that handles authentication, logging, throttling, and image handling, while Custom Vision provides the training workflow (iterations) and an easy deployment path (hosted prediction endpoint and, where supported, export for edge use). Cost is primarily driven by prediction volume and training usage, plus indirect costs like storage and bandwidth. Security hinges on correct key management (Key Vault, rotation) and restricting endpoint exposure.
Use Azure AI Custom Vision when you want a practical, managed path to custom vision without building a full ML platform—then expand into stronger MLOps and governance practices as your solution matures. Next step: read the official docs hub and run a second lab that automates image upload, training, and publishing through the REST API/SDK, aligned with your team’s CI/CD workflow.