{"id":17,"date":"2026-04-12T13:11:46","date_gmt":"2026-04-12T13:11:46","guid":{"rendered":"https:\/\/www.devopsschool.com\/tutorials\/alibaba-cloud-auto-scaling-tutorial-architecture-pricing-use-cases-and-hands-on-guide-for-computing\/"},"modified":"2026-04-12T13:11:46","modified_gmt":"2026-04-12T13:11:46","slug":"alibaba-cloud-auto-scaling-tutorial-architecture-pricing-use-cases-and-hands-on-guide-for-computing","status":"publish","type":"post","link":"https:\/\/www.devopsschool.com\/tutorials\/alibaba-cloud-auto-scaling-tutorial-architecture-pricing-use-cases-and-hands-on-guide-for-computing\/","title":{"rendered":"Alibaba Cloud Auto Scaling Tutorial: Architecture, Pricing, Use Cases, and Hands-On Guide for Computing"},"content":{"rendered":"\n<h2 class=\"wp-block-heading\">Category<\/h2>\n\n\n\n<p class=\"wp-block-paragraph\">Computing<\/p>\n\n\n\n<h2 class=\"wp-block-heading\">1. Introduction<\/h2>\n\n\n\n<p class=\"wp-block-paragraph\">Alibaba Cloud <strong>Auto Scaling<\/strong> is a Computing service that automatically adjusts the amount of compute capacity you run\u2014most commonly by adding or removing <strong>ECS instances<\/strong>\u2014based on demand, schedules, or events. Instead of manually provisioning servers to handle traffic spikes (and then forgetting to scale back down), you define scaling rules and boundaries (minimum\/maximum capacity), and Auto Scaling keeps your environment aligned with those rules.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">In simple terms: <strong>you tell Auto Scaling what \u201chealthy capacity\u201d looks like<\/strong>, and it tries to maintain that capacity by creating or terminating instances as your workload changes. This helps you keep applications responsive during peak demand while reducing waste during low demand.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Technically, Auto Scaling works by managing <strong>scaling groups<\/strong> that contain one or more scaling \u201ctargets\u201d (usually ECS instances) and a set of policies (scaling rules, scheduled tasks, event-triggered tasks). It integrates tightly with core Alibaba Cloud services such as <strong>Elastic Compute Service (ECS)<\/strong>, <strong>Virtual Private Cloud (VPC)<\/strong> networking, and monitoring\/alerting (typically via <strong>CloudMonitor<\/strong>). You can optionally integrate it with load balancers so newly created instances are automatically registered to receive traffic.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">The core problem Auto Scaling solves is the operational and financial burden of capacity management:\n&#8211; Without Auto Scaling, you either <strong>overprovision<\/strong> (paying for idle compute) or <strong>underprovision<\/strong> (risking downtime and performance issues).\n&#8211; With Auto Scaling, you implement <strong>elasticity<\/strong>: capacity grows and shrinks with demand under controlled, auditable policies.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">If Alibaba Cloud changes product naming or consolidates features, verify the latest status in the official documentation. As of the latest generally available documentation, <strong>\u201cAuto Scaling\u201d<\/strong> remains the official service name in Alibaba Cloud, and its APIs are commonly associated with the ESS (Elastic Scaling Service) namespace. Verify in official docs if you see the \u201cESS\u201d term in API\/SDK references.<\/p>\n\n\n\n<hr class=\"wp-block-separator\" \/>\n\n\n\n<h2 class=\"wp-block-heading\">2. What is Auto Scaling?<\/h2>\n\n\n\n<h3 class=\"wp-block-heading\">Official purpose<\/h3>\n\n\n\n<p class=\"wp-block-paragraph\">Alibaba Cloud <strong>Auto Scaling<\/strong> is designed to <strong>automatically scale compute capacity<\/strong> for your applications by creating and releasing instances according to policies you define. Its goal is to help you maintain application availability and performance while controlling cost.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">Core capabilities (what it can do)<\/h3>\n\n\n\n<p class=\"wp-block-paragraph\">Auto Scaling typically enables you to:\n&#8211; Define <strong>scaling groups<\/strong> with minimum, maximum, and desired capacity.\n&#8211; Define how to create instances via <strong>scaling configurations<\/strong> and\/or <strong>launch templates<\/strong> (terminology depends on the workflow you choose\u2014verify in official docs for your region\/account).\n&#8211; Trigger scaling by:\n  &#8211; <strong>Scheduled tasks<\/strong> (time-based)\n  &#8211; <strong>Event-triggered tasks<\/strong> (often based on monitoring alarms, such as CPU utilization)\n  &#8211; <strong>Manual execution<\/strong> of scaling rules\n&#8211; Attach instances to related infrastructure (commonly load balancers) so capacity changes are transparent to users.\n&#8211; Perform controlled scaling with <strong>cooldowns<\/strong> and <strong>lifecycle hooks<\/strong> to reduce risk during scale-in\/scale-out events.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">Major components (mental model)<\/h3>\n\n\n\n<p class=\"wp-block-paragraph\">While exact UI labels can evolve, the core building blocks are typically:<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>Scaling Group<\/strong><\/li>\n<li>Defines where instances run (VPC, vSwitch(es), security group), and capacity boundaries (min\/max\/desired).<\/li>\n<li><strong>Scaling Configuration \/ Launch Template<\/strong><\/li>\n<li>Defines how to create instances: image, instance type, key pair\/password, disks, network, and optional bootstrap\/user data.<\/li>\n<li><strong>Scaling Rule<\/strong><\/li>\n<li>The action: add N instances, remove N instances, or set capacity to a specific number.<\/li>\n<li><strong>Scheduled Task<\/strong><\/li>\n<li>Triggers a scaling rule at a specific time or recurring schedule.<\/li>\n<li><strong>Event-triggered Task \/ Alarm-triggered Scaling<\/strong><\/li>\n<li>Triggers a scaling rule when an alarm is fired (commonly via CloudMonitor metrics).<\/li>\n<li><strong>Scaling Activity<\/strong><\/li>\n<li>A record of a scaling execution: what happened, when, success\/failure, and error messages.<\/li>\n<li><strong>Lifecycle Hook<\/strong> (optional but important)<\/li>\n<li>Pauses an instance during scale-in or scale-out so you can run scripts\/automation (e.g., register with config management, drain connections).<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">Service type<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>Managed control-plane service<\/strong>: Auto Scaling manages scaling decisions and calls into other services (like ECS) to create\/terminate resources.<\/li>\n<li>It is not a compute runtime itself; it orchestrates <strong>compute provisioning<\/strong>.<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">Scope: regional\/zonal\/account<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Auto Scaling is typically <strong>regional<\/strong>: a scaling group exists in one region, and can use one or more <strong>zones<\/strong> within that region through vSwitch selection.<\/li>\n<li>Resources are owned within your <strong>Alibaba Cloud account<\/strong> (or resource directory member account) and governed by <strong>RAM<\/strong> permissions.<\/li>\n<li>It does <strong>not<\/strong> scale across regions automatically; multi-region strategies require additional design (traffic management + separate scaling groups per region).<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">How it fits into the Alibaba Cloud ecosystem<\/h3>\n\n\n\n<p class=\"wp-block-paragraph\">Auto Scaling sits in the Computing layer and commonly connects:\n&#8211; <strong>ECS<\/strong> for VM compute capacity\n&#8211; <strong>VPC<\/strong> (vSwitches, routing, security groups) for networking\n&#8211; <strong>CloudMonitor<\/strong> for metric\/alarm-driven scaling\n&#8211; <strong>Server Load Balancer<\/strong> \/ related load balancing services (product names vary; verify the exact LB type supported by your region and account)\n&#8211; <strong>ActionTrail<\/strong> for auditing API actions\n&#8211; <strong>Resource Orchestration Service (ROS)<\/strong> \/ Terraform for infrastructure-as-code<\/p>\n\n\n\n<hr class=\"wp-block-separator\" \/>\n\n\n\n<h2 class=\"wp-block-heading\">3. Why use Auto Scaling?<\/h2>\n\n\n\n<h3 class=\"wp-block-heading\">Business reasons<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>Lower cost through elasticity<\/strong>: scale down when demand drops instead of paying for idle servers.<\/li>\n<li><strong>Better customer experience<\/strong>: maintain performance and availability during traffic spikes.<\/li>\n<li><strong>Faster time-to-market<\/strong>: engineers spend less time doing manual capacity management.<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">Technical reasons<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>Policy-driven capacity<\/strong>: define desired behavior (min\/max\/desired, rules) once and let the system execute repeatedly and consistently.<\/li>\n<li><strong>Integration with monitoring<\/strong>: scale based on CPU, memory (if available via agents\/metrics), QPS, queue depth, or custom metrics (verify what\u2019s supported in your environment).<\/li>\n<li><strong>Multi-zone resilience (within a region)<\/strong>: distributing vSwitches across zones can reduce zonal dependency (subject to instance type capacity availability).<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">Operational reasons<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>Repeatable, auditable scaling actions<\/strong>: scaling activities and logs provide a trail of what changed.<\/li>\n<li><strong>Reduced on-call load<\/strong>: fewer emergencies where humans need to add capacity quickly.<\/li>\n<li><strong>Standardization<\/strong>: scaling group templates become reusable building blocks across environments.<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">Security\/compliance reasons<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>Least privilege via RAM<\/strong>: restrict who can change scaling policies.<\/li>\n<li><strong>Auditability<\/strong>: use ActionTrail to track scaling-related API calls and changes.<\/li>\n<li><strong>Controlled rollout<\/strong>: lifecycle hooks allow controlled onboarding\/offboarding, reducing the chance of misconfigured instances receiving traffic.<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">Scalability\/performance reasons<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>Better peak handling<\/strong>: add instances in response to real signals, not guesses.<\/li>\n<li><strong>Avoid bottlenecks<\/strong>: scaling groups can be designed around stateless tiers that scale horizontally.<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">When teams should choose Auto Scaling<\/h3>\n\n\n\n<p class=\"wp-block-paragraph\">Auto Scaling is a strong fit when:\n&#8211; Your workload can run on <strong>multiple interchangeable instances<\/strong> (stateless web\/app tier).\n&#8211; You need <strong>predictable elasticity<\/strong> (e-commerce, campaigns, variable usage patterns).\n&#8211; You want to standardize VM fleet management with guardrails.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">When teams should not choose it<\/h3>\n\n\n\n<p class=\"wp-block-paragraph\">Avoid or reconsider Auto Scaling if:\n&#8211; The workload is <strong>stateful<\/strong> and not designed for horizontal scaling (single-node databases, tightly coupled state).\n&#8211; Scaling is constrained by non-compute bottlenecks (e.g., database limits, licensing, external API rate limits).\n&#8211; You need <strong>sub-second scaling<\/strong> (VM boot times are typically minutes; consider container\/serverless approaches).\n&#8211; You cannot tolerate instance replacement (some legacy applications require manual configuration per host).<\/p>\n\n\n\n<hr class=\"wp-block-separator\" \/>\n\n\n\n<h2 class=\"wp-block-heading\">4. Where is Auto Scaling used?<\/h2>\n\n\n\n<h3 class=\"wp-block-heading\">Industries<\/h3>\n\n\n\n<p class=\"wp-block-paragraph\">Common across industries that experience variable load:\n&#8211; E-commerce and retail (flash sales)\n&#8211; Media\/streaming and content delivery origins\n&#8211; Gaming backends and matchmaking services\n&#8211; SaaS platforms with daily\/weekly demand curves\n&#8211; Education and online exams\n&#8211; Financial services frontends (with strict change control and auditing)<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">Team types<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Platform engineering teams creating reusable compute patterns<\/li>\n<li>DevOps\/SRE teams managing reliability and cost<\/li>\n<li>Application teams operating stateless services<\/li>\n<li>Security\/operations teams enforcing governance and audit controls<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">Workloads<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Web frontends and APIs<\/li>\n<li>Background workers (queue consumers)<\/li>\n<li>Batch processing fleets (time-based scaling)<\/li>\n<li>CI\/CD build agents (burst capacity)<\/li>\n<li>Multi-tenant SaaS app tiers<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">Architectures and deployment contexts<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>Three-tier apps<\/strong>: scaling app tier independently from DB tier<\/li>\n<li><strong>Microservices on VMs<\/strong>: each service scales based on its own metrics<\/li>\n<li><strong>Blue\/green or canary<\/strong>: controlled onboarding via lifecycle hooks (careful design required)<\/li>\n<li><strong>Production<\/strong>: most value comes from elasticity + resilience + cost control<\/li>\n<li><strong>Dev\/test<\/strong>: scheduled scaling down outside office hours reduces cost significantly<\/li>\n<\/ul>\n\n\n\n<hr class=\"wp-block-separator\" \/>\n\n\n\n<h2 class=\"wp-block-heading\">5. Top Use Cases and Scenarios<\/h2>\n\n\n\n<p class=\"wp-block-paragraph\">Below are realistic scenarios where Alibaba Cloud Auto Scaling is often used.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">1) Elastic web tier behind a load balancer<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>Problem<\/strong>: Traffic fluctuates; fixed VM count is either too expensive or too slow during spikes.<\/li>\n<li><strong>Why Auto Scaling fits<\/strong>: Scales ECS instances horizontally based on CPU\/QPS alarms or schedules.<\/li>\n<li><strong>Example<\/strong>: An online store scales from 2 to 12 ECS instances during weekend promotions.<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">2) API service scaling by latency or CPU threshold<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>Problem<\/strong>: API latency rises during peak; manual scaling lags behind incidents.<\/li>\n<li><strong>Why Auto Scaling fits<\/strong>: Alarm-triggered scaling can respond when metrics breach thresholds.<\/li>\n<li><strong>Example<\/strong>: A fintech API scales out when average CPU &gt; 60% for 5 minutes.<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">3) Scheduled scaling for predictable business hours<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>Problem<\/strong>: Usage is predictable (9am\u20136pm); running peak capacity 24\/7 wastes money.<\/li>\n<li><strong>Why Auto Scaling fits<\/strong>: Scheduled tasks increase capacity before business hours and reduce after.<\/li>\n<li><strong>Example<\/strong>: Internal portals scale to 6 instances at 08:30 and down to 2 at 19:00.<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">4) CI\/CD build agents that burst during deployments<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>Problem<\/strong>: Build pipelines queue up during releases; static runners cause delays.<\/li>\n<li><strong>Why Auto Scaling fits<\/strong>: Scale out during job surges; scale in after.<\/li>\n<li><strong>Example<\/strong>: A platform team runs ephemeral ECS build agents that scale with queue depth (metric design required).<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">5) Queue-based worker pools (async processing)<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>Problem<\/strong>: Background jobs (image processing, notifications) surge unpredictably.<\/li>\n<li><strong>Why Auto Scaling fits<\/strong>: Scale workers when backlog increases (via monitoring metric strategy).<\/li>\n<li><strong>Example<\/strong>: A media app scales workers from 1 to 20 when processing backlog grows.<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">6) Batch compute fleet for nightly jobs<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>Problem<\/strong>: Heavy ETL runs nightly; leaving the fleet running all day is wasteful.<\/li>\n<li><strong>Why Auto Scaling fits<\/strong>: Scheduled scale-out before the batch window, then scale-in after completion.<\/li>\n<li><strong>Example<\/strong>: Scale to 50 instances at 01:00 for processing, then back to 0\u20132 after 05:00 (design carefully; ensure minimum capacity rules allow it).<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">7) Preemptible\/spot-style capacity for cost optimization (where supported)<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>Problem<\/strong>: Need cheap compute for non-critical workloads; can tolerate interruption.<\/li>\n<li><strong>Why Auto Scaling fits<\/strong>: Scaling configurations can be designed to use lower-cost purchasing options (verify current support in your region).<\/li>\n<li><strong>Example<\/strong>: A rendering farm uses cost-optimized instances and replenishes capacity when instances are reclaimed.<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">8) Multi-zone resilience within one region<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>Problem<\/strong>: A single zone outage or capacity shortage affects availability.<\/li>\n<li><strong>Why Auto Scaling fits<\/strong>: Use multiple vSwitches across zones; scaling can place instances in available zones.<\/li>\n<li><strong>Example<\/strong>: A SaaS web tier spans two zones; Auto Scaling maintains capacity if one zone runs low.<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">9) Temporary event environments (marketing campaigns)<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>Problem<\/strong>: Campaign traffic spikes for a few days; manual provisioning is risky.<\/li>\n<li><strong>Why Auto Scaling fits<\/strong>: Create a scaling group for the event, then delete it.<\/li>\n<li><strong>Example<\/strong>: A product launch scales out aggressively for 72 hours, then cleans up.<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">10) Cost-controlled dev\/test environments<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>Problem<\/strong>: Dev\/test clusters left running overnight consume budget.<\/li>\n<li><strong>Why Auto Scaling fits<\/strong>: Scheduled tasks enforce scale-in outside working hours.<\/li>\n<li><strong>Example<\/strong>: A test environment scales to 0\u20131 instances after 20:00 and back up at 08:00 (verify minimum\/desired capacity behavior for your setup).<\/li>\n<\/ul>\n\n\n\n<hr class=\"wp-block-separator\" \/>\n\n\n\n<h2 class=\"wp-block-heading\">6. Core Features<\/h2>\n\n\n\n<p class=\"wp-block-paragraph\">This section focuses on commonly documented, current Auto Scaling features. Exact UI names can vary slightly; always validate against the official documentation for your region.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">Scaling groups (capacity boundaries and placement)<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>What it does<\/strong>: Defines min\/max\/desired capacity, VPC\/vSwitch placement, and instance health handling.<\/li>\n<li><strong>Why it matters<\/strong>: Capacity boundaries prevent runaway scaling; placement determines resiliency and network reachability.<\/li>\n<li><strong>Practical benefit<\/strong>: Consistent fleet behavior with guardrails.<\/li>\n<li><strong>Caveats<\/strong>: Scaling is typically limited to a region; multi-region requires multiple groups and traffic management.<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">Scaling configurations and\/or launch templates (instance creation blueprint)<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>What it does<\/strong>: Defines how new instances are launched (image, instance type, disk, security group, credentials, and often user data).<\/li>\n<li><strong>Why it matters<\/strong>: New capacity must be consistent and secure.<\/li>\n<li><strong>Practical benefit<\/strong>: Immutable infrastructure pattern\u2014replace instances rather than patch in place.<\/li>\n<li><strong>Caveats<\/strong>: Misconfigured images\/user data cause cascading failures during scale-out; test changes before production.<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">Scaling rules (scale-out\/scale-in actions)<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>What it does<\/strong>: Specifies actions like \u201cadd 1 instance,\u201d \u201cremove 2 instances,\u201d or \u201cset desired capacity to N.\u201d<\/li>\n<li><strong>Why it matters<\/strong>: Rules are the building blocks used by schedules and alarms.<\/li>\n<li><strong>Practical benefit<\/strong>: Clear, reusable actions that can be invoked automatically or manually.<\/li>\n<li><strong>Caveats<\/strong>: Aggressive rules can cause oscillation; pair with cooldowns and sensible thresholds.<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">Scheduled tasks (time-based scaling)<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>What it does<\/strong>: Runs scaling rules on a schedule (one-time or recurring).<\/li>\n<li><strong>Why it matters<\/strong>: Many workloads have predictable patterns.<\/li>\n<li><strong>Practical benefit<\/strong>: Easy cost savings by scaling down outside business hours.<\/li>\n<li><strong>Caveats<\/strong>: Time zones and daylight savings can create surprises; verify schedule semantics.<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">Event-triggered scaling (alarm-based scaling)<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>What it does<\/strong>: Triggers scaling rules when monitoring alarms fire (commonly CPU utilization).<\/li>\n<li><strong>Why it matters<\/strong>: Responds to real conditions, not just time.<\/li>\n<li><strong>Practical benefit<\/strong>: Better performance under unpredictable load.<\/li>\n<li><strong>Caveats<\/strong>: Metrics are delayed and noisy; choose evaluation periods and thresholds carefully.<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">Cooldown periods (stability control)<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>What it does<\/strong>: Temporarily prevents repeated scaling actions immediately after a scaling event.<\/li>\n<li><strong>Why it matters<\/strong>: Avoids rapid scale-in\/scale-out loops.<\/li>\n<li><strong>Practical benefit<\/strong>: More stable capacity changes.<\/li>\n<li><strong>Caveats<\/strong>: Too long can slow response; too short can cause oscillation.<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">Lifecycle hooks (safe provisioning and safe termination)<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>What it does<\/strong>: Pauses scaling actions at defined points (e.g., when an instance launches or terminates) so automation can run.<\/li>\n<li><strong>Why it matters<\/strong>: New instances often need bootstrapping; terminating instances may need draining.<\/li>\n<li><strong>Practical benefit<\/strong>: Safer deployments and fewer user-facing errors.<\/li>\n<li><strong>Caveats<\/strong>: Hooks require external automation to \u201ccomplete\u201d the lifecycle; timeouts and failures must be handled.<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">Health checks and instance removal policies (operational correctness)<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>What it does<\/strong>: Helps ensure unhealthy instances are replaced and defines which instances are removed first during scale-in.<\/li>\n<li><strong>Why it matters<\/strong>: Scale-in choices impact availability.<\/li>\n<li><strong>Practical benefit<\/strong>: Reduced risk of terminating the \u201cwrong\u201d instance.<\/li>\n<li><strong>Caveats<\/strong>: Health depends on correct signals (LB health check, monitoring); verify supported health sources.<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">Integration with load balancing (traffic distribution)<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>What it does<\/strong>: Automatically adds\/removes instances to\/from a load balancer backend pool during scaling.<\/li>\n<li><strong>Why it matters<\/strong>: Without this, new instances may not receive traffic; removed instances may drop connections.<\/li>\n<li><strong>Practical benefit<\/strong>: Elastic web tier without manual registration.<\/li>\n<li><strong>Caveats<\/strong>: Ensure health check readiness gates traffic; bootstrap time matters.<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">Tags and governance integration<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>What it does<\/strong>: Applies tags to instances created by Auto Scaling (depending on configuration and supported features).<\/li>\n<li><strong>Why it matters<\/strong>: Cost allocation, policy enforcement, and operational tracking.<\/li>\n<li><strong>Practical benefit<\/strong>: Better FinOps and compliance reporting.<\/li>\n<li><strong>Caveats<\/strong>: Tag propagation rules can differ; verify exact behavior.<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">Auditability and activity history<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>What it does<\/strong>: Records scaling activities, errors, and events; can be audited via logs and ActionTrail.<\/li>\n<li><strong>Why it matters<\/strong>: Scaling affects cost and availability\u2014must be traceable.<\/li>\n<li><strong>Practical benefit<\/strong>: Faster troubleshooting and compliance evidence.<\/li>\n<li><strong>Caveats<\/strong>: Retention and access depend on your logging\/audit setup.<\/li>\n<\/ul>\n\n\n\n<hr class=\"wp-block-separator\" \/>\n\n\n\n<h2 class=\"wp-block-heading\">7. Architecture and How It Works<\/h2>\n\n\n\n<h3 class=\"wp-block-heading\">High-level architecture<\/h3>\n\n\n\n<p class=\"wp-block-paragraph\">Auto Scaling is a control-plane service. It evaluates triggers (schedules, alarms, manual actions), decides whether capacity must change, and then calls underlying services to execute actions (launch instances, terminate instances, attach\/detach to load balancers, etc.).<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">Control flow (typical)<\/h3>\n\n\n\n<ol class=\"wp-block-list\">\n<li><strong>Trigger occurs<\/strong>\n   &#8211; Scheduled time reached, or monitoring alarm fires, or operator executes a scaling rule.<\/li>\n<li><strong>Auto Scaling evaluates constraints<\/strong>\n   &#8211; Checks min\/max capacity, cooldown status, and scaling group state.<\/li>\n<li><strong>Auto Scaling initiates a scaling activity<\/strong>\n   &#8211; Requests ECS instance creation\/termination according to scaling configuration\/launch template.<\/li>\n<li><strong>Networking and security applied<\/strong>\n   &#8211; Instances are placed into configured VPC vSwitch(es) and security groups.<\/li>\n<li><strong>(Optional) Load balancer registration<\/strong>\n   &#8211; Instances are attached to backend servers, health checks begin.<\/li>\n<li><strong>Activity completes<\/strong>\n   &#8211; Success\/failure is recorded; you validate via console\/CLI and monitoring.<\/li>\n<\/ol>\n\n\n\n<h3 class=\"wp-block-heading\">Integrations with related services (common)<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>ECS<\/strong>: provisioning\/termination of instances<\/li>\n<li><strong>VPC<\/strong>: vSwitch placement, routing, security group rules<\/li>\n<li><strong>CloudMonitor<\/strong>: alarms\/metrics used for event-triggered scaling<\/li>\n<li><strong>Server Load Balancer \/ ALB \/ NLB<\/strong>: traffic distribution (verify which LB products are supported for your scenario)<\/li>\n<li><strong>ActionTrail<\/strong>: auditing changes and API calls<\/li>\n<li><strong>ROS\/Terraform<\/strong>: infrastructure-as-code for repeatable deployments<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">Dependency services<\/h3>\n\n\n\n<p class=\"wp-block-paragraph\">Auto Scaling depends on:\n&#8211; At least one compute target (commonly ECS)\n&#8211; Networking (VPC + vSwitch)\n&#8211; Credentials and permissions (RAM + service-linked roles)\n&#8211; Optional monitoring\/alerting (CloudMonitor)<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">Security\/authentication model<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Human and automation access is governed by <strong>RAM users\/roles\/policies<\/strong>.<\/li>\n<li>Auto Scaling itself typically uses a <strong>service-linked role<\/strong> to call other services on your behalf. The exact role name can vary; commonly it follows the \u201cAliyunServiceRoleFor\u2026\u201d pattern. Verify the current role name and required permissions in the Auto Scaling documentation.<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">Networking model<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Instances launched are attached to your specified <strong>VPC<\/strong> and <strong>vSwitch<\/strong>.<\/li>\n<li>Outbound internet access typically requires <strong>EIP<\/strong>, <strong>NAT Gateway<\/strong>, or other egress design (depending on your architecture).<\/li>\n<li>Inbound access should be via a <strong>load balancer<\/strong> or controlled security group rules rather than exposing each instance directly.<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">Monitoring\/logging\/governance<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Use <strong>CloudMonitor<\/strong> to track instance CPU, network, and (if configured) application metrics.<\/li>\n<li>Use <strong>ActionTrail<\/strong> for auditing scaling configuration changes and scaling actions.<\/li>\n<li>Consider centralizing logs via <strong>Log Service (SLS)<\/strong> (verify exact integration approach for your application).<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">Simple architecture diagram (conceptual)<\/h3>\n\n\n\n<pre><code class=\"language-mermaid\">flowchart LR\n  User((Users)) --&gt; LB[Load Balancer]\n  LB --&gt; SG[Auto Scaling Group]\n  SG --&gt; ECS1[ECS Instance A]\n  SG --&gt; ECS2[ECS Instance B]\n  CM[CloudMonitor Alarm] --&gt; AS[Auto Scaling]\n  AS --&gt; SG\n<\/code><\/pre>\n\n\n\n<h3 class=\"wp-block-heading\">Production-style architecture diagram (multi-zone, operational controls)<\/h3>\n\n\n\n<pre><code class=\"language-mermaid\">flowchart TB\n  subgraph Region[Alibaba Cloud Region]\n    subgraph VPC[VPC]\n      subgraph ZoneA[Zone A]\n        vswA[vSwitch A]\n        ecsA1[ECS (ASG)]\n        ecsA2[ECS (ASG)]\n      end\n      subgraph ZoneB[Zone B]\n        vswB[vSwitch B]\n        ecsB1[ECS (ASG)]\n      end\n\n      LB[Load Balancer] --&gt; ecsA1\n      LB --&gt; ecsA2\n      LB --&gt; ecsB1\n\n      NAT[NAT Gateway \/ Egress] --&gt; Internet[(Internet)]\n      ecsA1 --&gt; NAT\n      ecsA2 --&gt; NAT\n      ecsB1 --&gt; NAT\n    end\n\n    CM[CloudMonitor Alarms] --&gt; AS[Auto Scaling Control Plane]\n    AT[ActionTrail] --&gt; SIEM[(Audit\/Analytics)]\n    SLS[Log Service] --&gt; SIEM\n  end\n\n  Users((Users)) --&gt; LB\n<\/code><\/pre>\n\n\n\n<hr class=\"wp-block-separator\" \/>\n\n\n\n<h2 class=\"wp-block-heading\">8. Prerequisites<\/h2>\n\n\n\n<p class=\"wp-block-paragraph\">Before starting, ensure the following are in place.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">Account and billing<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li>An active <strong>Alibaba Cloud account<\/strong> with billing enabled.<\/li>\n<li>For low-cost testing, use <strong>Pay-as-you-go<\/strong> ECS (recommended for labs to avoid longer commitments).<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">Permissions (RAM)<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li>A <strong>RAM user\/role<\/strong> with permissions to manage:<\/li>\n<li>Auto Scaling (ESS)<\/li>\n<li>ECS (instances, images, security groups, key pairs)<\/li>\n<li>VPC (vSwitch selection)<\/li>\n<li>CloudMonitor alarms (if doing alarm-triggered scaling)<\/li>\n<li>Load balancer resources (if integrating)<\/li>\n<li>Ability to create or use the Auto Scaling <strong>service-linked role<\/strong> (if required by the console).<\/li>\n<\/ul>\n\n\n\n<p class=\"wp-block-paragraph\">If you\u2019re in an enterprise with least-privilege policies, request a scoped policy for:\n&#8211; Create\/Update\/Delete scaling groups\/configurations\/rules\n&#8211; Read instance health and describe instances\n&#8211; Attach\/detach backend servers (if using a load balancer)<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">Tools (optional but useful)<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>Alibaba Cloud Console<\/strong> (sufficient for this tutorial)<\/li>\n<li><strong>Alibaba Cloud CLI<\/strong> (<code>aliyun<\/code>) for verification and automation:<\/li>\n<li>Installation: https:\/\/www.alibabacloud.com\/help\/en\/alibaba-cloud-cli\/latest\/install-alibaba-cloud-cli<\/li>\n<li>Configure credentials: https:\/\/www.alibabacloud.com\/help\/en\/alibaba-cloud-cli\/latest\/configure-alibaba-cloud-cli<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">Region availability<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Auto Scaling is available in many regions, but integrations (specific load balancer type, instance families) can vary.<\/li>\n<li>Choose one region for the entire lab.<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">Quotas\/limits<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Ensure you have enough quota for:<\/li>\n<li>ECS instance count<\/li>\n<li>vCPU capacity<\/li>\n<li>Security groups and vSwitch usage<\/li>\n<li>Quotas are region-specific; check <strong>Quotas<\/strong> in the console for the latest limits. Do not assume defaults.<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">Prerequisite services\/resources<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li>A <strong>VPC<\/strong> and at least one <strong>vSwitch<\/strong> in a zone<\/li>\n<li>A <strong>security group<\/strong><\/li>\n<li>(Recommended) An <strong>SSH key pair<\/strong> for Linux access (or a secure password policy)<\/li>\n<li>(Optional but recommended) A <strong>load balancer<\/strong> to front the scaling group if you want a web endpoint that remains stable while instances change<\/li>\n<\/ul>\n\n\n\n<hr class=\"wp-block-separator\" \/>\n\n\n\n<h2 class=\"wp-block-heading\">9. Pricing \/ Cost<\/h2>\n\n\n\n<h3 class=\"wp-block-heading\">Pricing model (what you pay for)<\/h3>\n\n\n\n<p class=\"wp-block-paragraph\">In many Alibaba Cloud deployments, <strong>Auto Scaling as a control-plane service is offered at no additional charge<\/strong>, and you pay for the <strong>resources it creates and manages<\/strong> (ECS instances, disks, bandwidth, load balancers, NAT, monitoring, etc.). However, pricing and billing rules can change and can be region-dependent.<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Official product page (overview): https:\/\/www.alibabacloud.com\/product\/auto-scaling  <\/li>\n<li>Official pricing entry points:<\/li>\n<li>Alibaba Cloud pricing hub: https:\/\/www.alibabacloud.com\/pricing<\/li>\n<li>Pricing Calculator: https:\/\/calculator.alibabacloud.com\/<\/li>\n<\/ul>\n\n\n\n<p class=\"wp-block-paragraph\"><strong>Verify in official docs\/pricing<\/strong> whether Auto Scaling currently has any billable dimensions in your region\/account (for example, certain advanced features, cross-service integrations, or monitoring usage patterns).<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">Pricing dimensions to understand<\/h3>\n\n\n\n<p class=\"wp-block-paragraph\">Even if Auto Scaling is \u201cfree,\u201d your total cost is driven by:<\/p>\n\n\n\n<ol class=\"wp-block-list\">\n<li>\n<p><strong>ECS compute<\/strong>\n   &#8211; Instance type (vCPU\/RAM)\n   &#8211; Pricing method (pay-as-you-go vs subscription)\n   &#8211; Running hours and scale-out frequency<\/p>\n<\/li>\n<li>\n<p><strong>System and data disks<\/strong>\n   &#8211; Disk type (ESSD\/SSD\/HDD), size, and performance tier\n   &#8211; Snapshots (if enabled elsewhere)<\/p>\n<\/li>\n<li>\n<p><strong>Network egress<\/strong>\n   &#8211; Public bandwidth (EIP) or NAT Gateway egress\n   &#8211; Cross-zone traffic inside a region may be priced differently depending on product; verify for your architecture.<\/p>\n<\/li>\n<li>\n<p><strong>Load balancer<\/strong>\n   &#8211; Load balancer instance\/service charges\n   &#8211; LCU\/capacity units (model depends on LB type\u2014verify current billing for your chosen LB product)<\/p>\n<\/li>\n<li>\n<p><strong>Monitoring and logging<\/strong>\n   &#8211; CloudMonitor basic metrics are typically included for ECS, but advanced monitoring, custom metrics, or high-resolution metrics may cost extra (verify).\n   &#8211; Log Service (SLS) ingestion, storage, and queries can become significant.<\/p>\n<\/li>\n<\/ol>\n\n\n\n<h3 class=\"wp-block-heading\">Cost drivers (what increases cost quickly)<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li>High max capacity with frequent scaling events<\/li>\n<li>Using large instance families as scaling targets<\/li>\n<li>Aggressive scale-out triggers due to noisy metrics<\/li>\n<li>NAT Gateway and public egress costs for fleets that download packages at boot (common hidden cost)<\/li>\n<li>Storing logs centrally without retention controls<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">Hidden\/indirect costs<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>Bootstrapping traffic<\/strong>: instances pulling container images, OS updates, or packages during scale-out can generate egress and slow readiness.<\/li>\n<li><strong>Over-provisioned health checks<\/strong>: overly strict health checks can cause churn (replacement loops), increasing cost.<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">Cost optimization tips<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Prefer <strong>golden images<\/strong> (custom images) to reduce bootstrap time and outbound downloads.<\/li>\n<li>Use <strong>scheduled scale-in<\/strong> for predictable idle periods.<\/li>\n<li>Set realistic <strong>max capacity<\/strong> and alarms to notify on near-max events.<\/li>\n<li>Use <strong>cooldowns<\/strong> and more stable metrics (e.g., average over 5\u201310 minutes) to reduce oscillation.<\/li>\n<li>Use <strong>smaller instance types<\/strong> with horizontal scaling where appropriate.<\/li>\n<li>Tag instances for cost allocation (project\/environment\/owner).<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">Example low-cost starter estimate (model, not numbers)<\/h3>\n\n\n\n<p class=\"wp-block-paragraph\">A minimal lab often includes:\n&#8211; 1\u20132 small ECS instances (pay-as-you-go)\n&#8211; 1 VPC + 1\u20132 vSwitches\n&#8211; 1 security group\n&#8211; Optional 1 load balancer (can be the largest incremental cost depending on type)\n&#8211; Minimal logs and basic monitoring<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Because pricing varies by region and instance family, use the <strong>Alibaba Cloud Pricing Calculator<\/strong> to estimate:\n&#8211; Baseline (min capacity) monthly cost\n&#8211; Peak (max capacity) cost during peak hours\n&#8211; Egress assumptions (package downloads, updates, user traffic)<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">Example production cost considerations<\/h3>\n\n\n\n<p class=\"wp-block-paragraph\">In production, cost planning should include:\n&#8211; Peak scaling capacity and how often it happens\n&#8211; Reserved\/subscription capacity for baseline + pay-as-you-go burst for spikes\n&#8211; Load balancer capacity model\n&#8211; NAT and egress costs\n&#8211; Observability (logs, metrics, alerting) at real traffic volumes<\/p>\n\n\n\n<hr class=\"wp-block-separator\" \/>\n\n\n\n<h2 class=\"wp-block-heading\">10. Step-by-Step Hands-On Tutorial<\/h2>\n\n\n\n<h3 class=\"wp-block-heading\">Objective<\/h3>\n\n\n\n<p class=\"wp-block-paragraph\">Deploy a small, low-cost <strong>Auto Scaling<\/strong> setup on Alibaba Cloud that:\n&#8211; Launches ECS instances automatically from a defined configuration\n&#8211; Uses an <strong>alarm-triggered<\/strong> scale-out\/scale-in policy based on CPU\n&#8211; Optionally registers instances behind a load balancer (recommended if you want a stable entry point)<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">Lab Overview<\/h3>\n\n\n\n<p class=\"wp-block-paragraph\">You will:\n1. Create networking prerequisites (VPC\/vSwitch\/security group).\n2. Create a scaling group and a scaling configuration (or launch template-based config, depending on your console options).\n3. Attach scaling rules and CloudMonitor alarm triggers.\n4. Validate scaling behavior.\n5. Clean up all resources.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\"><strong>Target cost<\/strong>: low, but not zero. You will pay for ECS runtime, disks, and possibly a load balancer and egress.<\/p>\n\n\n\n<hr class=\"wp-block-separator\" \/>\n\n\n\n<h3 class=\"wp-block-heading\">Step 1: Prepare networking (VPC, vSwitch, security group)<\/h3>\n\n\n\n<p class=\"wp-block-paragraph\">1) In the Alibaba Cloud Console, select a <strong>Region<\/strong> (use one region consistently).<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">2) Create or choose a <strong>VPC<\/strong>:\n&#8211; VPC CIDR example: <code>10.0.0.0\/16<\/code> (choose one that doesn\u2019t conflict with your network plan)<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">3) Create at least one <strong>vSwitch<\/strong> in a zone within the region:\n&#8211; vSwitch CIDR example: <code>10.0.1.0\/24<\/code>\n&#8211; Optional: create a second vSwitch in another zone for multi-zone placement.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">4) Create a <strong>security group<\/strong> for the scaling instances:\n&#8211; Inbound rules (minimum):\n  &#8211; If using a load balancer, allow inbound from the load balancer only (best practice).\n  &#8211; For a lab without LB, allow inbound TCP 80 from your IP for web testing (temporary).\n  &#8211; Allow inbound SSH (TCP 22) only from your IP for admin access (temporary).\n&#8211; Outbound: allow required outbound (default outbound allow is common in many setups; verify your policy).<\/p>\n\n\n\n<p class=\"wp-block-paragraph\"><strong>Expected outcome<\/strong>\n&#8211; You have a VPC, at least one vSwitch, and a security group ready for ECS instances.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\"><strong>Verification<\/strong>\n&#8211; In VPC console, confirm the vSwitch is in the expected zone.\n&#8211; In security group, confirm inbound rules match your intended exposure.<\/p>\n\n\n\n<hr class=\"wp-block-separator\" \/>\n\n\n\n<h3 class=\"wp-block-heading\">Step 2: (Optional but recommended) Create a load balancer<\/h3>\n\n\n\n<p class=\"wp-block-paragraph\">A load balancer gives you a stable endpoint while instances scale in\/out.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">1) Create a <strong>load balancer<\/strong> in the same VPC.\n2) Create a listener (HTTP 80).\n3) Configure a <strong>health check<\/strong>:\n&#8211; Path: <code>\/<\/code>\n&#8211; Healthy\/unhealthy thresholds: use defaults unless you know your app behavior.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\"><strong>Expected outcome<\/strong>\n&#8211; You have a load balancer with an HTTP listener ready to receive backend instances.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\"><strong>Verification<\/strong>\n&#8211; The listener shows \u201crunning\u201d (or equivalent).\n&#8211; No backends yet (that\u2019s expected).<\/p>\n\n\n\n<p class=\"wp-block-paragraph\"><strong>Notes<\/strong>\n&#8211; Alibaba Cloud has multiple load balancing products (and billing models). Choose the product recommended in your region for new deployments. Verify which load balancer types Auto Scaling can attach to in your specific configuration.<\/p>\n\n\n\n<hr class=\"wp-block-separator\" \/>\n\n\n\n<h3 class=\"wp-block-heading\">Step 3: Create an Auto Scaling scaling group<\/h3>\n\n\n\n<p class=\"wp-block-paragraph\">1) Open <strong>Auto Scaling<\/strong> in the Alibaba Cloud Console.\n2) Create a <strong>Scaling Group<\/strong>:\n&#8211; <strong>VPC<\/strong>: select the VPC from Step 1\n&#8211; <strong>vSwitch(es)<\/strong>: select one or multiple vSwitches\n&#8211; <strong>Min size<\/strong>: <code>1<\/code>\n&#8211; <strong>Max size<\/strong>: <code>3<\/code> (keep small for lab safety)\n&#8211; <strong>Desired capacity<\/strong>: <code>1<\/code>\n&#8211; <strong>Cooldown<\/strong>: start with 300 seconds (5 minutes) unless you have a reason to change it<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">3) Attach the load balancer (if you created one):\n&#8211; Select the LB and listener\/backend server group as required by the UI.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">4) Set instance removal policy (if configurable):\n&#8211; Choose a policy that is predictable (e.g., remove newest\/oldest). If uncertain, keep default and verify behavior in docs.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\"><strong>Expected outcome<\/strong>\n&#8211; A scaling group exists with min\/max\/desired capacity set.\n&#8211; No instances are launched yet until you attach an active configuration and enable the group (depending on workflow).<\/p>\n\n\n\n<p class=\"wp-block-paragraph\"><strong>Verification<\/strong>\n&#8211; In the scaling group details, confirm:\n  &#8211; VPC and vSwitch settings\n  &#8211; Capacity bounds\n  &#8211; Status indicates it\u2019s ready for configuration\/enablement<\/p>\n\n\n\n<hr class=\"wp-block-separator\" \/>\n\n\n\n<h3 class=\"wp-block-heading\">Step 4: Create a scaling configuration (instance blueprint)<\/h3>\n\n\n\n<p class=\"wp-block-paragraph\">This defines what Auto Scaling launches. Console workflows vary:\n&#8211; Some accounts use <strong>Scaling Configuration<\/strong>\n&#8211; Others may use <strong>Launch Template<\/strong> integration<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Use the option available in your console and verify the official doc for the latest recommended method.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">1) Create a <strong>Scaling Configuration<\/strong> (or select a <strong>Launch Template<\/strong>):\n&#8211; <strong>Image<\/strong>: choose a stable Linux image (e.g., Alibaba Cloud Linux). Verify available images in your region.\n&#8211; <strong>Instance type<\/strong>: choose a small, low-cost type available in your region (burstable types are common for labs). Verify availability.\n&#8211; <strong>Security group<\/strong>: select the security group from Step 1\n&#8211; <strong>VPC\/vSwitch<\/strong>: ensure it matches the scaling group\n&#8211; <strong>Login<\/strong>: choose SSH key pair (recommended) or password (follow strong password policy)<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">2) Add <strong>User Data<\/strong> (cloud-init style) to install and start a simple web server.\nUse a minimal bootstrap that is likely to work on common RPM-based distributions. If your image is Debian\/Ubuntu-based, adapt accordingly.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Example (verify package manager for your image):<\/p>\n\n\n\n<pre><code class=\"language-bash\">#!\/bin\/bash\nset -e\n\n# Try common package managers (works for many images; adapt if needed)\nif command -v yum &gt;\/dev\/null 2&gt;&amp;1; then\n  yum -y install nginx\n  systemctl enable nginx\n  echo \"Hello from Auto Scaling on Alibaba Cloud - $(hostname)\" &gt; \/usr\/share\/nginx\/html\/index.html\n  systemctl start nginx\nelif command -v apt-get &gt;\/dev\/null 2&gt;&amp;1; then\n  apt-get update\n  apt-get -y install nginx\n  systemctl enable nginx\n  echo \"Hello from Auto Scaling on Alibaba Cloud - $(hostname)\" &gt; \/var\/www\/html\/index.html\n  systemctl start nginx\nelse\n  echo \"No supported package manager found\" &gt; \/tmp\/bootstrap_error.txt\nfi\n<\/code><\/pre>\n\n\n\n<p class=\"wp-block-paragraph\">3) Save and set this configuration as the <strong>active<\/strong> configuration for the scaling group.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\"><strong>Expected outcome<\/strong>\n&#8211; Auto Scaling has a valid blueprint to create instances.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\"><strong>Verification<\/strong>\n&#8211; In the scaling group, confirm an active scaling configuration\/template is attached.<\/p>\n\n\n\n<hr class=\"wp-block-separator\" \/>\n\n\n\n<h3 class=\"wp-block-heading\">Step 5: Enable the scaling group and launch the first instance<\/h3>\n\n\n\n<p class=\"wp-block-paragraph\">1) Enable the scaling group (if it is not already enabled).\n2) Set desired capacity to <code>1<\/code> (or keep it at <code>1<\/code>).\n3) Wait for the first scaling activity to complete.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\"><strong>Expected outcome<\/strong>\n&#8211; One ECS instance is created and appears as a scaling instance in the group.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\"><strong>Verification<\/strong>\n&#8211; In Auto Scaling:\n  &#8211; Check <strong>Scaling Activities<\/strong>: should show a successful scale-out to 1.\n  &#8211; Check <strong>Instances<\/strong> tab: the instance should be listed.\n&#8211; In ECS console:\n  &#8211; Confirm the instance is running in the correct VPC\/vSwitch\/security group.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">If you used a load balancer:\n&#8211; Confirm the instance is registered as a backend and becomes <strong>healthy<\/strong>.<\/p>\n\n\n\n<hr class=\"wp-block-separator\" \/>\n\n\n\n<h3 class=\"wp-block-heading\">Step 6: Create scaling rules (scale out and scale in)<\/h3>\n\n\n\n<p class=\"wp-block-paragraph\">Create two scaling rules:<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">1) <strong>Scale-out rule<\/strong>\n&#8211; Action: add <code>+1<\/code> instance\n&#8211; Cooldown: keep default or 300 seconds<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">2) <strong>Scale-in rule<\/strong>\n&#8211; Action: remove <code>-1<\/code> instance (respect min size = 1)\n&#8211; Ensure it will not reduce below min size<\/p>\n\n\n\n<p class=\"wp-block-paragraph\"><strong>Expected outcome<\/strong>\n&#8211; Two scaling rules exist and can be executed manually.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\"><strong>Verification<\/strong>\n&#8211; Use \u201cExecute\u201d (manual run) for the scale-out rule once to confirm it works.\n&#8211; Confirm desired capacity rises to 2 and a new instance is created and registered\/healthy behind the LB.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Then manually execute scale-in to return to 1.<\/p>\n\n\n\n<hr class=\"wp-block-separator\" \/>\n\n\n\n<h3 class=\"wp-block-heading\">Step 7: Configure alarm-triggered scaling (CloudMonitor)<\/h3>\n\n\n\n<p class=\"wp-block-paragraph\">You will create CPU-based alarms:\n&#8211; High CPU =&gt; scale out\n&#8211; Low CPU =&gt; scale in<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">1) Open <strong>CloudMonitor<\/strong>.\n2) Create an alarm for ECS CPU utilization:\n&#8211; Scope: instances in the scaling group (you may need to select the instances or a group\/tag-based scope depending on your monitoring options)\n&#8211; Condition:\n  &#8211; CPUUtilization &gt;= 60% for 5 minutes (example; tune later)\n&#8211; Alarm action:\n  &#8211; Trigger Auto Scaling <strong>scale-out rule<\/strong><\/p>\n\n\n\n<p class=\"wp-block-paragraph\">3) Create a second alarm:\n&#8211; CPUUtilization &lt;= 20% for 10 minutes (example; tune later)\n&#8211; Alarm action:\n  &#8211; Trigger Auto Scaling <strong>scale-in rule<\/strong><\/p>\n\n\n\n<p class=\"wp-block-paragraph\"><strong>Expected outcome<\/strong>\n&#8211; Two alarms exist and are linked to Auto Scaling actions.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\"><strong>Verification<\/strong>\n&#8211; In CloudMonitor, confirm alarm status is \u201cEnabled.\u201d\n&#8211; In Auto Scaling, confirm event-triggered tasks (or equivalent) reference the alarms\/rules.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\"><strong>Important caveat<\/strong>\n&#8211; CPU alarms require meaningful CPU load. \u201cIdle nginx\u201d typically won\u2019t trigger high CPU alarms unless you generate load.<\/p>\n\n\n\n<hr class=\"wp-block-separator\" \/>\n\n\n\n<h3 class=\"wp-block-heading\">Step 8: Generate load to trigger scale-out (controlled test)<\/h3>\n\n\n\n<p class=\"wp-block-paragraph\">If your instance serves HTTP, you can generate load in several ways:\n&#8211; Use a simple load tool from a client machine\n&#8211; Or SSH into the instance and run a CPU stress tool (may require installation)<\/p>\n\n\n\n<p class=\"wp-block-paragraph\"><strong>Option A: HTTP load from your machine (simpler)<\/strong>\nIf you have a public endpoint (LB public address or a temporary public IP), run:<\/p>\n\n\n\n<pre><code class=\"language-bash\"># Replace URL with your load balancer DNS name or IP\nURL=\"http:\/\/YOUR_LB_ADDRESS\/\"\n\n# Simple loop load (not very strong; safe)\nfor i in $(seq 1 2000); do\n  curl -s \"$URL\" &gt;\/dev\/null &amp;\ndone\nwait\n<\/code><\/pre>\n\n\n\n<p class=\"wp-block-paragraph\"><strong>Option B: CPU stress (more direct, use carefully)<\/strong>\nSSH into one instance and run a short stress test. For example (package names vary; verify):<\/p>\n\n\n\n<pre><code class=\"language-bash\"># RPM-based\nsudo yum -y install stress || true\nstress --cpu 1 --timeout 300\n<\/code><\/pre>\n\n\n\n<p class=\"wp-block-paragraph\"><strong>Expected outcome<\/strong>\n&#8211; High CPU alarm transitions to \u201cALARM\u201d and triggers scale-out.\n&#8211; Auto Scaling increases desired capacity by 1 (up to max size).<\/p>\n\n\n\n<p class=\"wp-block-paragraph\"><strong>Verification<\/strong>\n&#8211; CloudMonitor shows the alarm fired.\n&#8211; Auto Scaling scaling activities show a scale-out event.\n&#8211; ECS shows an additional instance running.\n&#8211; Load balancer backend shows the new instance added and healthy.<\/p>\n\n\n\n<hr class=\"wp-block-separator\" \/>\n\n\n\n<h3 class=\"wp-block-heading\">Validation<\/h3>\n\n\n\n<p class=\"wp-block-paragraph\">Use this checklist:<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">1) <strong>Scaling group state<\/strong>\n&#8211; Desired capacity changes match your rules.\n&#8211; Min\/max boundaries are respected.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">2) <strong>Instance provisioning<\/strong>\n&#8211; New instances are in correct VPC\/vSwitch and have correct security group.\n&#8211; User data successfully installs nginx and serves the expected page.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">3) <strong>Traffic and health<\/strong>\n&#8211; Load balancer health checks mark instances healthy.\n&#8211; Requests return content: <code>Hello from Auto Scaling on Alibaba Cloud - &lt;hostname&gt;<\/code><\/p>\n\n\n\n<p class=\"wp-block-paragraph\">4) <strong>Audit<\/strong>\n&#8211; Scaling activities show timestamps and outcomes.\n&#8211; ActionTrail (if enabled) records scaling-related API calls.<\/p>\n\n\n\n<hr class=\"wp-block-separator\" \/>\n\n\n\n<h3 class=\"wp-block-heading\">Troubleshooting<\/h3>\n\n\n\n<p class=\"wp-block-paragraph\">Common issues and practical fixes:<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">1) <strong>Instances fail to launch<\/strong>\n&#8211; Check scaling activity error details.\n&#8211; Common causes:\n  &#8211; No quota for ECS instances\/vCPUs\n  &#8211; Selected instance type unavailable in the chosen zone\n  &#8211; Invalid image or configuration\n&#8211; Fix:\n  &#8211; Use a different instance type or add another vSwitch in another zone.\n  &#8211; Verify quotas in the console.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">2) <strong>Instances launch but do not become healthy behind LB<\/strong>\n&#8211; Check security group rules: allow LB health check traffic.\n&#8211; Check nginx is running:\n  &#8211; SSH and run <code>systemctl status nginx<\/code>\n&#8211; Fix:\n  &#8211; Adjust inbound rules (prefer allowing from LB security group\/source).\n  &#8211; Ensure user data uses correct package manager and paths.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">3) <strong>Alarms never trigger<\/strong>\n&#8211; Ensure alarm scope includes the scaling instances.\n&#8211; Ensure sufficient load and correct metric period.\n&#8211; Fix:\n  &#8211; Temporarily lower threshold or increase load.\n  &#8211; Verify metric availability and delay in CloudMonitor.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">4) <strong>Scaling oscillation (rapid scale in\/out)<\/strong>\n&#8211; Cooldown too short, thresholds too tight, or metric too noisy.\n&#8211; Fix:\n  &#8211; Increase cooldown, widen thresholds, increase evaluation periods.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">5) <strong>Scale-in terminates an instance that is still serving traffic<\/strong>\n&#8211; Missing connection draining \/ lifecycle hook.\n&#8211; Fix:\n  &#8211; Use lifecycle hooks and ensure LB deregistration\/drain completes before termination.<\/p>\n\n\n\n<hr class=\"wp-block-separator\" \/>\n\n\n\n<h3 class=\"wp-block-heading\">Cleanup<\/h3>\n\n\n\n<p class=\"wp-block-paragraph\">To avoid ongoing charges, delete resources in a safe order:<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">1) In Auto Scaling:\n&#8211; Disable scaling group (if required)\n&#8211; Set desired capacity to min and then to 0 only if your group allows it (many labs keep min=1; you may need to set min=0 temporarily\u2014verify behavior and policy)\n&#8211; Delete scaling rules, scheduled tasks, alarms associations\n&#8211; Delete scaling configuration\/launch template association\n&#8211; Delete scaling group<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">2) In ECS:\n&#8211; Verify no instances remain (terminate any leftover instances created by scaling)<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">3) Delete load balancer (if created)<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">4) In CloudMonitor:\n&#8211; Delete the alarms used for scaling (optional but recommended)<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">5) In VPC:\n&#8211; Delete test vSwitches\/security group\/VPC if they were created solely for this lab and are not used elsewhere<\/p>\n\n\n\n<hr class=\"wp-block-separator\" \/>\n\n\n\n<h2 class=\"wp-block-heading\">11. Best Practices<\/h2>\n\n\n\n<h3 class=\"wp-block-heading\">Architecture best practices<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>Design for statelessness<\/strong> in tiers that Auto Scaling manages. Store session\/state in external services (cache, DB) rather than on-instance.<\/li>\n<li>Use <strong>multi-zone vSwitches<\/strong> within a region when possible for resilience and capacity flexibility.<\/li>\n<li>Keep dependencies (DB, cache, queues) sized appropriately; scaling compute alone won\u2019t fix a saturated database.<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">IAM\/security best practices<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Use <strong>RAM roles and least privilege<\/strong>:<\/li>\n<li>Separate roles for operators vs automation.<\/li>\n<li>Restrict who can change min\/max and scaling rules.<\/li>\n<li>Prefer <strong>SSH key pairs<\/strong> over passwords for Linux instances.<\/li>\n<li>Avoid embedding secrets in user data; use a secrets strategy (see Security Considerations).<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">Cost best practices<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Keep <strong>max size<\/strong> realistic and use alerts when near max.<\/li>\n<li>Combine <strong>scheduled scaling<\/strong> (predictable) with <strong>alarm scaling<\/strong> (unpredictable bursts).<\/li>\n<li>Use <strong>custom images<\/strong> to reduce boot time and egress from package downloads.<\/li>\n<li>Tag everything: <code>env<\/code>, <code>app<\/code>, <code>owner<\/code>, <code>cost-center<\/code>.<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">Performance best practices<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Choose <strong>scaling metrics<\/strong> that reflect user experience:<\/li>\n<li>CPU is a starting point, but for many services QPS, latency, or queue depth is better.<\/li>\n<li>Calibrate scaling step size:<\/li>\n<li>Small increments reduce risk; larger increments reduce time-to-capacity.<\/li>\n<li>Maintain a <strong>warm baseline<\/strong> (min capacity) to reduce cold-start impact.<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">Reliability best practices<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Use <strong>health checks<\/strong> that reflect real readiness (not just \u201cport open\u201d).<\/li>\n<li>Implement <strong>lifecycle hooks<\/strong>:<\/li>\n<li>Scale-out: wait until app is ready and registered<\/li>\n<li>Scale-in: drain connections and finish in-flight work<\/li>\n<li>Test failure modes: unavailable instance types, zone capacity shortage, and image bootstrap failures.<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">Operations best practices<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Monitor scaling activity success\/failure rate.<\/li>\n<li>Track instance launch time and readiness time.<\/li>\n<li>Keep an <strong>emergency manual override<\/strong> procedure (set desired capacity directly during incidents).<\/li>\n<li>Use infrastructure-as-code (ROS\/Terraform) for repeatability.<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">Governance\/tagging\/naming best practices<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Naming conventions:<\/li>\n<li><code>asg-&lt;app&gt;-&lt;env&gt;-&lt;region&gt;<\/code><\/li>\n<li><code>sc-&lt;app&gt;-&lt;env&gt;-v1<\/code><\/li>\n<li>Tag propagation strategy:<\/li>\n<li>Ensure scaled instances inherit tags used for billing and CMDB (verify tag propagation behavior).<\/li>\n<\/ul>\n\n\n\n<hr class=\"wp-block-separator\" \/>\n\n\n\n<h2 class=\"wp-block-heading\">12. Security Considerations<\/h2>\n\n\n\n<h3 class=\"wp-block-heading\">Identity and access model<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>RAM users\/roles<\/strong> govern administrative access to Auto Scaling and dependent services.<\/li>\n<li>Auto Scaling commonly relies on a <strong>service-linked role<\/strong> to call ECS and related services. Ensure:<\/li>\n<li>The role exists<\/li>\n<li>It has only the required permissions<\/li>\n<li>Changes are audited<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">Encryption<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>At rest<\/strong>: use encrypted disks where required by policy (ECS disk encryption options depend on region and disk type\u2014verify).<\/li>\n<li><strong>In transit<\/strong>:<\/li>\n<li>Use HTTPS on load balancers where applicable.<\/li>\n<li>Encrypt service-to-service calls where supported.<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">Network exposure<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Prefer <strong>private subnets<\/strong> (no public IP per instance) and expose only the load balancer.<\/li>\n<li>Control egress via NAT Gateway and route tables when you need consistent outbound IPs.<\/li>\n<li>Restrict security group inbound rules to:<\/li>\n<li>Load balancer sources<\/li>\n<li>Admin IP ranges for SSH (temporary and tightly scoped)<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">Secrets handling<\/h3>\n\n\n\n<p class=\"wp-block-paragraph\">Avoid putting secrets in:\n&#8211; User data\n&#8211; Images baked with plaintext secrets\n&#8211; Git repositories<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Preferred patterns (verify what your org supports):\n&#8211; Pull secrets at runtime from a secure store (Alibaba Cloud secret product choices may vary; verify current recommended service in official docs).\n&#8211; Use instance RAM roles to fetch secrets without long-lived access keys.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">Audit\/logging<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Enable <strong>ActionTrail<\/strong> for audit logs and integrate with your log analytics\/SIEM.<\/li>\n<li>Log scaling events:<\/li>\n<li>Configuration changes (who changed min\/max\/rules)<\/li>\n<li>Scaling activities and failures<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">Compliance considerations<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Ensure resource placement (region) matches data residency rules.<\/li>\n<li>Ensure encryption and access controls meet standards (ISO, SOC, PCI) required by your industry.<\/li>\n<li>Keep retention policies for logs and audits aligned with policy.<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">Common security mistakes<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Allowing SSH from <code>0.0.0.0\/0<\/code><\/li>\n<li>Exposing every scaled instance with a public IP<\/li>\n<li>Using overly permissive RAM policies for scaling operations<\/li>\n<li>Lack of audit trails for scaling configuration changes<\/li>\n<li>Storing secrets in user data<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">Secure deployment recommendations<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Use a load balancer and private instances.<\/li>\n<li>Use lifecycle hooks to ensure instances are hardened before receiving traffic.<\/li>\n<li>Implement continuous vulnerability scanning on base images and patch pipelines.<\/li>\n<\/ul>\n\n\n\n<hr class=\"wp-block-separator\" \/>\n\n\n\n<h2 class=\"wp-block-heading\">13. Limitations and Gotchas<\/h2>\n\n\n\n<p class=\"wp-block-paragraph\">Always confirm current limits in the official docs and the Quotas console.<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>Regional scope<\/strong>: scaling groups are regional; cross-region scaling requires separate groups and traffic steering.<\/li>\n<li><strong>Instance boot time<\/strong>: scaling is not instantaneous; plan for minutes, not seconds.<\/li>\n<li><strong>Metric delay<\/strong>: CloudMonitor alarms evaluate over time windows; scaling reacts after thresholds are breached for a period.<\/li>\n<li><strong>Zone capacity constraints<\/strong>: some instance types may be unavailable in a zone during peak demand; multi-zone vSwitch selection helps but doesn\u2019t guarantee capacity.<\/li>\n<li><strong>Scale-in risk<\/strong>: terminating instances without draining can drop connections or lose in-flight work.<\/li>\n<li><strong>User data variability<\/strong>: bootstrap scripts differ by OS image and package manager; test before production.<\/li>\n<li><strong>Cost surprises<\/strong>:<\/li>\n<li>NAT Gateway and egress during frequent scale-out events<\/li>\n<li>Load balancer billing model (capacity units\/LCUs depending on product)<\/li>\n<li>Log volume growth<\/li>\n<li><strong>Quota bottlenecks<\/strong>: ECS quota or vCPU quota can prevent scale-out.<\/li>\n<li><strong>Configuration drift<\/strong>: if you update app code manually on instances, new instances won\u2019t match; use images and automation.<\/li>\n<\/ul>\n\n\n\n<hr class=\"wp-block-separator\" \/>\n\n\n\n<h2 class=\"wp-block-heading\">14. Comparison with Alternatives<\/h2>\n\n\n\n<h3 class=\"wp-block-heading\">Nearest services in Alibaba Cloud<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>ACK (Alibaba Cloud Container Service for Kubernetes)<\/strong> autoscaling:<\/li>\n<li>Horizontal Pod Autoscaler (HPA) and cluster autoscaler for containerized workloads<\/li>\n<li><strong>Serverless options<\/strong> (depending on workload):<\/li>\n<li>Function Compute for event-driven functions<\/li>\n<li>Serverless App Engine (SAE) for application-level autoscaling (verify suitability and current positioning)<\/li>\n<li><strong>Manual or scripted scaling<\/strong>:<\/li>\n<li>OOS\/ROS\/Terraform + scheduled workflows (less dynamic; more maintenance)<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">Nearest services in other clouds<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li>AWS: EC2 Auto Scaling<\/li>\n<li>Azure: Virtual Machine Scale Sets<\/li>\n<li>Google Cloud: Managed Instance Groups (MIG)<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">Open-source\/self-managed alternatives<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Kubernetes Cluster Autoscaler + HPA (requires Kubernetes operations)<\/li>\n<li>Custom scripts\/cron jobs calling ECS APIs (high risk; requires careful engineering)<\/li>\n<\/ul>\n\n\n\n<h4 class=\"wp-block-heading\">Comparison table<\/h4>\n\n\n\n<figure class=\"wp-block-table\"><table>\n<thead>\n<tr>\n<th>Option<\/th>\n<th>Best For<\/th>\n<th>Strengths<\/th>\n<th>Weaknesses<\/th>\n<th>When to Choose<\/th>\n<\/tr>\n<\/thead>\n<tbody>\n<tr>\n<td>Alibaba Cloud Auto Scaling<\/td>\n<td>VM-based elastic fleets (ECS)<\/td>\n<td>Native ECS integration, policy-based scaling, alarms\/schedules<\/td>\n<td>VM boot time, requires careful health\/lifecycle design<\/td>\n<td>Stateless VM tiers, predictable + bursty workloads<\/td>\n<\/tr>\n<tr>\n<td>ACK autoscaling (Kubernetes)<\/td>\n<td>Containerized microservices<\/td>\n<td>Faster scaling at pod layer, rich autoscaling ecosystem<\/td>\n<td>Kubernetes operational overhead<\/td>\n<td>You already run Kubernetes and want pod + node elasticity<\/td>\n<\/tr>\n<tr>\n<td>Function Compute<\/td>\n<td>Event-driven workloads<\/td>\n<td>Very fine-grained scaling, no server management<\/td>\n<td>Not suitable for long-lived stateful apps; execution model constraints<\/td>\n<td>Spiky event processing, lightweight APIs, automation tasks<\/td>\n<\/tr>\n<tr>\n<td>SAE (verify current scope)<\/td>\n<td>App-level managed runtime<\/td>\n<td>Simplified ops, platform autoscaling<\/td>\n<td>Platform constraints and portability considerations<\/td>\n<td>You want PaaS-style scaling rather than managing VMs<\/td>\n<\/tr>\n<tr>\n<td>AWS EC2 Auto Scaling \/ Azure VMSS \/ GCP MIG<\/td>\n<td>Multi-cloud parity<\/td>\n<td>Mature ecosystems and deep integrations<\/td>\n<td>Different APIs and operational models<\/td>\n<td>You\u2019re standardizing across providers or migrating<\/td>\n<\/tr>\n<tr>\n<td>Self-managed scripts<\/td>\n<td>Highly custom environments<\/td>\n<td>Full control<\/td>\n<td>Highest maintenance and risk<\/td>\n<td>Only if managed services can\u2019t meet requirements<\/td>\n<\/tr>\n<\/tbody>\n<\/table><\/figure>\n\n\n\n<hr class=\"wp-block-separator\" \/>\n\n\n\n<h2 class=\"wp-block-heading\">15. Real-World Example<\/h2>\n\n\n\n<h3 class=\"wp-block-heading\">Enterprise example (regulated SaaS with predictable peaks)<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>Problem<\/strong>: A regulated SaaS platform sees heavy weekday usage (9\u20136), with strict audit requirements and change control.<\/li>\n<li><strong>Proposed architecture<\/strong>:<\/li>\n<li>Two-zone VPC deployment in one region<\/li>\n<li>Load balancer in front of an Auto Scaling group for the web\/app tier<\/li>\n<li>CloudMonitor alarms for burst traffic<\/li>\n<li>Scheduled scaling to reduce capacity at night<\/li>\n<li>ActionTrail enabled; logs shipped to centralized logging<\/li>\n<li>Lifecycle hooks to ensure hardening + app readiness before traffic<\/li>\n<li><strong>Why Auto Scaling was chosen<\/strong>:<\/li>\n<li>Strong fit for stateless VM tier<\/li>\n<li>Auditability via scaling activities + ActionTrail<\/li>\n<li>Cost reduction using scheduled scale-in<\/li>\n<li><strong>Expected outcomes<\/strong>:<\/li>\n<li>Improved resilience to peak demand<\/li>\n<li>Lower compute cost overnight<\/li>\n<li>Fewer manual scaling incidents and better audit readiness<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">Startup\/small-team example (consumer app with viral spikes)<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>Problem<\/strong>: A startup mobile app experiences unpredictable spikes from social media referrals; small team can\u2019t scale manually.<\/li>\n<li><strong>Proposed architecture<\/strong>:<\/li>\n<li>Single-region deployment initially (cost and simplicity)<\/li>\n<li>Auto Scaling group for API servers<\/li>\n<li>Basic CloudMonitor CPU alarms to scale<\/li>\n<li>Simple golden image with app baked in to reduce boot time<\/li>\n<li><strong>Why Auto Scaling was chosen<\/strong>:<\/li>\n<li>Minimal operational overhead compared to building custom scaling automation<\/li>\n<li>Pay-as-you-go elasticity aligns with uncertain demand<\/li>\n<li><strong>Expected outcomes<\/strong>:<\/li>\n<li>Better user experience during spikes<\/li>\n<li>Controlled cost during low usage<\/li>\n<li>A clear path to multi-zone and multi-region later<\/li>\n<\/ul>\n\n\n\n<hr class=\"wp-block-separator\" \/>\n\n\n\n<h2 class=\"wp-block-heading\">16. FAQ<\/h2>\n\n\n\n<p class=\"wp-block-paragraph\">1) <strong>Is Alibaba Cloud Auto Scaling the same as ECS?<\/strong><br\/>\nNo. ECS provides compute instances. Auto Scaling orchestrates the creation and release of ECS instances based on rules and triggers.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">2) <strong>Does Auto Scaling cost extra?<\/strong><br\/>\nOften the control-plane service is not billed separately, but you pay for the resources created (ECS, disks, bandwidth, load balancers, NAT, logs). Always verify current pricing in official sources.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">3) <strong>Is Auto Scaling regional or global?<\/strong><br\/>\nTypically regional. A scaling group runs in one region and can span zones within that region via multiple vSwitches.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">4) <strong>Can Auto Scaling scale across regions automatically?<\/strong><br\/>\nNot automatically as a single group. Multi-region elasticity requires separate scaling groups per region plus traffic management.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">5) <strong>What is the difference between desired, min, and max capacity?<\/strong><br\/>\n&#8211; <strong>Min<\/strong>: the lowest number of instances allowed<br\/>\n&#8211; <strong>Max<\/strong>: the highest number allowed<br\/>\n&#8211; <strong>Desired<\/strong>: the target number Auto Scaling tries to maintain (within min\/max)<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">6) <strong>What metrics should I use for scaling?<\/strong><br\/>\nCPU is a common starting point, but better signals include request rate, latency, queue depth, or custom app metrics. Choose metrics that correlate with user experience.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">7) <strong>How fast does scaling happen?<\/strong><br\/>\nVM provisioning and bootstrapping typically take minutes. Design with a warm baseline and avoid expecting instant response.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">8) <strong>How do I prevent scaling oscillation?<\/strong><br\/>\nUse cooldowns, longer evaluation windows, and thresholds with hysteresis (e.g., scale out at 60%, scale in at 20%).<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">9) <strong>Can I run a scale-to-zero pattern?<\/strong><br\/>\nSometimes, but it depends on scaling group settings and your application requirements. Many production services keep min capacity &gt; 0 to avoid cold starts. Verify how min=0 behaves for your configuration.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">10) <strong>How do new instances get application code?<\/strong><br\/>\nCommon approaches:\n&#8211; Bake code into a custom image\n&#8211; Pull code\/artifacts at boot (slower, can increase egress)\n&#8211; Use configuration management during lifecycle hook<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">11) <strong>What happens if an instance fails health checks?<\/strong><br\/>\nAuto Scaling can replace unhealthy instances depending on health check configuration and integration. Verify which health signals are supported in your setup.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">12) <strong>Can Auto Scaling attach instances to a load balancer automatically?<\/strong><br\/>\nYes in common architectures, but exact supported LB products and configuration steps vary. Verify your load balancer type and supported integration.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">13) <strong>How do I audit who changed scaling settings?<\/strong><br\/>\nUse RAM for access control and <strong>ActionTrail<\/strong> for auditing changes and API calls.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">14) <strong>How do I troubleshoot failed scale-out events?<\/strong><br\/>\nCheck:\n&#8211; Scaling activity error messages\n&#8211; ECS quotas and instance type availability\n&#8211; VPC\/vSwitch and security group configuration\n&#8211; Image validity and user data scripts<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">15) <strong>Should I use Auto Scaling or Kubernetes autoscaling?<\/strong><br\/>\nIf your workload is VM-based and you want a simpler VM fleet model, Auto Scaling is often best. If you run containers and want pod-level scaling, Kubernetes autoscaling is usually a better fit (with added operational overhead).<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">16) <strong>How do I roll out changes to scaling configuration safely?<\/strong><br\/>\nCreate a new scaling configuration (or update launch template version), test it with a small scale-out, then gradually replace instances. Avoid mutating running instances manually.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">17) <strong>Can Auto Scaling use preemptible\/spot instances?<\/strong><br\/>\nIn some regions and configurations, yes. Support details and behaviors differ by product and region\u2014verify in official docs.<\/p>\n\n\n\n<hr class=\"wp-block-separator\" \/>\n\n\n\n<h2 class=\"wp-block-heading\">17. Top Online Resources to Learn Auto Scaling<\/h2>\n\n\n\n<figure class=\"wp-block-table\"><table>\n<thead>\n<tr>\n<th>Resource Type<\/th>\n<th>Name<\/th>\n<th>Why It Is Useful<\/th>\n<\/tr>\n<\/thead>\n<tbody>\n<tr>\n<td>Official documentation<\/td>\n<td>Alibaba Cloud Auto Scaling documentation: https:\/\/www.alibabacloud.com\/help\/en\/auto-scaling<\/td>\n<td>Primary, up-to-date reference for features, concepts, and limits<\/td>\n<\/tr>\n<tr>\n<td>Product overview<\/td>\n<td>Auto Scaling product page: https:\/\/www.alibabacloud.com\/product\/auto-scaling<\/td>\n<td>High-level capabilities and positioning<\/td>\n<\/tr>\n<tr>\n<td>Pricing hub<\/td>\n<td>Alibaba Cloud Pricing: https:\/\/www.alibabacloud.com\/pricing<\/td>\n<td>Starting point for current billing rules<\/td>\n<\/tr>\n<tr>\n<td>Cost estimation<\/td>\n<td>Alibaba Cloud Pricing Calculator: https:\/\/calculator.alibabacloud.com\/<\/td>\n<td>Build region-specific estimates without guessing<\/td>\n<\/tr>\n<tr>\n<td>CLI documentation<\/td>\n<td>Alibaba Cloud CLI: https:\/\/www.alibabacloud.com\/help\/en\/alibaba-cloud-cli\/latest<\/td>\n<td>Automate and verify scaling operations<\/td>\n<\/tr>\n<tr>\n<td>Audit<\/td>\n<td>ActionTrail documentation: https:\/\/www.alibabacloud.com\/help\/en\/actiontrail<\/td>\n<td>Track API calls and configuration changes for compliance<\/td>\n<\/tr>\n<tr>\n<td>Monitoring<\/td>\n<td>CloudMonitor documentation: https:\/\/www.alibabacloud.com\/help\/en\/cloudmonitor<\/td>\n<td>Build alarms and event-triggered scaling triggers<\/td>\n<\/tr>\n<tr>\n<td>IaC<\/td>\n<td>ROS documentation: https:\/\/www.alibabacloud.com\/help\/en\/resource-orchestration-service<\/td>\n<td>Repeatable infrastructure deployments<\/td>\n<\/tr>\n<tr>\n<td>IaC (community but widely used)<\/td>\n<td>Terraform Alibaba Cloud Provider (verify official): https:\/\/registry.terraform.io\/browse\/providers<\/td>\n<td>Practical automation patterns (verify resource support for Auto Scaling)<\/td>\n<\/tr>\n<\/tbody>\n<\/table><\/figure>\n\n\n\n<hr class=\"wp-block-separator\" \/>\n\n\n\n<h2 class=\"wp-block-heading\">18. Training and Certification Providers<\/h2>\n\n\n\n<figure class=\"wp-block-table\"><table>\n<thead>\n<tr>\n<th>Institute<\/th>\n<th>Suitable Audience<\/th>\n<th>Likely Learning Focus<\/th>\n<th>Mode<\/th>\n<th>Website<\/th>\n<\/tr>\n<\/thead>\n<tbody>\n<tr>\n<td>DevOpsSchool.com<\/td>\n<td>DevOps engineers, SREs, cloud engineers<\/td>\n<td>DevOps tooling, cloud operations, automation fundamentals<\/td>\n<td>check website<\/td>\n<td>https:\/\/www.devopsschool.com\/<\/td>\n<\/tr>\n<tr>\n<td>ScmGalaxy.com<\/td>\n<td>Beginners to intermediate engineers<\/td>\n<td>SCM, CI\/CD practices, DevOps foundations<\/td>\n<td>check website<\/td>\n<td>https:\/\/www.scmgalaxy.com\/<\/td>\n<\/tr>\n<tr>\n<td>CLoudOpsNow.in<\/td>\n<td>Cloud operations practitioners<\/td>\n<td>CloudOps practices, operations and reliability<\/td>\n<td>check website<\/td>\n<td>https:\/\/www.cloudopsnow.in\/<\/td>\n<\/tr>\n<tr>\n<td>SreSchool.com<\/td>\n<td>SREs and platform teams<\/td>\n<td>Reliability engineering, monitoring, incident response<\/td>\n<td>check website<\/td>\n<td>https:\/\/www.sreschool.com\/<\/td>\n<\/tr>\n<tr>\n<td>AiOpsSchool.com<\/td>\n<td>Ops teams exploring AIOps<\/td>\n<td>Observability and AIOps concepts<\/td>\n<td>check website<\/td>\n<td>https:\/\/www.aiopsschool.com\/<\/td>\n<\/tr>\n<\/tbody>\n<\/table><\/figure>\n\n\n\n<hr class=\"wp-block-separator\" \/>\n\n\n\n<h2 class=\"wp-block-heading\">19. Top Trainers<\/h2>\n\n\n\n<figure class=\"wp-block-table\"><table>\n<thead>\n<tr>\n<th>Platform\/Site<\/th>\n<th>Likely Specialization<\/th>\n<th>Suitable Audience<\/th>\n<th>Website<\/th>\n<\/tr>\n<\/thead>\n<tbody>\n<tr>\n<td>RajeshKumar.xyz<\/td>\n<td>DevOps\/cloud training content (verify offerings)<\/td>\n<td>Engineers seeking guided learning<\/td>\n<td>https:\/\/rajeshkumar.xyz\/<\/td>\n<\/tr>\n<tr>\n<td>devopstrainer.in<\/td>\n<td>DevOps training (verify course catalog)<\/td>\n<td>Beginners to working professionals<\/td>\n<td>https:\/\/www.devopstrainer.in\/<\/td>\n<\/tr>\n<tr>\n<td>devopsfreelancer.com<\/td>\n<td>Freelance DevOps services\/training (verify scope)<\/td>\n<td>Teams needing practical implementation help<\/td>\n<td>https:\/\/www.devopsfreelancer.com\/<\/td>\n<\/tr>\n<tr>\n<td>devopssupport.in<\/td>\n<td>DevOps support and enablement (verify scope)<\/td>\n<td>Ops\/DevOps teams needing hands-on guidance<\/td>\n<td>https:\/\/www.devopssupport.in\/<\/td>\n<\/tr>\n<\/tbody>\n<\/table><\/figure>\n\n\n\n<hr class=\"wp-block-separator\" \/>\n\n\n\n<h2 class=\"wp-block-heading\">20. Top Consulting Companies<\/h2>\n\n\n\n<figure class=\"wp-block-table\"><table>\n<thead>\n<tr>\n<th>Company<\/th>\n<th>Likely Service Area<\/th>\n<th>Where They May Help<\/th>\n<th>Consulting Use Case Examples<\/th>\n<th>Website<\/th>\n<\/tr>\n<\/thead>\n<tbody>\n<tr>\n<td>cotocus.com<\/td>\n<td>Cloud\/DevOps consulting (verify offerings)<\/td>\n<td>Architecture, DevOps enablement, cloud operations<\/td>\n<td>Designing autoscaling architectures; implementing IaC and observability<\/td>\n<td>https:\/\/cotocus.com\/<\/td>\n<\/tr>\n<tr>\n<td>DevOpsSchool.com<\/td>\n<td>DevOps consulting and training (verify services)<\/td>\n<td>DevOps transformation, CI\/CD, operations practices<\/td>\n<td>Standardizing scaling patterns; operational playbooks; governance and tagging<\/td>\n<td>https:\/\/www.devopsschool.com\/<\/td>\n<\/tr>\n<tr>\n<td>DEVOPSCONSULTING.IN<\/td>\n<td>DevOps consulting (verify offerings)<\/td>\n<td>DevOps assessments, automation and operations<\/td>\n<td>Implementing Auto Scaling with monitoring and incident response workflows<\/td>\n<td>https:\/\/www.devopsconsulting.in\/<\/td>\n<\/tr>\n<\/tbody>\n<\/table><\/figure>\n\n\n\n<hr class=\"wp-block-separator\" \/>\n\n\n\n<h2 class=\"wp-block-heading\">21. Career and Learning Roadmap<\/h2>\n\n\n\n<h3 class=\"wp-block-heading\">What to learn before Auto Scaling<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Alibaba Cloud fundamentals: regions, zones, billing, quotas<\/li>\n<li><strong>ECS basics<\/strong>: images, instance types, security groups, disks<\/li>\n<li><strong>VPC basics<\/strong>: CIDR, vSwitches, routing, NAT, EIP<\/li>\n<li>Basic Linux administration and SSH<\/li>\n<li>Monitoring fundamentals: metrics, alerts, dashboards<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">What to learn after Auto Scaling<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Advanced observability:<\/li>\n<li>CloudMonitor advanced usage (custom metrics, alert tuning)<\/li>\n<li>Centralized logging (Log Service) and alerting pipelines<\/li>\n<li>Infrastructure as Code:<\/li>\n<li>ROS or Terraform patterns for scaling groups and networking<\/li>\n<li>Deployment strategies:<\/li>\n<li>Immutable images, canary\/blue-green on VM fleets<\/li>\n<li>Lifecycle hooks automation<\/li>\n<li>Reliability engineering:<\/li>\n<li>Capacity planning, SLOs, incident response<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">Job roles that use it<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Cloud engineer<\/li>\n<li>DevOps engineer<\/li>\n<li>Site Reliability Engineer (SRE)<\/li>\n<li>Platform engineer<\/li>\n<li>Solutions architect<\/li>\n<li>Operations engineer<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">Certification path (if available)<\/h3>\n\n\n\n<p class=\"wp-block-paragraph\">Alibaba Cloud certification programs change over time and vary by region. Check the official Alibaba Cloud certification portal for current options and whether Auto Scaling is explicitly covered in a track. Verify in official sources.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">Project ideas for practice<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Build a two-tier app with Auto Scaling for the web tier and managed database backend.<\/li>\n<li>Implement scheduled scaling (work hours) + alarm scaling (bursts).<\/li>\n<li>Add lifecycle hooks to:<\/li>\n<li>run configuration management<\/li>\n<li>register\/deregister from service discovery<\/li>\n<li>drain connections before termination<\/li>\n<li>Implement cost dashboards by tags and produce a monthly elasticity report.<\/li>\n<\/ul>\n\n\n\n<hr class=\"wp-block-separator\" \/>\n\n\n\n<h2 class=\"wp-block-heading\">22. Glossary<\/h2>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>Auto Scaling<\/strong>: A managed service that automatically adjusts compute capacity according to policies.<\/li>\n<li><strong>Scaling Group<\/strong>: A logical group defining where instances run and how many should run (min\/max\/desired).<\/li>\n<li><strong>Scaling Configuration<\/strong>: A definition of how to launch instances (image, instance type, security group, bootstrap).<\/li>\n<li><strong>Launch Template<\/strong>: A reusable ECS launch definition that may be used by Auto Scaling (verify exact integration in your account).<\/li>\n<li><strong>Scaling Rule<\/strong>: A defined action that changes capacity (add\/remove\/set).<\/li>\n<li><strong>Scheduled Task<\/strong>: A time-based trigger that runs a scaling rule.<\/li>\n<li><strong>Event-triggered Task<\/strong>: A trigger based on alarms\/metrics (often via CloudMonitor).<\/li>\n<li><strong>Cooldown<\/strong>: A waiting period to prevent rapid successive scaling actions.<\/li>\n<li><strong>Lifecycle Hook<\/strong>: A mechanism to pause scaling so you can run automation before completing launch\/termination.<\/li>\n<li><strong>Health Check<\/strong>: A mechanism to determine whether an instance should receive traffic or be replaced.<\/li>\n<li><strong>ECS<\/strong>: Elastic Compute Service (VM compute) on Alibaba Cloud.<\/li>\n<li><strong>VPC \/ vSwitch<\/strong>: Network isolation and subnet constructs used for instance placement.<\/li>\n<li><strong>RAM<\/strong>: Resource Access Management, Alibaba Cloud IAM service for identities and permissions.<\/li>\n<li><strong>ActionTrail<\/strong>: Alibaba Cloud audit logging for API calls.<\/li>\n<li><strong>CloudMonitor<\/strong>: Monitoring\/alarms service used for metrics and triggers.<\/li>\n<\/ul>\n\n\n\n<hr class=\"wp-block-separator\" \/>\n\n\n\n<h2 class=\"wp-block-heading\">23. Summary<\/h2>\n\n\n\n<p class=\"wp-block-paragraph\">Alibaba Cloud <strong>Auto Scaling<\/strong> is a Computing control-plane service that automatically adds and removes capacity\u2014most commonly <strong>ECS instances<\/strong>\u2014based on schedules and monitoring-driven events. It matters because it turns capacity management into a repeatable, policy-driven process that improves availability under load while reducing cost during idle periods.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">In the Alibaba Cloud ecosystem, Auto Scaling fits best for <strong>stateless VM tiers<\/strong> behind a load balancer, integrated with <strong>VPC<\/strong> networking, <strong>CloudMonitor<\/strong> alarms, and governance\/audit tools like <strong>RAM<\/strong> and <strong>ActionTrail<\/strong>. Cost optimization comes from setting realistic min\/max bounds, combining scheduled and alarm-based scaling, and controlling indirect costs like NAT egress and log volume. Security depends on least-privilege RAM policies, private networking, controlled inbound exposure, and avoiding secrets in user data.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Use Auto Scaling when you need elastic VM fleets with operational guardrails; consider Kubernetes or serverless alternatives when you need faster scaling, platform-managed runtimes, or event-driven execution.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Next step: build a production-grade version of the lab using <strong>infrastructure-as-code (ROS\/Terraform)<\/strong>, add lifecycle hooks for safe rollout\/termination, and tune alarms to match real user-facing SLOs.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Computing<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[2,5],"tags":[],"class_list":["post-17","post","type-post","status-publish","format-standard","hentry","category-alibaba-cloud","category-computing"],"_links":{"self":[{"href":"https:\/\/www.devopsschool.com\/tutorials\/wp-json\/wp\/v2\/posts\/17","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.devopsschool.com\/tutorials\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.devopsschool.com\/tutorials\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.devopsschool.com\/tutorials\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/www.devopsschool.com\/tutorials\/wp-json\/wp\/v2\/comments?post=17"}],"version-history":[{"count":0,"href":"https:\/\/www.devopsschool.com\/tutorials\/wp-json\/wp\/v2\/posts\/17\/revisions"}],"wp:attachment":[{"href":"https:\/\/www.devopsschool.com\/tutorials\/wp-json\/wp\/v2\/media?parent=17"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.devopsschool.com\/tutorials\/wp-json\/wp\/v2\/categories?post=17"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.devopsschool.com\/tutorials\/wp-json\/wp\/v2\/tags?post=17"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}