Our cluster architecture is designed to leverage Kubernetes for orchestrating containerized applications across a set of nodes. We use multiple worker nodes in a distributed setup to ensure scalability and high availability, with each node handling specific workloads and tasks. The control plane manages the cluster's overall state, while worker nodes run the applications and services. To ensure fault tolerance, we implement replication strategies, where critical services are distributed across multiple nodes, so if one node fails, the workload is automatically shifted to another available node. Challenges in managing scalability include maintaining resource allocation across nodes, ensuring consistent performance under varying loads, and managing stateful applications. Additionally, ensuring smooth failover and minimizing downtime during node failures has been a continuous effort, requiring robust monitoring and automated recovery strategies. Despite these challenges, the architecture helps us achieve high availability and seamless scalability, allowing us to handle increasing demands efficiently.