This article demonstrates how to use FasterTransformer to accelerate inference on the ACK container service, using the Bloom7B1 model as an example.
This article uses the Bloom7B1 model as an example to demonstrate the distributed inference method for large language models in ACK.
This article explores how to implement the KServe big model inference in Alibaba Cloud Container Service for Kubernetes (ACK).
This article describes how to deploy enterprise-level AI applications based on Alibaba Cloud Serverless Container Service.
This article discusses the need for traffic isolation in scenarios where abnormal Pod behavior affects service quality.
This article explores challenges faced by enterprises running AI and big data applications on Kubernetes, focusing on the decoupling of computing and storage architecture.
This article focuses on how Koordinator helps facilitate the sharing of CPU resources between different types of workloads.
This article describes how to build and run DeepSpeed distributed training tasks based on the cloud-native AI suite of ACK.
This article describes how to use OpenKruise to build automated O&M.
This article offers a quick start guide to help users get started with Seata Saga and delves into best practices for its usage.
This article discusses Java application challenges in the cloud era, GraalVM Native Image solutions, and the principles of GraalVM.
This article describes how the author solve the "Address not available" issues in a container environment.
This article introduces the observability features and practices of the lates version of Apache Dubbo3.
This article presents the combined capabilities of Higress and Nacos as a microservice gateway and discusses two emerging trends in microservice gateway development.
The Dubbo Triple Protocol has undergone a major upgrade, allowing for seamless connectivity between web and backend microservices over HTTP.
This article discusses the advantages, deficiencies, and broad market prospects of Dubbo and Proxyless Service Mesh.
This article will focus on the high-performance secrets behind the Triple protocol, including valuable performance tuning tools, techniques, and code implementations.
This article introduces the flexible multi-protocol design principles of Apache Dubbo.
This article introduces the design idea, exception handling, and practical use of TCC in Seata-go.
The article discusses the OpenYurt community's efforts to improve the user-friendliness of their cloud-native intelligent edge computing platform.