×
ACK Gateway

ACK Gateway with Inference Extension: A Practice for Optimizing Large Model Inference Service Deployed across Multiple Nodes

This article introduces how to use ACK Gateway with Inference Extension to optimize multi-node large-model inference performance.

ACK Gateway with AI Extension: Intelligent Routing Practice for Kubernetes Large Model Inference

This article describes how to use the ACK Gateway with AI Extension plug-in to provide production-level load balancing and intelligent routing capabilities for QwQ-32B models deployed in ACK clusters.

ACK Gateway with AI Extension: Model Canary Release Practice for Large Model Inference

This article focuses on the canary release of models after the large model inference service is deployed in the cloud and the practices of model canary release based on ACK Gateway with AI Extension.