×
Knative

Building a Large Language Model Inference Service Optimized by TensorRT-LLM Based on KServe on ASM

This article introduces how to deploy optimized LLM model inference services in a cloud-native environment using the TensorRT-LLM-optimized Llama-2-hf model as an example.

Best Practices for AI Model Inference Configuration in Knative

The article introduces best practices for deploying and configuring AI model inference in Knative, focusing on the optimization of GPU resource utilization and rapid scaling.

Hands-on Labs | Deploy an Enterprise-Class Elastic Stable Diffusion Service in ASK

This step-by-step tutorial introduces how to deploy an enterprise-class elastic stable diffusion service in ASK.

Serverless Cost Optimization: Knative Supports Preemptible Instances

This article describes how to use preemptible instances in Knative.

Deploying Serverless Applications with ACK One and Knative for On-Premises Data Centers

This article introduces how to use ACK One and Knative to manage cloud resources.

Decipher the Knative's Open-Source Serverless Framework: Traffic Perspective

This article describes Knative's traffic management, traffic access, traffic-based elasticity, and monitoring.

Decipher the Elastic Technology of Knative's Open Source Serverless Framework

This article gives an in-depth understanding of the elastic implementation of Knative.

Higress: A Best Practice for Knative Ingress Gateway

This article describes the implementation of the network layer capabilities of Knative.

Deploy Enterprise-level AI Applications Based on Alibaba Cloud Serverless Container Service

This article describes how to deploy enterprise-level AI applications based on Alibaba Cloud Serverless Container Service.

How to Quickly Deploy AI Inference Services Based on ACK Serverless

This article describes how to quickly deploy AI inference services based on ACK Serverless.

Use ASM to Manage Knative Services (6): Auto Scaling Based on the Number of Traffic Requests

Part 6 of this 6-part series describes how to enable auto scaling of pods based on the number of requests

Use ASM to Manage Knative Services (5): Canary Deployment of Services Based on Traffic in Knative on ASM

Part 5 of this 6-part series describes how to implement canary deployment of services based on traffic in Knative on ASM.

Use ASM to Manage Knative Services (4): Use the ASM Gateway to Access Knative Services over HTTPS

Part 4 of this 6-part series demonstrates how to use the ASM gateway to access Knative services over HTTPS.

Use ASM to Manage Knative Services (3): Use Custom Domain in Knative on ASM

Part 3 of this 6-part series describes how to set a custom domain name for Knative Serving.

Use ASM to Manage Knative Services (2): Use Knative on ASM to Deploy Serverless Applications

Part 2 of this 6-part series describes how to use Knative on ASM to create Knative services.

Use ASM to Manage Knative Services (1): An Overview of Knative on ASM

Abstract: Part 1 of this 6-part series introduces Knative on ASM.

How to Provide Production-level Stable Diffusion Services Based on Knative

This article discusses the challenges faced by the AI Generative Content (AIGC) project Stable Diffusion in terms of limited processing capacity and precious GPU resources.

Event-driven Practice of Knative: Trigger Events Through EventBridge

This article describes how to use EventBridge's event to trigger a Knative service, using the example event of uploading files to Object Storage Service (OSS).

When Knative Meets WebAssembly

This article introduces the WAGI project and explains how it can combine WASM and WASI applications with Serverless frameworks.

Serverless Gateway Enhancing: Integration of Alibaba Cloud Knative with Cloud Product ALB

This article describes how to integrate ALB in Alibaba Cloud Container Service Knative.