×
Large Language Model

Caching is Efficiency: Achieving Precise LLM Cache Hits with Alibaba Cloud ACK GIE

This article introduces ACK GIE's precision-mode prefix cache-aware routing that maximizes KV-Cache hit rates for distributed LLM inference.

AliSQL Vector Technology Analysis (1): Storage Format and Algorithm Implementation

This article details the storage format and HNSW algorithm implementation behind AliSQL’s native vector indexing capability for high-dimensional AI workloads.

Alibaba Group Debuts “Wonder on Ice,” an Immersive AI Experience at Milan’s Sforza Castle for Milano Cortina 2026

Alibaba Group launched "Alibaba Wonder on Ice" (AWI) at the Milano Cortina 2026, using AI and cloud computing to demonstrate next-gen virtual retail experiences.

Hybrid Model Support | SGLang's Support Scheme for Hybrid Architecture Models like Mamba-Transformer

This article introduces a dual memory-pool inference framework enabling efficient hybrid Transformer-Mamba model execution by resolving conflicting caching mechanisms.

Alibaba Cloud Tair KVCache Implementation Based on 3FS Enterprise-Grade Deployment, High-Availability Operations & Performance Optimization

This article introduces engineering optimizations to 3FS—KVCache's foundation layer—across performance, productization, and cloud-native management for scalable AI inference.

Alibaba Cloud to Debut AI-powered Pin Trading Experience in Olympic Village at Milano Cortina 2026

The Intelligent Pin Trading Station blends one of the Games’ best-loved traditions with voice- and gesture-enabled interaction.

Alibaba Cloud Tair Partners with SGLang to Build HiCache: Constructing a New Cache Paradigm for "Agentic Inference"

This article introduces HiCache, a hierarchical KVCache infrastructure developed by Alibaba Cloud Tair and SGLang to optimize performance and memory capacity for long-context "agentic" LLM inference.

Memilih Model Qwen yang Tepat untuk Kebutuhan Anda

Ekosistem Qwen saat ini berkembang sangat pesat, mulai dari Large Language Model (LLM) hingga model multimodal yang bisa memahami teks, gambar, video,...

Alibaba's Quark Launches AI Chat Assistant, Powered by Qwen Model

Quark AI Glasses joins 11.11 shopping festival to kick off online pre-sale in China starting October 24

Alibaba Cloud Launches Second Data Center in Dubai to Accelerate AI-powered Digitalization in the Middle East

Global cloud leader supports customers and partners including Wio Bank, ACCUMED, Byond Asia, The Game Company, and Atos to achieve business growth in .

Alibaba Cloud and Wio Bank Ink MoU to Accelerate AI-Powered Innovation across Middle East’s Finance Industry

Alibaba Cloud has signed a MoU with Wio Bank, the Middle East’s leading digital financial platform, to accelerate innovation across cloud computing, A...

Alibaba Cloud and CapitaLand Strengthen Collaboration to Drive Digital Transformation

Alibaba Cloud has further strengthened its longstanding collaboration with CapitaLand Group (CapitaLand), one of Asia’s largest diversified real estate groups.

Next Gen Applications: Qwen Agentic Deep Dive and Workflow Innovation Lab

The article explores Alibaba Cloud's Qwen LLM and Dify platform, showcasing their roles in developing intelligent AI systems for business automation.

The Best Practice of Moonshot AI in Massive Data Preprocessing for the Kimi Large Model

This article introduces how Moonshot AI uses Alibaba Cloud's solutions to enhance data preprocessing for its large model, Kimi, focusing on stability, resource elasticity, and efficient management.

BMW and Alibaba Deepen Strategic Partnership in China, Harnessing Qwen's AI Power to Redefine Intelligent In-Car Experiences

The BMW Group and Alibaba Group announced an expanded strategic partnership in China, accelerating the integration of Alibaba’s Qwen large language mo.

Alibaba Cloud Releases Qwen2.5-Omni-7B An End-to-end Multimodal AI Model

Alibaba Cloud has launched Qwen2.5-Omni-7B, a unified end-to-end multimodal model in the Qwen series.

Alibaba Cloud's AI Revolution: Advancing the Frontier with Mixture of Experts (MoE), Advanced Reasoning Model, and End-to-end Multimodal Model

This article showcases Alibaba Cloud's innovative AI models that boost efficiency and integration across modalities, setting new standards in industri...

The Consumption of Tokens by Large Models Can Be Quite Ambiguous

This article discusses the challenges and strategies involved in managing resource consumption in large model applications.

Higress.ai Officially Launches: Effortlessly Unlock New AI Capabilities and Start Global Services

This article introduces Higress.ai, highlighting its official launch and the seamless integration of new AI capabilities.

Alibaba Cloud's Industry Leadership Recognized by Top Global Research Firms

Alibaba Cloud continues to solidify its standing as a global leader in cloud computing and artificial intelligence (AI).