×
LLMs

Qwen3.5-Max-Preview Unveiled!

Recently, Qwen3.5-Max-Preview, the preview of our next-generation flagship model, has made its debut on LM Arena.

Entering the AI-Native Era: How DAS Leverages LLMs to Revolutionize Database Autonomy

This article explores how DAS is democratizing expert-level database management for every enterprise.

Friend or Foe? AI: The New Cybersecurity Threat and Solutions

This article explains how generative AI is expanding the cybersecurity attack surface and outlines AI-driven strategies to defend AI systems and enterprise workflows.

Alibaba Reports Solid Progress in AI + Cloud on the Strength of Its Full-Stack Capabilities

Alibaba Group reported strong progress in AI for the December quarter, with accelerating revenue growth in the Cloud Intelligence Group and significan...

Deep Dive: How Kimi's AI Agent Runs on Alibaba Cloud

This article explains how Kimi leverages Alibaba Cloud's ACK and ACS to build a secure, instantly elastic infrastructure capable of supporting hundreds of thousands of concurrent AI Agent sandboxes.

Building a Production-Grade Cloud-Native Large Model Inference Platform with SGlang RBG + Mooncake

This article shows how SGLang RBG + Mooncake enable production-grade, cloud-native LLM inference with PD-disaggregation.

Self-Hosted GPU or Model-as-a-Service? A Strategic Guide for AI Leaders

This article offers a framework for choosing between self-hosted GPUs and MaaS for LLM inference by weighing cost, data, engineering, and scalability tradeoffs.

SysOM MCP: Open-Source Intelligent O&M Assistant for AI-Powered System Diagnostics

This article introduces SysOM MCP, an open-source O&M assistant that enables AI Agents to perform automated system diagnostics via natural language using MCP.

Alibaba Cloud Drives a More Sustainable, Efficient and Intelligent Olympic Experience at Milano Cortina 2026

Alibaba Group has supported the Olympic and Paralympic Winter Games Milano Cortina 2026 (Milano Cortina 2026) in becoming the most intelligent Games in Olympic history.

Caching is Efficiency: Achieving Precise LLM Cache Hits with Alibaba Cloud ACK GIE

This article introduces ACK GIE's precision-mode prefix cache-aware routing that maximizes KV-Cache hit rates for distributed LLM inference.

ACK One Fleet Multi-Cluster Canary Release: A "Safety Valve" for AI Inference Services

This article introduces ACK One Fleet's multi-cluster canary release solution, integrated with Kruise Rollout, for safe AI inference deployments across hybrid and geo-distributed clouds.

When Agents Meet Workflows—Can Intelligence Become More Controllable?

This article introduces how combining LLM Agents with deterministic Workflows like Argo enables controllable, production-ready AI systems.

Qwen3.5: Towards Native Multimodal Agents

We are delighted to announce the official release of Qwen3.5, introducing the open-weight of the first model in the Qwen3.

Qwen App's CNY Campaign Attracts Over 120 Million Orders

Qwen App, Alibaba’s consumer-facing AI application, has spurred a behavioral shift toward AI-powered shopping during its Chinese New Year (CNY) campaign.

UModel Data Governance: Practice of Building an O&M World Model

This article introduces UModel, Alibaba Cloud's ontology that transforms observability into a unified model-driven digital twin of IT systems.

Joe Tsai on the Future of Open-Source AI: Why Full-Stack Companies Will Excel

Alibaba Chairman shares his perspective at the World Government Summit 2026 on why full stack companies maintains an advantage as open-source AI providers.

Alibaba Brings Cloud-Based AI Innovation to the Olympic Winter Games Milano Cortina 2026

Alibaba Cloud is partnering with OBS and IOC to deploy advanced cloud and AI technologies for the Olympic and Paralympic Winter Games Milano Cortina 2026.

Hybrid Model Support | SGLang's Support Scheme for Hybrid Architecture Models like Mamba-Transformer

This article introduces a dual memory-pool inference framework enabling efficient hybrid Transformer-Mamba model execution by resolving conflicting caching mechanisms.

Alibaba Cloud Tair KVCache Implementation Based on 3FS Enterprise-Grade Deployment, High-Availability Operations & Performance Optimization

This article introduces engineering optimizations to 3FS—KVCache's foundation layer—across performance, productization, and cloud-native management for scalable AI inference.

Dify Officially Launched the Nacos A2A Plugin, Completing Its Bidirectional Multi-agent Collaboration Capabilities

This article introduces Dify's Nacos A2A plugins, enabling bidirectional agent collaboration—discovering external A2A agents and exposing Dify apps as discoverable agents via Nacos Registry.