×
LLMs

AgentScope Java 2.0: Building a Distributed, Enterprise-Grade Foundation for AI Agents

This article introduces AgentScope Java 2.0, an open-source framework for building distributed, enterprise-grade AI agents with production-ready features.

Alibaba Launches Qwen3.7-Plus, AI Swine Diagnosis Assistant and Model Studio CLI

This article introduces the launch of Qwen3.7-Plus multimodal model, an AI swine diagnosis assistant with Muyuan Group, and Model Studio's open-source CLI for AI agents.

Qwen-VLA: From Understanding the World to Acting in It

This article introduces Qwen-VLA, a general-purpose Vision-Language-Action model that extends multimodal perception and reasoning into continuous action generation for embodied intelligence.

Beyond 'Demo-Grade' Architecture: Building a Highly Available Production Foundation for Dify with SAE × SLS

This article introduces Alibaba Cloud SAE, a serverless platform that simplifies application modernization and accelerates AI deployment with zero node management.

DeepSeek V4-Flash at Scale: A Benchmark-Driven Deployment Guide

Choosing how to deploy a large language model in production is one of the most consequential — and confusing — decisions an AI team can make.

LoongCollector + ACS Agent Sandbox: Build a Production-grade AI Agent Runtime Platform

This article introduces a production-grade AI Agent runtime platform combining ACS Agent Sandbox for security and LoongCollector for observability.

Alibaba Cloud Tair KVCache Simulation Analysis: High-Precision Computational and Caching Simulation Design and Implementation

This article introduces Tair-KVCache-HiSim, a high-fidelity CPU-based simulator for optimizing multi-tier KV Cache configurations in LLM inference.

SGLang Hierarchical Sparse Attention

This article introduces hierarchical sparse attention: the full KV Cache is stored on the CPU, while the GPU keeps only a Top-k LRU Buffer.

Hybrid Model Support | SGLang's Support Scheme for Hybrid Architecture Models like Mamba-Transformer

This article introduces a dual memory-pool inference framework enabling efficient hybrid Transformer-Mamba model execution by resolving conflicting caching mechanisms.

Alibaba Cloud Tair KVCache Implementation Based on 3FS Enterprise-Grade Deployment, High-Availability Operations & Performance Optimization

This article introduces engineering optimizations to 3FS—KVCache's foundation layer—across performance, productization, and cloud-native management for scalable AI inference.

Apache RocketMQ for AI: Strategic Upgrade Ushers in the Era of AI MQ

This article introduces Apache RocketMQ's strategic evolution into an AI-native message engine for long-running sessions, intelligent compute scheduling, and agent collaboration.

Alibaba Cloud Launches New AI Model Subscription Service for Enterprises and Developers

Alibaba Cloud unveiled a new AI model subscription service specifically for enterprises and developers.

How to Make Agent-based Speech Interaction Stabler and Faster? A Practice of Optimizing High-Concurrency Message Links

This article introduces how to build a stable, reliable, and efficient real-time speech message link architecture using the LiteTopic feature of ApsaraMQ for RocketMQ.

Qwen Conference 2026: A First Look at the Exhibition Highlights!

Qwen Conference 2026 invites you to step inside the innovation.

PolarDB One-Stop Memory Management Launched: Empowering AI Agents with Persistent, Cross-Session Memory

PolarDB for PostgreSQL introduces a one-stop memory management system that combines vector and graph databases to enable AI agents with persistent, cross-session memory.

Multi-Turn Agents, Single-Turn Traces? OpenClaw CMS Plugin 0.1.2 Released

This article introduces openclaw-cms-plugin 0.1.2, which enables accurate multi-turn tracing for AI agents by reconstructing ReAct execution flows and stabilizing concurrent observability.

Qwen-Scope: Decoding Intelligence, Unleashing Potential

We are excited to introduce Qwen-Scope, an interpretability toolkit trained on the Qwen3 and Qwen3.5 series models.

Building Cross-Cloud Observability: One Architecture, Unified Analytics

This article introduces a unified observability architecture for cross-cloud log analysis and AIOps, designed to streamline multicloud O&M and reduce costs for global enterprises.

Accepted by Top Conferences! Multiple Alibaba Cloud Achievements Improve O&M Intelligence Accuracy and Efficiency

This article introduces three top-conference-accepted research achievements by Alibaba Cloud that solve core AIOps challenges in data augmentation, se...

One Command Equips Your OpenClaw with an X-ray Machine - Alibaba Cloud Observability Makes Farming Lobsters Cheaper and Safer

One-command observability integration makes OpenClaw AI agent operations transparent via Alibaba Cloud monitoring plugins.