×
Large Language Model

One-Click Deployment of DeepSeek-V3 and DeepSeek-R1 Models

The Model Gallery offers vLLM or BladeLLM accelerated deployment features, enabling you to deploy the DeepSeek-V3 and DeepSeek-R1 series models with a single click.

Accelerate Your Transformation in The GenAI-Era: Community Gathering Session Recap

This article delved into how Qwen LLM stands out in terms of performance, scalability, and versatility, making it an ideal choice for organizations looking to harness the power of generative AI.

Best Practices for Generating a Unit Test by Using Tongyi Lingma to Simplify Unit Testing

This article discusses what unit testing is, the value of unit testing, the principles of adequate unit testing, and how to write an adequate unit test.

Use EAS and Elasticsearch to Deploy a RAG-Based LLM Chatbot

This article describes the basic features provided by a RAG-based LLM chatbot and the special features provided by Elasticsearch.

Alibaba Cloud Model Studio를 사용하여 나만의 Chatbot 시스템 빠르게 구축하기

오늘 이 글에서 우리는 알리바바클라우드의 Model Studio와 Tongyi Qwen 을 활용하여 Chatbot을 구축하는 방법을 안내드리고자 합니다.

Strengthening Security in the AI Era: Alibaba Cloud Showcases Security Solutions for Diverse Cloud Environments

In response to the growing trend of organizations adopting multi-cloud and hybrid cloud environments, where data is distributed across various platforms.

Alibaba Cloud Drives AI Enhancements Across Industries in Asia

Alibaba Cloud continues to pioneer technology innovation across a diverse range of industries from technology development, imaging, travel to beauty and healthcare.

Use NVIDIA NIM to Accelerate LLM Inference in Alibaba Cloud ACK

This article introduces how to use the cloud-native AI suite to integrate open-source inference service framework KServe and quickly deploy NVIDIA NIM in an ACK cluster.

Use PAI-Blade and TensorRT Plug-Ins to Optimize a RetinaNet Model

This article describes how to use PAI-Blade to optimize a detection model whose post-processing network is built by using TensorRT plug-ins.

Quickly Deploy a Multimodal Large Language Model in EAS

This article describes how to deploy and call MLLM inference services by using PAI-EAS.

Qwen2.5-LLM: Extending the Boundary of LLMs

In this blog, we delve into the details of our latest Qwen2.5 series language models.

Qwen2.5: A Party of Foundation Models!

This article introduces the latest addition to the Qwen family, Qwen2.5, along with specialized models for coding and mathematics.

Qwen2.5-Coder Series: Powerful, Diverse, Practical

The Alibaba Cloud Qwen Large Model team has officially open-sourced the full series of Tongyi Qianwen code models, consisting of six Qwen2.

Introducing Qwen2.5 Coder 32B Instruct | Qwen

This article introduces Qwen2.5 Coder 32B Instruct the latest version of Qwen2.5 Coder from Qwen

LLM Inference Acceleration: GPU Optimization for Attention in the Decode Phase (2)

This article briefly discuss how to further improve the calculation performance of MMHA in this interval.

Unleashing the Power of Qwen, Alibaba Cloud Kicks Off Hong Kong Inter-School AIGC Competition

Alibaba Cloud’s first AI competition promotes tech inclusion and talent development in the city.

Alibaba Cloud’s AI Technology Sparks Breakthrough in RNA Virus Discovery

Alibaba Cloud researchers unveiled a deep-learning algorithm that uses artificial intelligence to detect RNA viruses.

LLM Inference Acceleration: GPU Optimization for Attention in the Decode Phase

This article introduces how the Attention in the decode phase is optimized on GPU based on RTP-LLM practices.

Observability of LLM Applications: Exploration and Practice from the Perspective of Trace

This article clarifies the technical challenges of observability by analyzing LLM application patterns and different concerns.

Fine-Tune a Language Model Using Alibaba Cloud's Platform for AI (PAI)

This step-by-step tutorial introduces how to fine-tune a language model using Alibaba Cloud's Platform for AI (PAI)