×
LLMs

Best Practices for Large Model Inference in ACK: TensorRT-LLM

This article uses the Llama-2-7b-hf model as an example to demonstrate how to deploy the Triton framework using KServe in Alibaba Cloud ACK.

Analyzing the Distributed Inference Process Using vLLM and Ray from the Perspective of Source Code

This article explores how to implement distributed inference with vLLM and Ray from a source code perspective.

Alibaba Progresses Towards Carbon Neutrality Goals and Digital Inclusion 2024 ESG Report

Alibaba Group reduced carbon emissions from its own operations by 5.0% during the year ended March 31, 2024, according to 2024 ESG report.

Alibaba AI Tool Creates Picture Books for Children with Autism

Over 50,000 people have used Alibaba's AI-powered tool to create picture books for children with autism since it launched in June 2024.

ACK Cloud Native AI Suite | Scaling Distributed Elastic Training for Large Models

The fifth episode of ACK Cloud Native AI Suite series introduces how to perform large-scale distributed elastic training based on the ACK Cloud-Native AI suite.

Quick Start the AI Model on the Alibaba Cloud Model Studio

This article compares the ability and performance of Model Studio and the original Qwen model regarding image generator and text-based chatting.

Comprehensive Reviews of Alibaba Cloud Model Studio: Insights from Alibaba Cloud MVPs Worldwide

This article shares comprehensive reviews from Alibaba Cloud MVPs who tested and reviewed Alibaba Cloud Model Studio.

Introducing Alibaba Cloud Model Studio

This article provides an introduction to Alibaba Cloud Model Studio, along with its features and functionality.

การปรับใช้โมเดลภาษาขนาดใหญ่ของ Alibaba Cloud (Tongy Qianwen) ด้วยส่วนต่อประสานรายคำสั่งและกราฟิก

บทความนี้สำรวจสองวิธีในการโต้ตอบกับโมเดล Tongyi Qianwen-7B วิธีหนึ่งใช้ส่วนต่อประสานกราฟิกกับผู้ใช้(GUI) และอีกวิธีหนึ่งผ่านส่วนต่อประสานรายคำสั่ง (CL...

Deploy a RAG-Based LLM Chatbot in EAS

This article describes how to deploy a RAG-based LLM chatbot and how to perform model inference.

E2E Development and Usage of LLM Data Processing + Model Training + Model Inference

This article describes how to use the data processing, model training, and model inference components of Large Language Model (LLM) provided by PAI to complete end-to-end development and use of LLM.

Fine-Tuning a Llama3-8B Model in PAI DSW

This article describes how to fine-tune the parameters of a Llama 3 model in DSW to enable the model to better align with and adapt to specific scenarios.

Use QuickStart to Fine-Tune and Deploy Llama 2 Models

This article uses llama-2-7b-chat as an example to describe how to use QuickStart to deploy a model as a service in Elastic Algorithm Service (EAS) and call the service.

Quickly Deploy a Llama 3 Model in EAS

This article describes how to quickly deploy a Llama 3 model and use the deployed web application in Elastic Algorithm Service (EAS) of Platform for AI (PAI).

Quickly Deploy Open Source LLMs in EAS

This article describes how to deploy an LLM in EAS and call the model.

การเพิ่มประสิทธิภาพโมเดล GenAI: แนวทางในการปรับแต่งอย่างละเอียดและควอนไทเซชัน

บทความนี้จะอธิบายวิธีปรับแต่งอย่างละเอียดและการควอนไทซ์โมเดลภาษาที่ได้รับการฝึกล่วงหน้า

Joe Tsai on Why Alibaba is All In on AI

Alibaba Chairman Joe Tsai spoke on the value and opportunities unleashed by artificial intelligence during J.P. Morgan's Global China Summit.

生成 AI モデルの最適化:ファインチューニングと量子化

この記事では、事前学習済み言語モデルをファインチューニングおよび量子化する方法について説明します。

Starter Guide | Build a RAG Service on Compute Nest with LLM on PAI-EAS and AnalyticDB for PostgreSQL in One Click

This tutorial describes how to build a RAG service using Compute Nest with LLM on Alibaba Cloud's PAI-EAS and AnalyticDB for PostgreSQL.

Alibaba Cloud's Model Studio Supports Llama3 Models

Alibaba Cloud's generative AI development platform Model Studio is now compatible with Llama3, the latest open-source LLM from Meta.