×
Data Processing

Use Cases for EMR Serverless Spark | Use EMR Serverless Spark to Submit a PySpark Streaming Job

This artile introduces the usability and maintainability of EMR Serverless Spark in stream processing.

Comprehensive Upgrade of SLS Data Transformation Feature: Integrate SPL Syntax

SLS data transformation feature aims to handle unstructured log data. It is now fully upgraded to integrate SPL, enhance data processing capabilities, and optimize its cost.

Seven Common Errors in ClickHouse Materialized View

This article discusses the principles and best practices of Materialized View, as well as common problems and their solutions that may arise during use.

Big Data Cloud Fighter Bootcamp

This article introduces the Big Data Cloud Fighters bootcamp, which provides an intensive, hands-on experience in mastering big data principles and technologies.

Distributed Pandas Processing with MaxCompute MaxFrame

This article introduces how to use common Pandas operators with MaxFrame.

Themepica Technology Enhances Data Processing Efficiency with Serverless

This article introduces how Beijing Themepica Technology Co., Ltd. leverages advanced cloud-native technologies to address challenges related to complex data processing procedures.

Understanding Batch Processing vs Stream Processing: Key Differences and Applications

Explore the differences between Batch Processing vs Stream Processing and their applications in data management for better decision-making.

Introduction to MaxCompute's Unified Near Real-time Data Processing Architecture

This article introduces how the new offline near real-time integrated architecture based on MaxCompute supports comprehensive business scenarios.

E2E Development and Usage of LLM Data Processing + Model Training + Model Inference

This article describes how to use the data processing, model training, and model inference components of Large Language Model (LLM) provided by PAI to complete end-to-end development and use of LLM.

What is Change Data Capture (CDC)?

Change Data Capture (CDC) detects and captures data changes as they occur in source systems, such as databases or applications.

The Next Step of Flink CDC

This article is based on a keynote speech given by Jark Wu, head of Flink SQL and Flink CDC at Alibaba Cloud, during Flink Forward Asia 2023.

Understanding Stream Processing: Real-Time Data Analysis and Use Cases

Learn about stream processing, its applications, challenges, and Alibaba Cloud's Realtime Compute for Apache Flink solution for real-time data analysis.

What is Batch Processing ?

Batch processing is a method of handling data where transactions are collected over a period and processed together as a group, or batch.

Kube Queue: A Powerful Tool for Kubernetes Task Queuing

This article discusses the importance and necessity of the task queue system, and details how Kube Queue defines its role and contribution in the current Kubernetes ecosystem.

MaxCompute Unleashed - Part 13: Global Zorder

This article introduces the support of Global Z-Order in MaxCompute.

MaxCompute Unleashed - Part 10: IF ELSE Branch Statement

Part 10 of the "Unleash the Power of MaxCompute" series introduces the script mode and parameterized views of MaxCompute.

MaxCompute Unleashed - Part 7: Grouping Set, Cube and Rollup

Part 7 of the "Unleash the Power of MaxCompute" series introduces MaxCompute's support for GROUPING SETS.

MaxCompute Unleashed - Part 3: Complex Type Functions

Part 3 of the “Unleash the Power of MaxCompute” series describes the complex type functions of MaxCompute.

MaxCompute Unleashed - Part 2: Basic Data Types and Built-in Functions

Part 2 of the “Unleash the Power of MaxCompute” series describes the basic data types and built-in functions of MaxCompute.

MaxCompute Unleashed - Part 5: SELECT TRANSFORM

Part 5 of the "Unleash the Power of MaxCompute" series introduces the support of MaxCompute for other scripting languages - SELECT TRANSFORM.