×
Data Processing

Apache Flink Broadcast Variable Optimization: FLIP-5's Approach to Reducing Network Overhead

This is Technical Insights Series by Perry Ma | Product Lead, Real-time Compute for Apache Flink at Alibaba Cloud.

Apache Flink FLIP-4: Enhanced Window Evictor for Flexible Data Eviction Before/After Processing

This is Technical Insights Series by Perry Ma | Product Lead, Real-time Compute for Apache Flink at Alibaba Cloud.

FlinkSQL Temporary Join Development

The article introduces the use of temporary joins in real-time development for matching traffic logs with product attributes.

Use iLogtail SPL to Process Logs: A Comprehensive Guide

This article describes the iLogtail plug-ins for data processing and how to write Simple Log Service (SLS) Processing Language (SPL) statements.

Alibaba Cloud Shares New Features of Apache Flink 2.0 at Flink Forward Asia

Alibaba Cloud highlighted the innovative features of the forthcoming Apache Flink 2.0 at Flink Forward Asia in Jakarta.

Use Cases for EMR Serverless Spark | Use EMR Serverless Spark to Submit a PySpark Streaming Job

This artile introduces the usability and maintainability of EMR Serverless Spark in stream processing.

Comprehensive Upgrade of SLS Data Transformation Feature: Integrate SPL Syntax

SLS data transformation feature aims to handle unstructured log data. It is now fully upgraded to integrate SPL, enhance data processing capabilities, and optimize its cost.

Seven Common Errors in ClickHouse Materialized View

This article discusses the principles and best practices of Materialized View, as well as common problems and their solutions that may arise during use.

Big Data Cloud Fighter Bootcamp

This article introduces the Big Data Cloud Fighters bootcamp, which provides an intensive, hands-on experience in mastering big data principles and technologies.

Distributed Pandas Processing with MaxCompute MaxFrame

This article introduces how to use common Pandas operators with MaxFrame.

Themepica Technology Enhances Data Processing Efficiency with Serverless

This article introduces how Beijing Themepica Technology Co., Ltd. leverages advanced cloud-native technologies to address challenges related to complex data processing procedures.

Understanding Batch Processing vs Stream Processing: Key Differences and Applications

Explore the differences between Batch Processing vs Stream Processing and their applications in data management for better decision-making.

Introduction to MaxCompute's Unified Near Real-time Data Processing Architecture

This article introduces how the new offline near real-time integrated architecture based on MaxCompute supports comprehensive business scenarios.

E2E Development and Usage of LLM Data Processing + Model Training + Model Inference

This article describes how to use the data processing, model training, and model inference components of Large Language Model (LLM) provided by PAI to complete end-to-end development and use of LLM.

What is Change Data Capture (CDC)?

Change Data Capture (CDC) detects and captures data changes as they occur in source systems, such as databases or applications.

The Next Step of Flink CDC

This article is based on a keynote speech given by Jark Wu, head of Flink SQL and Flink CDC at Alibaba Cloud, during Flink Forward Asia 2023.

Understanding Stream Processing: Real-Time Data Analysis and Use Cases

Learn about stream processing, its applications, challenges, and Alibaba Cloud's Realtime Compute for Apache Flink solution for real-time data analysis.

What is Batch Processing ?

Batch processing is a method of handling data where transactions are collected over a period and processed together as a group, or batch.

Kube Queue: A Powerful Tool for Kubernetes Task Queuing

This article discusses the importance and necessity of the task queue system, and details how Kube Queue defines its role and contribution in the current Kubernetes ecosystem.

MaxCompute Unleashed - Part 13: Global Zorder

This article introduces the support of Global Z-Order in MaxCompute.