×
Data Processing

Fluss: Redefining Streaming Storage for Real-time Data Analytics and AI

Explore Apache Fluss, the revolutionary streaming storage solution bridging traditional systems and lakehouse architectures for real-time data analytics and AI.

FLIP-9: Trigger Language - Apache Flink Rule Definition Guide

description" content="Learn about FLIP-9 proposal for Apache Flink trigger language. Discover why this rule language for Flink triggers was shelved an.

Apache Flink Broadcast Variable Optimization: FLIP-5's Approach to Reducing Network Overhead

Learn Apache Flink FLIP-5 broadcast variable optimization strategies for reducing network overhead. Discover performance lessons and modern scaling approaches for production stream processing.

Apache Flink FLIP-4: Enhanced Window Evictor for Flexible Data Eviction Before/After Processing

Master Apache Flink FLIP-4 enhanced window evictor for flexible data eviction. Learn real-time quality control, anomaly detection, and production window processing strategies.

Mastering Flink State Scaling: FLIP-8 Non-Partitioned State Management for Distributed Systems

Master Apache Flink FLIP-8 scalable non-partitioned state management. Learn dynamic scaling solutions, OperatorStateStore implementation, and state re...

Starting from o11y 2.0: The "More, Faster, Better, Cheaper" Approach to Big Data Pipelines

This article explains how Alibaba Cloud's Simple Log Service (SLS) provides a "more, faster, better, and cheaper" approach to big data pipelines to meet the demands of modern Observability 2.

Unlocking Data Value Without Compromise: Privacy-Enhancing Computation on Alibaba Cloud

The article introduces how Alibaba Cloud employs PEC to securely process data while maintaining privacy compliance and enhancing business collaboration.

Understanding Fluss Partial Update

Traditional streaming data pipelines often need to join many tables or streams on a primary key to create a wide view.

Apache Flink FLIP-7: Visualizing Monitoring Metrics in Web UI

Follow the Apache Flink® Community for making Flink Metrics More Accessible Through Web UI Visualization.

FlinkSQL Temporary Join Development

The article introduces the use of temporary joins in real-time development for matching traffic logs with product attributes.

Use iLogtail SPL to Process Logs: A Comprehensive Guide

This article describes the iLogtail plug-ins for data processing and how to write Simple Log Service (SLS) Processing Language (SPL) statements.

Alibaba Cloud Shares New Features of Apache Flink 2.0 at Flink Forward Asia

Alibaba Cloud highlighted the innovative features of the forthcoming Apache Flink 2.0 at Flink Forward Asia in Jakarta.

Use Cases for EMR Serverless Spark | Use EMR Serverless Spark to Submit a PySpark Streaming Job

This artile introduces the usability and maintainability of EMR Serverless Spark in stream processing.

Comprehensive Upgrade of SLS Data Transformation Feature: Integrate SPL Syntax

SLS data transformation feature aims to handle unstructured log data. It is now fully upgraded to integrate SPL, enhance data processing capabilities, and optimize its cost.

Seven Common Errors in ClickHouse Materialized View

This article discusses the principles and best practices of Materialized View, as well as common problems and their solutions that may arise during use.

Big Data Cloud Fighter Bootcamp

This article introduces the Big Data Cloud Fighters bootcamp, which provides an intensive, hands-on experience in mastering big data principles and technologies.

Distributed Pandas Processing with MaxCompute MaxFrame

This article introduces how to use common Pandas operators with MaxFrame.

Themepica Technology Enhances Data Processing Efficiency with Serverless

This article introduces how Beijing Themepica Technology Co., Ltd. leverages advanced cloud-native technologies to address challenges related to complex data processing procedures.

Understanding Batch Processing vs Stream Processing: Key Differences and Applications

Explore the differences between Batch Processing vs Stream Processing and their applications in data management for better decision-making.

Introduction to MaxCompute's Unified Near Real-time Data Processing Architecture

This article introduces how the new offline near real-time integrated architecture based on MaxCompute supports comprehensive business scenarios.