×
Batch Processing

Build an All-in-one Real-time Data Warehouse (Code-level) Based on AnalyticDB for PostgreSQL

This article introduces the process of building an all-in-one real-time data warehouse using AnalyticDB for PostgreSQL at the code level.

Introduction to Unified Batch and Stream Processing of Apache Flink

Unified batch and stream processing of Flink is a well-established concept in the stream computing field.

Understanding Batch Processing vs Stream Processing: Key Differences and Applications

Explore the differences between Batch Processing vs Stream Processing and their applications in data management for better decision-making.

Data Lake for Stream Computing: The Evolution of Apache Paimon

Uncover the advancements from Apache Hive to Hudi and Iceberg in stream computing, as our expert navigates the transformative landscape of real-time data lakes.

Apache Flink Has Become the De Facto Standard for Stream Computing

This article is based on a keynote speech given by WANG Feng, initiator of Apache Flink Community China and head of Open-Source Big Data Platform at Alibaba Cloud, at Flink Forward Asia 2023.

Apache Flink Tutorial: Master Real-time Data Processing

Ready to dive into real-time data processing? Learn Apache Flink basics & set up with Alibaba Cloud's Realtime Compute for Apache Flink.

Understanding Stream Processing: Real-Time Data Analysis and Use Cases

Learn about stream processing, its applications, challenges, and Alibaba Cloud's Realtime Compute for Apache Flink solution for real-time data analysis.

What is Batch Processing ?

Batch processing is a method of handling data where transactions are collected over a period and processed together as a group, or batch.

Announcement of the Release of Apache Flink 1.18

The Apache Flink PMC is pleased to announce the release of Apache Flink 1.18.0. As usual, we are looking at a packed release with a wide variety of improvements and new features.

Learning about Distributed Systems - Part 27: From Batch Processing to Stream Computing

Part 27 of this series discusses distributed systems in terms of throughput and latency.

Announcement of the Release of Apache Flink 1.17

Apache Flink, a leading stream processing standard, has released version 1.17.0, which includes new features and improvements.

Apache Flink Table Store 0.3.0 Release Announcement

The Apache Flink community has released version 0.3.0 of the Flink Table Store, which includes many new features and improvements.

More Than Computing: A New Era Led by the Warehouse Architecture of Apache Flink

Mowen discusses the future of Apache Flink regarding its core capabilities of stream computing and improving the processing standards of the entire industry.

Flink Remote Shuffle Open-Source: Shuffle Service for Cloud-Native and Unified Batch and Stream Processing

This article introduces the research and development background and the design and use of Flink Remote Shuffle.

How Idle Fish Uses RxJava to Improve the Asynchronous Programming Capability - Part1

Part 1 of this 2-part article introduces RxJava and explores its usage.

How Idle Fish Uses RxJava to Improve the Asynchronous Programming Capability - Part2

Part 2 of this 2-part article explores the basic principles and precautions of RxJava.

MaxCompute2.0 Performance Metrics: Faster, Stronger Computing

MaxCompute (originally ODPS) is a Big Data processing platform used for batch structural data storage and processing, to provide massive data warehouse solutions and data modeling.

Evolution of the Real-time Data Warehouses of the Alibaba Search and Recommendation Data Platform

This article shares the results of explorations into real-time data warehouses focusing on the evolution and best practices for data warehouses based on Apache Flink and Hologres.

The New Major Features of Flink 1.11.0

One of the release managers of Flink 1.11.0 shares his deep insights into the long-awaited features and explains them from different perspectives.

Flink Is Attempting to Build a Data Warehouse Simply by Using a Set of SQL Statements

A discussion of how unifying batch and real-time processing in data warehouses can promote integrated computing.