×
Shuffle

Learning about Distributed Systems - Part 17: Shuffle

Part 17 of this series introduces several possible Shuffle methods and their adoption in MapReduce and Spark.

Sort-Based Blocking Shuffle Implementation in Flink – Part 2

Part 2 of this 2-part series will give you insight into some core design considerations and implementation details of the sort-based blocking shuffle in Flink.

Sort-Based Blocking Shuffle Implementation in Flink – Part 1

Part 1 of this 2-part series will introduce the sort-based blocking shuffle, present benchmark results, and provide guidelines on how to use this new feature.

Revealing DAG – MaxCompute Execution Engine Core Technology

This article explains the core ideas and design of DAG.

Jingdong: Flink SQL Optimization Practice

This article focuses on the optimization measures of Jingdong in Flink SQL tasks, focusing on the aspects of shuffle, join mode selection, object reuse, and UDF reuse.