×
Distributed Computing

Data Lake for Stream Computing: The Evolution of Apache Paimon

Uncover the advancements from Apache Hive to Hudi and Iceberg in stream computing, as our expert navigates the transformative landscape of real-time data lakes.

Apache Flink Has Become the De Facto Standard for Stream Computing

This article is based on a keynote speech given by WANG Feng, initiator of Apache Flink Community China and head of Open-Source Big Data Platform at Alibaba Cloud, at Flink Forward Asia 2023.

Apache Flink Tutorial: Master Real-time Data Processing

Ready to dive into real-time data processing? Learn Apache Flink basics & set up with Alibaba Cloud's Realtime Compute for Apache Flink.

Learning about Distributed Systems – Part 10: An Exploration of Distributed Transactions

Part 10 of this series introduces several implementations of distributed transactions as a second preventive solution to data inconsistency.

Learning about Distributed Systems – Part 9: An Exploration of Data Consistency

Part 9 of this series introduces the replica mechanism for high availability and discusses data consistency.

Learning about Distributed Systems - Part 8: Improve Availability with Replications

Part 8 of this series discusses one of the core problems of distributed systems: availability.

Storage Policies and Read/Write Optimization in JindoFS

This article describes common problems and optimization methods of data read/write in computing-storage separation scenarios, and introduces data cache acceleration with JindoFS.

Data Lake: Concepts, Characteristics, Architecture, and Case Studies

This article provides deep insights into the data lake concept and compares some common solutions available in the market.

An Interpretation of PolarDB-X Source Codes (Extra): How to Implement a Paxos

This is an extra article from the 10-part series, discussing the engineering implementation of Paxos.

An Interpretation of PolarDB-X Source Codes (6): Distributed Deadlock Detection

Part 6 of this 10-part series focuses on the source codes of the distributed deadlock detection function in PolarDB-X.

Store Huge Amounts of Structured and Semi-Structured Data with Alibaba Cloud Tablestore

This short article explains the benefits of Alibaba Cloud Tablestore.

How Does SchedulerX Help Users Solve Distributed Task Scheduling Problems?

This article describes the resource definition, visualized control capability, and distributed batch processing capability of the task scheduling platform.

Cloud-Native Operation and Maintenance Technology: Enhance Application Security in ASM with the “Zero-Trust Concept” and OPA

This article explains the zero-trust concept and how to use it to enhance application security in ASM.

Alibaba Cloud Launches Enterprise-Level Cloud-Native Data Lake during 2020 Double 11

This article reviews Alibaba Cloud's enterprise-level cloud-native data lake solution launched during the double 11 festival and discusses its key benefits.

Basic Concepts and Intuitive Understanding of EPaxos

This article introduces the EPaxos algorithm in a simple and easy-to-understand way, suitable even for those with basic knowledge of Paxos or Raft algorithms.

Core Protocol Process of EPaxos: A Trilogy of EPaxos (Part Two)

This article introduces the core protocol process of EPaxos from the perspective of the comparison between Paxos and EPaxos.

Alibaba Big Data Practices on Cloud-Native – EMR Spark on ACK

This article discusses the practices and challenges of EMR Spark on Alibaba Cloud Kubernetes.

Flink Course Series (4): Fault Tolerance in Flink

This article mainly introduces Flink fault tolerance mechanism principles along with stateful stream computing, global consistency snapshots, and Flink state management.

Flink Course Series (1): A General Introduction to Apache Flink

This article describes the basic concepts, importance, development, and current applications of Apache Flink.

The Value of MaxCompute: SaaS Cloud-based Data Warehouse

This post describes the core capabilities of MaxCompute and discusses its advantages through several use cases.