×
E-MapReduce

JindoFS: Computing and Storage Separation for Cloud-native Big Data

In this blog, we'll introduce the origins of JindoFS and discuss the problems its

In-depth Review of Apache Spark: Spark + AI Summit 2020

Matei Zaharia, founder of the Spark project, gave an in-depth review of Spark at the Spark + AI Summit 2020 in conjunction with its 10-year anniversary.

Big Data Made Simpler with E-MapReduce – Part 2

Part 2 of this 2-part series discusses E-MapReduce cluster management and how it works in real-world scenarios and various usage scenarios.

Big Data Made Simpler with E-MapReduce – Part 1

Part 1 of this 2-part series discusses how E-MapReduce provides a simple and highly effective big data practice.

EMR: An Efficient Cloud-native Data Analytics Engine

This blog explores the architecture and design goals of Alibaba Cloud E-MapReduce (EMR), as well as introduces two key components of EMR: JindoFS and .

Empowering Open-source Cloud Ecosystems: Development of Alibaba Cloud's Open-source Big Data Platform

This article discusses how Alibaba Cloud EMR empowers open-source cloud ecosystems from multiple perspectives.

What is Alibaba Cloud EMR?

This article illustrates the definition of EMR, its advantages, architecture, and benefit.

Storage Policies and Read/Write Optimization in JindoFS

This article describes common problems and optimization methods of data read/write in computing-storage separation scenarios, and introduces data cache acceleration with JindoFS.

A Comprehensive List of Big Data Processing Tools

This blog discusses the popular tools used in a big data system and shares some basic tips on building a distributed product roadmap.

Data Lake Acceleration in Data Lake Architecture

This article introduces the reasons for choosing data lake acceleration, and shares Alibaba Cloud's practical experience and technical solutions.

Build a Cloud Data Lake Using E-MapReduce

This article is based on the enterprise data lake construction solution using E-MapReduce and customer best practices shared by Ziguan.

So How Did Flink Double Its GitHub Stars in Just One Year?

Read on to see exactly what happened to Flink in 2019, in particular how Alibaba has contributed to Flink.

Architecture Evolution and Application Scenarios of Real-time Warehouses in the Cainiao Supply Chain

In this blog, we'll discuss the evolution of Cainiao's Flink implementation solution and supply chain data in terms of real-time data technology architecture.

OPPO's Use of Flink-based Real-time Data Warehouses

This article covers the evolution of the OPPO real-time data warehouse and development of Flink SQL.

Netflix: Evolving Keystone to an Open Collaborative Real-time ETL Platform

This article briefly introduces Netflix's data platform team and its key product, Keystone.

Architecture Evolution and Practices of the Xiaomi Streaming Platform

This article discusses how Xiaomi leverages Apache Flink to build its streaming platform.

Meituan-Dianping's Use of Flink-based Real-time Data Warehouse Platforms

In this article, Lu Hao of Meituan-Dianping shares the company's practices using the Flink-based real-time data warehouse platform.

Architecture and Practices of Bilibili's Real-time Platform

This article introduces the architecture and practices of the Bilibili's Saber real-time computing platform by considering the pain points of real-time computing.

Trillions of Bytes of Data Per Day! Application and Evolution of Apache Flink in Kuaishou

This article introduces the technical evolution of Apache Flink during its application in Kuaishou and Kuaishou's future plans regarding Apache Flink.

Lyft's Large-scale Flink-based Near Real-time Data Analytics Platform

This blog shares how Lyft built a large-scale near real-time data analytics platform based on Apache Flink.