×
MapReduce

Alibaba Cloud Open Source Big Data Platform | E-MapReduce

In this episode, we will introduce Alibaba Cloud Open Source Big Data Platform, Elastic MapReduce.

Learning about Distributed Systems – Part 18: Run AND Write Fast

Part 18 of this series explains how to improve application development efficiency on distributed systems.

Learning about Distributed Systems – Part 16: Solve the Performance Problem of Worker

Part 16 of this series discusses problems with slaves' performance and MapReduce and whether there is room for improvement.

Best Practices for Big Data Processing in Spark

This article is an overview of the best practices for big data processing in Spark taken from a lecture.

Bitmap-Based Data Processing in MaxCompute

This article has a code example that shows how you can encode and compute bitmaps of active user IDs form different dates using the MapReduce module of MaxCompute.

My Thoughts on Distributed Computing Frameworks

This article provides a fully verified solution (with code) to run LR and GBDT on a LibSVM-formatted dataset efficiently using TensorFlow.

How to Setup Hadoop Cluster Ubuntu 16.04

In this tutorial, we will be learning how to setup an Apache Hadoop on a single node cluster on an Alibaba Cloud ECS with Ubuntu 16.04.