Best Practices for Big Data Processing in Spark

This article is an overview of the best practices for big data processing in Spark taken from a lecture.

Bitmap-Based Data Processing in MaxCompute

This article has a code example that shows how you can encode and compute bitmaps of active user IDs form different dates using the MapReduce module of MaxCompute.

My Thoughts on Distributed Computing Frameworks

This article provides a fully verified solution (with code) to run LR and GBDT on a LibSVM-formatted dataset efficiently using TensorFlow.

How to Setup Hadoop Cluster Ubuntu 16.04

In this tutorial, we will be learning how to setup an Apache Hadoop on a single node cluster on an Alibaba Cloud ECS with Ubuntu 16.04.