×
Distributed Computing

The Secrets Behind the Optimized SQL Performance of EMR Spark

Lin Xuewei, a technical expert, gives an overview of the latest performance and efficiency optimizations that were made to TPC-DS Perf after its third submission.

Alibaba Cloud Summit - Making a Data Warehouse that Integrates Distributed, Elastic Computing, and Cloud Computing

Think about this…by 2020, there will be 40 ZB of data in the world. Can AnalyticDB transform into the ultimate form of a data warehouse?

Alibaba Cloud E-MapReduce Sets World Record Again on TPC-DS Benchmark

This year, EMR increased its computing speed to 2.2 times of that from last year, breaking the world record again in the big data sector.

The Journey of an SQL Query in the MaxCompute Distributed System

This article introduces the MaxCompute computing platform and describes how to build an enterprise-grade distributed intelligent scheduling execution framework.

Training Facial Recognition Algorithms with Alibaba's Mars

This article shows you how to use Alibaba's open source Mars to implement facial recognition algorithms.

Mars – Matrix-based Universal Distributed Computing Framework

In this article, we discuss how Mars can help researchers in the scientific computing field solve large-scale multidimensional matrix operations.

PyCon China 2018: In-Depth Analysis of Mars

This article shares the "What, Why, and How" of Mars, presented at the PyCon China 2018 conference in Beijing, Chengdu, and Hangzhou.

Mars – Alibaba's Open Source Distributed Scientific Computing Engine

Mars is Alibaba's first open source and independently developed computing engine for large-scale scientific computing.

Using Hive in Apache Flink 1.9

This article describes the integration of Hive with Apache Flink 1.9.0 and discusses this feature from the perspectives of design architecture, the latest progress, and usage instructions.

Why Apache Flink 1.9.0 Support for Python API is a Game Changer

In this blog, we'll take a closer look at Apache Flink 1.9.0, including its new machine learning interfaces and Flink-Python modules.

Eight Things You Should Know about Big Data

As a senior technical expert at Alibaba Group, I will share my thoughts on what there is to say about big data, past, present, future.

My Thoughts on Distributed Computing Frameworks

This article provides a fully verified solution (with code) to run LR and GBDT on a LibSVM-formatted dataset efficiently using TensorFlow.

Interview with the Creator of Redisson – Building an Open Source Enterprise Redis Client

At the recent RedisConf 2018, Alibaba Cloud spoke with Redisson creator, Rui Gu, about his journey to building an open source enterprise Redis client for the community.

Optimizing Complex Data Distribution in MaxCompute

In this article, we introduce the data distribution and explain some new optimization measures in Alibaba Cloud MaxCompute.

How to Execute Mars in a Distributed Manner

This article introduces the distributed execution architecture implemented by Alibaba's open source Mars.

Combining Redis with Hadoop and ELK for Big Data

Redis is now a major component used in many Big Data applications. Redis is a favorable alternative to traditional relational database services becaus.

DRDS Read-only Instance for Complex SQL Queries

Do you face difficulties when running complex SQL Queries? Solve your problems with Alibaba Cloud DRDS read-only instances.

MongoShake – A MongoDB-based Cross-Data Center Data Replication Platform

In this article, we will introduce MongoShake, a general platform service written in Golang.