×
Apache Spark

In-depth Review of Apache Spark: Spark + AI Summit 2020

Matei Zaharia, founder of the Spark project, gave an in-depth review of Spark at the Spark + AI Summit 2020 in conjunction with its 10-year anniversary.

The Discovery of a Promising Technology

In this article, Zhang Jianfeng, a veteran in the open-source community, explains how to evaluate whether the technology is worth learning using three key dimensions.

Using Apache Spark for Data Processing and Analysis

In this article, you will learn to accelerate your data processing and analysis across Apache Spark Relational Cache, Mesos, Akka, Cassandra, and Kafka.

Spark-TFRecord: Toward Full Support of TFRecord in Spark

In this post, we will introduce Spark-TFRecord, a new solution to enable support for native TensorFlow data format in Spark.

Eight Things You Should Know about Big Data

As a senior technical expert at Alibaba Group, I will share my thoughts on what there is to say about big data, past, present, future.

Alibaba Cloud Security Team Discovers Apache Spark Rest API Remote Code Execution (RCE) Exploit

This article describes the discovery of the first "in-the-wild" Spark Rest API Remote Code Execution (RCE) vulnerability made by Fengwei Zhang and the team at Alibaba Cloud Security on July 7, 2018.

MaxCompute2.0 Performance Metrics: Faster, Stronger Computing

MaxCompute (originally ODPS) is a Big Data processing platform used for batch structural data storage and processing, to provide massive data warehouse solutions and data modeling.

A Quick Guide to Analyzing Apache Logs on Alibaba Cloud Log Service

Alibaba Cloud Apache Log Service, there are several methods available for you to collect upstream data.

How to Create Virtual Cloud Desktop using Apache Guacamole

Apache Guacamole is a free and open source web application which lets you access your dashboard using a modern web browser.