×
Hadoop

Zeppelin Notebook: An Important Tool for PyFlink Development Environment

This article introduces a PyFlink development environment tool that can help users solve various problems.

Deploy and Run Azkaban on Alibaba Cloud

This article is a tutorial on how to run the open-source project Azkaban on Alibaba Cloud with ApsaraDB (Alibaba Cloud Database).

How Can We Defend against Multiple Intrusion Methods on Multiple Platforms When Lemon-Duck Is Continuously Active?

This article offers some insight into protection against botnets and other Internet threats.

An Overview of Alibaba Cloud's Comprehensive Cloud-Native Data Lake System

This article introduces the establishment of a cloud-native data lake system based on Alibaba Cloud OSS, Data Lake Formation (DLF), and various computing engines present in Alibaba Cloud.

How to Use JindoDistCp for Offline Data Migration to a Data Lake

This article discusses the data lake offline data migration process using JindoDistCp and explains how it improves the migration performance in different scenarios.

JindoTable for Data Optimization and Query Acceleration in a Data Lake

The article briefly discusses Alibaba Cloud's JindoTable and explains how it solves the data management problems in a data lake.

EB-level Data Lake Based on OSS

This article briefly discusses data lake systems, their features, and describes the process of building a data lake storage based on Alibaba Cloud OSS.

Efficient Data Lake Formation Based on JindoFS and OSS

This article explains the process of data lake formation based on Alibaba Cloud OSS and JindoFS big data cache acceleration service.

The Discovery of a Promising Technology

In this article, Zhang Jianfeng, a veteran in the open-source community, explains how to evaluate whether the technology is worth learning using three key dimensions.

Alluxio Deep Learning Practices - 1: Running PyTorch Framework on HDFS

This article demonstrates how Alluxio simplifies running the PyTorch framework on HDFS using the Kubernetes platform to drastically improve development efficiency.

How to Migrate Data From Hadoop to The Cloud?

This blog gives you a big dive into secure migrating data from Apache Hadoop to the cloud platform.

The Big Data Platform Behind Alibaba's E-Commerce Systems

This article looks at the big data platform that helped power last year's Double 11.

Setting up Spark on MaxCompute

This post provides a walkthrough on how to set up Spark on MaxCompute on Alibaba Cloud.

There's No Need for Hadoop: Analyze Server Logs with AnalyticDB

This article outlines how you can use Alibaba Cloud AnalyticDB to analyze server logs without needing to set up Hadoop.

Set up a Hadoop Cluster with Apache Ambari

In this tutorial, you will learn how to set up Hadoop and its components on a multinode cluster using Apache Ambari.

One-click Deployment of a Hadoop Distributed Cluster on Alibaba Cloud

Hadoop is an open source distributed computing framework that processes data efficiently and scalably.

Eight Things You Should Know about Big Data

As a senior technical expert at Alibaba Group, I will share my thoughts on what there is to say about big data, past, present, future.

What's All Involved with Blink Merging with Apache Flink?

In January Alibaba announced Blink would become open-source and contribute to Apache Flink's code—now this has come to Fruition.

Diving into Big Data: Hadoop User Experience (Continued)

In this article, we continue with HUE, or Hadoop User Experience, which is an open-source web interface, which can make many operations more simpler and easy to complete.

Diving into Big Data: Hadoop User Experience

In this article, we explore HUE, or Hadoop User Experience, which is an open-source web interface, which can make many operations more simple and easy to complete.