Stream Computing

Who Comes Next After the Rise of Data Warehouses and Data Lakes?

In this blog, two experts from Alibaba Cloud talk about the advantages that data warehouses and data lakes bring to handle large, complex architecture for businesses.

Demo: How to Build Streaming Applications Based on Flink SQL

This article demonstrates how to use Flink SQL to integrate Kafka, MySQL, Elasticsearch, and Kibana to quickly build a real-time analysis application.

The Flink Ecosystem: A Quick Start to PyFlink

This article will introduce PyFlink's architecture and provide a quick demo in which PyFlink is used to analyze CDN logs.

Flink 1.10 vs. Hive 3.0 - A Performance Comparison

This blog compares the performance of Flink 1.10 against Hive 3.0 using the TPC-DS Benchmark 10-TB dataset and 20 hosts to test 3 engines.

How to Optimize Duplicate Data Cleansing in PostgreSQL

This article evaluates different duplicate data cleansing techniques while considering several technical database issues in PostgreSQL.

Alibaba Risk Control Brain: Exploration and Practices in Big Data Applications

Read on to learn how Alibaba's Risk Control Brain works in big data applications.

The Power of AI: Why Taobao Knows Online Shoppers Better Than They Know Themselves

This post takes a deep dive on how Taobao Mobile's recommendation system was developed from the ground up.

How to Design a Storage Layer for Structured Data Storage Requirements

This article discusses what are some structured data storage requirements and presents the design used by Alibaba Cloud Tablestore to meet these requirements.

Apache Flink Fundamentals: State Management and Fault Tolerance

This article describes the basics of Flink state management, different state types and its use cases.

Apache Flink Fundamentals: Using Table API for Programming

This article discusses three main parts: what is Table API, how to use Table API from a code perspective, and the latest information about the Table API.

Apache Flink Fundamentals: Five Modes of Client Operations

This article shares five methods on how to submit tasks in Flink, helping you improve development skills and enhance operations and maintenance efficiency.

Apache Flink Fundamentals: Building a Development Environment and Configure, Deploy and Run Applications

This article imparts knowledge to new Flink users or those who have a basic understanding of Flink, focusing on its various configuration steps and guidelines for development and debugging.

How Flink's Application Powers Up eBay's Monitoring System

This post describes the status quo of Flink's application in the monitoring system, focusing on its impacts on eBay's monitoring system, Sherlock.IO.

Use Python API in Apache Flink

This article introduces the history of Apache Flink Python API, and discusses its architecture, development environment, and key operators.

Advanced Apache Flink Tutorial 1: Analysis of Runtime Core Mechanism

This article describes the core mechanism of running jobs in Flink Runtime. It provides an overview of the Flink Runtime architecture and basic job running process.

Basic Apache Flink Tutorial: DataStream API Programming

This article reviews the basics of distributed stream processing and explores the development of Flink with DataStream API through an example.

Basic Apache Flink Tutorial: SQL Programming Practice

This article is part of the Basic Apache Flink Tutorial series, focusing on Flink SQL programming practices using five examples.

Flink 1.9: Using SQL Statements to Read Data from Kafka and Write to MySQL

This article includes the code that I demonstrated in my speech, entitled Flink SQL 1.9.0 Technologies and Best Practices, which sparked a lot of interest from the audience.

Implementating a Real-Time Data Warehouse with Flink

This article describes the development path, construction methods, and architecture of a data warehouse, and compares between real-time and offline data warehouses.

Using Flink Connectors Correctly

This article describes Flink connectors focusing on the basic working mechanism and usage of Kafka connectors commonly used in production.