×
LakeHouse

Beyond Silos: How Unified Multimodal Analytics Is Redefining Data Infrastructure for the AI Era

Hologres 4.0 introduces HSAP 2.0—a unified multimodal analytics platform that consolidates OLAP, vector search, full-text retrieval, and AI processing into a single engine.

Apache Fluss vs. Apache Paimon: Two Engines for the Real-Time Lakehouse

Apache Fluss and Paimon:Fluss delivers sub-second real-time data for Flink (reducing state bloat); Paimon is a streaming lakehouse format with ACID and minute-level latency.

Introducing Fluss: Streaming Storage for Real-Time Analytics

Today, we are excited to introduce Fluss, a cutting-edge streaming storage system designed to power real-time analytics.

Real-Time Lakehouse Solutions: Apache Flink & Apache Paimon Integration

Alibaba Cloud presents key optimizations in Flink-Paimon real-time lakehouse architecture, including the Variant data type for efficient semi-structur...

Building a Unified Lakehouse for Large-Scale Recommendation Systems with Apache Paimon at TikTok

TikTok transitioned to a unified Lakehouse architecture, powered by Apache Paimon, to optimize large-scale recommendation models (LRMs) that utilize user behavior sequences.

Flink Materialized Table: Building Unified Stream and Batch ETL

Explore Apache Flink's Materialized Table for unified stream-batch ETL. Learn declarative data processing and overcome Lambda architecture challenges.

vivo's Lakehouse Integration Practice Based on Paimon

Discover vivo's real-world Lakehouse integration using Apache Paimon. Learn architecture design, performance optimization, and unified stream-batch processing.

Best Practices for Flink CDC YAML in Realtime Compute for Apache Flink

This article is authored by the data pipeline team of Alibaba Cloud's open-source big data platforms.

High-speed and Unified New Data Lakehouse Paradigm: Alibaba Cloud E-MapReduce Serverless StarRocks 3.x

This article is compiled from the first session of the EMR StarRocks online open class - EMR Serverless StarRocks3.

Apache Paimon: Streaming Lakehouse is Coming

This article is based on the keynote speeches given by LI Jinsong, WU Xiangping, DI Xingxing, and WANG Yunpeng during Flink Forward Asia 2023.

Using Apache Paimon + StarRocks High-speed Batch and Streaming Lakehouse Analysis

The article introduces the development history, main scenarios, technical principles, performance tests, and future plans of the StarRocks + Apache Paimon lakehouse analysis.

Integration of Paimon and Spark - Part 2: Query Optimization

This article introduces the integration of Paimon and Spark, specifically focusing on query optimization.

Building a Streaming Lakehouse: Performance Comparison Between Paimon and Hudi

This article compares the performance of Paimon and Hudi on Alibaba Cloud EMR and explores their respective roles in building quasi-real-time data warehouses.

Lakehouse: AnalyticDB for MySQL Ingests Data from Multiple Tables to Data Lakes with Flink CDC + Hudi

This article explores how AnalyticDB for MySQL uses Apache Hudi to ingest complete and incremental data from multiple CDC tables into data lakes.

Alibaba Cloud Open Data Platform and Service | Lakehouse of MaxCompute

In this episode, we will introduce the idea of lakehouse and Alibaba Cloud Lakehouse of MaxCompute.

Analysis on the Serverless Elasticity of Cloud-Native AnalyticDB for MySQL

This article discusses data lakehouse edition, AnalyticDB for MySQL, and cost reduction and efficiency enhancement.

AnalyticDB for MySQL Data Lakehouse Edition: Build a Cloud-Native Comprehensive Data Analysis Platform from Lake to Warehouse

This article introduces AnalyticDB for MySQL Data Lakehouse Edition, its architecture, and its advantages.

Alibaba Cloud Cloud-Native Integrated Data Warehouse – An Interpretation of the New Capabilities of Lakehouse

This article discusses the overall updates to Lakehouse architecture.

Databricks Data Insight Open Course - An Introduction to Delta Lake (Commercial Edition)

This part of the Databricks Data Insight Open Course article series introduces Delta Lake Basics (Commercial Edition).

Alibaba Cloud LakeHouse: An Industry-Leading Next-Generation Big Data Platform of Alibaba Cloud to Integrate Data Warehouses and Data Lakes

The article gives an overview of the release of Alibaba Cloud LakeHouse, talks about its benefits, and how it accelerates the digital restructuring of enterprises.