Community

Blog Events Webinars Tutorials Forum

Create Account

×

Spark SQL

Integration of Paimon and Spark - Part I

This article introduces the main features in the new version of Paimon that are supported by the Spark-based computing engine.

Alibaba EMR April 15, 2024 3,321

Learning about Distributed Systems - Part 20: Rule-Based Optimization (RBO)

Part 20 of this series discusses another important SQL optimization method: rule-based optimization (RBO).

Alibaba Cloud_Academy July 25, 2023 2,313

Learning about Distributed Systems – Part 19: Performance-Impacting Operations in SQL

Part 19 of this series discusses SQL performance optimization.

Alibaba Cloud_Academy July 24, 2023 4,349

Best Practices for Big Data Processing in Spark

This article is an overview of the best practices for big data processing in Spark taken from a lecture.

Alibaba EMR October 12, 2021 3,801

In-depth Review of Apache Spark: Spark + AI Summit 2020

Matei Zaharia, founder of the Spark project, gave an in-depth review of Spark at the Spark + AI Summit 2020 in conjunction with its 10-year anniversary.

Alibaba EMR April 2, 2021 2,793

Rewriting the Execution Plan in the EMR Spark Relational Cache

This article goes through the process of rewriting execution plans in the Spark Relational Cache on EMR.

AdrianW August 22, 2019 21,721

Related Tags

artificial intelligence big data cloud computing