×
Distributed System

Interview Questions We've Learned Over the Years: The Distributed System

This article is part of a series focusing on interview questions for technicians, with a specific emphasis on the distributed system.

Implementation Principles and Best Practices of Distributed Lock

This article explains the principles and best practices for distributed locks.

Learning about Distributed Systems - Part 27: From Batch Processing to Stream Computing

Part 27 of this series discusses distributed systems in terms of throughput and latency.

Learning about Distributed Systems - Part 26: HBase

Part 26 of this series introduces HBase and explains how it applies to random or range queries of massive data and how it can maintain multiple versions of data.

Learning about Distributed Systems – Part 25: Kylin in a New Way

Part 25 of this series introduces Apache Kylin and its concept of “space for time.”

Learning about Distributed Systems – Part 24: Massive Parallel Processing (MPP)

Part 24 of this series introduces massive parallel processing (MPP) and how it relates to the exploration of system extensibility

Learning about Distributed Systems - Part 23: Distributed Data Warehouse

Part 23 of this series explains why Offline data warehouses based on Hive and real-time data warehouses based on Kafka + Flink make it easy to distribute data warehouses.

Learning about Distributed Systems - Part 22: Adaptive Optimization

Part 22 of this series discusses whether there is a more flexible method to optimize SQL query performance than CBO.

Learning about Distributed Systems - Part 21: Cost-based Optimization

Part 21 of this series focuses on why there is a CBO and how it is implemented.

Learning about Distributed Systems - Part 20: Rule-Based Optimization (RBO)

Part 20 of this series discusses another important SQL optimization method: rule-based optimization (RBO).

Learning about Distributed Systems – Part 19: Performance-Impacting Operations in SQL

Part 19 of this series discusses SQL performance optimization.

Learning about Distributed Systems – Part 18: Run AND Write Fast

Part 18 of this series explains how to improve application development efficiency on distributed systems.

Learning about Distributed Systems - Part 17: Shuffle

Part 17 of this series introduces several possible Shuffle methods and their adoption in MapReduce and Spark.

Learning about Distributed Systems – Part 16: Solve the Performance Problem of Worker

Part 16 of this series discusses problems with slaves' performance and MapReduce and whether there is room for improvement.

Learning about Distributed Systems - Part 3: Solving Short Storage

As the data grows rapidly and exponentially, cloud servers often run out of space to store them. Luckily, with distributed file systems like HDFS, we are now cracking the problem of low memory.

Learning about Distributed Systems - Part 14: Causes of Inconsistency

Inconsistency is so protruding, and we have tried every means to solve it. We want high availability under scalability.

Learning about Distributed Systems - Part 15: The Last Obstacle to Scalability

Part 15 of this series shows that distributed systems are not completely distributed, typical solutions to centralization problems, and the performance problems of masters.

Learning about Distributed Systems - Part 2: The Interaction Between Open Source and Business

This is the second blog of the distributed systems series. Today we look at the intriguing history of how academia and industry, open-source and business get along with each other.

Learning about Distributed Systems - Part 4: Smart Ways to Store Data

Last time we talked about WHERE to store massive data, and this time, HOW. Massive data brings massive costs.

Learning about Distributed Systems – Part 10: An Exploration of Distributed Transactions

Part 10 of this series introduces several implementations of distributed transactions as a second preventive solution to data inconsistency.