· Glossary  · 2 min read

What Is Sharding?

Learn what sharding is in database architecture. Sharding is the act of splitting a massive database into smaller pieces across multiple servers to handle scale.

Learn what sharding is in database architecture. Sharding is the act of splitting a massive database into smaller pieces across multiple servers to handle scale.

When a database gets too big for a single server you have two choices. You can buy a bigger server. Or you can split the database into pieces.

Sharding is the act of splitting.

Simple Definition

Sharding is a database architecture pattern related to horizontal partitioning. It involves separating one massive table into multiple smaller tables called “shards” spread across multiple servers.

Imagine a phone book for the entire world. It is too heavy to lift. So you rip it in half. Volume 1 has A-M. Volume 2 has N-Z. You have just sharded the phone book.

Each shard holds a subset of the data. Collectively they hold the entire dataset.

Why Shard?

We shard to handle massive scale.

Handling massive data

If you have 10 billion rows queries become slow. Indexes get too big for RAM.

By sharding you distribute the load. Instead of one server searching 10 billion rows you have 10 servers searching 1 billion rows each. This allows parallel processing.

It also increases write throughput. A single server can only write so fast. Ten servers can write ten times faster.

Visualizing Shards

How do you draw this?

Multiple DB nodes in parallel

In a System Architecture Diagram you don’t draw one database cylinder. You draw a cluster.

You might label them “Shard 0 (Users 0-1M),” “Shard 1 (Users 1M-2M),” etc.

You also need to draw the Routing Layer. The application needs to know which shard to talk to. This logic often lives in the application code or a dedicated proxy. Visualizing that routing decision helps developers understand the complexity cost of sharding.

To understand scaling you need these terms:

  • Partitioning: The general concept of splitting data. Sharding is a specific type of horizontal partitioning.
  • Replication: Copying the same data to multiple servers for backup. Sharding is splitting different data to multiple servers for scale.
  • Shard Key: The specific value (like UserID or Region) used to decide which shard a row belongs to.

For more on visualizing database architectures check out our System Design Guide.

Back to Blog

Related Posts

View All Posts »