JDBC Support for Database Sharding (2024)

Modern web applications face new scalability challenges with huge volumes of data. A commonly accepted solution to this problem is sharding. Sharding is a data tier architecture, where data is horizontally partitioned across independent databases. Each database in such a configuration is called a shard. All shards together make up a single logical database, which is referred to as a sharded database (SDB). Sharding is a shared-nothing database architecture because shards do not share physical resources such as CPU, memory, or storage devices.

Sharding uses Global Data Services (GDS), where GDS routes a client request to an appropriate database based on parameters such as availability, load, network latency, and replication lag. A GDS pool is a set of replicated databases that offer the same global service. The databases in a GDS pool can be located in multiple data centers across different regions. A sharded GDS pool contains all shards of a sharded database and their replicas, and appears as a single sharded database to database clients.

Starting from Oracle Database 12c Release 2 (12.2.0.1), Oracle JDBC supports database sharding. The JDBC driver recognizes the specified sharding key and super sharding key and connects to the relevant shard that contains the data. Once the connection is established to a shard, then any database operations, such as DMLs, SQL queries and so on, are supported and executed in the usual way. The following section describes the sharding terminologies used in this guide:

A sharding key is a partitioning key used in single-level sharding by range, list, or consistent hash. All sharding keys together are referred to as the composite sharding keys. A super-sharding key is the partitioning key used in composite sharding for the top-level sharding by range or list. Both the sharding key and the super sharding key can contain one or more columns that determine the shard where each row is stored. A sharding key can be of type VARCHAR2, CHAR, DATE, NUMBER, TIMESTAMP and so on.

For JDBC users, it is recommended that sharding keys and super sharding keys must be passed while obtaining connections from the database. However, Sharding Keys can be provided in the connection string as a separate attribute under CONNECT_DATA. Passing sharding key in the connection string restricts the connections only to one shard. So, it is not recommended to use this approach. Following code snippet shows how you can provide Sharding Keys as a separate attribute under CONNECT_DATA in the connection string:

(DESCRIPTION=(…)(CONNECT_DATA=(SERVICE_NAME=ORCL (SHARDING_KEY=…) (SUPER_SHARDING_KEY=...)))

Note:

You must provide the sharding key compliant to the NLS formatting that is specified in the database.

Multi Shard Queries

Multi Shard Queries enable routing and processing of queries and transactions that access data stored on multiple shards. Multi Shard Queries are executed without a sharding key. Multi Shard Operations are used for simple aggregation of data and reporting across shards.

Shard Catalog

Shard Catalog is a special database that is used for storing sharded database and supporting multi shard queries. It also helps in centralized management of a sharded database.

Shard Director

A shard director is a specific implementation of a global service manager (GSM) that acts as a regional listener for clients that connect to an SDB and maintains a current topology map of the SDB. Based on the sharding key passed during a connection request, it routes the connections to the appropriate shard.

Shard Topology

Shard Topology is the sharding key range mappings stored in a particular shard. Universal Connection Pool (UCP) can cache shard topology, which enables it to bypass shard director while establishing connections to shards. So, applications that you built using UCP get fast path for shards.

FAQs

Which database supports sharding? ›

Cassandra, HBase, HDFS , MongoDB and Redis are databases that support sharding. Sqlite, Memcached, Zookeeper, MySQL and PostgreSQL are databases that don't natively support sharding at the database layer. For databases that don't offer built-in support, sharding logic has to reside in the application.

Explore More ›

What is the sharding key in JDBC? ›

A sharding key can be of type VARCHAR2, CHAR, DATE, NUMBER, TIMESTAMP and so on. Note: You must provide the sharding key compliant to the NLS formatting that is specified in the database. Multi Shard Queries enable routing and processing of queries and transactions that access data stored on multiple shards.

Discover More Details ›

Does MySQL support sharding? ›

MySQL NDB Cluster automatically shards (partitions) tables across nodes, enabling databases to scale horizontally on low cost, commodity hardware to serve read and write-intensive workloads, accessed both from SQL and directly via NoSQL APIs.

Discover More Details ›

Does PostgreSQL support sharding? ›

In fact, PostgreSQL has implemented sharding on top of partitioning by allowing any given partition of a partitioned table to be hosted by a remote server. The basis for this is in PostgreSQL's Foreign Data Wrapper (FDW) support, which has been a part of the core of PostgreSQL for a long time.

Read On ›

What is the difference between sharding JDBC and sharding proxy? ›

Sharding-JDBC adopts decentralized architecture, applicable to high-performance light-weight OLTP application developed with Java; Sharding-Proxy provides static entry and all languages support, applicable for OLAP application and the sharding databases management and operation situation.

Find Out More ›

Is sharding only for SQL? ›

Sharding is a core feature of NoSQL databases, designed from the ground up to support horizontal scalability and distributed data storage.

Get More Info Here ›

Does MongoDB use sharding? ›

Sharding is a method for distributing data across multiple machines. MongoDB uses sharding to support deployments with very large data sets and high throughput operations. Database systems with large data sets or high throughput applications can challenge the capacity of a single server.

Read On ›

What are the best practices for sharding in MySQL? ›

Best Practices for Sharding in MySQL

Choose the right sharding key: The sharding key determines how data is distributed across shards. It should be carefully chosen to evenly distribute the data and avoid hotspots. Common sharding keys include user IDs, timestamps, or geographical locations.

See Details ›

Can NoSQL databases be sharded? ›

Sharding can offer several benefits for NoSQL databases, such as scalability, performance, and availability. It can help scale the database horizontally by adding more servers or nodes as the data grows, thus reducing load and bottlenecks on a single server and increasing throughput and storage capacity.

Learn More Now ›

Is PostgreSQL obsolete? ›

According to the official PostgreSQL versioning policy page, the final PostgreSQL 11 release is expected by November 9, 2023. Since no new releases are planned by that date, PostgreSQL 11 has effectively reached its End of Life.

Know More ›

Does neo4j support sharding? ›

An existing database can be sharded with the help of the neo4j-admin database copy command. For an example, see Sharding data with the copy command.

What is the difference between sharding and partitioning? ›

Sharding and partitioning are techniques to divide and scale large databases. Sharding distributes data across multiple servers, while partitioning splits tables within one server.

Read On ›

Does Oracle support sharding? ›

Oracle Sharding supports system-managed, user defined, or composite sharding methods. System-managed sharding does not require you to map data to shards. The data is automatically distributed across shards using partitioning by consistent hash.

Get More Info ›

Does Cassandra support sharding? ›

Both Cassandra and MongoDB allow sharding—a technique to horizontally partition data across multiple nodes in a cluster.

Keep Reading ›