Architecture

Design

SPARKSEEHA provides a horizontally scaling architecture that allows SPARKSEE-based applications to handle larger read-mostly workloads.

SPARKSEEHA has been thought to minimize developers' work to go from a single node installation to a multiple node HA-enabled installation. In fact, it does not require any change in the user application because it is simply a question of configuration.

To achieve this, several SPARKSEE slave databases work as replicas of a single SPARKSEE master database, as seen in the figure below. Thus, read operations can be performed locally on each node and write operations are replicated and synchronized through the master.

Figure 1.1: SPARKSEEHA Architecture

Figure 1.1: SPARKSEEHA Architecture

Figure 1.1 shows all components in a basic SPARKSEEHA installation:

How it works

Now that the pieces of the architecture are clear, let's see how SPARKSEEHA works in different scenarios or acts in typical operations using these components. Below is an explanation of how the system acts in the described situations.

Master election

The first time a SPARKSEE instance goes up, it registers itself into the coordinator service. The first instance registered which becomes the master. If a master already exists, it becomes a slave.

Reads

As all SPARKSEE slave databases are replicas of the SPARKSEE master database, slaves can answer read operations by performing the operation locally. They do not need to synchronize with the master.

Writes

In order to preserve data consistency, write operations require slaves to be synchronized with the master. A write operation is as follows:

  1. A slave wishes to perform a write operation and sends it to the master.
  2. The master serializes the operation in the history log, performs the write, and replies to the slave when it has been successfully achieved.
  3. From the master the slave receives a fully updated list of write operations, which are extracted from the history log, and records them in addition to its original write. This operation preserves the consistency of the database.

If two slaves perform a write operation on the same object at the same time, it may result in a lost update in the same way as may happen in a SPARKSEE single instance installation if two different sessions want to write the same object at the same time.

Slave goes down

A failure in a slave during a regular situation does not affect the rest of the system. However if it goes down in the middle of a write operation the behavior of the rest of the system will depend on the use of transactions:

Slave goes up

When a SPARKSEE instance goes up, it registers itself with the coordinator. The instance will become a slave if there is already a master in the cluster.

If polling is enabled for the slave, it will immediately synchronize with the master to receive all pending writes. On the other hand, if polling is disabled, the slave will synchronize when a write is requested (as explained previously).

Future work

This is a first version of SPARKSEEHA, so although it is fully operational some important functionality is not available which will assure a complete high-availability of the system. Subsequent versions will focus on the following features:

Master goes down

A failure in the master leaves the system non-operational. In future versions this scenario will be correctly handled automatically converting one of the slaves into a master.

Fault tolerance

A failure during the synchronization of a write operation between a master and a slave leaves the system non-operational. For instance, a slave could fail during the performance of a write operation enclosed in a transaction, or there could be a general network error.

This scenario requires that the master should be able to abort (rollback) a transaction. As SPARKSEE does not offer that functionality, these scenarios cannot currently be solved. SPARKSEEHA will be able to react when SPARKSEE implements the required functionality.

Back to Index