Partitioned graph

Wendy Chen
CPSC 426 Distributed Systems, Fall 2016

The following service description is adapted from the lab specification:

Service description

In this lab, we partition the graph to attain scalability. We use a simple modulo function to determine which partition serves which node ids. In our case, we support three partitions. We use the function (node_id mod 3) + 1 = partition_number. Thus, nodes 0, 3, 6 belong on partition 1; nodes 1, 4, 7 on partition 2; notes 2, 5, 8 on partition 3. For requests involving two node ids, the request will be sent to the lower numbered partition.

For any client request, the partition involved must be locked. If multiple partitions are involved, the partitions must be locked in order of increasing partition number. RPC is used for inter-node communication. After a partition has mutated its graph, it may be unlocked.

Unlike the previous lab, shortest_path and remove_node to not need to be supported.

To start the server, the user should run the following:
./cs426_graph_server <graph_server_port> -p <partnum> -l <partlist>

The -p flag is used to specify which address in the <partlist> belongs to the partition. Each entry in the partition list is in the form IP:rpc_server_port.

Compilation details

Like the previous lab, in this lab we continue to run three instances on Google Cloud. We continue to use Mongoose and Google RPC.

Implementation details

In this lab, locking and RPC communication for requests involving multiple node ids are introduced. Locking is implemented through use of std::mutex. Each partition has a mutex, which is acquired by the event handler before executing any graph mutation, and released by the event handler when the partition is no longer needed.

RPC is used for inter-node communication when the graph mutator functions add_edge and remove_edge are called. First, the client sends the http requests to the lower numbered partition, as required by the specification, then the partition is locked. Because the two node_ids may not both be stored on the lower numbered partition, the event handler initiates two rpc clients to query the partition for each node involved. The RPC call returns information about the node's existence and if it's neighbors contain the other node id.

After receiving this information, the partition forwards an rpc request to the higher numbered partition, sending the information about each node that was queried. The rpc server on the higher partition will lock the partition, perform the requested graph mutation, and unlock the partition. The higher numbered partition then sends back an acknowledgement to the lower numbered partition.

After receiving the acknowledgement, the lower numbered partition proceeds to perform the requested graph mutation. Lastly, the lower numbered partition releases the lock.

Because node ids are now partitioned, we must alter the Graph class methods in graph.cpp in addition to having added rpc calls in the event handler. Graph class methods can no longer query information about any arbitrary node_id because each partition holds only a subset of the entire graph. Instead, any graph method that involves multiple node ids must also receive queryable information about the node ids as inputs to the method (this information is collected via rpc calls, as noted above). Thus, the graph mutator functions add_edge and remove_edge are modified to account for this change in arguments.

Test script

The test script (mytest.sh) assumes that the graph is partitioned over three servers. The addresses of these servers are specified in the test script, but it is assumed that the user has already started the servers. The test script builds a simple graph and queries it, sending each of its requests to the appropriately numbered partition. The user should see that http responses send back correct information about the graph, indicating that each partition sees the correct subset of the entire graph.

Links:

Home: Demo video and Abstract

First iteration: Local in-memory graph

Second iteration: Durable on-disk graph

Third iteration: Chain-replicated graph

Fourth iteration: Partitioned graph

Source code: Each graph is stored under its appropriate branch (lab1, lab2, lab3, lab4)