CSE 40771 - Distributed Systems - Spring 2023
In this assignment, you will build a simple RPC server that supports access to an in-memory hash table running on a single machine. This will provide the basic capability that we will build upon over several assignments to work up to a fully distributed, scalable hash table. The focus of this assignment will be on the remote procedure call (RPC) interface to the system.
For this first assignment, the server will simply keep a hash table in memory and need only support a single client at a time. You will be adding more complexity in later assignments, one at a time.
Your hash table server should support the following five operations:
insert( key, value ) -> Inserts the given key and value into the hash table.
lookup( key ) -> Returns the value associated with a given key.
remove( key ) -> Removes the key and corresponding value from the hash table. (It should not return the old value, because that would make it non-idempotent.)
size() -> Returns the number of items currently in the table.
query( subkey, subvalue ) -> Returns a list of (key,value) pairs where the value
(if it is a dictionary) contains a `subkey` with the value `subvalue`.
Across the calls, keys are plain string with arbitrary contents and length,
and values can be any Python values convertible to JSON. (strings, integers, booleans, dictionaries, lists)
The subvalue in query
is limited to atomic values. (i.e. no dictionaries or lists)
You should design an appropriate set of JSON messages to represent a request and response for each operation. You can define this however you like, as long as it is consistent and works correctly. For example, a request to insert an item might look like this:
{
"method" : "insert",
"key" : "zebra",
"value" : { "weight":100, "name":"fred" }
}
In a similar way, the server should return a brief JSON message to indicate the result of the operation. Keep in mind that both the client and server should be able to distinguish between a successful operation, a failed operation, and a completely invalid request. You should have return messages to corresponding to each of these cases.
Take care: Each of the RPC operations may fail under normal use. For example, the user might attempt to remove
a key that is not present in the table.
If that happens, the server should not crash with an exception, but rather should return a suitable message to the client, so that the client can return an appropriate value or raise an exception to its caller. There should be no way in which the client can cause the server to crash.
To ensure some degree of consistency across the projects, please break your project down into several files with the following names. (It’s ok if you have additional files as well.)
HashTable.py
should contain the basic implementation of each of the four operations on a plain hash table in memory, and forms part of the server.HashTableServer.py
should contain the server-side RPC main program. It should accept a port number on the command line, create a listening TCP socket, accept incoming connections, decode messages coming from the client, invoke the proper operation, and then return the result to the client. This program should never exit of its own accord: use Control-C to kill it when you no longer need it.HashTableClient.py
should contain the client-side RPC operations. It should provide a method to connect to a server host and port, and a client-stub for each of the four hash table operations that sends a message and waits for the response.TestBasics.py
and TestPerf.py
will accept a server name and port number on the command line, and then make use of HashTableClient.py
to connect to the indicated server and perform various tests.HashTableServer.py
should be started by giving a single argument – a port number –
on the command line. The server should attempt to listen on the port (if available)
and fail otherwise. If the port number zero is given, then the server should
listen on any available port. Either way, it should display the port it is listening on:
python HashTableServer.py 9328
Listening on port 9328
python HashTableServer.py 0
Listening on port 10475
Then, each of the test programs should accept a host name and port number that they will connect to. For example:
python TestBasics.py student10.cse.nd.edu 10475
Write two test programs to exercise your service:
TestBasics.py
should exercise all of the RPC operations in various ways and evaluate them for correctness. For example, if you insert a value into the hash table, you should be able to use lookup to get the same value back. If a value is removed from the table, then further lookups should fail, and so forth. Make sure that you test each of the ways each call can succeed or fail. Don’t proceed to the next step until you are sure all the details are right here.
TestPerf.py
should time a sequence of operations in four stages: insert
a large number of items in the table, lookup
a large number of items, query
the table for matches, and finally remove
all the items. Measure the throughput (ops/sec) obtained for each operation, and invert it to get the average latency of each operation.
Please review the general instructions for submitting assignments.
Turn in all of your source code, along with a lab report titled REPORT
that describes the following in detail: