Skip to main content

Deadlock Analysis in Stronghold

Stronghold uses RwLock to enable concurrency. This approach has the pros of being simple in theory and fast performance when done right. The downside is that, in practice, it is usually really difficult to use locks in concurrent programming, and you often encounter data races or deadlocks when using them. One of the worst cases would be to have a deadlock in Stronghold, which could block Stronghold and render it unusable. To avoid this situation, Stronghold uses deadlock analysis.

Requirements for a Multithreaded Stronghold

  • Be able to clone the element type parts of the API and use them concurrently. The following types ar reachable through the API and may be cloned and shared between threads:
  • Use Mutex or RWLock

Deadlocks Analysis

note

These are thoughts stemming from experience, but they do not have a peer-reviewed paper backing them.

Model

Locks are used for controlling access to shared memory. To simplify the problem, we only model mutex so that only a single thread can access shared memory.

Model of functions and locks

How functions and locks are represented in Stronghold

  • 11, 22, 33 represent locks that protect 3 different and distinct shared memory locations.
  • f1f_1 is a function that sequentially takes hold of lock 11 then 22 then 33.
  • f1f_1 releases the locks 11, 22, 33 at the end of its computation.

Note that lock 22 depends on lock 11, because in f1f_1 you can't take lock 22 before taking lock 11. In the same way, lock 33 depends on locks 11 and 22.

Example: Deadlock with 2 functions

The following is an example of the most simple possible deadlock.

Simple deadlock

  • The function f1f_1 wants to take the locks for shared memory 11, then 22.
  • The function f2f_2 wants to take the locks for shared memory 22, then 11.
  • On the right, you can see a circular lock dependency between f1f_1 and f2f_2. This means that you can do a circle by following the arrows of f1f_1 and f2f_2.
Analysis

The graph at the right represents the merged dependency graph of functions f1f_1 and f2f_2. It shows all the dependencies of the locks in f1f_1 and f2f_2. A circular dependency on the merged graph indicates the possibility of a deadlock when executing f1f_1 and f2f_2 concurrently.

Example run resulting in a deadlock
  1. f1f_1 starts and takes the lock 11.
  2. f2f_2 starts and takes the lock 22.
  3. f1f_1 is stuck, because f2f_2 is holding the lock 22 that it requires.
  4. f2f_2 is stuck, because f2f_2 is holding the lock 11 that it requires.

Example: Deadlock with 3 functions

Three function deadlock

In the previous example, we showed that there is a circular dependency of locks between f1f_1, f2f_2 and f3f_3.

Example run resulting in a deadlock
  1. f1f_1 starts and takes the lock 11.
  2. f2f_2 starts and takes the lock 33.
  3. f3f_3 starts and takes the lock 22.
  4. f1f_1 is stuck, because f3f_3 is holding the lock 22 that it requires.
  5. f2f_2 is stuck, because f1f_1 is holding the lock 11 that it requires.
  6. f3f_3 is stuck, because f2f_2 is holding the lock 33 that it requires.

Proof backing "no circular dependency equals no deadlock"

This is a proof on an example for simplicity. You can apply the reasoning in the next part to any dependency graph of locks to prove the absence of a deadlock when there are no circular dependencies.

Example: no deadlock

No deadlock

There is no circular dependency in the right-side representation. We will try to argue that there is indeed no possible deadlock in this configuration. The main idea linking absence of deadlock with absence of circular dependency is that any thread will always eventually get the lock it's waiting for. So, it will eventually terminate its computation thus releasing the locks it acquired.

Definition: deadlock-free lock

A lock is deadlock-free when we can prove that it will never be part of a deadlock and will always eventually be released.

Prove the absence of deadlock using the merged graph

  1. Lock 33 is deadlock-free. No locks depend on lock 33. When a thread acquires lock 33 it will manage to terminate its computation and will always release it.
  2. Recursively lock 22 is deadlock-free. Lock 33 is the only lock that depends on lock 22. That means:
    1. If a thread acquires lock 22, it will eventually get lock 33 because lock 33 is deadlock-free (lock 33 will always be available at some point).
    2. If a thread acquires lock 22, it will manage to terminate its computation and eventually be released because lock 33 is deadlock-free.
    3. Lock 22 is deadlock-free by recursivity.
  3. Recursively, lock 11 is deadlock-free. All the locks that depend on lock 11 are deadlock-free. You can apply the same reasoning for lock 22 to lock 11.

Stronghold Deadlock Analysis

The goal is to list Stronghold's functions that access locks and build a merge graph. Then, we show that the merge graph does not contain circular dependencies.

Stronghold Share Memories

This list shows the different shared memory locations that are protected behind locks. Each shared memory location/lock is associated with a number that we will use for the dependency graph that follows.

Fields of the Stronghold type which are shared memories

  1. snapshot: Snapshot
  2. clients: HashMap<ClientId, Client>
  3. store: Store
  4. key_location: Option<Location>

Fields of the Client type which are shared memories

  1. keystore: KeyStore
  2. db: DbView
  3. store: Store

Stronghold Functions

Creating one dependency graph per function would be too complicated. Instead, we list every function that accesses the shared memories. After the function name, we describe the locks that are acquired or the functions that they call.

Functions of the Stronghold type

  • reset:
  • store:
  • load_client_from_snapshot: 1 \rightarrow 2 \rightarrow client.restore()
  • load_client: 1 \rightarrow 2 \rightarrow client.restore()
  • get_client: 2
  • unload_client: 2
  • purge_client: 1 \rightarrow 2
  • load_snapshot: 1
  • store_snapshot_key_at_location: 1 \rightarrow 4
  • create_client: 2
  • commit_with_keyprovider: 1 \rightarrow 2 \rightarrow 5 \rightarrow 6 \rightarrow 7
  • commit: 1 \rightarrow 2 \rightarrow 5 \rightarrow 6 \rightarrow 7
  • write_client: 1 \rightarrow 2 \rightarrow 5 \rightarrow 6 \rightarrow 7
  • clear: 1 \rightarrow 2 \rightarrow 3 \rightarrow 4 \rightarrow client.clear()

Functions of the Client type

  • store:
  • vault:
  • vault_exists: 5
  • record_exists: 6
  • sync_vaults: get_hierarchy() \rightarrow get_diff() \rightarrow export_entries() \rightarrow 5 \rightarrow 6
  • sync_with: other_client.get_hierarchy() \rightarrow get_diff() \rightarrow other_client.export_entries() \rightarrow other_client.5 \rightarrow 5 \rightarrow 6
  • id:
  • restore: 5 \rightarrow 6 \rightarrow 7
  • clear: 5 \rightarrow 6 \rightarrow 7
  • execute_procedure: execute_procedure_chained()
  • execute_procedure_chained: exec_proc()
  • get_db: 6
  • get_key_provider: 5
  • Runner trait
    • get_guards: 5 \rightarrow 6
    • exec_proc: 5 \rightarrow 6
    • write_to_vault: 5 \rightarrow 6
    • revoke_data: 5 \rightarrow 6
    • garbage_collect: 5 \rightarrow 6
    • resolve_locations: 5
    • get_guard: 5 \rightarrow 6
  • SyncClients trait
    • get_hierarchy: get_key_provider() \rightarrow get_db()
    • get_diff: get_key_provider() \rightarrow get_db()
    • export_entries: get_db()

Stronghold Merged Graph

This was created by merging all the lock dependencies from the Stronghold functions.

Merged Stronghold Graph

You can easily see that there are no circular dependencies in this graph.

Conclusion

This document reassures us of the absence of deadlocks in our multithreaded implementation of Stronghold.

However, the theory presented has not been proven correct to our knowledge. Moreover, creating the merged graph is not exempt from human error either. This is not proof of the absence of deadlocks, but an argument and a view on how we designed Stronghold to avoid such a situation.