Goal: To able to reason about the SecureDrop Client as a single application-level state machine, in which each global state is composed of (in order from left to right):
- an authentication state or state machine;
- a synchronization state or state machine; and
- a job-queue state or state machine.
Notes:
- This diagram is both descriptive and prescriptive. It describes what I read the SecureDrop Client codebase and supporting specifications as wanting to do. It prescribes (or implies) some changes that might help realize those design intentions. (And, inevitably, it makes some assumptions that may no longer hold or serve us.)
- Refer to the Mermaid state-diagram syntax if you have questions about how to read this diagram.
- If you find GitHub's inline rendering of this diagram difficult to read, you can copy-paste the source of the
.mmd
file directly into the Mermaid live editor.
The following points are not true of the SecureDrop Client as implemented. Instead, they follow logically from the explicit state machine depicted here.
- No thread-level state! If the application is running, then the sync/queue threads are running.
- No thread-blocking processing loops: keep the
QObject
event loop available! The queue worker processes the next job at the later of (a) the current job finished or (b) a new job is enqueued. - As a consequence of (1) and (2), a
PauseQueueJob
actually sleeps the queue forPAUSE_INTERVAL
(orRETRY_INTERVAL
). UserHasLoginAttemptInProgress
andUserHasLogoutRequestInProgress
are both blocking and modal, because their transitions within the authentication state machine supervene on all other state machines, even application-level quit.- In particular, the transition
UserHasLogoutRequestInProgress
→UserIsLoggedOut
clears the session state (authentication token) whether or not the logout request succeeds. Or, if we want to be really strict about cleaning up server-side state before it expires anyway: If a session's logout request fails, the Client proceeds toUserIsLoggedOut
, but the token is saved in memory. If and when a subsequent login attempt succeeds, a new logout request is made for the saved token.- Why is this conclusion necessary? We can't interrupt a job in process. Therefore, if we allow
UserHasLogoutRequestInProgress
→UserIsLoggedOut
while a job is still in progress, we can't guarantee that a job will be processed only by the session to which it belongs. (See also: freedomofpress/securedrop-client#420.) - This may have opsec implications we should threat-model, because (e.g.) a journalist will not be able to log out of the application immediately in an emergency.
- Why is this conclusion necessary? We can't interrupt a job in process. Therefore, if we allow
- In particular, the transition
There may be others! These are just the ones I've parsed out in my own thinking so far.
- No jobs are persistent across SecureDrop Client invocations. (freedomofpress/securedrop-client#410)
- If you'll forgive the networking terms (alternatives welcome): The Qt event loop is the application's control plane. Our job queues constitute the application's data plane. Precedential considerations:
Things not addressed in this draft from March 28 team feedback:
-
On successful sync, the queues resume. This makes sync an implicit heartbeat.
-
Need to check against all network QA cases.
- When are retries silent because routine (because Tor)? When do they bubble up to an application-level error state? (
BaseApiJob
retries thrice silently.) - "Read jobs" (sync, download) versus "write jobs" (upload, star, delete): differences in retry/failure logic?
- When are retries silent because routine (because Tor)? When do they bubble up to an application-level error state? (
-
Need to distinguish (a) reauthenticated failed session from (b) newly authenticated subsequent session.
-
Need to explicitly document offline mode. @eloquence:
I think it may be worth explicitly distinguishing the OfflineMode state (user-initiated or as a result of session expiry) and the SignedInWithNetworkError state (impacts interactivity of the app in certain ways, impacts queue jobs being tried)
-
Need to clarify
PauseQueueJob
as blocking operation versus queue-level operation.- Why do I (@cfm) dislike manipulating queue via jobs?
- "Sync Architecture"
- "Queue Architecture"
- "SecureDrop Client API Queue Specification"
- "Network Error Handling in the Client"
- "SecureDrop Client Failure Scenarios"
- freedomofpress/securedrop-client#397 → freedomofpress/securedrop-client#1420
- freedomofpress/securedrop-client#423
- freedomofpress/securedrop-client#531
Right now, you can start the application in offline mode, so it looks like that's reflected in the state machine by showing the
UserIsLoggedOut
state whereUserHasNoSession
? Just double-checking if I'm reading that right.Edit: Ah, yup! I see, I see... the
UserHasNoSession
is the auth state, and I see that after authentication succeeds it becomesUserHasSession
. I just needed to take a closer look and realize that you were very organized in keeping the auth state to the left, sync state in the middle, and queue state on the right.Might make sense to add something that reflects offline mode separately, e.g.:
Where
UserIsLoggedInAndOffline
is a new state where the queue is running but the user is not authenticated.