Create a distance pallet that handle rounds and validation queue
Create custom RPC methods to be used by the distance computation micro-service
Create runtime APIs for theses custom RPC methods
Develop the distance computation micro-service
Vocabulary point: For simplicity of text, I use the term "identity", instead of "pending membership application".
Algo:
Use the fact that the set of authorities for session N+1 is determined in the 1st block of session N.
Every 4 sessions, the distance pallet asks the runtime pallet for the list of identities to be evaluated. We note this session N.
The runtime provides only the identities that will expire in more than the specified due date (4 sessions) and forces the immediate expiration of the others (because they have no chance to pass).
The pallet distance writes in its storage the number and hash of the parent block and the set of identities to evaluate. We note this parent block B.
At the 1st block of session N + 2, the pallet distance check the hash of block B, if it's changed (fork), go back to step 1.
If the hash still the same (no fork), put the ComputationMetadata exposed by the runtime API.
At each block of sessions N + 3, the distance pallet notes the result published by the author of the block (from an inherent).
At the 1st block of the session N + 4, the distance pallet calculates the median of the results for each identity evaluated and transmits the results to the runtime and waits for an immediate return from the latter.
The runtime validates the Ok identities, and immediately sends back to the distance pallet the refused identities that have not expired and will not expire in the specified deadline (next 4 sessions).
Note that:
This implies that an identity will be re-evaluated every 4 sessions until it eventually passes or expires.
An identity can expire prematurely (at most 4 sessions earlier than the scheduled deadline), if the next distance calculation is too long away for this identity.
We need 2 types of inherent provided by offchain workers, 1 to declare the finalization and 1 to publish the results of the distance calculation.
Since the computation is done offchain, with floats and parallelization and SIMD optimizations, the result of the computation is not perfectly reproducible, 2 nodes may arrive at a slightly different result. This is why we take the median of the results.
We need to determine a deviation threshold from the median above which the author should be sanctioned.
The distance computation itself is performed during the session N+2 by all the authorities registered for the session N+3.