This is a simplified version that doesn't describe [containment mode](audit-containment.md). Chances of selecting the same stripe are rare, but it wouldn't cause any significant harm.
To estimate appropriate settings for reservoir sampling, we need to run a simulation.
Unvetted nodes should get 25,000 pieces per month. On a good day, there will be 1000 pieces added to an unvetted node, which should quickly fill the reservoir sample.
While we initially considered integrating the audit system's random node selection process with the existing garbage collection observer,
we decided not to do this because the difference in required interval for each observer would mean either too many bloom filters being created unnecessarily or audits occurring too slowly.
An initial idea for implementation was to sort the nodes table for nodes with least amount of audits, then select one node randomly within that low amount of audits.
Random selection comes with an overall easier algorithm to implement with more statistical balance across the nodes.
Another approach that we decided not to pursue was the a reverse method of looking up segments or pieces by node ID e.g. a table where each row is a node ID and an encrypted metainfo path.
Every time a segment is committed or deleted, that table (and every node) gets updated.
This could simplify the garbage collection process, but complexify upload and download.
We decided that this would increase database size too significantly to be viable.
If we need fewer audits, then we could use power of two choices, in which we randomly select two nodes, then choose the one with fewer audits.
This would require tracking number of audits, but it would prevent having to sort and query all nodes by audit count, which could cause undesirable behavior.
For example, when new nodes join the network, the audit system could become stuck auditing only new nodes, and ignoring more established nodes.
1. [Create a simulation](https://storjlabs.atlassian.net/browse/V3-2359) for random audit selection with reservoirs to figure out how many audits for vetted and unvetted nodes would be issued per day, configurations for reservoir sizes for vetted and unvetted nodes, and other insights (probably work with Jens and/or DS team for this).
2. [Add reservoir sampling struct for node auditing.](https://storjlabs.atlassian.net/browse/V3-2360)
3. [Create an audit observer that connects to metainfo loop.](https://storjlabs.atlassian.net/browse/V3-2361)
4. [Have the audit observer update the reservoir sampling structs.](https://storjlabs.atlassian.net/browse/V3-2362)
5. [Implement random selection of a node's reservoir, then of a random segment to audit.](https://storjlabs.atlassian.net/browse/V3-2363)
6. [Audit the segment selected from part 5 in the same way as verification happens in the existing system.](https://storjlabs.atlassian.net/browse/V3-2364)
## Open issues
1. Location of reservoir sampling. (resolved)
- Where do we have this happen? In a garbage collection observer or a new observer?
From Moby: "The main issue with integrating it into the gc observer is it means we will always be forcing the gc interval and the node audit reservoir sampling interval to be exactly the same. I don't think the performance gain of combining the two is necessarily worth the limitations created. Plus, this is the entire reason we created the metainfo loop/observer architecture."