Commit Graph

3 Commits

Author SHA1 Message Date
Michal Niewrzal
0eaf43120b satellite/repair/checker: optimize processing, part 3
ClassifySegmentPieces uses custom set implementation instead map.

Side note, for custom set implementation I also checked int8 bit set but
it didn't give better performance so I used simpler implementation.

Benchmark results (compared against part 2 optimization change):
name                                       old time/op    new time/op    delta
RemoteSegment/healthy_segment-8    21.7µs ± 8%    15.4µs ±16%  -29.38%  (p=0.008 n=5+5)

name                                       old alloc/op   new alloc/op   delta
RemoteSegment/healthy_segment-8    7.41kB ± 0%    1.87kB ± 0%  -74.83%  (p=0.000 n=5+4)

name                                       old allocs/op  new allocs/op  delta
RemoteSegment/healthy_segment-8       150 ± 0%       130 ± 0%  -13.33%  (p=0.008 n=5+5)

Change-Id: I21feca9ec6ac0a2558ac5ce8894451c54f69e52d
2023-10-16 12:06:16 +00:00
Michal Niewrzal
de4559d862 satellite/repair/checker: optimize processing, part 1
Optimization by reusing more slices.

Benchmark result:
name                                       old time/op    new time/op    delta
RemoteSegment/healthy_segment-8    33.2µs ± 1%    31.4µs ± 6%   -5.49%  (p=0.032 n=4+5)

name                                       old alloc/op   new alloc/op   delta
RemoteSegment/healthy_segment-8    15.9kB ± 0%    10.2kB ± 0%  -35.92%  (p=0.008 n=5+5)

name                                       old allocs/op  new allocs/op  delta
RemoteSegment/healthy_segment-8       280 ± 0%       250 ± 0%  -10.71%  (p=0.008 n=5+5)

Change-Id: I60462169285462dee6cd16d4f4ce1f30fb6cdfdf
2023-10-11 15:50:29 +00:00
paul cannon
1b8bd6c082 satellite/repair: unify repair logic
The repair checker and repair worker both need to determine which pieces
are healthy, which are retrievable, and which should be replaced, but
they have been doing it in different ways in different code, which has
been the cause of bugs. The same term could have very similar but subtly
different meanings between the two, causing much confusion.

With this change, the piece- and node-classification logic is
consolidated into one place within the satellite/repair package, so that
both subsystems can use it. This ought to make decision-making code more
concise and more readable.

The consolidated classification logic has been expanded to create more
sets, so that the decision-making code does not need to do as much
precalculation. It should now be clearer in comments and code that a
piece can belong to multiple sets arbitrarily (except where the
definition of the sets makes this logically impossible), and what the
precise meaning of each set is. These sets include Missing, Suspended,
Clumped, OutOfPlacement, InExcludedCountry, ForcingRepair,
UnhealthyRetrievable, Unhealthy, Retrievable, and Healthy.

Some other side effects of this change:

* CreatePutRepairOrderLimits no longer needs to special-case excluded
  countries; it can just create as many order limits as requested (by
  way of len(newNodes)).
* The repair checker will now queue a segment for repair when there are
  any pieces out of placement. The code calls this "forcing a repair".
* The checker.ReliabilityCache is now accessed by way of a GetNodes()
  function similar to the one on the overlay. The classification methods
  like MissingPieces(), OutOfPlacementPieces(), and
  PiecesNodesLastNetsInOrder() are removed in favor of the
  classification logic in satellite/repair/classification.go. This
  means the reliability cache no longer needs access to the placement
  rules or excluded countries list.

Change-Id: I105109fb94ee126952f07d747c6e11131164fadb
2023-09-25 09:42:08 -05:00