Commit Graph

208 Commits

Author SHA1 Message Date
paul cannon
355ea2133b satellite/audit: remove pieces when audits fail
When pieces fail an audit (hard fail, meaning the node acknowledged it
did not have the piece or the piece was corrupted), we will now remove
those pieces from the segment.

Previously, we did not do this, and some node operators were seeing the
same missing piece audited over and over again and losing reputation
every time.

This change will include both verification and reverification audits. It
will also apply to pieces found to be bad during repair, if
repair-to-reputation reporting is enabled.

Change-Id: I0ca7af7e3fecdc0aebbd34fee4be3a0eab53f4f7
2023-06-22 14:19:00 +00:00
JT Olio
36038af3d1 go.mod: bump storj.io/common
Change-Id: I18d4d491cdca99f6694a20f3ead5fba3f75bb658
2023-06-02 11:23:02 -04:00
Michal Niewrzal
4bdbb25d83 satellite/metabase/rangedloop: move Segment definition
We will remove segments loop soon so we need first to move
Segment definition to rangedloop package.

https://github.com/storj/storj/issues/5237

Change-Id: Ibe6aad316ffb7073cc4de166f1f17b87aac07363
2023-05-16 12:37:17 +00:00
paul cannon
c2710cc78d satellite/audit: improve error handling
* Don't use rpcstatus.Unknown as an indicator of dial failure; instead,
  GetShare now indicates with a per-share field where a failure happened
  (DialFailure, RequestFailure, NoFailure). Use that information in
  Verify() to determine how to treat the source node.

* Add a test that replaces a storage node with a black hole, so that
  connections there will always time out. Make sure we handle that case
  correctly.

Refs: https://github.com/storj/storj/issues/5632
Change-Id: I513a53520fb48b7187d4c4d7e14e01e2cfc0a721
2023-05-11 22:55:26 +00:00
Michal Niewrzal
36e046375c satellite/repair/checker: remove segments loop parts
We are switching completely to ranged loop.

https://github.com/storj/storj/issues/5368

Change-Id: I8583549973cd36aa0e0c482c20d7a75cb7568ab3
2023-05-08 12:19:13 +00:00
Michal Niewrzal
1aa24b9f0d satellite/audit: remove segments loop parts
We are switching completely to ranged loop.

https://github.com/storj/storj/issues/5368

Change-Id: I9cec0ac454f40f19d52c078a8b1870c4d192bd7a
2023-04-24 15:52:11 +00:00
Egon Elbre
8fba740332 storagenode/blobstore/testblobs: don't error checks in BadDB
We automatically start a chore to check whether the blobstore is
writeable and readable, however, we don't want to fail the tests due to
that reason. Usually we want to test some other failure.

There probably should be some nicer way to achieve this, but this is an
easier fix.

Change-Id: I77ada75329f88d3ea52edd2022e811e337c5255a
2023-04-11 16:14:31 +03:00
Egon Elbre
f5020de57c storagenode/blobstore: move blob store logic
The blobstore implementation is entirely related to storagenode, so the
rightful place is together with the storagenode implementation.

Fixes https://github.com/storj/storj/issues/5754

Change-Id: Ie6637b0262cf37af6c3e558556c7604d9dc3613d
2023-04-05 18:06:20 +00:00
JT Olio
4362761fc7 satellite/audit: fix go1.19 dial timeouts and log more
Change-Id: Ide17c1b8e0ca8c86f305bea1b4ae553cc4cb60d0
2023-02-28 17:09:47 +00:00
Michal Niewrzal
16b7901fde satellite/metabase: add piece size calculation to segment
This code is essentially replacement for eestream.CalcPieceSize. To call
eestream.CalcPieceSize we need eestream.RedundancyStrategy which is not
trivial to get as it requires infectious.FEC. For example infectious.FEC
creation is visible on GE loop observer CPU profile because we were
doing this for each segment in DB.

New method was added to storj.Redundancy and here we are just wiring it
with metabase Segment.

BenchmarkSegmentPieceSize
BenchmarkSegmentPieceSize/eestream.CalcPieceSize
BenchmarkSegmentPieceSize/eestream.CalcPieceSize-8         	    5822	    189189 ns/op	    9776 B/op	       8 allocs/op
BenchmarkSegmentPieceSize/segment.PieceSize
BenchmarkSegmentPieceSize/segment.PieceSize-8              	94721329	        11.49 ns/op	       0 B/op	       0 allocs/op

Change-Id: I5a8b4237aedd1424c54ed0af448061a236b00295
2023-02-22 11:04:02 +00:00
Michal Niewrzal
06f4aede9b satellite/audit: use NodeAlias instead pure NodeID
This is first attempt to use AliasPieces inastead Pieces with
segments/range loop. So far we were always using Pieces
which are always converted from AliasPieces for easy use.
Side effect is that using NodeID with loop observers is heavy
e.g. we are using maps which behaves slower with NodeIDs.

We are starting with audit observer because it's easy to change
it as in feact it doesn't need access to real NodeID at all. We just
need to reference node in some way and this way is NodeAlias.

Results of BenchmarkRemoteSegment:
name                                         old time/op    new time/op    delta
RemoteSegment/Cockroach/multiple_segments-8    1.79µs ± 6%    0.03µs ± 4%  -98.29%  (p=0.008 n=5+5)

name                                         old alloc/op   new alloc/op   delta
RemoteSegment/Cockroach/multiple_segments-8     0.00B          0.00B          ~     (all equal)

name                                         old allocs/op  new allocs/op  delta
RemoteSegment/Cockroach/multiple_segments-8      0.00           0.00          ~     (all equal)

Change-Id: Ib7fc87e568a4d3a9af27b5e3b644ea68ab6db7aa
2023-02-21 15:31:59 +00:00
Michal Niewrzal
aba2f14595 satellite/metabase/rangedloop: few additions for monitoring
Additional elements added:
* monkit metric for observers methods like Start/Fork/Join/Finish to
be able to check how much time those methods are taking
* few more logs e.g. entries with processed range
* segmentsProcessed metric to be able to check loop progress

Change-Id: I65dd51f7f5c4bdbb4014fbf04e5b6b10bdb035ec
2023-02-17 08:46:00 +00:00
paul cannon
ab2e793555 satellite/audit: test delay before Reverify
We are supposed to wait for some amount of time after a timed-out audit
before retrying the audit on the contained node. We are also supposed to
wait for some amount of time before subsequent retries, if they are
necessary. The test added here tries to assure that those delays happen,
as far as it is possible to assure that a delay will happen in computer
code.

The previous behavior of the system was, in fact, to carry out
Reverifies as soon as a worker could retrieve the job from the
reverification queue. That's not a very major problem, as subsequent
retries do have a delay and the node does get several retries. Still, it
was not ideal, and this test exposed that mismatch with expectations, so
this commit includes a minor change to effect that pause between verify
and the first reverify.

Refs: https://github.com/storj/storj/issues/5499
Change-Id: I83bb79c166a458ba59a2db2d17c85eca43ca90f0
2023-02-15 23:16:23 +00:00
Márton Elek
8f8e97de23 satellite/metainfo: support desired node number for download object/segment
This modification introduce support of the new "desired node" field of download segment/object.

This can be used to request more nodes than the suggested minimum. It can be used to achieve better performance in exchange of using more bandwidth. (more parallel downloads).

Change-Id: Ia167d6979e6d70a597c85070a4ccd1c3a573e406
2023-02-13 13:57:48 +00:00
paul cannon
362d73f783 satellite/audit: add tests for concurrent audits
This commit introduces tests that perform multiple concurrent audits
against the same storage node, to make sure that doing so does not
create incorrect outcomes.

Refs: https://github.com/storj/storj/issues/5495
Change-Id: Iaae49e042306bfa59bdf04c1a1540667488e51e5
2023-02-09 19:58:50 +00:00
paul cannon
d6f8be1ec6 satellite/audit: add ContainmentSyncChore
We will be needing an infrequent chore to check which nodes are in the
reverify queue and synchronize that set with the 'contained' field in
the nodes db, since it is easily possible for them to get out of sync.
(We can't require that the reverification queue table be in the same
database as the nodes table, so maintaining consistency with SQL
transactions is out. Plus, even if they were in the same database, using
such SQL transactions to maintain consistency would be slow and
unwieldy.)

This commit adds the actual chore.

Refs: https://github.com/storj/storj/issues/5431
Change-Id: Id78b40bf69fae1ac39010e3b553315db8a1472bd
2023-02-07 01:18:49 +00:00
paul cannon
7652a598d8 satellite/audit: add GetAllContainedNodes method to ReverifyQueue
We will be needing an infrequent chore to check which nodes are in the
reverify queue and synchronize that set with the 'contained' field in
the nodes db, since it is easily possible for them to get out of sync.
(We can't require that the reverification queue table be in the same
database as the nodes table, so maintaining consistency with SQL
transactions is out. Plus, even if they were in the same database, using
such SQL transactions to maintain consistency would be slow and
unwieldy.)

This commit adds a method to the class representing the reverify queue
in the database, allowing us to get the list of every node that has at
least one record in the reverification queue.

Refs: https://github.com/storj/storj/issues/5431
Change-Id: Idce2633b3d63f2645170365e5cdeb2ea749fa9cb
2023-02-02 00:39:29 +00:00
Andrew Harding
590d44301c satellite/audit: implement rangedloop observer
This change implements the ranged loop observer to replace the audit
chore that builds the audit queue.

The strategy employed by this change is to use a collector for each
segment range to  build separate per-node segment reservoirs that are
then merge them during the join step.

In previous observer migrations, there were only a handful of tests so
the strategy was to duplicate them. In this package, there are dozens
of tests that utilize the chore. To reduce code churn and maintenance
burden until the chore is removed, this change introduces a helper that
runs tests under both the chore and observer, providing a pair of
functions that can be used to pause or run the queueing function.

https://github.com/storj/storj/issues/5232

Change-Id: I8bb4b4e55cf98b1aac9f26307e3a9a355cb3f506
2023-01-03 08:52:01 -07:00
paul cannon
7b851b42f7 satellite/audit: split out auditor process
This change creates a new independent process, the 'auditor', comparable
to the repairer, gc, and api processes. This will allow auditors to be
scaled independently of the core.

Refs: https://github.com/storj/storj/issues/5251
Change-Id: I8a29eeb0a6e35753dfa0eab5c1246048065d1e91
2022-12-16 12:44:32 -06:00
paul cannon
fc905a15f7 satellite/audit: newContainment->containment
Now that all the reverification changes have been made and the old code
is out of the way, this commit renames the new things back to the old
names. Mostly, this involves renaming "newContainment" to "containment"
or "NewContainment" to "Containment", but there are a few other renames
that have been promised and are carried out here.

Refs: https://github.com/storj/storj/issues/5230
Change-Id: I34e2b857ea338acbb8421cdac18b17f2974f233c
2022-12-16 17:59:52 +00:00
Andrew Harding
73d5c6944a satellite/audit: merge support for reservoirs
Change-Id: Ibbedd2a0043412210159fa2523f9e63d987276c3
2022-12-16 15:27:55 +00:00
paul cannon
0342ca1aa6 satellite/audit: delete now-unused code
Now that we are doing scalable piecewise reverifications, the code for
handling the old way of doing things (containment, pending audits,
reporting, testing) can now be removed.

Refs: https://github.com/storj/storj/issues/5230
Change-Id: Ief1a75f423eff682e8f3d57804e343b3409a6631
2022-12-16 14:53:39 +00:00
paul cannon
a66503b444 satellite/audit: Begin using piecewise reverifications
This commit pulls the big switch! We have been setting up piecewise
reverifications (the workers for which can be scaled independently of
the core) for several commits now, and this commit actually begins
making use of them.

The core of this commit is fairly small, but it requires changing the
semantics in all the tests that relate to reverifications, so it ends up
being a large change. The changes to the tests are mostly mechanical and
repetitive, though, so reviewers needn't worry much.

Refs: https://github.com/storj/storj/issues/5230
Change-Id: Ibb421cc021664fd6e0096ffdf5b402a69b2d6f18
2022-12-16 14:21:13 +00:00
Andrew Harding
93fad70e4b satellite/audit: prevent accessing unset reservoir segments
This change fixes the access of unset segments and keys on the reservoir
when the reservoir size is less than the max OR the number of sampled
segments is smaller than the reservoir size. It does so by tucking away
the segments and keys behind methods that return properly sized slices
into the segments/keys arrays.

It also fixes a bug in the housekeeping for the internal index variable
that holds onto how many items in the array have been populated. As part
of this fix, it changes the type of index to int8, which reduces the
size of the reservoir struct by 8 bytes.

The tests have been updated to provide better coverage for this case.

Change-Id: I3ceb17b692fe456fc4c1ca5d67d35c96aeb0a169
2022-12-14 17:43:17 -07:00
paul cannon
47b9134f76 satellite/audit: add IdentifyContainedNodes
This method on the Verifier allows the caller to find, out of the nodes
holding pieces in a given segment, which ones are contained.

This method is not yet being used. It will be in a future commit.

Refs: https://github.com/storj/storj/issues/5230
Change-Id: I242cd999913ca4dabbe8a62767ed4869b31fca04
2022-12-13 20:46:43 +00:00
paul cannon
c575a21509 satellite/audit: add audit.ReverifyWorker
Here we add a worker class comparable to audit.Worker, which will be
responsible for pulling items off of the reverification queue and
calling reverifier.ReverifyPiece on them.

Note that piecewise reverification audits (which this will control) are
not yet being done. That is, nothing is being added to the
reverification queue at this point.

Refs: https://github.com/storj/storj/issues/5251
Change-Id: I94e28830e27caa49f2c8bd4a2336533e187ab69c
2022-12-12 11:57:08 +00:00
paul cannon
1854351da6 satellite/audit: teach Reporter about piecewise audits
The Reporter is responsible for processing results from auditing
operations, logging the results, disqualifying nodes that reached
the maximum reverification count, and passing the results on to
the reputation system.

In this commit, we extend the Reporter so that it knows how to process
the results of piecewise reverification audits.

We also change most reporter-related tests so that reverifications
happen as piecewise reverification audits, exercising the new code.

Note that piecewise reverification audits are not yet being done outside
of tests. In a later commit, we will switch from doing segmentwise
reverifications to piecewise reverifications, as part of the
audit-scaling effort.

Refs: https://github.com/storj/storj/issues/5230
Change-Id: I9438164ce1ea4d9a1790d18d0e1046a8eb04d8e9
2022-12-12 11:28:02 +00:00
paul cannon
231c783698 satellite/audit: fix reservoir sampling bias
While researching logs from a large set of audits, I noticed that nearly
all of them had streamIDs starting with 0 or 1. This seemed very odd,
because streamIDs are supposed to be pretty much entirely random, and
every hex digit from 0-f should have been represented with roughly equal
frequency.

It turned out that our A-Chao implementation of reservoir sampling is
flawed. As far as we can tell, so is the Wikipedia implementation. No
one has yet reviewed the original 1982 paper by Dr. Chao in enough
detail to know where the error originated, but we do know that we have
been auditing segments near the beginning of the segment loop (low
streamIDs) far more often than segments near the end of the segment loop
(high streamIDs).

This change uses an algorithm Wikipedia calls "A-Res" instead, and adds
a test to check for that sort of bias creeping back in somehow. A-Res
will be slightly slower than A-Chao, because of a few extra steps that
need to be done, but it does appear to be selecting items uniformly.

Change-Id: I45eba4c522bafc729cebe2aab6f3fe65cd6336be
2022-12-09 18:16:58 -06:00
paul cannon
35f60fd5eb satellite/audit: change signature of ReverifyPiece
First, adding a logger argument allows the caller to have a logger
already set up with whatever extra fields they want here.

Secondly, we need to return the Outcome instead of a simple boolean so
that it can be passed on to the Reporter later (need to make the right
decision on increasing reputation vs decreasing it).

Thirdly, we collect the cached reputation information from the overlay
when creating the Orders, and return it from ReverifyPiece. This will
allow the Reporter to determine what reputation-status fields need to be
updated, similarly to how we include a map of ReputationStatus objects
in an audit.Report.

Refs: https://github.com/storj/storj/issues/5251
Change-Id: I5700b9ce543d18b857b81e684323b2d21c498cd8
2022-12-07 18:32:42 +00:00
paul cannon
378b8915c4 satellite/{satellitedb,audit}: add NewContainment
NewContainment will replace Containment later in this commit chain, but
for now it is not yet being used.

NewContainment will allow a node to be contained for multiple pending
reverify jobs at a time. It is implemented by way of the reverify queue.

Refs: https://github.com/storj/storj/issues/5231
Change-Id: I126eda0b3dfc4710a88fe4a5f41780618ec19101
2022-12-07 18:03:37 +00:00
paul cannon
2617925f8d satellite/audit: Split Reverifier from Verifier
Adding a new worker comparable to Verifier, called Reverifier; as the
name suggests, it will be used for reverifications, whereas Verifier
will be used for verifications.

This allows distinct logging from the two classes, plus we can add some
configuration that is specific to the Reverifier.

There is a slight modification to GetNextJob that goes along with this.

This should have no impact on operational concerns.

Refs: https://github.com/storj/storj/issues/5251
Change-Id: Ie60d2d833bc5db8660bb463dd93c764bb40fc49c
2022-12-05 09:55:49 -06:00
paul cannon
601874f79a satellite/audit: retire audit.Queues
audit.Queues was the previous method of passing stacks of segments for
audit to the verifier. As of commit 68f9ce4a, this is now happening
by way of the auditor queue (database-backed, so that communication can
happen between multiple peers). audit.Queues is no longer needed.

Refs: https://github.com/storj/storj/issues/5228
Change-Id: I46f2d48d655fb66366c92146cdb6b85aef200552
2022-12-01 13:12:43 +00:00
paul cannon
fba39b72b8 satellite/audit: add GetByNodeID to ReverifyQueue
GetByNodeID will allow querying the reverification queue to see if there
are any pending jobs for a given node ID. And thus, to see if that node
ID should be contained or not.

Some parameters on the other methods of the ReverifyQueue interface have
been changed to accept pointers; this was done ahead of the rest of the
changes for the reverification queue to better match the signatures of
the methods that these will replace once ReverifyQueue is actually being
used (meaning fewer changes to tests).

Refs: https://github.com/storj/storj/issues/5251
Change-Id: Ic38ce6d2c650702b69f1c7244a224f00a34893a1
2022-12-01 12:14:49 +00:00
paul cannon
b612ded9af satellite/audit: help performance of pushing to audit queue
The audit chore will be pushing a large number of segments to be
audited, and the db might choke on that large insert when under load.

This change divides the insert up into batches, which can be sized
however is optimal for the backing database. It also arranges for
segments to be inserted in the order of the primary key, which helps
performance on some systems.

Refs: https://github.com/storj/storj/issues/5228

Change-Id: I941f580f690d681b80c86faf4abca2995e37135d
2022-11-29 15:37:49 +00:00
paul cannon
8b494f3740 satellite/audit: use db for auditor queue
As part of the effort of splitting out the auditor workers to their own
process, we are transitioning the communication between the auditor
chore and the verification workers to a queue implemented in the
database, rather than the sequence of in-memory queues we used to use.

This logical database is safely partitionable from the rest of
satelliteDB.

Refs: https://github.com/storj/storj/issues/5251

Change-Id: I6cd31ac5265423271fbafe6127a86172c5cb53dc
2022-11-22 14:04:00 +00:00
paul cannon
1b5d953fd9 satellite/satellitedb: add queue for verifications
Since the auditor will be moving to a different process from the
metainfo loop, we need a way of communicating which segments have
been chosen for audit. This queue will be that communication, for now.

Contrast this with the queue for _re_verifications in commit 9c67f62f.

Refs: https://github.com/storj/storj/issues/5251
Change-Id: I9a269c7ef21e6c5e9c6e5e1f3db298fe159a8a79
2022-11-09 15:38:56 +00:00
paul cannon
c54c45c9c7 satellite/audit: new ReverifyPiece implementation
ReverifyPiece() is not currently hooked up to anything, but is planned
to take the place of audit.(*Verifier).Reverify().

ReverifyPiece() works by downloading one piece in its entirety, rather
than pulling an entire stripe across many nodes.

Change-Id: Ie2c680f4d3c3b65273a72466a3f9f55c115b0311
2022-10-27 16:06:21 +00:00
paul cannon
9c67f62fe3 satellite/satellitedb: add table for reverify queue
This table will be used as a queue for pieces that need to be reverified
(a regular audit timed out on the owning node, so now that node is
contained and we need to validate the piece before un-containing it).

Refs: https://github.com/storj/storj/issues/5228

Change-Id: I5dcd26b6adced8674cbd81884c1543a61ea9d4c8
2022-10-27 15:28:47 +00:00
JT Olio
58a9c55f36 mod: bump dependencies
- storj.io/common

Change-Id: Ib78154acc253a13683495abfdd96d702625fdce8
2022-10-19 17:01:53 +00:00
Egon Elbre
8b70f969b6 all: fix nolint directives
Change-Id: I261c8b12e4961e6401cc4024fa5abc35b1a5efa6
2022-10-11 18:31:20 +00:00
Michal Niewrzal
e37435602f satellite/audit: optimize loop observer
Two things were done to optimize audit observer:
* monik call was removed as we have different way to track it
* no new allocation for audit.Segment struct inside observer

Benchmark against 'main':
name                                         old time/op    new time/op    delta
RemoteSegment/Cockroach/multiple_segments-8    5.85µs ± 1%    0.74µs ± 4%   -87.28%  (p=0.008 n=5+5)

name                                         old alloc/op   new alloc/op   delta
RemoteSegment/Cockroach/multiple_segments-8    2.72kB ± 0%    0.00kB           ~     (p=0.079 n=4+5)

name                                         old allocs/op  new allocs/op  delta
RemoteSegment/Cockroach/multiple_segments-8      50.0 ± 0%       0.0       -100.00%  (p=0.008 n=5+5)

Change-Id: Ib973e48782bad4346eee1cd5aee77f0a50f69258
2022-10-02 22:24:37 +00:00
paul cannon
802ff18bd8 satellite/audit: better handling of piece fetch errors
We have an alert on `not_enough_shares_for_audit` which fires too
frequently. Every time so far, it has been because of a network blip of
some nature on the satellite side.

Satellite operators are expected to have other means in place for
alerting on network problems and fixing them, so it's not necessary for
the audit framework to act in that way.

Instead, in this change, we add three new metrics,
`audit_not_enough_nodes_online`, `audit_not_enough_shares_acquired`, and
`audit_suspected_network_problem`. When an audit fails, and emits
`not_enough_shares_for_audit`, we will now determine whether it looks
like we are having network problems (most errors are connection
failures, possibly also some successful connections which subsequently
time out) or whether something else has happened.

After this is deployed, we can remove the alert on
`not_enough_shares_for_audit` and add new alerts on
`audit_not_enough_nodes_online` and `audit_not_enough_shares_acquired`.
`audit_suspected_network_problem` does not need an alert.

Refs: https://github.com/storj/storj/issues/4669

Change-Id: Ibb256bc19d2578904f71f5229111ac98e5212fcb
2022-09-28 17:02:06 +00:00
paul cannon
7d0885bbaa satellite/repair: move over audit.Pieces
This structure is entirely unused within the audit module, and is only
used by repair code. Accordingly, this change moves the structure from
audit code to repair code.

Also, we take the opportunity here to rename the structure to something
less generic.

Refs: https://github.com/storj/storj/issues/4669

Change-Id: If85b37e08620cda1fde2afe98206293e02b5c36e
2022-09-22 16:43:03 +00:00
paul cannon
0dcc0a9ee0 satellite/reputation: reconfigure lambda and alpha
This is in response to community feedback that our existing reputation
calculation is too likely to disqualify storage nodes unfairly with
extreme swings up and down.

For details and analysis, please see the data_loss_vs_dq_chance_sim.py
tool, the "tuning reputation further.ipynb" Jupyter notebook in the
storj/datascience repository, and the discussion at

    https://forum.storj.io/t/tuning-audit-scoring/14084

In brief: changing the lambda and initial-alpha parameters in this way
causes the swings in reputation to be smaller and less likely to put a
node past the disqualification threshold unfairly.

Note: this change will cause a one-time reset of all (non-disqualified)
node reputations, because the new initial alpha value of 1000 is
dramatically different, and the disqualification threshold is going to
be much higher.

Change-Id: Id6dc4ba8fde1be3db4255b72282207bab5491ca3
2022-08-17 18:52:53 +00:00
paul cannon
37a4edbaff all: reformat comments as required by gofmt 1.19
I don't know why the go people thought this was a good idea, because
this automatic reformatting is bound to do the wrong thing sometimes,
which is very annoying. But I don't see a way to turn it off, so best to
get this change out of the way.

Change-Id: Ib5dbbca6a6f6fc944d76c9b511b8c904f796e4f3
2022-08-10 18:24:55 +00:00
Michal Niewrzal
6cc2052f47 satellite: fix segment loop observers metrics
We made optimization for segment loop observers to avoid
heavy monkit initialization on each call. It was applied to very
often executed methods. Unfortunately we used wrong monkit
method to track function times. Instead mon.Task we used
mon.Func().

https://github.com/spacemonkeygo/monkit#how-it-works

Change-Id: I9ca454dbd828c6b43ba09ca75c341991d2fd73a8
2022-08-10 14:13:16 +00:00
Egon Elbre
bc9ab8ee5e satellite/audit,storagenode/gracefulexit: fixes to limiter
Ensure we don't rely on limiter to wait multiple times.

Change-Id: I75d48420236216d4c2fc6fa99293f51f80cd9c33
2022-08-03 10:24:16 +03:00
paul cannon
2f20bbf4d8 satellite/reputation: add a reputation write cache
This should lower the amount of database load coming from
reputation updates.

Change-Id: Iaacfb81480075261da77c5cc93e08b24f69f8949
2022-07-14 21:40:16 +00:00
Egon Elbre
b8006e192b satellite/audit: use a larger delay in the test
The MinDownloadTimeout 950ms and delay of 1s were quiet close, possibly
causing flaky behavior in TestVerifierSlowDownload.

Change-Id: I4f6c1554a118b21427357642abe39986fd0af38d
2022-06-28 17:03:23 +03:00
paul cannon
737d7c7dfc satellite/reputation: new ApplyUpdates() method
The ApplyUpdates() method on the reputation.DB interface acts like the
similar Update() method, but can allow for applying the changes from
multiple audit events, instead of only one.

This will be necessary for the reputation write cache, which will batch
up changes to each node's reputation in order to flush them
periodically.

Refs: https://github.com/storj/storj/issues/4601

Change-Id: I44cc47767ea2d9423166bb8fed080c8a11182041
2022-06-07 15:22:25 +00:00