Here we add a worker class comparable to audit.Worker, which will be
responsible for pulling items off of the reverification queue and
calling reverifier.ReverifyPiece on them.
Note that piecewise reverification audits (which this will control) are
not yet being done. That is, nothing is being added to the
reverification queue at this point.
Refs: https://github.com/storj/storj/issues/5251
Change-Id: I94e28830e27caa49f2c8bd4a2336533e187ab69c
The Reporter is responsible for processing results from auditing
operations, logging the results, disqualifying nodes that reached
the maximum reverification count, and passing the results on to
the reputation system.
In this commit, we extend the Reporter so that it knows how to process
the results of piecewise reverification audits.
We also change most reporter-related tests so that reverifications
happen as piecewise reverification audits, exercising the new code.
Note that piecewise reverification audits are not yet being done outside
of tests. In a later commit, we will switch from doing segmentwise
reverifications to piecewise reverifications, as part of the
audit-scaling effort.
Refs: https://github.com/storj/storj/issues/5230
Change-Id: I9438164ce1ea4d9a1790d18d0e1046a8eb04d8e9
While researching logs from a large set of audits, I noticed that nearly
all of them had streamIDs starting with 0 or 1. This seemed very odd,
because streamIDs are supposed to be pretty much entirely random, and
every hex digit from 0-f should have been represented with roughly equal
frequency.
It turned out that our A-Chao implementation of reservoir sampling is
flawed. As far as we can tell, so is the Wikipedia implementation. No
one has yet reviewed the original 1982 paper by Dr. Chao in enough
detail to know where the error originated, but we do know that we have
been auditing segments near the beginning of the segment loop (low
streamIDs) far more often than segments near the end of the segment loop
(high streamIDs).
This change uses an algorithm Wikipedia calls "A-Res" instead, and adds
a test to check for that sort of bias creeping back in somehow. A-Res
will be slightly slower than A-Chao, because of a few extra steps that
need to be done, but it does appear to be selecting items uniformly.
Change-Id: I45eba4c522bafc729cebe2aab6f3fe65cd6336be
Some observers assume that they will observe all the segments for a
given stream, and that they will observe those segments in a sequential
stream over one or more iterations.
This change updates the range provider from rangedlooptest to provide
these guarantees.
The change also removes the Mock suffix from the provider/splitter types
since the package name (rangedlooptest) implies that the type is a test
double.
Change-Id: I927c409807e305787abcde57427baac22f663eaa
We have a bug in our behavior while doing API pods deployment. At this
time its possible to have pods with multiple versions flag set true only
partially for some of pods. Because of that it's possible to start new
object without removing existing/older version on BeginObject
(new behavior) and also don't remove that existing/older object on
CommitObject. That can cause to have two committed objects with
different versions and that's a state we want to avoid.
To fix it we are removing multiple versions flag from CommitObject to
always try delete existing objects. This way even if we don't remove
existing object on BeginObject it will be always removed while
committing.
Fixes https://github.com/storj/storj/issues/5373
Change-Id: Idc334bf5cc785d2f559af96e92c3de6d82ca58ba
Add an abstraction rangedloop.SegmentProvider to fetch chunks of
segments from the metainfo database in parallel.
Part of https://github.com/storj/storj/issues/5223
Change-Id: Ife26467ea0c3be550bde0b05464ef1db62dd4d2a
Adds DeleteAllSessionsByUserIDExcept which removes all sessions except the specified session from the database and applies this function to enableMFA and disableMFA
addresses https://github.com/storj/storj-private/issues/15
Change-Id: I5d8c620dadbbda4a1b430ccf8a6121e167dd0761
Minimal implementation of the ranged (=threaded) segment loop
service, to improve performance over the existing loop.
Has tests with a an inmemory segment database
and example observer.
Does not have yet: database link, observer duration tracking,
suspicious processed ratio guard, rate limiting, minimum execution
interval per observer, etc.
Part of https://github.com/storj/storj/issues/5223
Change-Id: I08ffb392c3539e380f4e7b4f1afd56c4c394668d
This change shows STORJ token balance on the billing overview page instead of the Stripe balance it shows currently.
It changes the text on the "Available balance" card to reflect the new balance being displayed. Finally, it adds shortcuts to navigate straight to token history or add tokens modal when call to action on "Balance card"
Issue: https://github.com/storj/storj/issues/5204
Change-Id: Ic88e43c602e4949b6c6be4c7644c04f3c7d38585
To be able to verify segments in a list of buckets, this change:
- adds method ListBucketsStreamIDs to list all stream ids belonging to a list of buckets provided using a ListVerifyBucketList on which Add(projectID, bucketName) is defined.
- allows to specify a list of streamIDs to check in ListVerifySegments
Fixes https://github.com/storj/storj-private/issues/101
Change-Id: I72a48a0873a3056ac54ad56c0e9242364b2ae918
First, adding a logger argument allows the caller to have a logger
already set up with whatever extra fields they want here.
Secondly, we need to return the Outcome instead of a simple boolean so
that it can be passed on to the Reporter later (need to make the right
decision on increasing reputation vs decreasing it).
Thirdly, we collect the cached reputation information from the overlay
when creating the Orders, and return it from ReverifyPiece. This will
allow the Reporter to determine what reputation-status fields need to be
updated, similarly to how we include a map of ReputationStatus objects
in an audit.Report.
Refs: https://github.com/storj/storj/issues/5251
Change-Id: I5700b9ce543d18b857b81e684323b2d21c498cd8
NewContainment will replace Containment later in this commit chain, but
for now it is not yet being used.
NewContainment will allow a node to be contained for multiple pending
reverify jobs at a time. It is implemented by way of the reverify queue.
Refs: https://github.com/storj/storj/issues/5231
Change-Id: I126eda0b3dfc4710a88fe4a5f41780618ec19101
We have a bug where if number of buckets in the system will be
multiplication of batch size (2500) then loop that is going over
all buckets can run indefinitely.
Fixes https://github.com/storj/storj/issues/5374
Change-Id: Idd4d97c638db83e46528acb9abf223c98ad46223
Simple email validation before attempting to send notifications. If the
email is not valid, skip sending notifications and go to update
email_sent so we don't try it again. Also, move ValidateEmail function
into new package so it can be used in nodeevents without import cycle.
Change-Id: I63ce0fc84f7b1d964f7cc6da61206f54baaf1a21
It helps for the (*reverifyQueue).Insert() method to be idempotent (it
does not make sense for the same node to be under containment for the
same piece multiple times). This change allows for that, by adding an
`ON CONFLICT DO NOTHING` clause to the database query.
Refs: https://github.com/storj/storj/issues/5231
Change-Id: Id2839ee185d5396c0bc2f84ffad610df9786f6c7
Adding a new worker comparable to Verifier, called Reverifier; as the
name suggests, it will be used for reverifications, whereas Verifier
will be used for verifications.
This allows distinct logging from the two classes, plus we can add some
configuration that is specific to the Reverifier.
There is a slight modification to GetNextJob that goes along with this.
This should have no impact on operational concerns.
Refs: https://github.com/storj/storj/issues/5251
Change-Id: Ie60d2d833bc5db8660bb463dd93c764bb40fc49c
Previously, the node events chore would select based on the earliest
created_at. However, if for some reason this batch fails, it would still
be the next item to select. If there is a consistent error, the chore
would be stuck retrying the same batch over and over. Now instead
GetNextBatch orders by `last_attempted NULLS FIRST ASC, created_at ASC`.
If a batch fails during Notify, last_attempted is updated so we can move
on to a new batch if one exists.
Change-Id: Ia8458e05ac358d85b2f2c6d690f3d607d631be61
audit.Queues was the previous method of passing stacks of segments for
audit to the verifier. As of commit 68f9ce4a, this is now happening
by way of the auditor queue (database-backed, so that communication can
happen between multiple peers). audit.Queues is no longer needed.
Refs: https://github.com/storj/storj/issues/5228
Change-Id: I46f2d48d655fb66366c92146cdb6b85aef200552
SetNodeContained() will change the contained flag in the nodes table,
which will affect whether nodes are selected for new uploads. This flag
_should_ correlate with whether or not a given node has any entries in
the reverification queue. However, the reverification queue is intended
to be 'safely partitionable' from the nodes table, so we can't enforce
that characteristic transactionally. But this is ok; there are no dire
consequences if they are out of sync.
We will be adding a chore that updates the contained flag based on the
contents of the reverification queue periodically, if something fails
to set it directly when appropriate.
Refs: https://github.com/storj/storj/issues/5231
Change-Id: I26460d8718dee63fd55d00a44568b2065fc8fe30
GetByNodeID will allow querying the reverification queue to see if there
are any pending jobs for a given node ID. And thus, to see if that node
ID should be contained or not.
Some parameters on the other methods of the ReverifyQueue interface have
been changed to accept pointers; this was done ahead of the rest of the
changes for the reverification queue to better match the signatures of
the methods that these will replace once ReverifyQueue is actually being
used (meaning fewer changes to tests).
Refs: https://github.com/storj/storj/issues/5251
Change-Id: Ic38ce6d2c650702b69f1c7244a224f00a34893a1
This change removes the error type that is returned when a token
request contains an incorrect password. Instead, the generic error
type for invalid login credentials is used.
Change-Id: Ia7dbc38f4a08aeaeeac7ff5b5a801233e349b8b3
This change reduces the token links expiry time from 24h to 30m and improves the UI to promt users of the expiration.
see: https://github.com/storj/storj-private/issues/17
Change-Id: Iac00f5740fa84069937fdf9bd30a739b6db2a9e0
The audit chore will be pushing a large number of segments to be
audited, and the db might choke on that large insert when under load.
This change divides the insert up into batches, which can be sized
however is optimal for the backing database. It also arranges for
segments to be inserted in the order of the primary key, which helps
performance on some systems.
Refs: https://github.com/storj/storj/issues/5228
Change-Id: I941f580f690d681b80c86faf4abca2995e37135d
* storj/common
* storj/private
Latests common version requires small refactoring for names and types
used by metainfo code.
Change-Id: I224fe93b4751c996ba6e846be0e5677252cf830f
Reputation updates during repair currently consumes a lot of database
resources. Sometimes increasing the rate of repair is more important
than auditing a node based on whether they have or don't have the
correct piece during repair. This is the job of the audit service.
This commit is to implement an intermediate solution from this issue: https://github.com/storj/storj/issues/5089
This commit does not address the more in-depth fix discussed here: https://github.com/storj/storj/issues/4939
Change-Id: I4163b18d78a96fadf5265789fd73c8aa8def0e9f
This change causes rate limiting errors to be returned to the client
as JSON objects rather than plain text to prevent the satellite UI from
encountering issues when trying to parse them.
Resolvesstorj/customer-issues#88
Change-Id: I11abd19068927a22f1c28d18fc99e7dad8461834
We tested new upload flow (with multiple versions) to fix inconsistency
while uploading object on QA/EUN1/SLC. Now we would like to enable it
for all satellites by default. Tests required small adjustments.
Fixes https://github.com/storj/storj/issues/5283
Change-Id: I0d53c041abebc0d182ba5a88bb1dac906c29caf0
As part of the effort of splitting out the auditor workers to their own
process, we are transitioning the communication between the auditor
chore and the verification workers to a queue implemented in the
database, rather than the sequence of in-memory queues we used to use.
This logical database is safely partitionable from the rest of
satelliteDB.
Refs: https://github.com/storj/storj/issues/5251
Change-Id: I6cd31ac5265423271fbafe6127a86172c5cb53dc
We added alternative way to calculate bucket tallies for accounting and
now it's tested and we will enable it by default.
CollectBucketTallies was extended to support overriding current time
to be able to test handling expired objects.
Change-Id: I738b99a33fd2e086245f92d874c1cbb806e834c0
Add a new chore to periodically insert nodes who are offline and
have not gotten an offline email in a certain amount of time into node
events
Change-Id: I658b385bb777b0240c98092946a93d65bee94abc
Create NodeEvents Chore on satellite core to read nodeevents DB and
notify node operators on node events. The chore sends notifications
grouped by email and event type: it selects the oldest entry in
nodeevents.DB and also any other event with the same email and event
type no matter how old it is. The oldest entry of a group must exist for
a minimum amount of time before that group can be selected, however.
This minimum amount of time is a configurable value:
--node-events.selection-wait-period. This wait period allows us to
combine events of the same time and same email address into a singular
email.
Change-Id: I8b444aa324d2dae265cc27d9e9e85faef79195d8