It helps for the (*reverifyQueue).Insert() method to be idempotent (it
does not make sense for the same node to be under containment for the
same piece multiple times). This change allows for that, by adding an
`ON CONFLICT DO NOTHING` clause to the database query.
Refs: https://github.com/storj/storj/issues/5231
Change-Id: Id2839ee185d5396c0bc2f84ffad610df9786f6c7
Adding a new worker comparable to Verifier, called Reverifier; as the
name suggests, it will be used for reverifications, whereas Verifier
will be used for verifications.
This allows distinct logging from the two classes, plus we can add some
configuration that is specific to the Reverifier.
There is a slight modification to GetNextJob that goes along with this.
This should have no impact on operational concerns.
Refs: https://github.com/storj/storj/issues/5251
Change-Id: Ie60d2d833bc5db8660bb463dd93c764bb40fc49c
Previously, the node events chore would select based on the earliest
created_at. However, if for some reason this batch fails, it would still
be the next item to select. If there is a consistent error, the chore
would be stuck retrying the same batch over and over. Now instead
GetNextBatch orders by `last_attempted NULLS FIRST ASC, created_at ASC`.
If a batch fails during Notify, last_attempted is updated so we can move
on to a new batch if one exists.
Change-Id: Ia8458e05ac358d85b2f2c6d690f3d607d631be61
audit.Queues was the previous method of passing stacks of segments for
audit to the verifier. As of commit 68f9ce4a, this is now happening
by way of the auditor queue (database-backed, so that communication can
happen between multiple peers). audit.Queues is no longer needed.
Refs: https://github.com/storj/storj/issues/5228
Change-Id: I46f2d48d655fb66366c92146cdb6b85aef200552
SetNodeContained() will change the contained flag in the nodes table,
which will affect whether nodes are selected for new uploads. This flag
_should_ correlate with whether or not a given node has any entries in
the reverification queue. However, the reverification queue is intended
to be 'safely partitionable' from the nodes table, so we can't enforce
that characteristic transactionally. But this is ok; there are no dire
consequences if they are out of sync.
We will be adding a chore that updates the contained flag based on the
contents of the reverification queue periodically, if something fails
to set it directly when appropriate.
Refs: https://github.com/storj/storj/issues/5231
Change-Id: I26460d8718dee63fd55d00a44568b2065fc8fe30
GetByNodeID will allow querying the reverification queue to see if there
are any pending jobs for a given node ID. And thus, to see if that node
ID should be contained or not.
Some parameters on the other methods of the ReverifyQueue interface have
been changed to accept pointers; this was done ahead of the rest of the
changes for the reverification queue to better match the signatures of
the methods that these will replace once ReverifyQueue is actually being
used (meaning fewer changes to tests).
Refs: https://github.com/storj/storj/issues/5251
Change-Id: Ic38ce6d2c650702b69f1c7244a224f00a34893a1
This change removes the error type that is returned when a token
request contains an incorrect password. Instead, the generic error
type for invalid login credentials is used.
Change-Id: Ia7dbc38f4a08aeaeeac7ff5b5a801233e349b8b3
This change reduces the token links expiry time from 24h to 30m and improves the UI to promt users of the expiration.
see: https://github.com/storj/storj-private/issues/17
Change-Id: Iac00f5740fa84069937fdf9bd30a739b6db2a9e0
The audit chore will be pushing a large number of segments to be
audited, and the db might choke on that large insert when under load.
This change divides the insert up into batches, which can be sized
however is optimal for the backing database. It also arranges for
segments to be inserted in the order of the primary key, which helps
performance on some systems.
Refs: https://github.com/storj/storj/issues/5228
Change-Id: I941f580f690d681b80c86faf4abca2995e37135d
* storj/common
* storj/private
Latests common version requires small refactoring for names and types
used by metainfo code.
Change-Id: I224fe93b4751c996ba6e846be0e5677252cf830f
Reputation updates during repair currently consumes a lot of database
resources. Sometimes increasing the rate of repair is more important
than auditing a node based on whether they have or don't have the
correct piece during repair. This is the job of the audit service.
This commit is to implement an intermediate solution from this issue: https://github.com/storj/storj/issues/5089
This commit does not address the more in-depth fix discussed here: https://github.com/storj/storj/issues/4939
Change-Id: I4163b18d78a96fadf5265789fd73c8aa8def0e9f
This change causes rate limiting errors to be returned to the client
as JSON objects rather than plain text to prevent the satellite UI from
encountering issues when trying to parse them.
Resolvesstorj/customer-issues#88
Change-Id: I11abd19068927a22f1c28d18fc99e7dad8461834
We tested new upload flow (with multiple versions) to fix inconsistency
while uploading object on QA/EUN1/SLC. Now we would like to enable it
for all satellites by default. Tests required small adjustments.
Fixes https://github.com/storj/storj/issues/5283
Change-Id: I0d53c041abebc0d182ba5a88bb1dac906c29caf0
As part of the effort of splitting out the auditor workers to their own
process, we are transitioning the communication between the auditor
chore and the verification workers to a queue implemented in the
database, rather than the sequence of in-memory queues we used to use.
This logical database is safely partitionable from the rest of
satelliteDB.
Refs: https://github.com/storj/storj/issues/5251
Change-Id: I6cd31ac5265423271fbafe6127a86172c5cb53dc
We added alternative way to calculate bucket tallies for accounting and
now it's tested and we will enable it by default.
CollectBucketTallies was extended to support overriding current time
to be able to test handling expired objects.
Change-Id: I738b99a33fd2e086245f92d874c1cbb806e834c0
Add a new chore to periodically insert nodes who are offline and
have not gotten an offline email in a certain amount of time into node
events
Change-Id: I658b385bb777b0240c98092946a93d65bee94abc
Create NodeEvents Chore on satellite core to read nodeevents DB and
notify node operators on node events. The chore sends notifications
grouped by email and event type: it selects the oldest entry in
nodeevents.DB and also any other event with the same email and event
type no matter how old it is. The oldest entry of a group must exist for
a minimum amount of time before that group can be selected, however.
This minimum amount of time is a configurable value:
--node-events.selection-wait-period. This wait period allows us to
combine events of the same time and same email address into a singular
email.
Change-Id: I8b444aa324d2dae265cc27d9e9e85faef79195d8
read one is the wrong method when trying to select one row when there
are multiple. It returns TooManyRows error. Read first is the correct
method.
Change-Id: Ic6c92795486892ac041befd118b6945314bffeaa
Add LastOfflineEmail to overlay.NodeDossier. This is the last time a
node got an offline email. Add two new overlay db methods,
GetOfflineNodesForEmail and UpdateLastOfflineEmail. Edit db method
UpdateCheckIn to nullify last_offline_email if node is up.
Change-Id: I1ee60e7d98dd1b68348a57f9a4fb77c6c9895d6d
We have code that is used only by old uplinks and can fail at some point
but we don't interrupt anything and only log message about failure.
Until now it was logged as error but it's nothing critial so we can
reduce it to warning.
As an addition log entry was extended with more information about client
that is using this backward compatibility code.
Change-Id: Ie21c673ee59eb10de065cc371132f8f9505e2220
This change causes the session inactivity timer to be enabled unless
expressly specified otherwise.
Change-Id: I85b4014394afac2feb21f383cac414cddb09ca8f
Added new feature flag.
Reworked vuex logic to work properly with project level passphrase.
Implemented new simple set project level passphrase modal.
Issue:
https://github.com/storj/storj/issues/5280
Change-Id: I6a15e90ee9fa7aa8a09c67022466787090120f9c
Currently the primary key of the underlying rollup table has the
primary key being the bucket name, but we used to sort by projectID.
This caused dead locks due to the contention during updates/inserts.
We should reevalute if bucket name being the primary key is the right
way for this table, this should stop the long running and failing attempts tho.
Change-Id: Ie7d0f86944da48ad9cbd92eb162226882a2fb954
This change modifies the method responsible for returning project
usage summaries such that the end date of the given time period
is excluded to prevent overlap.
Change-Id: If06155efff5c6fce3865f5f6e4344873abe3e432
When a node checks in and its version is below the minimum, insert
BelowMinVersion event into node events
Change-Id: I0e437ac34496778369515cbc40c15676da8b27ae
Multipart upload requires to have the same UploadID returned from
different requests (BeginUpload, ListUploads). Otherwise client won't
be able to find existing uploads. Main issue was that data needed to
construct UploadID is in System metadata which can be filtered out
by listing option.
This change is fixing how we are setting Status for listed objects and
it's forcing reading System metadata if we are reading pending objects.
Fixes https://github.com/storj/storj/issues/5298
Change-Id: I8dd5fbab4421a64dc3ed95556408ead4c829f276
Upon adding members to a project using the Add Team Member modal,
users are now notified that only email addresses belonging to an
account will receive a project invitation. This notification appears
regardless of whether every submitted email corresponds to an account.
Previously, users received an error message if any email address not
attached to an account was submitted.
Change-Id: Ia014c8311c1347e001b1c6c33de73ea61f20b0cb
We want to be able to exclude contained nodes from nodes selection. For this, we add a 'contained' column to the nodes table to track the containment status.
Fixes https://github.com/storj/storj/issues/5231
Change-Id: Id78e645f172145adcb8664646e8ebf14e218b57b
Since the auditor will be moving to a different process from the
metainfo loop, we need a way of communicating which segments have
been chosen for audit. This queue will be that communication, for now.
Contrast this with the queue for _re_verifications in commit 9c67f62f.
Refs: https://github.com/storj/storj/issues/5251
Change-Id: I9a269c7ef21e6c5e9c6e5e1f3db298fe159a8a79
Hubspot is migrating from using API keys for authentication to OAuth.
This change migrates our Hubspot integration to use OAuth tokens.
It modifies the EnqueueCreateUser code to not send empty HubspotUTK to hubspot, and to return error for failed requests.
see: https://developers.hubspot.com/changelog/upcoming-api-key-sunset
Change-Id: I422f00e3e3caeff3ff3d08ddec059502b9addaee
* Mark node events table as "safely partitionable", meaning that it
is/will not be queried relationally along with other tables. This way,
we can safely use this table in Postgres rather than CockroachDB,
where most of our other satellite tables are running.
* Add a dbx-generated delete function to the node events table, to allow
us to easily delete entries created before a provided time. This
allows us to keep the table clean, since there is no need to persist
entries after emails have been sent.
Change-Id: I25e8a5c4092fe49dcfa6c8bb73f2043646bb611f
Libuplink is using some aliases to storj package which we will
move directly to libuplink and remove from common/storj.
To make code compilable we need to fix places where we
are using aliased types directly to be able to update libuplink.
Change-Id: I7222a927af3b41e214d1c9204917f3ebce4727ce
GetObject and GetObjectIPs are invoked by the Linksharing service to
display the shared object and its map. These two endpoint currently
require read permission.
There is a use case where an object can be shared with an access grant
that has only list permission. In such a case, the expectation is that
the linksharing service would still display the metadata of the shared
object (name, size, map), but the content would be still inaccessible.
See https://github.com/storj/gateway-mt/issues/209 for details.
This change allows GetObject and GetObjectIPs to require either read or
list permission to support the described use case.
Change-Id: I3477edc7bf8990e9848482890da047094c875d09
When a customer has no pending line items, an invoice will not be
generated for them and the Stripe client method responsible for
creating new invoices returns nil. This change adds a nil check to
account for this possibility to ensure that no panics are caused
by attempted processing of the invoice.
Change-Id: Id184d027d7447f0ef876db58601ab6cf63927fc5
Some changes to make code cleaner and easier to adopt to new ranged
loop.
* removed unneeded mutex
* reorganize constructor args
* avoid creating the same redundancy scheme for each segment piece
Change-Id: I81f3f6597147fc515516949db3ce796a60b1c8a0
Pair uuid's to create ranges. Will be used to parallelize the segment
loop.
Part of https://github.com/storj/storj/issues/5223
Change-Id: I73e2fb8a2cd379b840864449b6251b48feeb7b66
Instead of sending emails at the time the node is seen to be back
online, we have decided to send the event to the node events table,
which will initiate the email sending process at some point.
Change-Id: Id756209498112579de8e78ee20ad2df54571a617
Add nodeevents.DB to satellite overlay service so we can insert node
events into the nodeevents DB.
Change-Id: I642c0ccc9941ecdb08cb22d5c8cf701959a55156
New flag 'MultipleVersions' was not correctly passed from metainfo
configuration to metabase configuration. Because configuration was
set correctly for unit tests we didn't catch it and issue was found
while testing on QA satellite.
This change reduce number of places where new metabase flags needs
to be propagated from metainfo configuration to avoid problems with
setting new flags in the future.
Fixes https://github.com/storj/storj/issues/5274
Change-Id: I74bc122649febefd87f665be2fba628f6bfd9044
since amount of objects is growing and looping through all of them
starts taking lot of time, we are switching for SQL query to do it
in chunks of tallies per bucket. 2nd part of issue fix.
Closes https://github.com/storj/team-metainfo/issues/125
Change-Id: Ia26bcac0a7e2c6503df9ebbf4817a636841d3284
Change he bloomfilter generation process to prefix the objects with a date and update the LATEST object with the prefix. The sender will read the LATEST file to get the prefix to process.
Change-Id: Iae0d3c49015d57f391d87789fb799a7d774066bf
The current deployment strategy requires that the GC bloomfilter generation process executes only once and exits.
Change-Id: I952991f126596aa165d1f2e9fce6f8548c21bdba
Implement node events DB with Insert and GetLatestByEmailAndEvent. Get
was changed to GetLatestByEmailAndEvent so we can verify items are being
inserted into the table without needing the ID, which is not available
to us in the tests.
Change-Id: I4abe63631c44774cd7e795fbab0cbab4d801db4c
Change node_events schema to use an id column as primary key rather than
node id because there can be multiple events per node id.
Change-Id: I518d8ef9ea658764876483e282a4058d3c4910f4
Add new table for node events. We can use this to notify node
operators of certain node events. Further, we can squash events for
multiple nodes with the same email into a single notification.
Change-Id: Icea6dd939df8fe4a98806bd79c014e21d239c43e
ReverifyPiece() is not currently hooked up to anything, but is planned
to take the place of audit.(*Verifier).Reverify().
ReverifyPiece() works by downloading one piece in its entirety, rather
than pulling an entire stripe across many nodes.
Change-Id: Ie2c680f4d3c3b65273a72466a3f9f55c115b0311
This table will be used as a queue for pieces that need to be reverified
(a regular audit timed out on the owning node, so now that node is
contained and we need to validate the piece before un-containing it).
Refs: https://github.com/storj/storj/issues/5228
Change-Id: I5dcd26b6adced8674cbd81884c1543a61ea9d4c8
BeginCopyObject checks twice for write permission in the destination
bucket. One check should be enough.
Change-Id: I3d5935d34f69cd48eaaf00d0117683edfdcefc05
The procedure responsible for node reputation status comparison could
return an invalid result due to comparing a status timestamp against
itself rather than comparing it against another. This results in
unnecessary database updates that could be avoided otherwise.
This change modifies the procedure to resolve this issue.
Change-Id: Id147e1942e994e8bca4ced2a9358f2474927d6ec
We had multiple experiment so far to collect high cardinality data (mainly in aggregated form).
1. we have a `/top` endpoint which aggregates events with upper bound
2. we use same api (eventstat) to publish S3 gateway-mt agents to influxdb
This patch starts to replace theses api with jtolio/eventkit. Instead of aggregation all events can be sent to a collector host where we can do aggregation and/or persisting data.
Change-Id: Id6df4882b51d2dbd2be9401ee4199d14f3ff7186
The threshold of piece deletions from the nodes during CommitObject
when overriding an existing object seemed to cause a race condition in
tests.
This change makes the threshold configurable so we can set it to maximum
so CommitObject waits until all pieces are removed from the nodes in the
test.
Change-Id: Idf6b52e71d0082a1cd87ad99a2edded6892d02a8
We removed monkit call from "object" method because it was using
too much cpu and was visible on cpu profile but we should first try
optimized version of this call.
Change-Id: Ib76d8a2968a704ce47235c6dac6edad4e40bde48
One of two parts to stop using objects loop for bucket accounting,
this method collects bucket tallies from list of bucket locations
part1 of: https://github.com/storj/team-metainfo/issues/125
Change-Id: Id2d492582453e28463cddf1245622fb7f191050c
We want to send emails to SNOs. Node status changes go through the
overlay service, so it's a good place to add the mail service.
Add the mailservice.Service, satellite address, and satellite name to
overlay service. Also add feature flag --overlay.send-node-emails
Change-Id: I3bd2cb3bf22f9724954ce2374f8b651b902b3a24
Add getSalt to projects api. Add action, GET_SALT, on Store
Projects module to make the api request and return the salt
string everywhere in the web app that generates an access grant.
The Wasm code which is used to create the access grant has been
changed to decode the salt as a base64 encoded string. The names
of the function calls in the changed Wasm code have also been
changed to ensure that access grant creation fails if JS access
grant worker code and Wasm code are not the same version.
https://github.com/storj/storj-private/issues/64
Change-Id: Ia2bc4cbadad84b066ca1882b042a3f0bb13c783a
We have new flow where existing object is deleted not on begin
object but on commit object. Deletion on commit object is still
missing deletion from storage nodes. This change adds this part
to the code.
Fixes https://github.com/storj/storj/issues/5222
Change-Id: Ibfd34665b2a055ec6c0d6e260c1a57e8a4c62b0e
We have a code to limit segments loop in case it will hit DB to hard
but so far we didn't use this loop feature in production. This is a
simple change to avoid logic responsible for rate limiting and its
monitoring if limiting is disabled (RateLimit = 0)
Change-Id: I43e07b407c6e65cf252303159d052eef250d1bea
Until this change we were stripping prefix from object key on satellite side. Because of that we were transferring over network unnecessary data
from DB. This change adjusts iterator SQL queries to use SUBSTRING to
remove prefix on DB side and avoid sending it to satellite.
Benchmark against 'main':
unfortunately "time/op" is very unstable while doing local bench in this
case and sometimes there is no difference in time and sometimes its up to 18%. I never saw results when old solution is faster then new one. Results for "alloc/op" and "allocs/op" are rather consistent.
name old time/op new time/op delta
NonRecursiveListing/Cockroach/listing_no_prefix-8 1.98ms ± 6% 2.05ms ±23% ~ (p=1.000 n=9+10)
NonRecursiveListing/Cockroach/listing_with_prefix-8 3.97ms ± 8% 3.42ms ±20% -13.86% (p=0.005 n=10+10)
NonRecursiveListing/Cockroach/listing_only_prefix-8 8.42ms ±16% 7.58ms ± 5% -9.91% (p=0.002 n=10+10)
name old alloc/op new alloc/op delta
NonRecursiveListing/Cockroach/listing_no_prefix-8 16.7kB ± 0% 16.9kB ± 0% +1.16% (p=0.000 n=10+10)
NonRecursiveListing/Cockroach/listing_with_prefix-8 27.3kB ± 0% 28.2kB ± 0% +3.31% (p=0.000 n=10+10)
NonRecursiveListing/Cockroach/listing_only_prefix-8 60.0kB ± 0% 62.4kB ± 0% +3.93% (p=0.000 n=10+8)
name old allocs/op new allocs/op delta
NonRecursiveListing/Cockroach/listing_no_prefix-8 312 ± 0% 315 ± 0% +0.96% (p=0.000 n=10+10)
NonRecursiveListing/Cockroach/listing_with_prefix-8 526 ± 0% 541 ± 0% +2.85% (p=0.000 n=10+10)
NonRecursiveListing/Cockroach/listing_only_prefix-8 1.16k ± 0% 1.23k ± 0% +5.24% (p=0.000 n=10+10)
Change-Id: I23e501494ededafb2dd5ea903e8e4e313b42e956
This change increments users' failed_login_count in the database layer to avoid potential data race.
It also updates the login_lockout_expiration as well in one operation.
see: https://github.com/storj/storj/issues/4986
Change-Id: I74624f1bee31667b269cb205d74d16e79daabcb6