We removed monkit call from "object" method because it was using
too much cpu and was visible on cpu profile but we should first try
optimized version of this call.
Change-Id: Ib76d8a2968a704ce47235c6dac6edad4e40bde48
One of two parts to stop using objects loop for bucket accounting,
this method collects bucket tallies from list of bucket locations
part1 of: https://github.com/storj/team-metainfo/issues/125
Change-Id: Id2d492582453e28463cddf1245622fb7f191050c
We want to send emails to SNOs. Node status changes go through the
overlay service, so it's a good place to add the mail service.
Add the mailservice.Service, satellite address, and satellite name to
overlay service. Also add feature flag --overlay.send-node-emails
Change-Id: I3bd2cb3bf22f9724954ce2374f8b651b902b3a24
Add getSalt to projects api. Add action, GET_SALT, on Store
Projects module to make the api request and return the salt
string everywhere in the web app that generates an access grant.
The Wasm code which is used to create the access grant has been
changed to decode the salt as a base64 encoded string. The names
of the function calls in the changed Wasm code have also been
changed to ensure that access grant creation fails if JS access
grant worker code and Wasm code are not the same version.
https://github.com/storj/storj-private/issues/64
Change-Id: Ia2bc4cbadad84b066ca1882b042a3f0bb13c783a
We have new flow where existing object is deleted not on begin
object but on commit object. Deletion on commit object is still
missing deletion from storage nodes. This change adds this part
to the code.
Fixes https://github.com/storj/storj/issues/5222
Change-Id: Ibfd34665b2a055ec6c0d6e260c1a57e8a4c62b0e
We have a code to limit segments loop in case it will hit DB to hard
but so far we didn't use this loop feature in production. This is a
simple change to avoid logic responsible for rate limiting and its
monitoring if limiting is disabled (RateLimit = 0)
Change-Id: I43e07b407c6e65cf252303159d052eef250d1bea
Until this change we were stripping prefix from object key on satellite side. Because of that we were transferring over network unnecessary data
from DB. This change adjusts iterator SQL queries to use SUBSTRING to
remove prefix on DB side and avoid sending it to satellite.
Benchmark against 'main':
unfortunately "time/op" is very unstable while doing local bench in this
case and sometimes there is no difference in time and sometimes its up to 18%. I never saw results when old solution is faster then new one. Results for "alloc/op" and "allocs/op" are rather consistent.
name old time/op new time/op delta
NonRecursiveListing/Cockroach/listing_no_prefix-8 1.98ms ± 6% 2.05ms ±23% ~ (p=1.000 n=9+10)
NonRecursiveListing/Cockroach/listing_with_prefix-8 3.97ms ± 8% 3.42ms ±20% -13.86% (p=0.005 n=10+10)
NonRecursiveListing/Cockroach/listing_only_prefix-8 8.42ms ±16% 7.58ms ± 5% -9.91% (p=0.002 n=10+10)
name old alloc/op new alloc/op delta
NonRecursiveListing/Cockroach/listing_no_prefix-8 16.7kB ± 0% 16.9kB ± 0% +1.16% (p=0.000 n=10+10)
NonRecursiveListing/Cockroach/listing_with_prefix-8 27.3kB ± 0% 28.2kB ± 0% +3.31% (p=0.000 n=10+10)
NonRecursiveListing/Cockroach/listing_only_prefix-8 60.0kB ± 0% 62.4kB ± 0% +3.93% (p=0.000 n=10+8)
name old allocs/op new allocs/op delta
NonRecursiveListing/Cockroach/listing_no_prefix-8 312 ± 0% 315 ± 0% +0.96% (p=0.000 n=10+10)
NonRecursiveListing/Cockroach/listing_with_prefix-8 526 ± 0% 541 ± 0% +2.85% (p=0.000 n=10+10)
NonRecursiveListing/Cockroach/listing_only_prefix-8 1.16k ± 0% 1.23k ± 0% +5.24% (p=0.000 n=10+10)
Change-Id: I23e501494ededafb2dd5ea903e8e4e313b42e956
This change increments users' failed_login_count in the database layer to avoid potential data race.
It also updates the login_lockout_expiration as well in one operation.
see: https://github.com/storj/storj/issues/4986
Change-Id: I74624f1bee31667b269cb205d74d16e79daabcb6
With this change we are switching methods to begin object, from
BeginObjectExactVersion to BeginObjectNextVersion. Main implication
is that from now it will be possible to have object with version
different than 1. New object will always get first available version.
Main reason to do this it to avoid deleting existing object during
reuploading object. Now we can create multiple pending objects but
only last committed will be available to the user. Any previous
committed object will be deleted.Because of that we moved logic to
delete existing object from BeginObject to CommitoObject request.
New logic is behind feature flat to be able to test it well first
before enablng on production.
Fixes https://github.com/storj/storj/issues/4871
Change-Id: I2dd9c7364fd93796a05ef607bda9c39a741e6a89
The flags weren't properly loading from config.
The code assumed that every node that's online for downloading also have
data uploaded to them -- which is not true.
Change-Id: Ifd65a47b9eca5b4841231928244fab17acbde6fb
This is another change to remove monkit calls from fast methods. Those
calls are visible in CPU profiles.
Change-Id: Ib3beba0dca6a6d93c3342b0994c580f78bbdd50b
After you create a brand new cluster (with storj-up, for example) the project usage fails during the first 5 minutes.
The problem is the usage of `AS OF SYSTEM TIME` which points to a time where the master database didn't exist.
In this specific case the database not found error can be ignored to avoid such messages. (if the database is really missing, we will have problems way more earlier, eg. at the login)
Change-Id: I51ee78994d91fc2a14b56646402faaaa8154c934
This patch addresses the following issues:
1. Running full migration in cockroachdb is quite slow. We already have an approach for unit tests to start from the latest snapshot. This patch makes it possible to use it for integrations tests.
2. Migration requires executing a separated command which makes it hard to run application in containerized test environments (like storj-up) or from IDE. This patch introduces a hidden flag to run migration.
3. Test user creation is painful. We do it with calling GraphQL + admin API. Providing an option with testuser makes the integration tests significant more simple (especially as the projectID -> access grant can be predictable)
Change-Id: I61010728727b490ea6aac32620f2da0484966727
Add an extra parameter to the pay-invoices command that can be used to restrict which invoices will have a payment attempted in stripe. The parameter should be of the form MM/DD/YYY and any invoices created on or after the date will have token balances applied and be processed for payment according to stripe subscriptions settings.
Change-Id: I5da5070d3ac97f45c05c02f2849254bdc44413c3
This change causes new invoices to be scheduled for automatic
advancement through Stripe if their amount due is zero. Invoices
marked for automatic advancement are exempt from the manual invoice
finalization procedure.
Change-Id: Ic583db4c86ec5243d7506d380ca3faee5e9a58d3
This change introduces the generate-invoices satellite billing
command whose functionality is equivalent to running
apply-free-coupons, prepare-invoice-records,
create-project-invoice-items, and create-invoices in order.
Invoice finalization must still be performed separately.
Change-Id: Ia3d80b95eef1f2776c38bd730ed731e42ec4c35e
Monkit calls for fast methods which are executed very frequently can
slowdown whole process. This change removes monkit calls which are not
used.
See https://review.dev.storj.io/c/storj/storj/+/8498 as an example of
speed improvement after removing monkit calls.
Change-Id: If6567d80e05b748e6393b58a5142e43013107c61
Benchmark against 'main':
name old time/op new time/op delta
RemoteSegment/Cockroach/multiple_segments-8 5.56µs ± 5% 0.69µs ±12% -87.57% (p=0.008 n=5+5)
name old alloc/op new alloc/op delta
RemoteSegment/Cockroach/multiple_segments-8 2.72kB ± 0% 0.00kB ~ (p=0.079 n=4+5)
name old allocs/op new allocs/op delta
RemoteSegment/Cockroach/multiple_segments-8 50.0 ± 0% 0.0 -100.00% (p=0.008 n=5+5)
Change-Id: I20527fb576cd81db667a81929fa95b810ee11b14
It looks that monikt monitoring can give high CPU overhead for
segments loop observer. With this code we are changing how monitoring
is initialized for observer methods. This optimization affects mainly
path where segment is healthy and doesn't require repair. Benchmark
is also added to show difference between old and new approach.
Benchmark against 'main':
name old time/op new time/op delta
RemoteSegment/Cockroach/healthy_segment-8 8.55µs ± 4% 1.37µs ± 6% -84.03% (p=0.008 n=5+5)
name old alloc/op new alloc/op delta
RemoteSegment/Cockroach/healthy_segment-8 2.63kB ± 0% 0.17kB ± 0% -93.62% (p=0.008 n=5+5)
name old allocs/op new allocs/op delta
RemoteSegment/Cockroach/healthy_segment-8 54.0 ± 0% 8.0 ± 0% -85.19% (p=0.008 n=5+5)
Change-Id: Ie138eab0d59e436395b13f57bdfb11f9871d4c18
Two things were done to optimize audit observer:
* monik call was removed as we have different way to track it
* no new allocation for audit.Segment struct inside observer
Benchmark against 'main':
name old time/op new time/op delta
RemoteSegment/Cockroach/multiple_segments-8 5.85µs ± 1% 0.74µs ± 4% -87.28% (p=0.008 n=5+5)
name old alloc/op new alloc/op delta
RemoteSegment/Cockroach/multiple_segments-8 2.72kB ± 0% 0.00kB ~ (p=0.079 n=4+5)
name old allocs/op new allocs/op delta
RemoteSegment/Cockroach/multiple_segments-8 50.0 ± 0% 0.0 -100.00% (p=0.008 n=5+5)
Change-Id: Ib973e48782bad4346eee1cd5aee77f0a50f69258
If we need to restore the satelliteDB from a backup, we must preserve the user - storj token wallet address association. This commit adds a log statement of this information after a user successfully claims a wallet. We can perform a SQL update to reassign the wallet address to the user if needed.
Change-Id: Ia5c25d7ac57e59b35865d74068196e42bc4ffe87
We have an alert on `not_enough_shares_for_audit` which fires too
frequently. Every time so far, it has been because of a network blip of
some nature on the satellite side.
Satellite operators are expected to have other means in place for
alerting on network problems and fixing them, so it's not necessary for
the audit framework to act in that way.
Instead, in this change, we add three new metrics,
`audit_not_enough_nodes_online`, `audit_not_enough_shares_acquired`, and
`audit_suspected_network_problem`. When an audit fails, and emits
`not_enough_shares_for_audit`, we will now determine whether it looks
like we are having network problems (most errors are connection
failures, possibly also some successful connections which subsequently
time out) or whether something else has happened.
After this is deployed, we can remove the alert on
`not_enough_shares_for_audit` and add new alerts on
`audit_not_enough_nodes_online` and `audit_not_enough_shares_acquired`.
`audit_suspected_network_problem` does not need an alert.
Refs: https://github.com/storj/storj/issues/4669
Change-Id: Ibb256bc19d2578904f71f5229111ac98e5212fcb
While investigating high memory consumption on a Redis instance, it was found that much of the memory consumed was from cached redis scripts.
Redis caches scripts based on the hash of the script itself. This change uses ARGV to reduce the number of cached scripts.
Change-Id: Ia878fe81552a3067f09e60c44bb4ace25c6b5f9a
When doing server-side copy, deletes the committed version of the target location if it already exists. It does not touch pending versions. The version of the copy is set to the highest already existing version + 1.
Fixes: https://github.com/storj/storj/issues/5071
Change-Id: I1d91ac17054834b1f4f0970a9fa5d58198c58a37
Change the default loop interval for querying for new payments and adding them into the billing table from 1 minute to 15 seconds.
Change-Id: I26cf4a764cbe1de4c9b839ad60352374d8231522
Introduces a new endpoint on the satellite web server to get the
project's salt. The endpoint utilizes a new console service method
GetSalt which in turn calls the project DB GetSalt method if the
user is authorized. It returns the project salt bytes as a base64
encoded string in the response.
Change-Id: Ia13b5a4b8580e7bdad0dbb98014a276b1c74b46d
Restore functionality where retain filters can be sent out to multiple
storage nodes simultaneously.
Fixes https://github.com/storj/team-metainfo/issues/121
Change-Id: I2bf86a166b09c6a277c1cb455cdca0165ce6b8af
Add new project db method, GetSalt, to get project salt. If salt
column is empty, return the sha-256 hash of the project ID. This
new method is used in metainfo endpoint ProjectInfo to return the
project salt to the client. This is backwards compatible because
the salt column is not populated yet. The updated endpoint will
do the same thing as the current endpoint.
Change-Id: I7eba376c865e10995a5a916302feca7cd7c7efa2
Assert that listing works with our new object consistency approach.
Half of https://github.com/storj/storj/issues/4868
Change-Id: I5e92f86122b50103cec7bf6d3b2c8ed103caceec
Restore previously existing end-to-end garbage collection test using
the new separate services for bloom filter generation and storage node
communication.
Original tests can be found under:
https://github.com/storj/storj/blob/v1.63.1/satellite/gc/gc_test.go
Change-Id: I42d1ab0f9981dfe183140da4d08087f4a6cd9296
Return the balance as currency object with a value and currency. The values are returned in USDollarsMicro (6 digits after the decimal).
Change-Id: I88c87faf3311b72dedd293d4e754c2fd5c03c128
Change the default number of required block confirmations for a payment to be confirmed from 12 to 15.
Change-Id: I44c258134c293e7691623bc00c504130aa69a96a
We have an alert on `repair_too_many_nodes_failed` which fires too
frequently. Every time so far, it has been because of a network blip of
some nature on the satellite side.
Satellite operators are expected to have other means in place for
alerting on network problems and fixing them, so it's not necessary for
the repair framework to act in that way.
Instead, in this change, we change the way that
`repair_too_many_nodes_failed` works. When a repair fails, we collect
piece fetch errors by type and determine from them whether it looks like
we are having network problems (most errors are connection failures,
possibly also some successful connections which subsequently time out)
or whether something else has happened.
We will now only emit `repair_too_many_nodes_failed` when the outcome
does not look like a network failure. In the network failure case, we
will instead emit `repair_suspected_network_problem`.
Refs: https://github.com/storj/storj/issues/4669
Change-Id: I49df98da5df9c606b95ad08a2bdfec8092fba926
This structure is entirely unused within the audit module, and is only
used by repair code. Accordingly, this change moves the structure from
audit code to repair code.
Also, we take the opportunity here to rename the structure to something
less generic.
Refs: https://github.com/storj/storj/issues/4669
Change-Id: If85b37e08620cda1fde2afe98206293e02b5c36e