Commit Graph

1462 Commits

Author SHA1 Message Date
Ethan
9a29ec5b3e Add index to graceful_exit_transfer_queue table
This fixes a slow query that was taking up to 4 seconds in production

SELECT node_id, path, piece_num, root_piece_id, durability_ratio, queued_at, requested_at, last_failed_at, last_failed_code, failed_count, finished_at, order_limit_send_count
	FROM graceful_exit_transfer_queue
	WHERE node_id = '[redacted]'
	AND finished_at is NULL
	AND last_failed_at is NULL
	ORDER BY durability_ratio asc, queued_at asc LIMIT 300 OFFSET 0;

Change-Id: Ib89743ca35f1d8d0a1456b20fa08c683ebdc1549
2020-10-26 14:47:48 +00:00
Ivan Fraixedes
4b61ca638b
satellite/console/consoleweb/consoleapi: Fix & add test DeleteAccount
Fix the DeleteAccount handler to return 501 HTTP status code because
it's what corresponds for a "Not Implemented" status.

Add a black box test for the DeleteAccount to ensure that always return
an error response because, at this time, we don't allow to delete
accounts through the API.

This test was not added to the corresponding commit
https://review.dev.storj.io/c/storj/storj/+/2712 due to the rush to
fix it.

Change-Id: Ibcf09e2ec52f182a8a580d606c457328d94c8b60
2020-10-23 09:14:50 +02:00
paul cannon
76d4977b6a storagenode/gracefulexit: logic moved from worker to service
Change-Id: I8b12606a96b712050bf40d587664fb1b2c578fbc
2020-10-22 23:19:30 +00:00
Ivan Fraixedes
9abdcc05e5 satellite/console/consoleweb/consoleapi: report err to monkit
Report the "Not Implemented" error response returned by DeleteAccount
API handler to monkit.

Change-Id: I17e319639c458cbe803b65b5a34111b8f74daece
2020-10-22 17:07:13 +00:00
Ivan Fraixedes
46b12c96bd satellite/console/consoleweb/consoleql: Fix typo
Fix a typo in the GraphQL mutation testing function.

Change-Id: I1c474795bfbaa3151b04cb768dfc506e654557ab
2020-10-22 13:30:20 +00:00
Kaloyan Raev
1f386db566
cmd/satellite: remove metainfo commands (#3955) 2020-10-22 13:33:09 +03:00
Kaloyan Raev
1aeb14e65e satellite/audit: do not delete expired segments
A year ago we made the audit service deleting expired segments.
Meanwhile, we introduced an expired deletetion sub-service in the
metainfo service which sole purpose is deleting expired segments.

Therefore, now we are removing this responsibility from the audit
service. It will continue to avoid reporting failures on expired
segments, but it would not delete them anymore.

We do this to cleanup responsibilities in advance of the metainfo
refactoring.

Change-Id: Id7aab2126f9289dbb5b0bdf7331ba7a3328730e4
2020-10-22 08:24:16 +00:00
Jessica Grebenschikov
89bdb20a62 storagenodedb/orders: select unsent satellite with expiration
In production we are seeing ~115 storage nodes (out of ~6,500) are not using the new SettlementWithWindow endpoint (but they are upgraded to > v1.12).

We analyzed data being reported by monkit for the nodes who were above version 1.11 but were not successfully submitting orders to the new endpoint.
The nodes fell into a few categories:
1. Always fail to list orders from the db; never get to try sending orders from the filestore
2. Successfully list/send orders from the db; never get to calling satellite endpoint for submitting filestore orders
3. Successfully list/send orders from the db; successfully list filestore orders, but satellite endpoint fails (with "unauthenticated" drpc error)

The code change here add the following to address these issues:
- modify the query for ordersDB.listUnsentBySatellite so that we no longer select expired orders from the unsent_orders table
- always process any orders that are in the ordersDB and also any orders stored in the filestore
- add monkit monitoring to filestore.ListUnsentBySatellite so that we can see the failures/successes

Change-Id: I0b473e5d75252e7ab5fa6b5c204ed260ab5094ec
2020-10-21 15:02:23 +00:00
paul cannon
360ab17869 satellite/audit: use LastIPAndPort preferentially
This preserves the last_ip_and_port field from node lookups through
CreateAuditOrderLimits() and CreateAuditOrderLimit(), so that later
calls to (*Verifier).GetShare() can try to use that IP and port. If a
connection to the given IP and port cannot be made, or the connection
cannot be verified and secured with the target node identity, an
attempt is made to connect to the original node address instead.

A similar change is not necessary to the other Create*OrderLimits
functions, because they already replace node addresses with the cached
IP and port as appropriate. We might want to consider making a similar
change to CreateGetRepairOrderLimits(), though.

The audit situation is unique because the ramifications are especially
powerful when we get the address wrong. Failing a single audit can have
a heavy cost to a storage node. We need to make extra effort in order
to avoid imposing that cost unfairly.

Situation 1: If an audit fails because the repair worker failed to make
a DNS query (which might well be the fault on the satellite side), and
we have last_ip_and_port information available for the target node, it
would be unfair not to try connecting to that last_ip_and_port address.

Situation 2: If a node has changed addresses recently and the operator
correctly changed its DNS entry, but we don't bother querying DNS, it
would be unfair to penalize the node for our failure to connect to it.

So the audit worker must try both last_ip_and_port _and_ the node
address as supplied by the SNO.

We elect here to try last_ip_and_port first, on the grounds that (a) it
is expected to work in the large majority of cases, and (b) there
should not be any security concerns with connecting to an out-or-date
address, and (c) avoiding DNS queries on the satellite side helps
alleviate satellite operational load.

Change-Id: I9bf6c6c79866d879adecac6144a6c346f4f61200
2020-10-21 13:34:40 +00:00
Ivan Fraixedes
979ee762ba
satellite/console/consoleweb: Fix typo in method name
Fix a typo in the graphQL handler method name.

Change-Id: I038c7783073f7bed95353f56a8a24520c724a5b6
2020-10-21 11:58:37 +02:00
Stefan Benten
334ae5b164 satellite/admin: add apikey endpoints
This change allows the creation and deletion of api keys via the admin API.
It adds two methods for deletion, one via the name and projectID and the
second one via the serialized apikey directly.

Change-Id: Ida8aa729e716db58c671a901e5f7e39253e89a0d
2020-10-20 11:26:56 +00:00
Moby von Briesen
7c3afe164b satellite/overlay: uncomment dq for offline and disable with feature flag
Change-Id: Ib39e2be32e880b822a94eddfb81af99a38843a27
2020-10-16 12:55:16 +00:00
Jessica Grebenschikov
205c39d404 satellite/orders: upgrade to phase 2 rollout ordersWithWindow
We are moving an error into rejectErr since its preventing storage nodes from being able to settle other orders.

Change-Id: I3ac97c340e491b127f5e0024c5e8bd9f4df8d5c3
2020-10-15 21:20:19 +00:00
Yaroslav Vorobiov
139a7ee959 private/migrate: add ablity to create dbs during migration
Use tagsql.DB pointer as step database, to propagate changes
back and forth between actual database and migration.
Adds CreateDB operation to the migration step to be able to
create new dbs before executing migration action.
Adjusts storagenode database migration to use inner tagsql.DB
pointer of each database as step.DB.
Adjusts satellite dabase migration, adds proxy migrationDB field
to satellite db that wraps itself as tagsql.DB, pointer of which
is used as step.DB.

Change-Id: Ifed4de5b01a356cf7b37db64d2eaeb7b61982c5c
2020-10-15 15:28:04 +03:00
Stefan Benten
0b43b93259 satellite/satellitedb: make limits per default NULL
This change completes the column migration of
5f6fccc6e8 and
2f648fd981.
It resets every users project limits who are below or equal to our
current production defaults.

Change-Id: Ie041d08bb67b62844f6023190fc00bc2dad5b1cb
2020-10-14 20:28:16 +00:00
VitaliiShpital
59d85aab5b web/satellite: take project amount limit from db instead of config
WHAT:
Now project amount limit is taken from users db instead of config. But if db value is 0 then default config value will be used instead.

WHY:
this will allow us to change user's project limit by changing db value.

Change-Id: I9edcd0bf9eaae5fe40e90a44cac82d9ce8519274
2020-10-14 14:17:45 +00:00
Kaloyan Raev
830817ec0d cmd/storj-sim: run gateway without --access flag
This makes it possible to remove of this obsolete flag from the
multi-tenant gateway.

As a consequence, displaying the GATEWAY_0_ACCESS env var will always
require a running storj-sim. Until now, it was required only the first
time. Then the value was stored in the 'access' config. But this is now
not possible anymore.

The changes in StripeMock are required to fix failures in integration
tests. StripeMock is in-memory and its data does not survive restarts of
storj-sim. The second and following starts of storj-sim had invalid
state of StripeMock, which failed requests that were required to
populate the GATEWAY_0_ACCESS env var. The changes in StripeMock makes
it repopulate the Stripe customers from the database.

Change-Id: I981a208172b76577f12ecdaae485f5ae4ea269bc
2020-10-13 14:45:04 +00:00
Egon Elbre
2268cc1df3 all: fix linter complaints
Change-Id: Ia01404dbb6bdd19a146fa10ff7302e08f87a8c95
2020-10-13 15:59:01 +03:00
Egon Elbre
0bdb952269 all: use keyed special comment
Change-Id: I57f6af053382c638026b64c5ff77b169bd3c6c8b
2020-10-13 15:13:41 +03:00
Stefan Benten
1d3b728766 satellite/{console/payments/satellitedb}: add validation for deletion of account and project
The same was that our Admin API handles project and account deletions currently, we would like
to have the same checks on the user-facing API. This PR adds the same checks to the console service.
General more applicable checks have been moved directly into the payments service.

In addition it adds the BucketsDB to the console DB, to have easier access and avoiding import cycles with
the metainfo package.

A small cleanup around our unnecessary monkit imports made it in as well.

Change-Id: I8769b01c2271c1687fbd2269a738a41764216e51
2020-10-13 07:55:26 +00:00
Jeff Wendling
4cbd4d52a9 satellite/orders: only hold the orders semaphore during database calls
holding it during node i/o means slow nodes can hold up order
processing for everyone else. this dramatically increases
the amount of tiem spent handling orders.

Change-Id: Iec999b7ed0817c921a0fd039097a75bdd3c70ea2
2020-10-10 15:40:50 -04:00
Jeff Wendling
0f0faf0a9f satellite/orders: do a better job limiting concurrent requests
Doing it at the ProcessOrders level was insufficient: the endpoints
make multiple database calls. It was a misguided attempt to only
have one spot enter the semaphore. By putting it in the endpoint
we can not only be sure that the concurrency is correctly limited
but it can be configurable easily.

Change-Id: I937149dd077adf9eb87fce52a1a17dc0afe96f64
2020-10-09 16:27:15 -04:00
Jeff Wendling
7c303208ff satellite/satellitedb: emergency temporary order processing semaphore
we have thundering herds of order submissions that take all of the
database connections causing temporary periodic outages. limit
the amount of concurrent order processing to 2.

Change-Id: If3f86cdbd21085a4414c2ff17d9ef6d8839a6c2b
2020-10-08 19:16:47 +00:00
Stefan Benten
b3cf12f567 satellite/console: Add more validation for console requests
Adds membership checks for the following calls:
- GetProject

Add ownership checks for the following calls:
- DeleteProject

It also disables the API endpoint to delete a project.

Furthermore it adds tests for the console service.

Change-Id: I1ffc8dcb44746a74ad06a7dbd064a29c57c25272
2020-10-07 15:33:28 +00:00
Kaloyan Raev
e7f2ec7ddf satellite/audit: fix sanity check for verify-piece-hashes command
The VerifyPieceHashes method has a sanity check for the number pieces to
be removed from the pointer after the audit for verifying the piece
hashes.

This sanity check failed when we executed the command on the production
satellites because the Verify command removes Fails and PendingAudits
nodes from the audit report if piece_hashes_verified = false.

A new temporary UsedToVerifyPieceHashes flag is added to
audits.Verifier. It is set to true only by the verify-piece-hashes
command. If the flag is true then the Verify method will always include
Fails and PendingAudits nodes in the report.

Test case is added to cover this use case.

Change-Id: I2c7cb6b12029d52b2fc565365eee0826c3de6ee8
2020-10-07 17:17:48 +03:00
Kaloyan Raev
4280142b24 satellite/console: remove unnecessary Error.Wrap
Change-Id: If851ccce7932cbf72c2fff3b51f4f9f2ea07c124
2020-10-07 09:22:41 +00:00
Stefan Benten
14a2050b8d pkg/auth: move package to consoleauth
To avoid further name collisions, the very broad named package gets moved into
the consoleauth package where its also mainly being used.

Change-Id: Ie563c9700adbf0553baca2b7b8ba4a1d9c29d144
2020-10-06 14:15:07 +02:00
Stefan Benten
44bd65795b satellite/console: ensure only project members can remove other project members
Change-Id: I815eb85f37631aaa65b5dc4cafa6851f241ca0f0
2020-10-06 11:03:12 +00:00
Stefan Benten
9deea2ffe2 satellite/console: disable account deletion via API
Change-Id: Ia8e43284c90fb2b833eb601e2c8f701cb5a4d9c0
2020-10-06 13:01:46 +02:00
Stefan Benten
0aaad88a44 satellite/{admin, console}: add test for projectLimit increase and update README
This change adds the capabilities to adjust the users project limit via the Admin API.
Adds a test for the new added function of the API and updates the existing tests.
It renames the json field on the user struct to be more consistent.

Change-Id: I9018acd80dae0af68d1d50526f20987132c654f3
2020-10-05 11:54:37 +00:00
Cameron Ayer
b39a99bae6 satellite/{overlay,satellitedb}: always show node's real online score
Previously if a node did not have audit history data for each of the
windows over the tracking period, we would give them the benefit of
the doubt and set their score to 1. This was to prevent nodes from
being suspended right out the gate. We need a minimum amount of data
to evaluate them.

However, a node who is actually failing at being online will have no
idea until they have received enough audits and we suspend them.

Instead, we will always use their real score, but use a flag to determine
whether they are eligible for suspension/dq.

Change-Id: I382218f12e8770f95d4bcddcf101ef348940cadf
2020-10-02 12:28:11 -04:00
Yingrong Zhao
c085a17a52 bump common and uplink to latest
Change-Id: I717f0214dd9973acd51b7732c5d64587f610c805
2020-10-01 15:38:58 +00:00
Moby von Briesen
000b1e6011 satellite/admin/project_test.go: Update TestCheckUsageLastMonthUnappliedInvoice
Test is failing because the mock date used in it is September 2020.
Update to 2030.

Change-Id: I4733487d0589dbeb0d2f3dd2abd88f767d59d0ca
2020-09-30 00:10:02 -04:00
Cameron Ayer
c2525ba2b5 satellite/{repair,satellitedb}: clean up healthy segments from repair queue at end of checker iteration
Repair workers prioritize the most unhealthy segments. This has the consequence that when we
finally begin to reach the end of the queue, a good portion of the remaining segments are
healthy again as their nodes have come back online. This makes it appear that there are more
injured segments than there actually are.

solution:
Any time the checker observes an injured segment it inserts it into the repair queue or
updates it if it already exists. Therefore, we can determine which segments are no longer
injured if they were not inserted or updated by the last checker iteration. To do this we
add a new column to the injured segments table, updated_at, which is set to the current time
when a segment is inserted or updated. At the end of the checker iteration, we can delete any
items where updated_at < checker start.

Change-Id: I76a98487a4a845fab2fbc677638a732a95057a94
2020-09-29 20:38:22 +00:00
Egon Elbre
c23a8e3b81 go.mod: update pgx to v4.9.0
Fix query to use TextArray instead of VarcharArray.
Fix queries to use the correct type.

Change-Id: Ibb7e55adba277d05778118d81ca697470e72c374
2020-09-29 19:03:08 +00:00
Brandon Iglesias
8783c2024a
satellite/rewards: add partner videocoin (#3946) 2020-09-29 14:30:46 +02:00
Kaloyan Raev
b409b53f7f cmd/satellite: command for verifying piece hashes
Jira: https://storjlabs.atlassian.net/browse/PG-69

There are a number of segments with piece_hashes_verified = false in
their metadata) on US-Central-1, Europe-West-1, and Asia-East-1
satellites. Most probably, this happened due to a bug we had in the
past. We want to verify them before executing the main migration to
metabase. This would simplify the main migration to metabase with one
less issue to think about.

Change-Id: I8831af1a254c560d45bb87d7104e49abd8242236
2020-09-29 10:58:24 +00:00
Egon Elbre
d508c4c985 scripts: remove duplicate check config lock
We have configlock_test for checking changes so the separate script and
makefile target are not needed.

Change-Id: I2bbc1c21ad849c9b7ec8bba43c0e11e94e04f6a6
2020-09-29 10:03:34 +00:00
Egon Elbre
2d27bc8787 satellite/satellitedb: separate cockroach for migration tests
Currently Cockroach migration test is the most heavy with regards to
schema changes. This causes other tests to time out. This adds an
alternate cockroach instance that is used for migration tests.

Change-Id: I01fe9313527ff002f0bb0914dd52c3645b8eaf6d
2020-09-29 09:31:33 +00:00
Stefan Benten
79eb682f9c satellite/console: allow coupons to be a valid payment option
Currently a user is only able to create a project if either
a STORJ deposit or CC was added to his account. With this change, an existing
coupon is also valid to let the user proceed.

Change-Id: I7be8d2d9ec58a15c50755b3fe33af04d2fd64ea2
2020-09-28 21:24:04 +00:00
Jessica Grebenschikov
4a2c66fa06 satellite/accounting: add cache for getting project storage and bw limits
This PR adds the following items:
1) an in-memory read-only cache thats stores project limit info for projectIDs

This cache is stored in-memory since this is expected to be a small amount of data. In this implementation we are only storing in the cache projects that have been accessed. Currently for the largest Satellite (eu-west) there is about 4500 total projects. So storing the storage limit (int64) and the bandwidth limit (int64), this would end up being about 200kb (including the 32 byte project ID) if all 4500 projectIDs were in the cache. So this all fits in memory for the time being. At some point it may not as usage grows, but that seems years out.

The cache is a read only cache. When requests come in to upload/download a file, we will read from the cache what the current limits are for that project. If the cache does not contain the projectID, it will get the info from the database (satellitedb project table), then add it to the cache.

The only time the values in the cache are modified is when either a) the project ID is not in the cache, or b) the item in the cache has expired (default 10mins), then the data gets refreshed out of the database. This occurs by default every 10 mins. This means that if we update the usage limits in the database, that change might not show up in the cache for 10 mins which mean it will not be reflected to limit end users uploading/downloading files for that time period..

Change-Id: I3fd7056cf963676009834fcbcf9c4a0922ca4a8f
2020-09-25 16:28:49 +00:00
Stefan Benten
9d0d0ad728 satellite/console: enable multiple projects all users
Change-Id: I42cc9f48cac387e1a67d21c1dd394f28cc5ff399
2020-09-23 16:18:28 +00:00
Stefan Benten
38108828ac satellite/satellitedb: enable multiple projects existing users
Change-Id: I2ef77182d5464d72574698c8abfbbfdbda3f5a9e
2020-09-23 18:17:38 +02:00
Stefan Benten
5f6fccc6e8 satellite/satellitedb: makes limits nullable change backwards compatible
Our current endpoints bail on us, if the column data is null. Thus we need
to take the intermediate step and set the default to a fixed value and
reset those with the following release.

It sets the default column value to our current config values of 50GB
for storage and bandwidth and 100 buckets, while still enabling the field to be nullable.
All 0 values are migrated to be the default as well to ensure they can
keep using their projects, as with the original change, 0 actually means 0.

Change-Id: I797be80ce2d2105091599dc1b3fc76f74336b66b
2020-09-23 17:54:42 +02:00
Stefan Benten
2f648fd981 satellite: make limits be nullable
Currently we have no way to actually set one
of the following limits to 0 (meaning not usable):

- maxBuckets
- usageLimit
- bandwidthLimit

With having the field nullable,
NULL corresponds to the global default,
0 now actually 0 and
a set value determines a custom limit.

Change-Id: I92bb77529dcbd0881ae8368921be9d246eb0919e
2020-09-21 19:34:19 +00:00
Kaloyan Raev
34613e4588 cmd/satellite: command for fixing old-style objects
Jira: https://storjlabs.atlassian.net/browse/PG-67

There are a number of old-style objects (without the number of segments
in their metadata) on US-Central-1, Europe-West-1, and Asia-East-1
satellites. We want to migrate their metadata to contain the number of
segments before executing the main migration to metabase. This would
simplify the main migration to metabase with one less issue to think
about.

Change-Id: I42497ae0375b5eb972aab08c700048b9a93bb18f
2020-09-21 14:46:19 +00:00
JT Olio
f46161cf20 consoleweb: log index template failures
Change-Id: I286ded309fed6198f1c450a2a31df36b6a015551
2020-09-16 16:26:14 +00:00
Qweder93
8182fdad0b storagenode: heldamount renamed to payouts, renamed some methods and structs to more meaningful names. grouped estimated payout with pathouts
satellite: heldamount renamed to SNOpayouts.

Change-Id: I244b4d2454e0621f4b8e22d3c0d3e602c0bbcb02
2020-09-16 14:57:35 +00:00
Michal Niewrzal
d199daa907 satellite/metainfo: replace local constants with metabase values
Change-Id: Ie855c10dd259464658952186251b7d3210eb49ce
2020-09-16 13:25:14 +02:00
VitaliiShpital
7d5e0259f6 satellite/projects: initial update project name functionality implemented
WHAT:
added functionality for user to update project name. Logic only, without actual GUI updates.

WHY:
better user experience

Change-Id: I1e38e33ba827b0bdf2c89e29de24e4e87edb474a
2020-09-15 12:21:56 +03:00