Commit Graph

5138 Commits

Author SHA1 Message Date
Ivan Fraixedes
9abdcc05e5 satellite/console/consoleweb/consoleapi: report err to monkit
Report the "Not Implemented" error response returned by DeleteAccount
API handler to monkit.

Change-Id: I17e319639c458cbe803b65b5a34111b8f74daece
2020-10-22 17:07:13 +00:00
Yingrong Zhao
746cbfc659 scripts/tests/rollingupgrade: test current release version on master
branch

Currently, we are testing previous release version upgrading to latest
master on each master build
However, this behavior is only desired when the test is running on a
release branch.

Change-Id: Iaeb66f44951c9e4934ca3c8316d1e490d7958239
2020-10-22 11:45:54 -04:00
NickolaiYurchenko
d6b9563e56 web/satellite: disposed removed from historical gross total, total+surge calculation changed
Change-Id: If69c251bd12e0a2141ea0061353ddcc7ee618aaf
2020-10-22 17:06:11 +03:00
Ivan Fraixedes
46b12c96bd satellite/console/consoleweb/consoleql: Fix typo
Fix a typo in the GraphQL mutation testing function.

Change-Id: I1c474795bfbaa3151b04cb768dfc506e654557ab
2020-10-22 13:30:20 +00:00
Kaloyan Raev
1f386db566
cmd/satellite: remove metainfo commands (#3955) 2020-10-22 13:33:09 +03:00
Kaloyan Raev
1aeb14e65e satellite/audit: do not delete expired segments
A year ago we made the audit service deleting expired segments.
Meanwhile, we introduced an expired deletetion sub-service in the
metainfo service which sole purpose is deleting expired segments.

Therefore, now we are removing this responsibility from the audit
service. It will continue to avoid reporting failures on expired
segments, but it would not delete them anymore.

We do this to cleanup responsibilities in advance of the metainfo
refactoring.

Change-Id: Id7aab2126f9289dbb5b0bdf7331ba7a3328730e4
2020-10-22 08:24:16 +00:00
Jessica Grebenschikov
89bdb20a62 storagenodedb/orders: select unsent satellite with expiration
In production we are seeing ~115 storage nodes (out of ~6,500) are not using the new SettlementWithWindow endpoint (but they are upgraded to > v1.12).

We analyzed data being reported by monkit for the nodes who were above version 1.11 but were not successfully submitting orders to the new endpoint.
The nodes fell into a few categories:
1. Always fail to list orders from the db; never get to try sending orders from the filestore
2. Successfully list/send orders from the db; never get to calling satellite endpoint for submitting filestore orders
3. Successfully list/send orders from the db; successfully list filestore orders, but satellite endpoint fails (with "unauthenticated" drpc error)

The code change here add the following to address these issues:
- modify the query for ordersDB.listUnsentBySatellite so that we no longer select expired orders from the unsent_orders table
- always process any orders that are in the ordersDB and also any orders stored in the filestore
- add monkit monitoring to filestore.ListUnsentBySatellite so that we can see the failures/successes

Change-Id: I0b473e5d75252e7ab5fa6b5c204ed260ab5094ec
2020-10-21 15:02:23 +00:00
paul cannon
360ab17869 satellite/audit: use LastIPAndPort preferentially
This preserves the last_ip_and_port field from node lookups through
CreateAuditOrderLimits() and CreateAuditOrderLimit(), so that later
calls to (*Verifier).GetShare() can try to use that IP and port. If a
connection to the given IP and port cannot be made, or the connection
cannot be verified and secured with the target node identity, an
attempt is made to connect to the original node address instead.

A similar change is not necessary to the other Create*OrderLimits
functions, because they already replace node addresses with the cached
IP and port as appropriate. We might want to consider making a similar
change to CreateGetRepairOrderLimits(), though.

The audit situation is unique because the ramifications are especially
powerful when we get the address wrong. Failing a single audit can have
a heavy cost to a storage node. We need to make extra effort in order
to avoid imposing that cost unfairly.

Situation 1: If an audit fails because the repair worker failed to make
a DNS query (which might well be the fault on the satellite side), and
we have last_ip_and_port information available for the target node, it
would be unfair not to try connecting to that last_ip_and_port address.

Situation 2: If a node has changed addresses recently and the operator
correctly changed its DNS entry, but we don't bother querying DNS, it
would be unfair to penalize the node for our failure to connect to it.

So the audit worker must try both last_ip_and_port _and_ the node
address as supplied by the SNO.

We elect here to try last_ip_and_port first, on the grounds that (a) it
is expected to work in the large majority of cases, and (b) there
should not be any security concerns with connecting to an out-or-date
address, and (c) avoiding DNS queries on the satellite side helps
alleviate satellite operational load.

Change-Id: I9bf6c6c79866d879adecac6144a6c346f4f61200
2020-10-21 13:34:40 +00:00
Yaroslav Vorobiov
25df79a6bf storagenode-updater: check binary version on self-update
Check binary version on self-update instead of current process
version to prevent updating already updated binary.
Add info logs to report current version of service beeing
updated.

Change-Id: Id22dee188a99d6d45db925104786f49f5d3a61ae
2020-10-21 10:54:26 +00:00
Ivan Fraixedes
979ee762ba
satellite/console/consoleweb: Fix typo in method name
Fix a typo in the graphQL handler method name.

Change-Id: I038c7783073f7bed95353f56a8a24520c724a5b6
2020-10-21 11:58:37 +02:00
littleskunk
77d54ff0ac
storagenode/bandwidthdb: Use existing indexes (#3949)
* storagenode/bandwidthdb: Use existing indexes
2020-10-20 22:48:40 +02:00
Stefan Benten
334ae5b164 satellite/admin: add apikey endpoints
This change allows the creation and deletion of api keys via the admin API.
It adds two methods for deletion, one via the name and projectID and the
second one via the serialized apikey directly.

Change-Id: Ida8aa729e716db58c671a901e5f7e39253e89a0d
2020-10-20 11:26:56 +00:00
Yehor Butko
c6415406a1
docs/blueprints: graceful exit initial refactoring (#3938)
* docs: update graceful exit refactoring doc

Co-authored-by: paul cannon <thepaul@storj.io>
Co-authored-by: Jennifer Li Johnson <jennifer@storj.io>
Co-authored-by: Maximillian von Briesen <mobyvb@gmail.com>
2020-10-19 23:34:48 -05:00
NickolaiYurchenko
d3805761a2 web/storagenode: accesible functional elements
Change-Id: I1e49f612ae967c770be5329f0ee41498866700ee
2020-10-19 13:39:05 +03:00
Qweder93
9df74338a8 storagenode: secret db and service added
Change-Id: I91257e5adc4fc6711653f30c118e476ed1c95b6b
2020-10-16 13:24:33 +00:00
Moby von Briesen
7c3afe164b satellite/overlay: uncomment dq for offline and disable with feature flag
Change-Id: Ib39e2be32e880b822a94eddfb81af99a38843a27
2020-10-16 12:55:16 +00:00
NickolaiYurchenko
7c275830a1 web/storagenode: gross total added to historical data, with surge moved
WHAT:
changed estimation table row order.

WHY:
to show gross total for selected period to avoid misunderstanding
when held amount is bigger than paid multiple times.

Change-Id: I03881c8af682372139a378030acf04f199d3260b
2020-10-16 13:26:28 +03:00
Jessica Grebenschikov
205c39d404 satellite/orders: upgrade to phase 2 rollout ordersWithWindow
We are moving an error into rejectErr since its preventing storage nodes from being able to settle other orders.

Change-Id: I3ac97c340e491b127f5e0024c5e8bd9f4df8d5c3
2020-10-15 21:20:19 +00:00
Yaroslav Vorobiov
139a7ee959 private/migrate: add ablity to create dbs during migration
Use tagsql.DB pointer as step database, to propagate changes
back and forth between actual database and migration.
Adds CreateDB operation to the migration step to be able to
create new dbs before executing migration action.
Adjusts storagenode database migration to use inner tagsql.DB
pointer of each database as step.DB.
Adjusts satellite dabase migration, adds proxy migrationDB field
to satellite db that wraps itself as tagsql.DB, pointer of which
is used as step.DB.

Change-Id: Ifed4de5b01a356cf7b37db64d2eaeb7b61982c5c
2020-10-15 15:28:04 +03:00
Moby von Briesen
aa86c0889c storagenode/console: Add current storage used per satellite to storagenode api
Right now, the best way for a storage node operator to get the current
space used for each satellite is to run the `storagenode exit-satellite`
command for graceful exit, and cancel at the second confirmation prompt.
This is convoluted and the data is readily available from the Blobs
Usage Cache.

This change adds the current space used by each satellite to the
endpoints `/api/sno` and `/api/sno/satellite/<Satellite ID>`

Change-Id: I2173005bb016fc76db96fd598d26b485e5b2aa0b
2020-10-14 21:30:28 +00:00
Stefan Benten
0b43b93259 satellite/satellitedb: make limits per default NULL
This change completes the column migration of
5f6fccc6e8 and
2f648fd981.
It resets every users project limits who are below or equal to our
current production defaults.

Change-Id: Ie041d08bb67b62844f6023190fc00bc2dad5b1cb
2020-10-14 20:28:16 +00:00
Egon Elbre
20a50f0906 cmd/metric-receiver: restore minimal metrics server
Change-Id: I33ac9d7ccf21f41ef3077c64506df63607ed6b15
2020-10-14 20:01:29 +03:00
Moby von Briesen
02cbf1e72a storagenode/orders: Add V1 orders file
V1 allows the storagenode to continue reading orders from an
unsent/archived orders file, even if orders in the middle are corrupted.

Change-Id: Iea4117d55c05ceeb77f47d5c973e5ba95da46c66
2020-10-14 15:04:33 +00:00
VitaliiShpital
59d85aab5b web/satellite: take project amount limit from db instead of config
WHAT:
Now project amount limit is taken from users db instead of config. But if db value is 0 then default config value will be used instead.

WHY:
this will allow us to change user's project limit by changing db value.

Change-Id: I9edcd0bf9eaae5fe40e90a44cac82d9ce8519274
2020-10-14 14:17:45 +00:00
Caleb Case
be84616e69
Update to uplink v1.3.1
Change-Id: Iae1e6a63c00d07fee9047638ddc9a414780ce0e9
2020-10-14 09:39:29 -04:00
crawter
126450f7dd multinode/database: nodes repository
Change-Id: I327afa1b4bde44f7c2f0d4df10a0792dd5c588a3
2020-10-14 12:46:17 +00:00
Egon Elbre
e2d589f3cc certificate/rpcerrs: move logging sanitizer into certificate
Currently that is the only place using it and it's tied to zap
implementation. We don't want to have zap in common to reduce common
dependencies.

Change-Id: I72c064008f83ad3a8a3aa21944753208d4844c85
2020-10-14 14:11:36 +03:00
Egon Elbre
c54dd45755 examples: remove rarely used code
Currently the examples are rarely used and either should belong to a
different repository, be part of documentation or deleted entirely.

Change-Id: I125a0860f9a1d475d384882c0e7edf64ee0f371f
2020-10-14 13:28:09 +03:00
Egon Elbre
2f4bb114d4 go.mod: bump common to remove sha256-simd
Change-Id: I77aeb3b4bea87e8d5bb83b05e86b61a80a695f2a
2020-10-14 12:47:37 +03:00
Egon Elbre
f06ce1ef01 release: remove binary stripping
We've had bizarre crashes for satellite and it's difficult to debug
stripped binaries. The only binary where stripping is useful, is uplink
cli. This change will increase uplink by 5MB.

Change-Id: I4d1dfd36452063c22e8471d99eec97f6de6167b8
2020-10-14 10:06:20 +03:00
Kaloyan Raev
830817ec0d cmd/storj-sim: run gateway without --access flag
This makes it possible to remove of this obsolete flag from the
multi-tenant gateway.

As a consequence, displaying the GATEWAY_0_ACCESS env var will always
require a running storj-sim. Until now, it was required only the first
time. Then the value was stored in the 'access' config. But this is now
not possible anymore.

The changes in StripeMock are required to fix failures in integration
tests. StripeMock is in-memory and its data does not survive restarts of
storj-sim. The second and following starts of storj-sim had invalid
state of StripeMock, which failed requests that were required to
populate the GATEWAY_0_ACCESS env var. The changes in StripeMock makes
it repopulate the Stripe customers from the database.

Change-Id: I981a208172b76577f12ecdaae485f5ae4ea269bc
2020-10-13 14:45:04 +00:00
Egon Elbre
cf2dd76db7 cmd/satellite: proper log usage
log.Fatal immediately terminates the program without running any defers.
We should properly close all the services and databases.

Change-Id: I5e959cef3eafedeacb3a2062e3da47e8d04e8e75
2020-10-13 16:56:35 +03:00
Egon Elbre
2268cc1df3 all: fix linter complaints
Change-Id: Ia01404dbb6bdd19a146fa10ff7302e08f87a8c95
2020-10-13 15:59:01 +03:00
Egon Elbre
0bdb952269 all: use keyed special comment
Change-Id: I57f6af053382c638026b64c5ff77b169bd3c6c8b
2020-10-13 15:13:41 +03:00
littleskunk
3ff8467878
satellite/projectlimit: Update limit increase link (#3950)
Co-authored-by: Stefan Benten <mail@stefan-benten.de>
2020-10-13 12:46:22 +02:00
Stefan Benten
1d3b728766 satellite/{console/payments/satellitedb}: add validation for deletion of account and project
The same was that our Admin API handles project and account deletions currently, we would like
to have the same checks on the user-facing API. This PR adds the same checks to the console service.
General more applicable checks have been moved directly into the payments service.

In addition it adds the BucketsDB to the console DB, to have easier access and avoiding import cycles with
the metainfo package.

A small cleanup around our unnecessary monkit imports made it in as well.

Change-Id: I8769b01c2271c1687fbd2269a738a41764216e51
2020-10-13 07:55:26 +00:00
Brandon Iglesias
50756cb434
Adding dominickmarino to the CLA bot list (#3952) 2020-10-12 23:41:59 +02:00
Jeff Wendling
4cbd4d52a9 satellite/orders: only hold the orders semaphore during database calls
holding it during node i/o means slow nodes can hold up order
processing for everyone else. this dramatically increases
the amount of tiem spent handling orders.

Change-Id: Iec999b7ed0817c921a0fd039097a75bdd3c70ea2
2020-10-10 15:40:50 -04:00
Jeff Wendling
0f0faf0a9f satellite/orders: do a better job limiting concurrent requests
Doing it at the ProcessOrders level was insufficient: the endpoints
make multiple database calls. It was a misguided attempt to only
have one spot enter the semaphore. By putting it in the endpoint
we can not only be sure that the concurrency is correctly limited
but it can be configurable easily.

Change-Id: I937149dd077adf9eb87fce52a1a17dc0afe96f64
2020-10-09 16:27:15 -04:00
Caleb Case
cf1748158a
Bump Dependencies
Change-Id: I4c8a4438e74379a490a19f1f88ea9dac7715dbbd
2020-10-09 09:33:49 -04:00
Stefan Benten
7161506b68 Makefile: handle msi packages correctly
With the current Makefile, both the msi and exe files get combined into one zip file.
This is not the expected behavior, which this change fixes.
It strips off only .exe from filenames going forward and leaves every other extension intact.

Change-Id: If8132b1427eec7a9e5ebd7ac6b8b3e9d12524080
2020-10-09 13:15:35 +02:00
Stefan Benten
c1ca470e7e storagenode/orders: fix import and cleanup go.mod and go.sum
Accidentally we imported the wrong monkit package with a previous
commit and made our go.mod and go.sum file unclean.
This should fix it.

Change-Id: I4c3c8b696f59cfd06dc2d5436bb7aea2805936ce
2020-10-09 00:04:57 +02:00
Jeff Wendling
7c303208ff satellite/satellitedb: emergency temporary order processing semaphore
we have thundering herds of order submissions that take all of the
database connections causing temporary periodic outages. limit
the amount of concurrent order processing to 2.

Change-Id: If3f86cdbd21085a4414c2ff17d9ef6d8839a6c2b
2020-10-08 19:16:47 +00:00
Stefan Benten
ad8da61dac cmd/satellite: Remove curl from Dockerfile
Sadly the build process with this command is very, very flaky and often fails pulling down curl via apk.
As we currently do not need it anyway, it is safe to remove.

Change-Id: I8a396c560d61a7fe6324560152a68c07c6b31638
2020-10-08 20:59:05 +02:00
Moby von Briesen
3209effeb6 storagenode/orders: Increase order sending interval from 5m to 1h
Since storage nodes check to see if any order files can be sent every 5
minutes, every storage node attempts to send orders to the satellite
within 5 minutes of each hour since this is when the files become
"available" to send. It is placing a lot of load on our satellite and
storage nodes are not being paid out properly due to timeouts during
order sending due to the increased satellite load.

Change-Id: I44d991b5884b8c11e8a3856d39aee8323f086b51
2020-10-08 12:51:21 -04:00
Yaroslav Vorobiov
e598876d79 cmd/storagenode-updater: trimm \n suffix on receiving service pid from systemctl
Change-Id: I92aac195522e46b712f05beb47d7472c2a1b4d6c
2020-10-08 15:20:20 +00:00
NickolaiYurchenko
a1488e53a0 web/storagenode: online score added
Change-Id: I8c0fd1332354063941d62891ba79ca895074b6c8
2020-10-08 13:02:59 +00:00
Stefan Benten
b3cf12f567 satellite/console: Add more validation for console requests
Adds membership checks for the following calls:
- GetProject

Add ownership checks for the following calls:
- DeleteProject

It also disables the API endpoint to delete a project.

Furthermore it adds tests for the console service.

Change-Id: I1ffc8dcb44746a74ad06a7dbd064a29c57c25272
2020-10-07 15:33:28 +00:00
Kaloyan Raev
e7f2ec7ddf satellite/audit: fix sanity check for verify-piece-hashes command
The VerifyPieceHashes method has a sanity check for the number pieces to
be removed from the pointer after the audit for verifying the piece
hashes.

This sanity check failed when we executed the command on the production
satellites because the Verify command removes Fails and PendingAudits
nodes from the audit report if piece_hashes_verified = false.

A new temporary UsedToVerifyPieceHashes flag is added to
audits.Verifier. It is set to true only by the verify-piece-hashes
command. If the flag is true then the Verify method will always include
Fails and PendingAudits nodes in the report.

Test case is added to cover this use case.

Change-Id: I2c7cb6b12029d52b2fc565365eee0826c3de6ee8
2020-10-07 17:17:48 +03:00
Kaloyan Raev
4280142b24 satellite/console: remove unnecessary Error.Wrap
Change-Id: If851ccce7932cbf72c2fff3b51f4f9f2ea07c124
2020-10-07 09:22:41 +00:00