Commit Graph

629 Commits

Author SHA1 Message Date
Michal Niewrzal
b3aa28cc02 satellite/gracefulexit: migrate to metabase
Change-Id: I8be9cc68894124427e4a30d7631126b3afb1f281
2020-12-18 10:57:39 +00:00
Qweder93
2fd7809e54 storagenode/payout: stefanbenten satellite name added to payout history, satellites with no held history removed from list
Change-Id: I96861058ccb9c8ce52698796c91b999eaec1f6e6
2020-12-17 11:01:28 +00:00
Egon Elbre
12055e7864 all: minor cleanups
Change-Id: I4248dbe36a62a223b06135254b32851485a2eec1
2020-12-16 10:47:46 +00:00
Qweder93
12144a600b storagenode/console: payout tests and heldhistory joined_at rounding added
Change-Id: I1d43620fbafbf7ed92588b84cb9c6b8ced8832ef
2020-12-14 19:35:04 +02:00
Qweder93
2f62cdf491 storagenode/console: diskSpaceInfo extended with overused diskspace, getDashboardData updated.
Change-Id: I44db26661a8dfb45b5d8e9fcb7511f63deb88cad
2020-12-08 14:55:55 +00:00
Stefan Benten
494bd5db81
all: golangci-lint v1.33.0 fixes (#3985) 2020-12-05 17:01:42 +01:00
Jessica Grebenschikov
b261110352 satellite/orders: get bucketID from encrypted metadata in order instead of serial_numbers table
We want to stop using the serial_numbers table in satelliteDB. One of the last places using the serial_numbers table is when storagenodes settle orders, we look up the bucket name and project ID from the serial number from the serial_numbers table.

Now that we have support to add encrypted metadata into the OrderLimit, this PR makes use of that and now attempts to read the project ID and bucket name from the encrypted orderLimit metadata instead of from the serial_numbers table. For backwards compatibility and to ensure no errors, we will still fallback to the old way of getting that info from the serial_numbers table, but this will be removed in the next release as long as there are no errors.

All processes that create orderLimits must have an orders.encryption-keys set. The services that create orderLimits (and thus need to encrypt the order metadata) are the satellite apiProcess, the repair process, audit service (core process), and graceful exit (core process). Only the satellite api process decrypts the order metadata when storagenodes settle orders. This means that the same encryption key needs to be provided in the config for the satellite api process, repair process, and the core process like so:
orders.include-encrypted-metadata=true
orders.encryption-keys="<"encryptionKeyID>=<encryptionKey>"

Change-Id: Ie2c037971713d6fbf69d697bfad7f8b672eedd66
2020-12-01 15:29:32 +00:00
Egon Elbre
aeb801604e {satellite,storagenode}/orders: fix flaky tests
Before manipulating order information on storagenodes we need to wait
for the orders to propagate to the database. Some of that happens
async with uplink.

Change-Id: Iaacfd7db0909ab5d2831d06388e5fb27b6d4778f
2020-11-18 13:44:02 +00:00
Moby von Briesen
41d86c0985 storagenode/orders/ordersfile: Add reasonable size caps for orders/limits when detecting file corruption.
Define constants of 32 KiB as the upper limit of the marshalled order
and limit protobuf sizes. This value gives lots of buffer in case the
protobufs ever change, but is not as extreme as what we were doing
before in V0 files, which was to use the Uint32 max value.

Change-Id: I0914d17dde3b044b2611af33f931d46d55f81e98
2020-11-18 12:33:26 +00:00
Qweder93
a17cd9aa3e storageode/apikey: added service, CLI issue api key
Change-Id: I840cd0fdbd8dca884eefbd111f21fd3990c11e68
2020-11-18 10:40:17 +00:00
Ivan Fraixedes
fa95c6bbb9
storagenode/orders/ordersfile: Fix error message wrong var
Fix the error message reported by a wrong order size due to passing the
wrong variable to the interpolation pattern.

Change-Id: Ic0059615c60cfa33a26d4aeb0ebda5e586f0df05
2020-11-17 15:22:27 +01:00
Ivan Fraixedes
9740da6508 storagenode/orders: Don't panic if size is over MaxInt32
`make` built function to build a new slice with a negative
length panics.
`make` length parameter is of `int` type.

These changes avoid that `make` panics on 32 bits architecture due to
the fact that `int` type is a `int32` an uint32 value can be over the
maximum `int32`, and when that happens the length parameter value
becomes negative and makes `make` to panic.

Change-Id: Ife9ab5993916d6dcf5584b37c208272269cb2b45
2020-11-17 10:35:21 +00:00
Qweder93
c409194d43 storagenode/payouts: estimation payout heldamount rounding removed
Change-Id: I9fdc7cda15de0df8875436b0b376f0e6479d3aeb
2020-11-17 10:06:11 +00:00
Cameron Ayer
48d8114b3f satellite/contact: treat pingback failure as error
If the satellite fails to pingback the storage node during CheckIn
an error message is returned to the node in the response, but the actual
error value returned is nil. We are only checking the error. This means
the node has no feedback about the failure, and the node also does not
attempt to retry the connection.

Change-Id: Iaed00e422ba91af573e72255cc6671ea97928eae
2020-11-16 18:26:37 +00:00
Moby von Briesen
db480e6e1b storagenode/orders: Improve performance of handling corrupt orders.
This change fixes two things which can make reading from a corrupted
orders file inefficient.
* When a corrupted order is detected, but the underlying error is an
UnexpectedEOF (as opposed to a pb.Unmarshal error, for instance), there
is no point in attempting to read from the file another time to find an
additional uncorrupted order - we will continue to get UnexpectedEOF
errors until we seek to the very end of the file and get a normal EOF.
Instead, when UnexpectedEOF occurs, log and send metrics as with other
types of corruption, but do not attempt to read again.
* When a corrupted order is detected, instead of seeking forward only
one byte for the next attempt, seek forward by the size of entryHeader.
This cuts down on the number of iterations needed to find an uncorrupted
order after detecting a corrupted one.

Change-Id: Ie1a613127e29d29318584ec7f60e8f7554f73487
2020-11-16 14:08:36 +00:00
Cameron Ayer
5a337c48ec {cmd,private,storagenode}: create storage dir verification during setup
Previously, we created a new file to use for directory verification
every time the storage node starts. This is not helpful if the storage node
points to the wrong directory when restarting. Now we will only create the file
on setup. Now the file should be created only once and will be verified at
runtime.

Change-Id: Id529f681469138d368e5ea3c63159befe62b1a5b
2020-11-11 11:01:36 -05:00
Egon Elbre
b892a00143 mod: bump dependencies and reenable test
We shouldn't have any EOF issues with recent drpc fix, let's reenable
and see whether it's still flaky.

Change-Id: I0de312bcb087c7f70ec9d3281d73d86f971845d5
2020-11-10 10:32:21 +00:00
Moby von Briesen
db6bc6503d satellite/metainfo: Update metainfo RS config to more easily support multiple RS schemes.
Make metainfo.RSConfig a valid pflag config value. This allows us to
configure the RSConfig as a string like k/m/o/n-shareSize, which makes
having multiple supported RS schemes easier in the future.

RS-related config values that are no longer needed have been removed
(MinTotalThreshold, MaxTotalThreshold, MaxBufferMem, Verify).

Change-Id: I0178ae467dcf4375c504e7202f31443d627c15e1
2020-11-09 22:16:13 +00:00
Qweder93
8dc10e32ad stefan benten satellited added to historical payout data
Change-Id: I1177b2d2ef10d514f7d401e29891fa7dd964e9ac
2020-11-09 15:43:41 +00:00
Egon Elbre
7183dca6cb all: fix defers in loop
defer should not be called in a loop.

Change-Id: Ifa5a25a56402814b974bcdfb0c2fce56df8e7e59
2020-11-02 15:06:38 +02:00
Egon Elbre
fd8e697ab2 {satellite,storagenode}/internalpb: use specific package name
Ensure we don't register types with the same name into protobuf.

Change-Id: I53d025863fff8c91a067ca5819befa87eb5e35bb
2020-10-30 17:31:08 +02:00
Egon Elbre
1903b15474 storagenode/internalpb: move gracefulexit.proto
Change-Id: Ia3614846ed49a39c8f39331516d16d45a695240b
2020-10-30 15:24:56 +02:00
Egon Elbre
cda67a659a storagenode/internalpb: move inspector.proto
Change-Id: I951379c3b2ff00d1bc09d6a49c026a7e723432d6
2020-10-30 14:51:26 +02:00
Qweder93
f5ba8b8009 storagenode/suspensions: added offline-suspension notificatio chore + tests
Change-Id: I2521cd2e7d08a1dd379e717a554a026c7508c18f
2020-10-29 19:44:22 +02:00
Egon Elbre
e0dca4042d all: add pprof labels for debugger
By using pprof.Labels debugger is able to show service/peer names in
goroutine names.

Change-Id: I5f55253470f7cc7e556f8e8b87f746394e41675f
2020-10-29 15:10:07 +00:00
Qweder93
624255e8ba storagenode/secret: db tests added, small renaming fixes added
Change-Id: I7eae1a9a64c20a39c97e81fa741cfc9b9e1e615a
2020-10-29 14:23:04 +02:00
Egon Elbre
caefde6b32 private/{dbutil,tagsql}: pass ctx to database opening
Database opening usually dial and hence we should pass ctx to them.

Change-Id: Iaa2875981570d83e65be3710f841cf30349f807b
2020-10-29 10:51:29 +00:00
Egon Elbre
89ce1fe626 storagenode/storagenodedb: add ctx to OpenNew and OpenExisting
Database opening usually dial and hence we should pass ctx to them.

Change-Id: I9160ae95829f22f347bd525904898a47279a7427
2020-10-29 09:52:37 +02:00
Egon Elbre
76f4619a9c {satellite,storagenode}/gracefulexit: ensure client is closed
Change-Id: I576a955a5578caf7fcbee832beca28cef2b0c83e
2020-10-27 23:27:07 +02:00
Moby von Briesen
2fbb4095b2 storagenode/orders/ordersfile: Handle remaining pb.Unmarshal errors
Missed one case of Unmarshal in the previous commit for V0 files (0f4e4969b7)
In V1, unmarshalling was being attempted before the checksum was
verified, so this commit moves those calls to the end of the V1 ReadOne
function.

Change-Id: Ic0b49f0bbc91fb61fb28af6003060994d0af22ed
2020-10-26 20:27:05 +00:00
Moby von Briesen
53ba01b1f1 storagenode/orders/ordersfile/v0.go: Return ErrEntryCorrupt on pb.Unmarshal failure
In V0 orders files, unexpected EOF is correctly treated as a file
corruption, but pb.Unmarshal can also fail, and this is not treated as a
file corruption. This commit fixes that.

Change-Id: I6b446a10f4b1a5a44e832cbcc9bf8b2548cfcfeb
2020-10-26 17:38:22 +00:00
Jessica Grebenschikov
f5880f6833 satellite/orders: rollout phase3 of SettlementWithWindow endpoint
Change-Id: Id19fae4f444c83157ce58c933a18be1898430ad0
2020-10-26 14:56:28 +00:00
paul cannon
76d4977b6a storagenode/gracefulexit: logic moved from worker to service
Change-Id: I8b12606a96b712050bf40d587664fb1b2c578fbc
2020-10-22 23:19:30 +00:00
Jessica Grebenschikov
89bdb20a62 storagenodedb/orders: select unsent satellite with expiration
In production we are seeing ~115 storage nodes (out of ~6,500) are not using the new SettlementWithWindow endpoint (but they are upgraded to > v1.12).

We analyzed data being reported by monkit for the nodes who were above version 1.11 but were not successfully submitting orders to the new endpoint.
The nodes fell into a few categories:
1. Always fail to list orders from the db; never get to try sending orders from the filestore
2. Successfully list/send orders from the db; never get to calling satellite endpoint for submitting filestore orders
3. Successfully list/send orders from the db; successfully list filestore orders, but satellite endpoint fails (with "unauthenticated" drpc error)

The code change here add the following to address these issues:
- modify the query for ordersDB.listUnsentBySatellite so that we no longer select expired orders from the unsent_orders table
- always process any orders that are in the ordersDB and also any orders stored in the filestore
- add monkit monitoring to filestore.ListUnsentBySatellite so that we can see the failures/successes

Change-Id: I0b473e5d75252e7ab5fa6b5c204ed260ab5094ec
2020-10-21 15:02:23 +00:00
littleskunk
77d54ff0ac
storagenode/bandwidthdb: Use existing indexes (#3949)
* storagenode/bandwidthdb: Use existing indexes
2020-10-20 22:48:40 +02:00
Qweder93
9df74338a8 storagenode: secret db and service added
Change-Id: I91257e5adc4fc6711653f30c118e476ed1c95b6b
2020-10-16 13:24:33 +00:00
NickolaiYurchenko
7c275830a1 web/storagenode: gross total added to historical data, with surge moved
WHAT:
changed estimation table row order.

WHY:
to show gross total for selected period to avoid misunderstanding
when held amount is bigger than paid multiple times.

Change-Id: I03881c8af682372139a378030acf04f199d3260b
2020-10-16 13:26:28 +03:00
Yaroslav Vorobiov
139a7ee959 private/migrate: add ablity to create dbs during migration
Use tagsql.DB pointer as step database, to propagate changes
back and forth between actual database and migration.
Adds CreateDB operation to the migration step to be able to
create new dbs before executing migration action.
Adjusts storagenode database migration to use inner tagsql.DB
pointer of each database as step.DB.
Adjusts satellite dabase migration, adds proxy migrationDB field
to satellite db that wraps itself as tagsql.DB, pointer of which
is used as step.DB.

Change-Id: Ifed4de5b01a356cf7b37db64d2eaeb7b61982c5c
2020-10-15 15:28:04 +03:00
Moby von Briesen
aa86c0889c storagenode/console: Add current storage used per satellite to storagenode api
Right now, the best way for a storage node operator to get the current
space used for each satellite is to run the `storagenode exit-satellite`
command for graceful exit, and cancel at the second confirmation prompt.
This is convoluted and the data is readily available from the Blobs
Usage Cache.

This change adds the current space used by each satellite to the
endpoints `/api/sno` and `/api/sno/satellite/<Satellite ID>`

Change-Id: I2173005bb016fc76db96fd598d26b485e5b2aa0b
2020-10-14 21:30:28 +00:00
Moby von Briesen
02cbf1e72a storagenode/orders: Add V1 orders file
V1 allows the storagenode to continue reading orders from an
unsent/archived orders file, even if orders in the middle are corrupted.

Change-Id: Iea4117d55c05ceeb77f47d5c973e5ba95da46c66
2020-10-14 15:04:33 +00:00
Egon Elbre
cf2dd76db7 cmd/satellite: proper log usage
log.Fatal immediately terminates the program without running any defers.
We should properly close all the services and databases.

Change-Id: I5e959cef3eafedeacb3a2062e3da47e8d04e8e75
2020-10-13 16:56:35 +03:00
Egon Elbre
2268cc1df3 all: fix linter complaints
Change-Id: Ia01404dbb6bdd19a146fa10ff7302e08f87a8c95
2020-10-13 15:59:01 +03:00
Egon Elbre
0bdb952269 all: use keyed special comment
Change-Id: I57f6af053382c638026b64c5ff77b169bd3c6c8b
2020-10-13 15:13:41 +03:00
Stefan Benten
c1ca470e7e storagenode/orders: fix import and cleanup go.mod and go.sum
Accidentally we imported the wrong monkit package with a previous
commit and made our go.mod and go.sum file unclean.
This should fix it.

Change-Id: I4c3c8b696f59cfd06dc2d5436bb7aea2805936ce
2020-10-09 00:04:57 +02:00
Moby von Briesen
3209effeb6 storagenode/orders: Increase order sending interval from 5m to 1h
Since storage nodes check to see if any order files can be sent every 5
minutes, every storage node attempts to send orders to the satellite
within 5 minutes of each hour since this is when the files become
"available" to send. It is placing a lot of load on our satellite and
storage nodes are not being paid out properly due to timeouts during
order sending due to the increased satellite load.

Change-Id: I44d991b5884b8c11e8a3856d39aee8323f086b51
2020-10-08 12:51:21 -04:00
Moby von Briesen
fbf2c0b242 storagenode/orders: Refactor orders store
Abstract details of writing and reading data to/from orders files so
that adding V1 and future maintenance are easier.

Change-Id: I85f4a91761293de1a782e197bc9e09db228933c9
2020-10-06 15:28:07 -04:00
Qweder93
664b8f6821 storagenode/payout: estimation payout values switched from int64 to float64 to avoid incorrect rounding.
float64 values rounding to 2nd sign after dot.

Change-Id: Ice49f6a0944231ea6adb3343545bf1a62ff6dbc1
2020-10-02 11:33:43 +00:00
Qweder93
245986d528 negative space calculations fix removed
Change-Id: I342c61856fce6d02dc99fd27fd3d563540f22b64
2020-09-30 14:08:24 +00:00
Yaroslav Vorobiov
a840cb71e7 storagenode: check db version before run
Change-Id: I912f63fd62f2bff10341346c28dfb92fcd683806
2020-09-30 10:58:09 +00:00
Michal Niewrzal
cd2a5484f3 storagenode/console: ignore untrusted satellite while returning
dashboard data and calculating satellites data

Change-Id: I71d596891477e0839863e007689b6e2e6e420a22
2020-09-29 18:27:49 +00:00