Commit Graph

1665 Commits

Author SHA1 Message Date
Cameron Ayer
a17934cb51 satellite/satellitedb: remove reference to uptime counts
Change-Id: I26ac540b720a8ba5d6ca44526900228352dcaf4e
2021-02-02 14:51:27 -05:00
Jeff Wendling
759bdd6794 satellite/compensation: add total-paid and total-distributed to invoices
Change-Id: Id4414867917cbf8aad77795f764d6381e88d9a34
2021-02-02 18:14:31 +00:00
Kaloyan Raev
339d1212cd satellite/repair: don't remove expired segments from repair queue
It's impossible to time correctly this check. The segment may expire
just at the time we upload the repaired pieces to new storage nodes.
They will reject this as expired and the repair will fail.

Also, we penalize storage nodes with audit failure only if they fail
piece hash verification, i.e. return incorrect data, but only if they
have already deleted the piece.

So, it would be best if the repair service does not care about object
expiration at all. This is a responsibility of another service.

Removing this check will also simplify how we migrate this code
correctly to the metabase.

Change-Id: I09f7b372ae2602daee919a8a73cd0475fb263cd2
2021-02-02 16:13:01 +00:00
Egon Elbre
0bbe2f1d61 satellite/metainfo: add unimplemented ListPendingObjectStreams
Change-Id: I6c0fd240ce5be82c1f464470a6f147289b1cdf9d
2021-02-02 16:51:36 +02:00
Ivan Fraixedes
cc0d88f9c3
satellite/satellitedb: Fix GE flaky test
Fix an issue due to copy-paste problem that made that the Graceful Exit
test to be flaky.

The test uses a time created at the beginning of the test for avoiding
to get undeterministic time differences due to the fact of the response
time variation by the DB queries, however some part of the test were
using a current time rather than this base time, so they have been
addressed.

Change-Id: I4786f06209e041269875c07798a44c2850478438
2021-02-02 13:24:42 +01:00
Ivan Fraixedes
d93944c57b satellite/orders: Delete unused methods & DB tables
Delete satellite order methods and DB tables which aren't used anymore
after we have done a refactoring on the orders to stuck bucket
information in the orders' encrypted metadata.

There are also configuration parameters and a satellite chore that
aren't needed anymore after the orders refactoring.

Change-Id: Ida3682b95921df70792284b42c96d2508bf8ca9c
2021-02-01 18:01:29 +00:00
Ivan Fraixedes
076804eac9 cmd/satellite: Add command for GE data cleanup
Add a command to the satellite for cleaning up the Graceful Exit (a.k.a
GE) transfer queue items of nodes that have exited.

The commit adds to the GE satellite DB a couple of new methods, and its
corresponding test, for performing the operations of the new command.

Change-Id: I29a572a59689d63b24990ac13c52e76d65aaa917
2021-02-01 17:30:58 +00:00
Jeff Wendling
1cf3d89a56 satellite/satellitedb: add distributed column and migration
using redash i manually checked that the only times the sum of
the payments does not match the paid column is for 2020-12 and
if it does not match then there are no payments.

Change-Id: I71ce0571de7e38e21548d7d6757b25abc3bfa781
2021-02-01 16:33:14 +00:00
Natalie Villasana
91bd4191dd satellite/accounting: add rollup archiver chore
The rollup archiver chore moves bucket bandwidth rollups and
storagenode rollups that are older than a given duration
to two new archive tables.

Change-Id: I1626a3742ad4271bc744fbcefa6355a29d49c6a5
2021-02-01 09:29:54 -05:00
Egon Elbre
b7a0739219 satellite/overlay: use DownloadSelectionCache for getting node IPs
Change-Id: Ib8f4eedb2bf465767050693a1e961b37a294ca06
2021-01-29 16:47:10 +02:00
Egon Elbre
54e01d37f9 satellite/overlay: add DownloadSelectionCache
Change-Id: Ic0779280172325f8d03f55a2e9673722f72bdd44
2021-01-29 16:47:06 +02:00
Cameron Ayer
89e682b4d7 satellite/repair/checker: add 29/80/130-52 to default repair overrides
Change-Id: I2e5a7538fdf33f3869fcb65fc88f7abb10faad79
2021-01-28 16:55:16 -05:00
Egon Elbre
19e3dc4ec0 satellite/overlay: rename NodeSelectionCache to UploadSelectionCache
It wasn't obvious that NodeSelectionCache was only for uploads.

Change-Id: Ifeeaa6fdb50a4b7916245b48d8634d70ac54459c
2021-01-28 14:56:53 +02:00
Kaloyan Raev
4d32bdaefb satellite/satellitedb: drop bucket_metainfos_name_project_id_key index
This index is obsolete and duplicates a similiar (project_id, name)
index on the same table.

Moreover, it might confuse CockroachDB which of the two index to use,
which may might affect DB performance.

Change-Id: If8d1df8347714942cea9dca82864ba5f4973bed3
2021-01-28 09:06:22 +02:00
Jeff Wendling
ca86820b8b satellite/snopayouts: use dbx + some refactorings
Change-Id: I8f3973d2377f071bcea2f61e0fc21d913ffa7ea8
2021-01-27 17:59:16 +00:00
Jeff Wendling
66e15fb7f1 satellite/compensation: remove ytd paid amounts
they aren't right and we aren't using them.

Change-Id: I5ca024e38d055696696886278863e941b5bc51bf
2021-01-27 17:31:01 +00:00
Malcolm Bouzi
24d60384c5
satellite/satellitedb: add columns for professional users (#4028)
Co-authored-by: Egon Elbre <egonelbre@gmail.com>
2021-01-26 11:38:53 -05:00
Isaac Hess
c92bda7e75 tally monkit: change location to monitor piecesize
When we observed the value for total piecesizes stored in the network,
we were doing it after converting them to byte-hours, rather than using
the actual piece sizes. This fixes that issue.

Change-Id: I1564d21b519f70eb59f298d97dbd777baf127723
2021-01-26 15:37:02 +00:00
littleskunk
0b2568d712
satellite/overlay/straynodes: increase development duration without contact
Stopping storj-sim for over 5 minutes caused nodes to be disqualified. Set development max duration without contact to 300 days.
2021-01-26 12:24:39 +02:00
Michał Niewrzał
3fc0d2a83e satellite/metainfo: add testing method from multipart-upload branch
We wanto have single uplink branch for standard and multipart-upload satellite but some tests are using helper methods from multipart. This change adds methods used by uplink test.

Change-Id: I82352ed56674ff7e8743b58061ba594018e78e3b
2021-01-26 09:13:12 +00:00
Michał Niewrzał
50bea3ab1a satellite/metainfo: adjust tests to changes in uplink
One of uplink method changed its signature and we need to fix test on satellite side.

Change-Id: Ib89ea6aa25c57bac11bc3e0e9c2c89a4b9debd7c
2021-01-26 08:54:35 +00:00
Moby von Briesen
8263f18321 satellite/console: Add graphql query for owned projects
Change-Id: If47183d46cb7552ecdddbb3e536c36d958fad6d0
2021-01-25 17:43:04 +00:00
Jessica Grebenschikov
b7d8dee5e9 satellite/console/wasm: add js tests
Change-Id: I8b1b0e81500836e0408e0517edb6c696698ab5f7
2021-01-21 20:18:03 +00:00
Moby von Briesen
0a48071854 satellite/console: Add pagination fields for ListProjectsByOwnerID
Add ProjectsCursor type for pagination
Add PageCount, CurrentPage, and TotalCount ProjectsPage
This allows us to mimic the logic of GetBucketTotals and the
implementation of BucketUsages in graphql for the new ProjectsByOwnerID
functionality.

Change-Id: I4e1613859085db65971b44fcacd9813d9ddad8eb
2021-01-20 16:15:29 +00:00
Brandon Iglesias
e6c0383477
adding hypernet to the list (#4025) 2021-01-20 10:52:29 -05:00
Cameron Ayer
d14607a5f7 satellite/{contact,nodestats,overlay,satellitedb}: remove references to total_uptime_count and uptime_success_count columns
Change-Id: I1f92022909bc564e9b1e31bf937fdfe7c16554de
2021-01-19 15:43:02 -05:00
Cameron Ayer
75d828200c private,satellite: add chore to dq stray nodes
Full scope:
private/testplanet,satellite/{overlay,satellitedb}

Description:
In most cases, downtime tracking with audits will eventually lead
to DQ for nodes who are unresponsive. However, if a stray node has no
pieces, it will not be audited and will thus never be disqualified.
This chore will check for nodes who have not successfully been contacted
in some set time and DQ them.

There are some new flags for toggling DQ of stray nodes and the timeframes
for running the chore and how long nodes can go without contact.

Change-Id: Ic9d41fdbf214736798925e728245180fb3c55615
2021-01-19 14:21:56 -05:00
Ivan Fraixedes
0f70574e3c
satellite/accounting/tally: Don't abort tally when cache is down
Tally shouldn't abort its cycle if the accounting cache return an error
because it isn't an essential requirement to update it.

Change-Id: I78fd2bd9cf253ddedfb9ada80c0fa2ddf438f647
2021-01-19 15:13:01 +01:00
Qweder93
6ba8f6c8a9 storanode, satellite: payout renamed to payouts, expected estimation payouts added, console api for audits reworked
Change-Id: I4aa5e99bffaa87d0a800a429a4c83aa498ad4b7b
2021-01-18 10:56:03 +00:00
Ivan Fraixedes
678b07b314
satellite: Fix typos & code formatting
Fix some typos in the doc comments and readdress some code formatting
applied automatically.

Change-Id: I605b4eff2e7c6c58227ecf16be4c1d26f5322eb6
2021-01-15 16:40:26 +01:00
Moby von Briesen
c24f84914c satellite/console: Add ability to list projects by owner ID
Listing projects by owner ID also includes the number of members in each
project.

Change-Id: I53a09674b60c199ef378943851bb0f164e92e4e2
2021-01-15 14:22:22 +00:00
Egon Elbre
85fb964afe satellite/{metainfo,overlay}: improvements to GetObjectIPs
* Deduplicate NodeID list prior to fetching IPs.
* Use NodeSelectionCache for fetching reliable IPs.
* Return number of segements, reliable pieces and all pieces.

Change-Id: I13e679caab275488b4037624b840a4068dad9589
2021-01-14 09:12:45 +00:00
Egon Elbre
d11c2b709e go.mod: bump storj.io/common
* Add missing endpoints.
* Fix deprecated packages and funcs.

Change-Id: I756090c46a4d15eabf6d413a593cdc64c5809bc7
2021-01-13 14:51:08 +00:00
Ivan Fraixedes
a4d06b9b1e satellite/metainfo: Don't response errors when Redis down
For being able to have resilient multi-region satellites we cannot stop
processing uploads/download client request when Redis isn't responding
properly.

These changes avoid to stop the processing of the client requests when
we cannot check if the client exceeds its storage or bandwidth limits
and we cannot update its used storage/bandwidth limits because Redis is
not responding successfully or the satellite database returns an error.

Change-Id: Ia7f12c07fc9ffdfad0e7ff052ff3fd81eca0f0e3
2021-01-13 14:30:44 +00:00
Ivan Fraixedes
a73c59bbdd
satellite/console/consoleweb: Change status codes usage limits
Respond to the HTTP clients which request the project usage limits with
different status codes depending of the error class returned by the
satellite/accounting Service.

Change-Id: I6f486ea55517f616c7cec81dbbe77e997484180f
2021-01-13 15:00:12 +01:00
Cameron Ayer
0184d33e96 satellite/satellitedb: set default 0 on uptime columns
This is the first step in the removal of uptime columns on the
nodes table. These columns are no longer used:

uptime_success_count
total_uptime_count
uptime_reputation_alpha
uptime_reputation_beta

In order to avoid breaking backwards compatibility, we need to
remove all references to these columns before removing the columns
themselves from the database. However, since uptime_success_count
and total_uptime_count are NOT NULLABLE, we can't remove them from
the insert statements in the overlay. So we can't remove the columns
because of the references, and we can't remove the references because
the columns can't be null. What a pickle. To remedy this, we will set a
default on the columns. Then we should be able to remove them from the
insert statements

Change-Id: I75f6c56fb7897835bbf29869f86f39de1d9dd345
2021-01-12 17:44:37 +00:00
Ivan Fraixedes
ce26616647
satellite/accounting/live: Use Redis client directly
We have to adapt the live accounting to allow the packages that use it
to differentiate about errors for being able to ignore them and make our
satellite resilient to Redis downtime.

For differentiating errors we should make changes in the live accounting
but also in the storage/redis.Client, however, we may need to do some
dirty workarounds or break other parts of the implementation that
depends on it.

On the other hand we want to get rid of the storage/redis.Client because
it has more functionality that the one that we are using and some
process has been started to remove it.

Hence, we have refactored the live accounting to directly use the Redis
client library for later on (in a future commit) adapt the satellite for
being resilient to Redis downtime.

Last but not least, a test for expired bandwidth keys have been added
and with it a bug was spotted and fix it.

Change-Id: Ibd191522cd20f6a9a15e5ccb7beb83a678e530ff
2021-01-12 15:33:29 +01:00
JT Olio
1ad69b9f96 satellite: make external address calc more robust
Change-Id: Icbdc6fb4e3fc85076dcb7bcc4b3ec36baad308d2
2021-01-11 16:47:25 +00:00
Cameron Ayer
0403e99a5b satellite/{overlay,satellitedb}: remove unused methods for old downtime tracking
GetSuccessfulNodeNotCheckedInSince and GetOfflineNodesLimited are overlay methods
which were only used by the previous downtime tracking system which has been removed.
These methods should also be removed.

Change-Id: Idb829d742e1f987e095604423fff656fe581183e
2021-01-11 15:21:28 +00:00
Egon Elbre
24833465e6 satellite/metainfo/metabase: avoid magic constant
Change-Id: I4f01e38f67e18ae9cb9845a8e75a987acba66427
2021-01-11 10:22:21 +00:00
Jessica Grebenschikov
1709117b0d satellite/console/wasm: add more unit tests
Change-Id: Ie134f8a08d690ce013039ed1a4e484f8b6a1a6d5
2021-01-08 18:50:29 +00:00
Jeff Wendling
2d2359667d satellite/orders: remove unused satelliteAddress field
Change-Id: I58091769472688433c48becc8dfc9029bddd87aa
2021-01-08 12:25:39 -05:00
Egon Elbre
ba5461562d satellite/orders: remove satellite address
SatelliteAddress in OrderLimit is not being used anymore and some
satellite addresses may consume too much bytes.

Change-Id: Ic7a0efe5b6211c2f3b91af67b293cde98b29d074
2021-01-08 16:57:36 +00:00
Egon Elbre
51731db121 satellite/orders: use smaller encrypted metadata
Avoid using project uuid string representation, because
it uses more bandwidth.

This reduces the encrypted metadata size from 118 -> 97 bytes.

Change-Id: Ic53a81b83acc065f24f28cd404f9c0b1fe592594
2021-01-08 16:40:31 +00:00
JT Olio
8907180e81 satellite: pass contact.external-address config to web ui
Change-Id: I54978aa34aa9eb98876fab6460a5737d718d6135
2021-01-06 10:11:20 -07:00
Egon Elbre
9cb4466eb0 cmd/storj-sim: use dev setup by default for consistency
Fixes bug when using release binaries together with storj-sim.

Change-Id: I077bedc1486ac85aa1f04fcc0ed4098cd313f2fc
2021-01-05 13:47:30 +02:00
Moby von Briesen
a90d6fcad8 satellite/repair/checker: Use segment health on checker insert
Do not insert the number of healthy pieces for segment health anymore.
Rather, insert the segment health calculated by our new priority
function.

Change-Id: Ieee7fb2deee89f4d79ae85bac7f577befa2a0c7f
2021-01-04 11:48:17 -05:00
Moby von Briesen
6e2ef3b9ee Revert "satellite/satellitedb: Do not consider nodes with offline_suspended as reputable."
This reverts commit e24262c2c9.

Change-Id: I287deb2e52d03bbd698ed055f0f216b0b5bf2798
2021-01-04 14:28:37 +00:00
Michał Niewrzał
d4ebdba48c satellite/payments/stripecoinpayments: fix tests failing in 2021
We had some tests with hardcoded year 2020.

Change-Id: I0184c3ece819cb764eb305751a1d8d4056b6af17
2021-01-04 10:47:31 +01:00
Moby von Briesen
edbee53888 satellite,storagenode: Pass audit history over GetStats endpoint
Full prefix: satellite/{overlay,nodestats},storagenode/{reputation,nodestats}

Allow the storagenode to receive its audit history data from the
satellite via the satellite's GetStats endpoint.

The storagenode does not save this data for use in the API yet.

Change-Id: I9488f4d7a4ccb4ccf8336b8e4aeb3e5beee54979
2020-12-30 19:13:26 +00:00
Rafael Gomes
8b2e4bfa7e satellite/metainfo/piecedeletion Remove spaces from metrics.
Change-Id: Iaf1d8a96a43087f2fcc579347f581e8a78a0fb58
2020-12-30 14:27:39 -03:00
paul cannon
7246368ca1 satellite/repair: clamp totalNodes to 100 or higher
Change-Id: I239418ed3671b1cee30b0b1797dc434244e72448
2020-12-30 10:39:14 -06:00
Moby von Briesen
825dc71227 satellite/{overlay, satellitedb}: Refactor audit history
* Separate audit history interface into its own file in the overlay
package
* Add overlay.AuditHistory struct so that internalpb.AuditHistory is
only used from within the database layer
* Add overlay.GetAuditHistory function for features that will require
access to detailed audit history information
* Do not return full audit history from UpdateAuditHistory - callers to
that function only need to know the online score and whether a full
tracking period has been completed
* Move audit history tests out of satellite/satellitedb, since they are
independent of database implementation

Change-Id: I35b0c4ac23bbaabd80624f8a9631c3cb1a1f33bd
2020-12-29 18:50:22 +00:00
Moby von Briesen
85ae13f11d satellite/satellitedb: Drop nodes_offline_times table.
Now that the deprecated downtime tracking service is removed
(3fc76f4ffe), we can safely remove
the nodes_offline_times table.

Change-Id: Ia7c6efe32ba104dff5a830af5f2beee3337eefe5
2020-12-29 18:17:50 +00:00
Moby von Briesen
e24262c2c9 satellite/satellitedb: Do not consider nodes with offline_suspended as reputable.
Nodes which are offline_suspended will no longer be considered for new
uploads. The current threshold that enters a node into offline
suspension is 0.6. Disqualification for offline suspension is still
disabled.

Change-Id: I0da9abf47167dd5bf6bb21e0bc2186e003e38d1a
2020-12-29 17:59:09 +00:00
Stefan Benten
ad58459198 satellite/admin: allow more than just "paid" invoice status during user deletion
Currently we do not allow anything other than the "paid" status for invoices when
trying to delete a user. However there can be a couple of other states that are
still fine to accept during deletion of a user. This change reverses the order to
check for the status that we do not want to allow.

Change-Id: I78d85af6438015c55100fa201ccffc731c91de1c
2020-12-23 16:40:44 +01:00
JT Olio
7faaeed2bf satellite/access grant wizard: don't hardcode the satellites
Change-Id: Id9fbf68882cdb2fce846b7a2604cf965cc53ab1a
2020-12-22 21:24:45 -07:00
JT Olio
efde103dba accounting: rollup test is broken for the hour before midnight UTC
this change isn't the real fix. it's just ignoring the problem.

i don't know what the real fix is. is the problem with the test, or
is there actually a problem with the rollup code?

Change-Id: I552bdd947deadc212cc56efc5f818942b9827126
2020-12-22 14:14:52 -07:00
Ethan Adams
6070018021
satellite/overlay: use AS OF SYSTEM TIME with Cockroach
Query nodes table using AS OF SYSTEM TIME '-10s' (by default) when on CRDB to alleviate contention on the nodes table and minimize CRDB retries. Queries for standard uploads are already cached, and node lookups for graceful exit uploads has retry logic so it isn't necessary for the nodes returned to be current.
2020-12-22 21:07:07 +02:00
Ethan Adams
563197c628
satellite/overlay: Add index on nodes table (#4012)
satellite/accounting: Add index for project_id on bucket_storage_tallies
2020-12-21 12:48:48 -05:00
Ethan Adams
9b52283570
satellite/accounting: Add index for project_id on bucket_storage_tallies (#4010)
Change-Id: I47ab2d1e24f94307c3383c497cffe2a150fa8ab7
2020-12-21 11:42:00 -05:00
Ethan Adams
6e501898c3
satellite/accounting: Performance improvements to getNodeIds used by GetBandwidthSince (#4009) 2020-12-21 16:37:01 +01:00
Jessica Grebenschikov
d961437889 satellite/orders: remove the config IncludeEncryptedMetadata
Since the Satellite now requires the order encryption functionality (since serial_number table is deprecated) to properly function, we can remove the config flag to turn on/off the feature.

Change-Id: Ie973f72a9a05a81cef9e53dc9c99d22c940c2488
2020-12-18 10:39:29 -08:00
Jessica Grebenschikov
da0327c9b7 satellite/dbcleanup: remove expired serial chore
Change-Id: Ib71d41eb6679d6435e5bc10b6244dac66380a74e
2020-12-18 09:36:28 -08:00
Jessica Grebenschikov
97a5e6c814 satellite/orders: stop inserting/reading from serial_numbers table
This PR contains the minimum changes needed to stop inserting into the serial_numbers table. This is the first step in completely deprecating that table.
The next step is to create another PR to remove the expiredSerial chore, fix more tests, and remove any other methods on the serial_number table.

Change-Id: I5f12a56ebf3fa4d1a1976141d2911f25a98d2cc3
2020-12-18 08:35:13 -08:00
littleskunk
2437d5b171
satellite/access-grants: default auth service url (#4002)
* satellite/access-grants: default auth service url
2020-12-17 23:38:16 +01:00
paul cannon
d3604a5e90 satellite/repair: use survivability model for segment health
The chief segment health models we've come up with are the "immediate
danger" model and the "survivability" model. The former calculates the
chance of losing a segment becoming lost in the next time period (using
the CDF of the binomial distribution to estimate the chance of x nodes
failing in that period), while the latter estimates the number of
iterations for which a segment can be expected to survive (using the
mean of the negative binomial distribution). The immediate danger model
was a promising one for comparing segment health across segments with
different RS parameters, as it is more precisely what we want to
prevent, but it turns out that practically all segments in production
have infinite health, as the chance of losing segments with any
reasonable estimate of node failure rate is smaller than DBL_EPSILON,
the smallest possible difference from 1.0 representable in a float64
(about 1e-16).

Leaving aside the wisdom of worrying about the repair of segments that
have less than a 1e-16 chance of being lost, we want to be extremely
conservative and proactive in our repair efforts, and the health of the
segments we have been repairing thus far also evaluates to infinity
under the immediate danger model. Thus, we find ourselves reaching for
an alternative.

Dr. Ben saves the day: the survivability model is a reasonably close
approximation of the immediate danger model, and even better, it is
far simpler to calculate and yields manageable values for real-world
segments. The downside to it is that it requires as input an estimate
of the total number of active nodes.

This change replaces the segment health calculation to use the
survivability model, and reinstates the call to SegmentHealth() where it
was reverted. It gets estimates for the total number of active nodes by
leveraging the reliability cache.

Change-Id: Ia5d9b9031b9f6cf0fa7b9005a7011609415527dc
2020-12-17 21:30:17 +00:00
littleskunk
3feee9f4f8
satellite/accounting: default project limits (#4001) 2020-12-17 22:27:05 +01:00
Cameron Ayer
28eaae66af satellite/satellitedb: drop num_healthy_pieces column from injuredsegments
This column is no longer used as it has been replaced by the segment_health
column.

Change-Id: I6b4df89cd4f994d8418976f88e8c5f57615f8115
2020-12-17 20:17:08 +00:00
VitaliiShpital
f4bbd0f5df web/satellite: use brotli instead of gzip
WHAT:
we'll use brotli instead of gzip from now on

WHY:
better compression

Change-Id: Ibeadd6bfc783e9c15cf3f62f719af692071a7721
2020-12-17 19:23:44 +00:00
Egon Elbre
12055e7864 all: minor cleanups
Change-Id: I4248dbe36a62a223b06135254b32851485a2eec1
2020-12-16 10:47:46 +00:00
Cameron Ayer
8c52bb3a18 satellite/checker: use numHealthy as segment health in repair queue
A few weeks ago it was discovered that the segment health function
was not working as expected with production values. As a bandaid,
we decided to insert the number of healthy pieces into the segment
health column. This should have effectively reverted our means of
prioritizing repair to the previous implementation.

However, it turns out that the bandaid was placed into the code which
removes items from the irreparable db and inserts them into the repair
queue.

This change: insert number of healthy pieces into the repair queue in the
method, RemoteSegment

Change-Id: Iabfc7984df0a928066b69e9aecb6f615253f1ad2
2020-12-15 17:16:59 -05:00
Cameron Ayer
2ac72eaf16 satellite/repair/checker: add new monkit stats tagged with rs scheme
There is a new checker field called statsCollector. This contains
a map of stats pointers where the key is a stringified redundancy
scheme. stats contains all tagged monkit metrics. These metrics exist
under the key name, "tagged_repair_stats", which is tagged with the
name of each metric and a corresponding rs scheme.

As the metainfo observer works on a segment, it checks statsCollector
for a stats corresponding to the segment's redundancy scheme. If one
doesn't exist, it is created and chained to the monkit scope. Now we can call
Observe, Inc, etc on the fields just like before, and they have tags!

durabilityStats has also been renamed to aggregateStats.

At the end of the metainfo loop, we insert the aggregateStats totals into the
corresponding stats fields for metric reporting.

Change-Id: I8aa1918351d246a8ef818b9712ed4cb39d1ea9c6
2020-12-15 14:08:01 +00:00
Stefan Benten
9fe477899b satellite/satellitedb: add lint ignore rule to support staticcheck 2020.2
staticcheck 2020.2 is not liking our dbx files, so we need to ignore them.

Change-Id: I6becc3619bb088473f9776d0878ce240d4935936
2020-12-14 21:16:31 +00:00
Jessica Grebenschikov
3cc98de3ee satellite/console/wasm: reduce size to <9MB
Make changes so that we only import the necessary files from the console package so that the generated wasm code is as small as possible.

This change gets the compiled wasm code down to 8.6MB uncompressed and 2MB when compressed with `gzip --best`.

https://review.dev.storj.io/c/storj/storj/+/3396

Change-Id: Ifdd4be285810757b46bbbe43327c0d0139e5f8f7
2020-12-14 16:41:39 +00:00
Ivan Fraixedes
2dddcffe43 satellite/accounting/rollout: Remove unused variable
Remove a declared variable that's set by never read nor passed to any
function so it's unused code.

Change-Id: I8daf9d1f71d29ab39d7a80011d1b4813ada1c67d
2020-12-14 14:11:41 +00:00
Brandon Iglesias
ca1e6b9756
Adding Fastly (#3994) 2020-12-11 15:53:05 +02:00
Stefan Benten
8fe829d5fd
build: add wasm bits to Dockerfile and bump to go v1.15.6 (#3992) 2020-12-11 02:23:39 +01:00
Jessica Grebenschikov
0649d2b930 satellite/repair: improve contention for injuredsegments table on CRDB
We migrated satelliteDB off of Postgres and over to CockroachDB (crdb), but there was way too high contention for the injuredsegments table so we had to rollback to Postgres for the repair queue. A couple things contributed to this problem:
1) crdb doesn't support `FOR UPDATE SKIP LOCKED`
2) the original crdb Select query was doing 2 full table scans and not using any indexes
3) the SLC Satellite (where we were doing the migration) was running 48 repair worker processes, each of which run up to 5 goroutines which all are trying to select out of the repair queue and this was causing a ton of contention.

The changes in this PR should help to reduce that contention and improve performance on CRDB.
The changes include:
1) Use an update/set query instead of select/update to capitalize on the new `UPDATE` implicit row locking ability in CRDB.
- Details: As of CRDB v20.2.2, there is implicit row locking with update/set queries (contention reduction and performance gains are described in this blog post: https://www.cockroachlabs.com/blog/when-and-why-to-use-select-for-update-in-cockroachdb/).

2) Remove the `ORDER BY` clause since this was causing a full table scan and also prevented the use of the row locking capability.
- While long term it is very important to `ORDER BY segment_health`, the change here is only suppose to be a temporary bandaid to get us migrated over to CRDB quickly. Since segment_health has been set to infinity for some time now (re: https://review.dev.storj.io/c/storj/storj/+/3224), it seems like it might be ok to continue not making use of this for the short term. However, long term this needs to be fixed with a redesign of the repair workers, possible in the trusted delegated repair design (https://review.dev.storj.io/c/storj/storj/+/2602) or something similar to what is recommended here on how to implement a queue on CRDB https://dev.to/ajwerner/quick-and-easy-exactly-once-distributed-work-queues-using-serializable-transactions-jdp, or migrate to rabbit MQ priority queue or something similar..

This PRs improved query uses the index to avoid full scans and also locks the row its going to update and CRDB retries for us if there are any lock errors.

Change-Id: Id29faad2186627872fbeb0f31536c4f55f860f23
2020-12-10 09:51:26 -08:00
Michal Niewrzal
c2a97aeb14 satellite/satellitedb: add ListAllBuckets method
We need to be able to list all buckets in DB without knowing project ID.
This method will be used to list buckets for metainfo loop
implementation based on metabase.

Change-Id: Iac75af0eee4f31e80a15577575a8249cbca787b2
2020-12-10 14:19:27 +00:00
Stefan Benten
494bd5db81
all: golangci-lint v1.33.0 fixes (#3985) 2020-12-05 17:01:42 +01:00
Ethan Adams
f90ea10a4a
Allow for DB application names per process. (#3983) 2020-12-04 11:24:39 +01:00
Moby von Briesen
d75e4be11f satellite/{accounting, contact}: Remove periods and spaces from metrics.
Change-Id: I84179c2931293e3a1eb0ff8050416d25e481ce07
2020-12-03 15:33:01 +00:00
Moby von Briesen
3fc76f4ffe satellite/downtime: Remove deprecated downtime tracking service.
We are no longer planning on implementing downtime penalization using
the method described in
docs/blueprints/archive/storage-node-downtime-tracking-deprecated.md.
Now, we are implementing the design described in
docs/blueprints/storage-node-downtime-tracking-with-audits.md.

This change removes the downtime estimation chores from the satellite
core as well as the package satellite/downtime. A future change will
remove the database table.

Change-Id: I1a1d3cf9dceeba36255d25243294865b89925518
2020-12-02 15:16:13 -05:00
JT Olio
1728c3a992 satellite/dbx: standardize on assignment
Change-Id: I8f87bc8391e765e4480b0590d92d3601248e1f93
2020-12-01 16:10:18 +00:00
Jessica Grebenschikov
b261110352 satellite/orders: get bucketID from encrypted metadata in order instead of serial_numbers table
We want to stop using the serial_numbers table in satelliteDB. One of the last places using the serial_numbers table is when storagenodes settle orders, we look up the bucket name and project ID from the serial number from the serial_numbers table.

Now that we have support to add encrypted metadata into the OrderLimit, this PR makes use of that and now attempts to read the project ID and bucket name from the encrypted orderLimit metadata instead of from the serial_numbers table. For backwards compatibility and to ensure no errors, we will still fallback to the old way of getting that info from the serial_numbers table, but this will be removed in the next release as long as there are no errors.

All processes that create orderLimits must have an orders.encryption-keys set. The services that create orderLimits (and thus need to encrypt the order metadata) are the satellite apiProcess, the repair process, audit service (core process), and graceful exit (core process). Only the satellite api process decrypts the order metadata when storagenodes settle orders. This means that the same encryption key needs to be provided in the config for the satellite api process, repair process, and the core process like so:
orders.include-encrypted-metadata=true
orders.encryption-keys="<"encryptionKeyID>=<encryptionKey>"

Change-Id: Ie2c037971713d6fbf69d697bfad7f8b672eedd66
2020-12-01 15:29:32 +00:00
JT Olio
70b91aac54 satellitedb: remove cruft caused by https://review.dev.storj.io/c/storj/storj/+/3223
Change-Id: I198bb2f869cc7177b9ecafdd8932bbf2b58be5b8
2020-12-01 00:16:26 +00:00
Yingrong Zhao
d8ba7b3057 satellite/console: only allow project member to get all bucket names
Change-Id: I8ceb0b7eb19e221072b4ff3411a4ec1a7817d16f
2020-11-30 15:41:35 -05:00
Egon Elbre
f456d7ce03 satellite: remove implementation detail from DB interface
Which database access and how it internally does migrations is an
implementation detail and does not belong in the requirements interface.

Change-Id: Ia4a6994f39470063a96a8e5f3a1bd27aa79fe5cd
2020-11-30 13:29:20 +02:00
Egon Elbre
28ea63be92 satellite/repair: avoid TestDBAccess
Change-Id: I34adb58cd67fba5917032f2f328d75b1c4afdbbf
2020-11-30 13:29:08 +02:00
VitaliiShpital
0771cdb0b1 web/satellite: create access grant: generate gateway credentials step
WHAT:
generate gateway credentials step for create access grant flow

WHY:
part of the flow

Change-Id: I6496712b43f78a818ba0582b586cfae3a44683e6
2020-11-30 10:36:29 +00:00
VitaliiShpital
bb7677a85f web/satellite: get gateway credentials request using url from config
WHAT:
POST request to get gateway credentials using access grant.
Put request url to config and use it for request.

WHY:
to show gateway credentials on UI

Change-Id: I15ef43ecdeed69b0961d5796aacb47f36d560b1b
2020-11-30 10:36:23 +00:00
Michal Niewrzal
21602e0494 satellite/metainfo: enable commented test
Test was commented to make uplink refactoring possible. Now we can bring
back this test.

Change-Id: I0511b76073efaafed8aac97f8e845dcec93dd059
2020-11-30 10:49:23 +01:00
JT Olio
71e11b27f3 satellite/dbx: only retry with cockroach
Change-Id: Id3630c26dbfda36dcbece2849e2353d5ab2882af
2020-11-29 18:10:07 -07:00
JT Olio
bd23d12bb9 satellite/dbx: add cockroach retries for other QueryContext operations
Change-Id: Ia30fbba55c926892702fa96fb9dd01b75347d351
2020-11-29 18:09:56 -07:00
JT Olio
ea2f39ca7f satellite/dbx: add retries for QueryRowContext-based operations
Change-Id: Ie2527b673dd4ce5250cf5c0cbf8f14921262f665
2020-11-29 18:09:46 -07:00
JT Olio
d3b0691bbd satellite/dbx: import dbx templates
these are unchanged from storj.io/dbx.

we're importing them because in a later commit we
will change them, and it'd be nice to see that
diff as a separate commit.

Change-Id: I8315130ed6bab397bd65b9a1a90c29d130b8c02d
2020-11-29 18:09:33 -07:00
JT Olio
5d8a67a4f7 satellitedb: retry GetBandwidthSince on cockroach
Change-Id: I2bf20f3a19e7f3af97630d8a679410feba70661e
2020-11-29 16:36:15 -07:00
Ethan
5dc013d3bd satellite/overlay: Add retry to all selects in overlaycache
Change-Id: I0356d71a35701f8e0ca04a34b2bb2aea666c1394
2020-11-29 16:46:57 -05:00
JT Olio
6bce907cb0 satellite: try to stream rollups to aggregation function to use less memory
this change tries really hard to never have all of the storage node
rollups in memory at the same time, up until the rollups are actually
getting summed together.

Change-Id: If67f49e7d71106798d996a6850b3e48671bd9e18
2020-11-29 10:26:32 -07:00