Commit Graph

218 Commits

Author SHA1 Message Date
Michał Niewrzał
fa083a7f05 Merge remote-tracking branch 'origin/main' into multipart-upload
Change-Id: Ib5ce5965b77b81c254d08c27ab30c7eccefbd4c6
2021-03-17 15:37:17 +01:00
Vitalii Shpital
6a553ec9c5 web/satellite: change banner for beta satellites with URLs
WHAT:
beta satellite top banner's copy is changed to include support/feedback URLs

WHY:
so users using our beta satellite will be able to report feedback somewhere

Change-Id: Ibc349c8b3354b577275fcf1d2b75bfdd267729d9
2021-03-15 17:12:07 +00:00
Michał Niewrzał
67e26aafcd Merge remote-tracking branch 'origin/main' into multipart-upload
Change-Id: I9b183323cb470185be22f7c648bb76917d2e6fca
2021-03-10 08:53:38 +01:00
Natalie Villasana
c290e5ac9a satellite/orders: decrease FlushBatchSize default to 1000
The previous default FlushBatchSize of 10000 was causing major
slow down in select and insert statements on bucket_bandwidth_rollups.
We saw on the saltlake satellite that a FlushBatchSize of 1000 helped
reduce contention and query latency.

Change-Id: Ib95e73482219bc5aedc11925b1849fa5999774ba
2021-03-02 14:00:48 +00:00
Vitalii Shpital
300e88f9a7 web/satellite: config flag for satellites in beta
WHAT:
config flag to indicate if satellite is in beta

WHY:
to avoid using hardcoded satellite names which may cause issues

Change-Id: If92eb7417c340bf343a9a91e2f6b11f0349020c5
2021-02-24 12:29:07 +02:00
Michał Niewrzał
908a96ae30 Merge remote-tracking branch 'origin/main' into multipart-upload
Change-Id: I075aaff42ca3f5dc538356cedfccd5939c75e791
2021-02-11 11:48:23 +01:00
Yingrong Zhao
3b49d3cddf satellite: remove referral program related code
This PR removes all back-end related referral program code including the
marketing portal.

We will have a separate PR for front-end code and database migration to
drop `offers` and `usercredits` table

Change-Id: If59f952cddfe0558a7dc03a0eac7cc1081517f88
2021-02-08 13:52:50 +00:00
Kaloyan Raev
6f3d0c4ad5 Merge remote-tracking branch 'origin/main' into multipart-upload
Conflicts:
	go.mod
	go.sum
	satellite/repair/repair_test.go
	satellite/repair/repairer/segments.go

Change-Id: Ie51a56878bee84ad9f2d31135f984881a882e906
2021-02-02 19:19:04 +02:00
Ivan Fraixedes
d93944c57b satellite/orders: Delete unused methods & DB tables
Delete satellite order methods and DB tables which aren't used anymore
after we have done a refactoring on the orders to stuck bucket
information in the orders' encrypted metadata.

There are also configuration parameters and a satellite chore that
aren't needed anymore after the orders refactoring.

Change-Id: Ida3682b95921df70792284b42c96d2508bf8ca9c
2021-02-01 18:01:29 +00:00
Natalie Villasana
91bd4191dd satellite/accounting: add rollup archiver chore
The rollup archiver chore moves bucket bandwidth rollups and
storagenode rollups that are older than a given duration
to two new archive tables.

Change-Id: I1626a3742ad4271bc744fbcefa6355a29d49c6a5
2021-02-01 09:29:54 -05:00
Kaloyan Raev
d0612199f0 Merge remote-tracking branch 'origin/main' into multipart-upload
Conflicts:
	go.mod
	go.sum
	satellite/metainfo/config.go
	satellite/metainfo/metainfo_test.go

Change-Id: I95cf3c1d020a7918795b5eec63f36112fdb86749
2021-02-01 14:32:12 +02:00
Cameron Ayer
89e682b4d7 satellite/repair/checker: add 29/80/130-52 to default repair overrides
Change-Id: I2e5a7538fdf33f3869fcb65fc88f7abb10faad79
2021-01-28 16:55:16 -05:00
Kaloyan Raev
c24ada7114 Merge remote-tracking branch 'origin/main' into multipart-upload
Conflicts:
	go.mod
	go.sum

Change-Id: Icf7c029e9d800e5f6a9fdd208c36f28e05468690
2021-01-20 17:35:57 +02:00
Cameron Ayer
d14607a5f7 satellite/{contact,nodestats,overlay,satellitedb}: remove references to total_uptime_count and uptime_success_count columns
Change-Id: I1f92022909bc564e9b1e31bf937fdfe7c16554de
2021-01-19 15:43:02 -05:00
Cameron Ayer
75d828200c private,satellite: add chore to dq stray nodes
Full scope:
private/testplanet,satellite/{overlay,satellitedb}

Description:
In most cases, downtime tracking with audits will eventually lead
to DQ for nodes who are unresponsive. However, if a stray node has no
pieces, it will not be audited and will thus never be disqualified.
This chore will check for nodes who have not successfully been contacted
in some set time and DQ them.

There are some new flags for toggling DQ of stray nodes and the timeframes
for running the chore and how long nodes can go without contact.

Change-Id: Ic9d41fdbf214736798925e728245180fb3c55615
2021-01-19 14:21:56 -05:00
Michał Niewrzał
ad3e3a38c5 Merge 'main' branch
Change-Id: Ia0db1b1f9ef3e0671d3f2208881b0abc3064e200
2021-01-04 12:13:45 +01:00
Ethan Adams
6070018021
satellite/overlay: use AS OF SYSTEM TIME with Cockroach
Query nodes table using AS OF SYSTEM TIME '-10s' (by default) when on CRDB to alleviate contention on the nodes table and minimize CRDB retries. Queries for standard uploads are already cached, and node lookups for graceful exit uploads has retry logic so it isn't necessary for the nodes returned to be current.
2020-12-22 21:07:07 +02:00
Michal Niewrzal
9a8959d429 Merge 'master' branch
Change-Id: Iba69ea73ca4d3f1cd4ae94243eaaae033c5324e8
2020-12-22 14:55:57 +01:00
Jessica Grebenschikov
d961437889 satellite/orders: remove the config IncludeEncryptedMetadata
Since the Satellite now requires the order encryption functionality (since serial_number table is deprecated) to properly function, we can remove the config flag to turn on/off the feature.

Change-Id: Ie973f72a9a05a81cef9e53dc9c99d22c940c2488
2020-12-18 10:39:29 -08:00
Jessica Grebenschikov
da0327c9b7 satellite/dbcleanup: remove expired serial chore
Change-Id: Ib71d41eb6679d6435e5bc10b6244dac66380a74e
2020-12-18 09:36:28 -08:00
Jessica Grebenschikov
97a5e6c814 satellite/orders: stop inserting/reading from serial_numbers table
This PR contains the minimum changes needed to stop inserting into the serial_numbers table. This is the first step in completely deprecating that table.
The next step is to create another PR to remove the expiredSerial chore, fix more tests, and remove any other methods on the serial_number table.

Change-Id: I5f12a56ebf3fa4d1a1976141d2911f25a98d2cc3
2020-12-18 08:35:13 -08:00
Kaloyan Raev
ce20db9f68 scripts/testdata: update satellite-config.yaml.lock
Change-Id: I6545a75b1de9834ec35ee172cf5db3daa7243295
2020-12-18 12:00:48 +02:00
littleskunk
2437d5b171
satellite/access-grants: default auth service url (#4002)
* satellite/access-grants: default auth service url
2020-12-17 23:38:16 +01:00
littleskunk
3feee9f4f8
satellite/accounting: default project limits (#4001) 2020-12-17 22:27:05 +01:00
Moby von Briesen
3fc76f4ffe satellite/downtime: Remove deprecated downtime tracking service.
We are no longer planning on implementing downtime penalization using
the method described in
docs/blueprints/archive/storage-node-downtime-tracking-deprecated.md.
Now, we are implementing the design described in
docs/blueprints/storage-node-downtime-tracking-with-audits.md.

This change removes the downtime estimation chores from the satellite
core as well as the package satellite/downtime. A future change will
remove the database table.

Change-Id: I1a1d3cf9dceeba36255d25243294865b89925518
2020-12-02 15:16:13 -05:00
VitaliiShpital
bb7677a85f web/satellite: get gateway credentials request using url from config
WHAT:
POST request to get gateway credentials using access grant.
Put request url to config and use it for request.

WHY:
to show gateway credentials on UI

Change-Id: I15ef43ecdeed69b0961d5796aacb47f36d560b1b
2020-11-30 10:36:23 +00:00
JT Olio
6bce907cb0 satellite: try to stream rollups to aggregation function to use less memory
this change tries really hard to never have all of the storage node
rollups in memory at the same time, up until the rollups are actually
getting summed together.

Change-Id: If67f49e7d71106798d996a6850b3e48671bd9e18
2020-11-29 10:26:32 -07:00
JT Olio
6aae21541f satellitedb: do saverollup in batches
Change-Id: I78278a192cba60541eee2986f54a88d5a479bd3e
2020-11-28 19:26:46 -07:00
Moby von Briesen
575f50df84 satellite/repair: Update repair override config to support multiple RS schemes.
Rather than having a single repair override value, we will now support
repair override values based on a particular segment's RS scheme.

The new format for RS override values is
"k/o/n-override,k/o/n-override..."

Change-Id: Ieb422638446ef3a9357d59b2d279ee941367604d
2020-11-23 18:01:15 +00:00
Ethan
2b92bba563 satellite/satellitedb/orders: Handle serial_numbers deletes in smaller increments on CRDB
CRDB doesn't like large deletes. While testing in the POC environment we found that deletes on the serial_numbers table could take hours.  This change limits deletes to 1000 at a time (configurable) to avoid blocking other queries.

Change-Id: I08455e25db1574579dd4d7b7125a08e9c913dff1
2020-11-20 13:44:52 +00:00
Moby von Briesen
0ec685b173 satellite/{satellitedb, repair/{queue, checker}}: Use new column "segmentHealth" instead of "numHealthy" in injured segments queue
We plan to add support for a new Reed-Solomon scheme soon, but our
repair queue orders segments by least number of healthy pieces first.
With a second RS scheme, fewer healthy pieces will not necessarily
correlate to lower health.

This change just adds the new column in a migration. A separate change
will add the new health function.

Right now, since we only support one RS scheme, behavior will not
change. Number of healthy pieces is being inserted as "segment health"
until the new health function is merged.

Segment health is calculated with a new priority function created in
commit 3e5640359. In order to use the function, a new config value is
added, called NodeFailureRate, representing the approximate probability
of any individual node going down in the duration of one checker run.

Change-Id: I51c4202203faf52528d923befbe886dbf86d02f2
2020-11-16 21:18:09 +00:00
Moby von Briesen
db6bc6503d satellite/metainfo: Update metainfo RS config to more easily support multiple RS schemes.
Make metainfo.RSConfig a valid pflag config value. This allows us to
configure the RSConfig as a string like k/m/o/n-shareSize, which makes
having multiple supported RS schemes easier in the future.

RS-related config values that are no longer needed have been removed
(MinTotalThreshold, MaxTotalThreshold, MaxBufferMem, Verify).

Change-Id: I0178ae467dcf4375c504e7202f31443d627c15e1
2020-11-09 22:16:13 +00:00
littleskunk
ed1f6d7973
satellite/config: move repair override from config to default (#3958)
Co-authored-by: Igor <38665104+ihaid@users.noreply.github.com>
2020-10-28 17:24:39 +02:00
Jessica Grebenschikov
f5880f6833 satellite/orders: rollout phase3 of SettlementWithWindow endpoint
Change-Id: Id19fae4f444c83157ce58c933a18be1898430ad0
2020-10-26 14:56:28 +00:00
Moby von Briesen
7c3afe164b satellite/overlay: uncomment dq for offline and disable with feature flag
Change-Id: Ib39e2be32e880b822a94eddfb81af99a38843a27
2020-10-16 12:55:16 +00:00
Jessica Grebenschikov
205c39d404 satellite/orders: upgrade to phase 2 rollout ordersWithWindow
We are moving an error into rejectErr since its preventing storage nodes from being able to settle other orders.

Change-Id: I3ac97c340e491b127f5e0024c5e8bd9f4df8d5c3
2020-10-15 21:20:19 +00:00
Jeff Wendling
0f0faf0a9f satellite/orders: do a better job limiting concurrent requests
Doing it at the ProcessOrders level was insufficient: the endpoints
make multiple database calls. It was a misguided attempt to only
have one spot enter the semaphore. By putting it in the endpoint
we can not only be sure that the concurrency is correctly limited
but it can be configurable easily.

Change-Id: I937149dd077adf9eb87fce52a1a17dc0afe96f64
2020-10-09 16:27:15 -04:00
Jessica Grebenschikov
4a2c66fa06 satellite/accounting: add cache for getting project storage and bw limits
This PR adds the following items:
1) an in-memory read-only cache thats stores project limit info for projectIDs

This cache is stored in-memory since this is expected to be a small amount of data. In this implementation we are only storing in the cache projects that have been accessed. Currently for the largest Satellite (eu-west) there is about 4500 total projects. So storing the storage limit (int64) and the bandwidth limit (int64), this would end up being about 200kb (including the 32 byte project ID) if all 4500 projectIDs were in the cache. So this all fits in memory for the time being. At some point it may not as usage grows, but that seems years out.

The cache is a read only cache. When requests come in to upload/download a file, we will read from the cache what the current limits are for that project. If the cache does not contain the projectID, it will get the info from the database (satellitedb project table), then add it to the cache.

The only time the values in the cache are modified is when either a) the project ID is not in the cache, or b) the item in the cache has expired (default 10mins), then the data gets refreshed out of the database. This occurs by default every 10 mins. This means that if we update the usage limits in the database, that change might not show up in the cache for 10 mins which mean it will not be reflected to limit end users uploading/downloading files for that time period..

Change-Id: I3fd7056cf963676009834fcbcf9c4a0922ca4a8f
2020-09-25 16:28:49 +00:00
Egon Elbre
888bfaae4b cmd/satellite: only add google profiler to satellite
Previously uplink, storagenode etc. included google cloud profiler,
however they don't need it.

Change-Id: Ibc95cb03d667a3844672eecd49fa455a6acc3866
2020-09-25 18:56:59 +03:00
Stefan Benten
9d0d0ad728 satellite/console: enable multiple projects all users
Change-Id: I42cc9f48cac387e1a67d21c1dd394f28cc5ff399
2020-09-23 16:18:28 +00:00
VitaliiShpital
c4d6f472fc web/satellite: notification bar for reaching projects count limit
WHAT:
notification bar added to project dashboard page. It is shown when projects count limit is reached.
Create project button is removed after creating last available project

WHY:
inform user that their projects count limit was reached

Change-Id: If0d67148003be40cc9eb4d8b25cc17f8204008d4
2020-09-08 15:48:27 +00:00
Egon Elbre
dc48197bd8 satellite/orders: add bucket id to order limit
Change-Id: I9019ec77d692e62ac17b67a1da71dc3535cde50c
2020-09-03 10:50:11 +03:00
Egon Elbre
61b17f1214 satellite/orders: add encryption keys flag to Service
Change-Id: Ie96e75bc96241b799d04654ef5e05b82e6a899bb
2020-09-02 05:02:14 +00:00
Natalie Villasana
95ff29cce1 satellite/metainfo: reduce lookupLimit default to 2500
Change-Id: I6569c6d1f145b127a9e8e1a65e4344dd62c989bb
2020-09-01 12:04:48 -04:00
stefanbenten
4645805b18 private/dbutil: set connMaxLifetime to 30 minutes
To prevent longlived unused connections, set the maximum time to 30 minutes to
prevent proxies and loadbalancers forcefully cutting the connection.
This helps in scenarios with low load/requests to a DB.

Change-Id: I7dba15ef97f6f6541e872a6fb1d3a9bbbfe5bb50
2020-08-28 18:00:41 +00:00
Qweder93
88ff8829a1 satellite/gracefulexit: RecvTimeout increased to 2h, so slow nodes stop receiving lot of fails and as a result DQ
Change-Id: Id4c8a394162ba368aeb573a927f825bf7250aa52
2020-08-24 18:59:24 +03:00
Yingrong Zhao
14ad7a4f1c satellite/metainfo: add limiter for objectdeletion and piecedeletion
services

This PR adds a limiter on the amount of concurrent objects deletion can be handled so
we don't run out of memory.

Change-Id: Id2ce368af6f86845fcdfd34cb2f5e460efe9b272
2020-08-19 16:08:29 +00:00
Qweder93
4ee1b2d45a storagenode/console: added list of all audits per satellite to sno dashboard/satellites
Change-Id: I52e58748d6467f372d9a308347fc77e400d137e2
2020-08-10 12:55:07 +00:00
Moby von Briesen
e02adfe5e9 satellite/overlay/config.go: Add AuditHistoryConfig to overlay
Adds AuditHistory{WindowSize, TrackingPeriod, GracePeriod,
OfflineThreshold}. These values will be used to track offline audits over
time, and to suspend/disqualify nodes for being offline for too long.

Change-Id: I05f7dbc3c034bdc53c4fbd7719c71a44f37ec6a5
2020-08-04 18:18:56 +00:00
Jeff Wendling
85a74b47e7 satellite/orders: 3-phase rollout
This adds a config flag orders.window-endpoint-rollout-phase
that can take on the values phase1, phase2 or phase3.

In phase1, the current orders endpoint continues to work as
usual, and the windowed orders endpoint uses the same backend
as the current one (but also does a bit extra).

In phase2, the current orders endpoint is disabled and the
windowed orders endpoint continues to use the same backend.

In phase3, the current orders endpoint is still disabled and
the windowed orders endpoint uses the new backend that requires
much less database traffic and state.

The intention is to deploy in phase1, roll out code to nodes
to have them use the windowed endpoint, switch to phase2, wait
a couple days for all existing orders to expire, then switch
to phase3.

Additionally, it fixes a bug where a node could submit a bunch
of orders and rack up charges for a bucket.

Change-Id: Ifdc10e09ae1645159cbec7ace687dcb2d594c76d
2020-08-03 17:01:42 +00:00