Commit Graph

462 Commits

Author SHA1 Message Date
Kaloyan Raev
d0612199f0 Merge remote-tracking branch 'origin/main' into multipart-upload
Conflicts:
	go.mod
	go.sum
	satellite/metainfo/config.go
	satellite/metainfo/metainfo_test.go

Change-Id: I95cf3c1d020a7918795b5eec63f36112fdb86749
2021-02-01 14:32:12 +02:00
Cameron Ayer
89e682b4d7 satellite/repair/checker: add 29/80/130-52 to default repair overrides
Change-Id: I2e5a7538fdf33f3869fcb65fc88f7abb10faad79
2021-01-28 16:55:16 -05:00
Ivan Fraixedes
49c8e94480
scripts: Add test Satellite working w/o Redis
Create a storj-sim test that checks that uplinks operations works when
satellite runs and can connect to Redis and when it cannot connect to
simulate a Redis downtime. Also verifies that the satellite can start
despite of Redis being downtime.

This test currently doesn't pass and it will be the one used to verify
the work that has to be done to make sure that the satellite allow the
clients to perform their operations despite of Redis being unavailable.
We require these changes before we deploy any customer face satellite on
a multi-region architecture.

NOTE that this test will be added later on to Jenkins to run this test
every time that we apply changes and at that time we'll see if it has to
be adjusted for being able to run on Jenkins because as it's now it may
not work because the scripts start and stop a Redis docker container.

Change-Id: I22acb22f0ca594583e36b45c88f8c03bac73b329
2021-01-25 16:02:59 +01:00
Jessica Grebenschikov
1f1d9fce58 satellite/console/wasm: add test to confirm wasm size isnt growing
Change-Id: I975a9f8ac3f6b98cc213140fdd7a99557efe14c8
2021-01-21 15:48:49 +00:00
Kaloyan Raev
c24ada7114 Merge remote-tracking branch 'origin/main' into multipart-upload
Conflicts:
	go.mod
	go.sum

Change-Id: Icf7c029e9d800e5f6a9fdd208c36f28e05468690
2021-01-20 17:35:57 +02:00
Cameron Ayer
d14607a5f7 satellite/{contact,nodestats,overlay,satellitedb}: remove references to total_uptime_count and uptime_success_count columns
Change-Id: I1f92022909bc564e9b1e31bf937fdfe7c16554de
2021-01-19 15:43:02 -05:00
Cameron Ayer
75d828200c private,satellite: add chore to dq stray nodes
Full scope:
private/testplanet,satellite/{overlay,satellitedb}

Description:
In most cases, downtime tracking with audits will eventually lead
to DQ for nodes who are unresponsive. However, if a stray node has no
pieces, it will not be audited and will thus never be disqualified.
This chore will check for nodes who have not successfully been contacted
in some set time and DQ them.

There are some new flags for toggling DQ of stray nodes and the timeframes
for running the chore and how long nodes can go without contact.

Change-Id: Ic9d41fdbf214736798925e728245180fb3c55615
2021-01-19 14:21:56 -05:00
Michał Niewrzał
ac058e5ecc metainfo-migration: basic pointerdb->metabase migrator
Change-Id: If183b9c99ce816521fcae6e7189a6f85ddd4eb48
2021-01-12 12:59:53 +00:00
Michał Niewrzał
ad3e3a38c5 Merge 'main' branch
Change-Id: Ia0db1b1f9ef3e0671d3f2208881b0abc3064e200
2021-01-04 12:13:45 +01:00
Stefan Benten
7f1871b8f1 all: switch from master to main 2020-12-28 22:59:06 +01:00
Ethan Adams
6070018021
satellite/overlay: use AS OF SYSTEM TIME with Cockroach
Query nodes table using AS OF SYSTEM TIME '-10s' (by default) when on CRDB to alleviate contention on the nodes table and minimize CRDB retries. Queries for standard uploads are already cached, and node lookups for graceful exit uploads has retry logic so it isn't necessary for the nodes returned to be current.
2020-12-22 21:07:07 +02:00
Michal Niewrzal
9a8959d429 Merge 'master' branch
Change-Id: Iba69ea73ca4d3f1cd4ae94243eaaae033c5324e8
2020-12-22 14:55:57 +01:00
Stefan Benten
d14f2e4164
Makefile,scripts: move from aarch64 to arm64v8 (#4008)
It turns out that alpine dropped support/updates for the aarch64 image.
Instead they have been using the arm64v8 notation for quite a while,
which resulted in breaking our recent aarch64 builds due to missing
dependencies/updates.

Both arches are exactly the same, aarch64 was created originally by Apple
and arm64 by GNU. The backends have been merged by now and the arm64 became
the de facto standard.
2020-12-21 14:50:57 +01:00
Jessica Grebenschikov
d961437889 satellite/orders: remove the config IncludeEncryptedMetadata
Since the Satellite now requires the order encryption functionality (since serial_number table is deprecated) to properly function, we can remove the config flag to turn on/off the feature.

Change-Id: Ie973f72a9a05a81cef9e53dc9c99d22c940c2488
2020-12-18 10:39:29 -08:00
Jessica Grebenschikov
da0327c9b7 satellite/dbcleanup: remove expired serial chore
Change-Id: Ib71d41eb6679d6435e5bc10b6244dac66380a74e
2020-12-18 09:36:28 -08:00
Jessica Grebenschikov
97a5e6c814 satellite/orders: stop inserting/reading from serial_numbers table
This PR contains the minimum changes needed to stop inserting into the serial_numbers table. This is the first step in completely deprecating that table.
The next step is to create another PR to remove the expiredSerial chore, fix more tests, and remove any other methods on the serial_number table.

Change-Id: I5f12a56ebf3fa4d1a1976141d2911f25a98d2cc3
2020-12-18 08:35:13 -08:00
Kaloyan Raev
ce20db9f68 scripts/testdata: update satellite-config.yaml.lock
Change-Id: I6545a75b1de9834ec35ee172cf5db3daa7243295
2020-12-18 12:00:48 +02:00
Michal Niewrzal
2111740236 Merge 'master' branch
Change-Id: Ib73af0ff3ce0e9a1547b0b9fc55bf88704f6f394
2020-12-18 09:13:24 +01:00
littleskunk
2437d5b171
satellite/access-grants: default auth service url (#4002)
* satellite/access-grants: default auth service url
2020-12-17 23:38:16 +01:00
littleskunk
3feee9f4f8
satellite/accounting: default project limits (#4001) 2020-12-17 22:27:05 +01:00
VitaliiShpital
79a3a47805 build: added brotli compression for wasm bits
WHAT:
added brotli compression for wasm files and added copying of those files to static/wasm folder in Dockerfile

WHY:
those files are a part of web worker webpack bundle and I didn't find a way to compress them separately using webpack.
I'm open to any other ideas if they come up

Change-Id: I105cc1582e9816fd9b63052ba48358525c85a164
2020-12-17 19:23:53 +00:00
Michal Niewrzal
2381ca2810 Merge 'master' branch
Change-Id: I4a3e45a2a2cdacfd87d16b148cfb4c6671c20b15
2020-12-17 13:17:17 +01:00
Ivan Fraixedes
187680b0c1
scripts: Fix typo in a comment
Change-Id: If79e778e786db06d2263bcd5393f639a4bb92542
2020-12-15 18:58:54 +01:00
Michal Niewrzal
57f374af24 Merge 'master' branch
Change-Id: Idf6b10ea7ca94e4d232e6a3b6a38ef2e646ba197
2020-12-15 08:26:53 +01:00
Stefan Benten
8fe829d5fd
build: add wasm bits to Dockerfile and bump to go v1.15.6 (#3992) 2020-12-11 02:23:39 +01:00
Michal Niewrzal
218bbeaffa Merge 'master' branch
Change-Id: Ica5c25607a951076dd9f77e35e308062f71ce3f0
2020-12-07 15:05:52 +01:00
Yingrong Zhao
746315672f scripts/tests/testversions: fix indentation
Change-Id: Iaa5aec27f0ad78e1d8bf1a68aa5a62762c8ab537
2020-12-04 21:54:55 +00:00
Yingrong Zhao
0faf7d5293 scripts/tests/testversions: fix race in install_sim
Change-Id: I0792686d99a222d5977fd913425e2b94d100c40e
2020-12-04 18:18:14 +00:00
Kaloyan Raev
cc9e9ee1f5 storj-sim: use gateway from multipart-upload branch
Change-Id: I0886d277b3b757c8b00975a3e95c2d0d1228488b
2020-12-04 15:15:47 +02:00
Yingrong Zhao
13555f3983 scripts/tests/testversions: enable concurrent installation for each
version

The test-versions test currently takes 1h 40min to run each time. By
running each installation concurrently, hopefully, it will reduce the execution
time for the whole test.

Change-Id: I680c7d9945e982894b11825c9075c167f754e087
2020-12-03 15:01:37 +00:00
Moby von Briesen
3fc76f4ffe satellite/downtime: Remove deprecated downtime tracking service.
We are no longer planning on implementing downtime penalization using
the method described in
docs/blueprints/archive/storage-node-downtime-tracking-deprecated.md.
Now, we are implementing the design described in
docs/blueprints/storage-node-downtime-tracking-with-audits.md.

This change removes the downtime estimation chores from the satellite
core as well as the package satellite/downtime. A future change will
remove the database table.

Change-Id: I1a1d3cf9dceeba36255d25243294865b89925518
2020-12-02 15:16:13 -05:00
VitaliiShpital
bb7677a85f web/satellite: get gateway credentials request using url from config
WHAT:
POST request to get gateway credentials using access grant.
Put request url to config and use it for request.

WHY:
to show gateway credentials on UI

Change-Id: I15ef43ecdeed69b0961d5796aacb47f36d560b1b
2020-11-30 10:36:23 +00:00
JT Olio
6bce907cb0 satellite: try to stream rollups to aggregation function to use less memory
this change tries really hard to never have all of the storage node
rollups in memory at the same time, up until the rollups are actually
getting summed together.

Change-Id: If67f49e7d71106798d996a6850b3e48671bd9e18
2020-11-29 10:26:32 -07:00
JT Olio
6aae21541f satellitedb: do saverollup in batches
Change-Id: I78278a192cba60541eee2986f54a88d5a479bd3e
2020-11-28 19:26:46 -07:00
Moby von Briesen
575f50df84 satellite/repair: Update repair override config to support multiple RS schemes.
Rather than having a single repair override value, we will now support
repair override values based on a particular segment's RS scheme.

The new format for RS override values is
"k/o/n-override,k/o/n-override..."

Change-Id: Ieb422638446ef3a9357d59b2d279ee941367604d
2020-11-23 18:01:15 +00:00
Ethan
2b92bba563 satellite/satellitedb/orders: Handle serial_numbers deletes in smaller increments on CRDB
CRDB doesn't like large deletes. While testing in the POC environment we found that deletes on the serial_numbers table could take hours.  This change limits deletes to 1000 at a time (configurable) to avoid blocking other queries.

Change-Id: I08455e25db1574579dd4d7b7125a08e9c913dff1
2020-11-20 13:44:52 +00:00
Egon Elbre
e19fabc880 scripts/tests/rollingupgrade: fix typo in flag
Change-Id: Ia3e8d076741a30fb2a42af9b2621796a814c75ae
2020-11-18 19:15:42 +00:00
Egon Elbre
8da5e6a554 scripts/tests/rollingupgrade: use wait-for instead of sleep
Change-Id: Ie879e061d3b312705726375953767d420e922073
2020-11-18 12:00:16 +00:00
Moby von Briesen
0ec685b173 satellite/{satellitedb, repair/{queue, checker}}: Use new column "segmentHealth" instead of "numHealthy" in injured segments queue
We plan to add support for a new Reed-Solomon scheme soon, but our
repair queue orders segments by least number of healthy pieces first.
With a second RS scheme, fewer healthy pieces will not necessarily
correlate to lower health.

This change just adds the new column in a migration. A separate change
will add the new health function.

Right now, since we only support one RS scheme, behavior will not
change. Number of healthy pieces is being inserted as "segment health"
until the new health function is merged.

Segment health is calculated with a new priority function created in
commit 3e5640359. In order to use the function, a new config value is
added, called NodeFailureRate, representing the approximate probability
of any individual node going down in the duration of one checker run.

Change-Id: I51c4202203faf52528d923befbe886dbf86d02f2
2020-11-16 21:18:09 +00:00
littleskunk
9ab824d3e6
jenkins/rollingupgrade: sleep 5 seconds between old api startup and database migration (#3971) 2020-11-16 21:25:11 +01:00
Egon Elbre
1726b39ed2 scripts: remove thrift mod replace
Currently our code is only using github version of the code, so there
shouldn't be need for the exception.

Change-Id: I0c6e8a8465ab7b525d4b5d1b29e4e5298384286d
2020-11-16 12:00:09 +02:00
Yingrong Zhao
54c5d564a1 scripts/tests/testversions: fix older uplink setup
This PR does follwing changes:
    1. Change oldest uplink version in the test to v0.35.3
        When the test is first created, we decided to support uplink
        version starting from v0.17.1, however with many API changes,
        older uplinks are not usable with latest version of the network
        anymore. One of the reasons being older uplinks uses deprecated
        endpoint. Therefore, we will change the oldest uplink version to
        the one that's using only new endpoints.
    2. Disable tls certificate verification in uplink
    3. Use storj-sim version control server instead of production one
    4. Skip uplink version v1.3.x due to bug in that release

Change-Id: I926a6bb9829cb7181ee752437cdcb67e59197fe0
2020-11-11 17:00:01 -05:00
Yingrong Zhao
8fd841b910 scripts/tests/testversions: fix installation during setup
This PR fixes below issues:
1. remove concurrent installation for various versions
    We were doing this to decrease the amount of execution time the versions test.
    However, it's returning incorrect exit code when there's an
    installation failure.
    Right now, we are only installing two versions of `storj-sim` and
    the rest are only doing uplink cli installation. The performance of
    this test should be hugely impacted by the setup step now.
2. only remove release settings instead of deleting the entire file
    uplink CLI is referrencing `private/version` package. Therefore, we
    cannot delete it
3. add back `GATEWAY_0_API_KEY` in storj-sim
    In order to set up older version of uplink cli, we need access to
    the gate way api key.

Change-Id: Ia3c37c197bd007b6e1f7c2bd71adde42181d46f0
2020-11-10 20:38:49 +00:00
Moby von Briesen
db6bc6503d satellite/metainfo: Update metainfo RS config to more easily support multiple RS schemes.
Make metainfo.RSConfig a valid pflag config value. This allows us to
configure the RSConfig as a string like k/m/o/n-shareSize, which makes
having multiple supported RS schemes easier in the future.

RS-related config values that are no longer needed have been removed
(MinTotalThreshold, MaxTotalThreshold, MaxBufferMem, Verify).

Change-Id: I0178ae467dcf4375c504e7202f31443d627c15e1
2020-11-09 22:16:13 +00:00
littleskunk
ed1f6d7973
satellite/config: move repair override from config to default (#3958)
Co-authored-by: Igor <38665104+ihaid@users.noreply.github.com>
2020-10-28 17:24:39 +02:00
Jessica Grebenschikov
99c88efbbf scripts/tests: fix gateway tests
Change-Id: I9a23ef08794043ad615066ae5929df9ff3a02d69
2020-10-27 08:21:28 -07:00
Jessica Grebenschikov
f5880f6833 satellite/orders: rollout phase3 of SettlementWithWindow endpoint
Change-Id: Id19fae4f444c83157ce58c933a18be1898430ad0
2020-10-26 14:56:28 +00:00
Yingrong Zhao
746cbfc659 scripts/tests/rollingupgrade: test current release version on master
branch

Currently, we are testing previous release version upgrading to latest
master on each master build
However, this behavior is only desired when the test is running on a
release branch.

Change-Id: Iaeb66f44951c9e4934ca3c8316d1e490d7958239
2020-10-22 11:45:54 -04:00
Moby von Briesen
7c3afe164b satellite/overlay: uncomment dq for offline and disable with feature flag
Change-Id: Ib39e2be32e880b822a94eddfb81af99a38843a27
2020-10-16 12:55:16 +00:00
Jessica Grebenschikov
205c39d404 satellite/orders: upgrade to phase 2 rollout ordersWithWindow
We are moving an error into rejectErr since its preventing storage nodes from being able to settle other orders.

Change-Id: I3ac97c340e491b127f5e0024c5e8bd9f4df8d5c3
2020-10-15 21:20:19 +00:00