Commit Graph

437 Commits

Author SHA1 Message Date
Michal Niewrzal
218bbeaffa Merge 'master' branch
Change-Id: Ica5c25607a951076dd9f77e35e308062f71ce3f0
2020-12-07 15:05:52 +01:00
Yingrong Zhao
746315672f scripts/tests/testversions: fix indentation
Change-Id: Iaa5aec27f0ad78e1d8bf1a68aa5a62762c8ab537
2020-12-04 21:54:55 +00:00
Yingrong Zhao
0faf7d5293 scripts/tests/testversions: fix race in install_sim
Change-Id: I0792686d99a222d5977fd913425e2b94d100c40e
2020-12-04 18:18:14 +00:00
Kaloyan Raev
cc9e9ee1f5 storj-sim: use gateway from multipart-upload branch
Change-Id: I0886d277b3b757c8b00975a3e95c2d0d1228488b
2020-12-04 15:15:47 +02:00
Yingrong Zhao
13555f3983 scripts/tests/testversions: enable concurrent installation for each
version

The test-versions test currently takes 1h 40min to run each time. By
running each installation concurrently, hopefully, it will reduce the execution
time for the whole test.

Change-Id: I680c7d9945e982894b11825c9075c167f754e087
2020-12-03 15:01:37 +00:00
Moby von Briesen
3fc76f4ffe satellite/downtime: Remove deprecated downtime tracking service.
We are no longer planning on implementing downtime penalization using
the method described in
docs/blueprints/archive/storage-node-downtime-tracking-deprecated.md.
Now, we are implementing the design described in
docs/blueprints/storage-node-downtime-tracking-with-audits.md.

This change removes the downtime estimation chores from the satellite
core as well as the package satellite/downtime. A future change will
remove the database table.

Change-Id: I1a1d3cf9dceeba36255d25243294865b89925518
2020-12-02 15:16:13 -05:00
VitaliiShpital
bb7677a85f web/satellite: get gateway credentials request using url from config
WHAT:
POST request to get gateway credentials using access grant.
Put request url to config and use it for request.

WHY:
to show gateway credentials on UI

Change-Id: I15ef43ecdeed69b0961d5796aacb47f36d560b1b
2020-11-30 10:36:23 +00:00
JT Olio
6bce907cb0 satellite: try to stream rollups to aggregation function to use less memory
this change tries really hard to never have all of the storage node
rollups in memory at the same time, up until the rollups are actually
getting summed together.

Change-Id: If67f49e7d71106798d996a6850b3e48671bd9e18
2020-11-29 10:26:32 -07:00
JT Olio
6aae21541f satellitedb: do saverollup in batches
Change-Id: I78278a192cba60541eee2986f54a88d5a479bd3e
2020-11-28 19:26:46 -07:00
Moby von Briesen
575f50df84 satellite/repair: Update repair override config to support multiple RS schemes.
Rather than having a single repair override value, we will now support
repair override values based on a particular segment's RS scheme.

The new format for RS override values is
"k/o/n-override,k/o/n-override..."

Change-Id: Ieb422638446ef3a9357d59b2d279ee941367604d
2020-11-23 18:01:15 +00:00
Ethan
2b92bba563 satellite/satellitedb/orders: Handle serial_numbers deletes in smaller increments on CRDB
CRDB doesn't like large deletes. While testing in the POC environment we found that deletes on the serial_numbers table could take hours.  This change limits deletes to 1000 at a time (configurable) to avoid blocking other queries.

Change-Id: I08455e25db1574579dd4d7b7125a08e9c913dff1
2020-11-20 13:44:52 +00:00
Egon Elbre
e19fabc880 scripts/tests/rollingupgrade: fix typo in flag
Change-Id: Ia3e8d076741a30fb2a42af9b2621796a814c75ae
2020-11-18 19:15:42 +00:00
Egon Elbre
8da5e6a554 scripts/tests/rollingupgrade: use wait-for instead of sleep
Change-Id: Ie879e061d3b312705726375953767d420e922073
2020-11-18 12:00:16 +00:00
Moby von Briesen
0ec685b173 satellite/{satellitedb, repair/{queue, checker}}: Use new column "segmentHealth" instead of "numHealthy" in injured segments queue
We plan to add support for a new Reed-Solomon scheme soon, but our
repair queue orders segments by least number of healthy pieces first.
With a second RS scheme, fewer healthy pieces will not necessarily
correlate to lower health.

This change just adds the new column in a migration. A separate change
will add the new health function.

Right now, since we only support one RS scheme, behavior will not
change. Number of healthy pieces is being inserted as "segment health"
until the new health function is merged.

Segment health is calculated with a new priority function created in
commit 3e5640359. In order to use the function, a new config value is
added, called NodeFailureRate, representing the approximate probability
of any individual node going down in the duration of one checker run.

Change-Id: I51c4202203faf52528d923befbe886dbf86d02f2
2020-11-16 21:18:09 +00:00
littleskunk
9ab824d3e6
jenkins/rollingupgrade: sleep 5 seconds between old api startup and database migration (#3971) 2020-11-16 21:25:11 +01:00
Egon Elbre
1726b39ed2 scripts: remove thrift mod replace
Currently our code is only using github version of the code, so there
shouldn't be need for the exception.

Change-Id: I0c6e8a8465ab7b525d4b5d1b29e4e5298384286d
2020-11-16 12:00:09 +02:00
Yingrong Zhao
54c5d564a1 scripts/tests/testversions: fix older uplink setup
This PR does follwing changes:
    1. Change oldest uplink version in the test to v0.35.3
        When the test is first created, we decided to support uplink
        version starting from v0.17.1, however with many API changes,
        older uplinks are not usable with latest version of the network
        anymore. One of the reasons being older uplinks uses deprecated
        endpoint. Therefore, we will change the oldest uplink version to
        the one that's using only new endpoints.
    2. Disable tls certificate verification in uplink
    3. Use storj-sim version control server instead of production one
    4. Skip uplink version v1.3.x due to bug in that release

Change-Id: I926a6bb9829cb7181ee752437cdcb67e59197fe0
2020-11-11 17:00:01 -05:00
Yingrong Zhao
8fd841b910 scripts/tests/testversions: fix installation during setup
This PR fixes below issues:
1. remove concurrent installation for various versions
    We were doing this to decrease the amount of execution time the versions test.
    However, it's returning incorrect exit code when there's an
    installation failure.
    Right now, we are only installing two versions of `storj-sim` and
    the rest are only doing uplink cli installation. The performance of
    this test should be hugely impacted by the setup step now.
2. only remove release settings instead of deleting the entire file
    uplink CLI is referrencing `private/version` package. Therefore, we
    cannot delete it
3. add back `GATEWAY_0_API_KEY` in storj-sim
    In order to set up older version of uplink cli, we need access to
    the gate way api key.

Change-Id: Ia3c37c197bd007b6e1f7c2bd71adde42181d46f0
2020-11-10 20:38:49 +00:00
Moby von Briesen
db6bc6503d satellite/metainfo: Update metainfo RS config to more easily support multiple RS schemes.
Make metainfo.RSConfig a valid pflag config value. This allows us to
configure the RSConfig as a string like k/m/o/n-shareSize, which makes
having multiple supported RS schemes easier in the future.

RS-related config values that are no longer needed have been removed
(MinTotalThreshold, MaxTotalThreshold, MaxBufferMem, Verify).

Change-Id: I0178ae467dcf4375c504e7202f31443d627c15e1
2020-11-09 22:16:13 +00:00
littleskunk
ed1f6d7973
satellite/config: move repair override from config to default (#3958)
Co-authored-by: Igor <38665104+ihaid@users.noreply.github.com>
2020-10-28 17:24:39 +02:00
Jessica Grebenschikov
99c88efbbf scripts/tests: fix gateway tests
Change-Id: I9a23ef08794043ad615066ae5929df9ff3a02d69
2020-10-27 08:21:28 -07:00
Jessica Grebenschikov
f5880f6833 satellite/orders: rollout phase3 of SettlementWithWindow endpoint
Change-Id: Id19fae4f444c83157ce58c933a18be1898430ad0
2020-10-26 14:56:28 +00:00
Yingrong Zhao
746cbfc659 scripts/tests/rollingupgrade: test current release version on master
branch

Currently, we are testing previous release version upgrading to latest
master on each master build
However, this behavior is only desired when the test is running on a
release branch.

Change-Id: Iaeb66f44951c9e4934ca3c8316d1e490d7958239
2020-10-22 11:45:54 -04:00
Moby von Briesen
7c3afe164b satellite/overlay: uncomment dq for offline and disable with feature flag
Change-Id: Ib39e2be32e880b822a94eddfb81af99a38843a27
2020-10-16 12:55:16 +00:00
Jessica Grebenschikov
205c39d404 satellite/orders: upgrade to phase 2 rollout ordersWithWindow
We are moving an error into rejectErr since its preventing storage nodes from being able to settle other orders.

Change-Id: I3ac97c340e491b127f5e0024c5e8bd9f4df8d5c3
2020-10-15 21:20:19 +00:00
Egon Elbre
f06ce1ef01 release: remove binary stripping
We've had bizarre crashes for satellite and it's difficult to debug
stripped binaries. The only binary where stripping is useful, is uplink
cli. This change will increase uplink by 5MB.

Change-Id: I4d1dfd36452063c22e8471d99eec97f6de6167b8
2020-10-14 10:06:20 +03:00
Jeff Wendling
0f0faf0a9f satellite/orders: do a better job limiting concurrent requests
Doing it at the ProcessOrders level was insufficient: the endpoints
make multiple database calls. It was a misguided attempt to only
have one spot enter the semaphore. By putting it in the endpoint
we can not only be sure that the concurrency is correctly limited
but it can be configurable easily.

Change-Id: I937149dd077adf9eb87fce52a1a17dc0afe96f64
2020-10-09 16:27:15 -04:00
Kaloyan Raev
bd177bff03 cmd/storj-sim: cleanup gateway setup
Remove usage of --non-interactive flag. It is not provided (and
necessary) by the multitenant S3 gateway anymore.

ACCESS_KEY and SECRET_KEY env vars are not provided anymore as they are
not generated by the multitenant S3 gateway.

Change-Id: I3ecfb92110e31ae13977de3899dad273daae6c1e
2020-10-06 14:22:47 +00:00
Egon Elbre
d508c4c985 scripts: remove duplicate check config lock
We have configlock_test for checking changes so the separate script and
makefile target are not needed.

Change-Id: I2bbc1c21ad849c9b7ec8bba43c0e11e94e04f6a6
2020-09-29 10:03:34 +00:00
Jessica Grebenschikov
4a2c66fa06 satellite/accounting: add cache for getting project storage and bw limits
This PR adds the following items:
1) an in-memory read-only cache thats stores project limit info for projectIDs

This cache is stored in-memory since this is expected to be a small amount of data. In this implementation we are only storing in the cache projects that have been accessed. Currently for the largest Satellite (eu-west) there is about 4500 total projects. So storing the storage limit (int64) and the bandwidth limit (int64), this would end up being about 200kb (including the 32 byte project ID) if all 4500 projectIDs were in the cache. So this all fits in memory for the time being. At some point it may not as usage grows, but that seems years out.

The cache is a read only cache. When requests come in to upload/download a file, we will read from the cache what the current limits are for that project. If the cache does not contain the projectID, it will get the info from the database (satellitedb project table), then add it to the cache.

The only time the values in the cache are modified is when either a) the project ID is not in the cache, or b) the item in the cache has expired (default 10mins), then the data gets refreshed out of the database. This occurs by default every 10 mins. This means that if we update the usage limits in the database, that change might not show up in the cache for 10 mins which mean it will not be reflected to limit end users uploading/downloading files for that time period..

Change-Id: I3fd7056cf963676009834fcbcf9c4a0922ca4a8f
2020-09-25 16:28:49 +00:00
Egon Elbre
888bfaae4b cmd/satellite: only add google profiler to satellite
Previously uplink, storagenode etc. included google cloud profiler,
however they don't need it.

Change-Id: Ibc95cb03d667a3844672eecd49fa455a6acc3866
2020-09-25 18:56:59 +03:00
Stefan Benten
9d0d0ad728 satellite/console: enable multiple projects all users
Change-Id: I42cc9f48cac387e1a67d21c1dd394f28cc5ff399
2020-09-23 16:18:28 +00:00
VitaliiShpital
c4d6f472fc web/satellite: notification bar for reaching projects count limit
WHAT:
notification bar added to project dashboard page. It is shown when projects count limit is reached.
Create project button is removed after creating last available project

WHY:
inform user that their projects count limit was reached

Change-Id: If0d67148003be40cc9eb4d8b25cc17f8204008d4
2020-09-08 15:48:27 +00:00
Egon Elbre
dc48197bd8 satellite/orders: add bucket id to order limit
Change-Id: I9019ec77d692e62ac17b67a1da71dc3535cde50c
2020-09-03 10:50:11 +03:00
Yingrong Zhao
af773ec8a6 cmd/uplink: use DeleteBucketWithObjects for bucket force deletion
This PR updates `uplink rb --force` command to use the new libuplink API
`DeleteBucketWithObjects`.
It also updates `DeleteBucket` endpoint to return a specific error
message when a given bucket has concurrent writes while being deleted.

Change-Id: Ic9593d55b0c27b26cd8966dd1bc8cd1e02a6666e
2020-09-02 16:39:20 +00:00
Egon Elbre
61b17f1214 satellite/orders: add encryption keys flag to Service
Change-Id: Ie96e75bc96241b799d04654ef5e05b82e6a899bb
2020-09-02 05:02:14 +00:00
Natalie Villasana
95ff29cce1 satellite/metainfo: reduce lookupLimit default to 2500
Change-Id: I6569c6d1f145b127a9e8e1a65e4344dd62c989bb
2020-09-01 12:04:48 -04:00
stefanbenten
4645805b18 private/dbutil: set connMaxLifetime to 30 minutes
To prevent longlived unused connections, set the maximum time to 30 minutes to
prevent proxies and loadbalancers forcefully cutting the connection.
This helps in scenarios with low load/requests to a DB.

Change-Id: I7dba15ef97f6f6541e872a6fb1d3a9bbbfe5bb50
2020-08-28 18:00:41 +00:00
Qweder93
88ff8829a1 satellite/gracefulexit: RecvTimeout increased to 2h, so slow nodes stop receiving lot of fails and as a result DQ
Change-Id: Id4c8a394162ba368aeb573a927f825bf7250aa52
2020-08-24 18:59:24 +03:00
Yingrong Zhao
14ad7a4f1c satellite/metainfo: add limiter for objectdeletion and piecedeletion
services

This PR adds a limiter on the amount of concurrent objects deletion can be handled so
we don't run out of memory.

Change-Id: Id2ce368af6f86845fcdfd34cb2f5e460efe9b272
2020-08-19 16:08:29 +00:00
Qweder93
4ee1b2d45a storagenode/console: added list of all audits per satellite to sno dashboard/satellites
Change-Id: I52e58748d6467f372d9a308347fc77e400d137e2
2020-08-10 12:55:07 +00:00
Moby von Briesen
e02adfe5e9 satellite/overlay/config.go: Add AuditHistoryConfig to overlay
Adds AuditHistory{WindowSize, TrackingPeriod, GracePeriod,
OfflineThreshold}. These values will be used to track offline audits over
time, and to suspend/disqualify nodes for being offline for too long.

Change-Id: I05f7dbc3c034bdc53c4fbd7719c71a44f37ec6a5
2020-08-04 18:18:56 +00:00
Jeff Wendling
85a74b47e7 satellite/orders: 3-phase rollout
This adds a config flag orders.window-endpoint-rollout-phase
that can take on the values phase1, phase2 or phase3.

In phase1, the current orders endpoint continues to work as
usual, and the windowed orders endpoint uses the same backend
as the current one (but also does a bit extra).

In phase2, the current orders endpoint is disabled and the
windowed orders endpoint continues to use the same backend.

In phase3, the current orders endpoint is still disabled and
the windowed orders endpoint uses the new backend that requires
much less database traffic and state.

The intention is to deploy in phase1, roll out code to nodes
to have them use the windowed endpoint, switch to phase2, wait
a couple days for all existing orders to expire, then switch
to phase3.

Additionally, it fixes a bug where a node could submit a bunch
of orders and rack up charges for a bucket.

Change-Id: Ifdc10e09ae1645159cbec7ace687dcb2d594c76d
2020-08-03 17:01:42 +00:00
Rafael Gomes
935f44ddb7 satellite/metainfo: Add Delete Service config
Change-Id: I0a6e3ce1adfe1488eb23da9dda92877af1834599
2020-08-03 14:28:02 +00:00
Kaloyan Raev
edfd3d7661 satellite/payments: delete credits and credits_spendings db tables
Jira: https://storjlabs.atlassian.net/browse/USR-822

This the last step of dropping these 2 db tables. It also deletes all
code associate with them.

Change-Id: I8be840dc2a7be255cf6308c9434b729fe4d9391e
2020-07-30 12:19:57 +03:00
Bill Thorp
b265b7f555 satellite/console: make paywall optional
Add a config so that some percent of users require credit cards /
account balances
in order to create a project or have a promotional coupon applied

UI was updated to match needed paywall status

At this point we decided not to use a field to store if a user is in an
A/B
test, and instead just use math to see if they're in a test.  We decided
to use MD5 (because its in Postgres too) and User UUID for that math.

Change-Id: I0fcd80707dc29afc668632d078e1b5a7a24f3bb3
2020-07-28 10:57:49 +00:00
Egon Elbre
ba4c3d9986 satellite/orders: remove unused node status logging flag
Change-Id: I24da78a11cc5d3d88cdf6aca85c4238e4086e59c
2020-07-24 16:35:59 +03:00
Ethan
cfca021839 satellite/accounting: Add chore to cleanup old project bandwidth rollups data
Removes old project_bandwidth_rollups records that are no longer used.

Uses a retain months configuration to determine how many months to save.  Current month cannot be removed.
Tests retainMonths=-1, 0, 2

Change-Id: Ia4be2546cdb28802427acf41ecd85ad66df3e62c
2020-07-22 18:56:49 +00:00
Egon Elbre
e70da5cd4e all: fix comments
Change-Id: I2d2307e3fab87de47a72b3595d051e2c95ff4f8a
2020-07-16 19:13:14 +03:00
Egon Elbre
080ba47a06 all: fix dots
Change-Id: I6a419c62700c568254ff67ae5b73efed2fc98aa2
2020-07-16 14:58:28 +00:00