Commit Graph

4767 Commits

Author SHA1 Message Date
JT Olio
efde103dba accounting: rollup test is broken for the hour before midnight UTC
this change isn't the real fix. it's just ignoring the problem.

i don't know what the real fix is. is the problem with the test, or
is there actually a problem with the rollup code?

Change-Id: I552bdd947deadc212cc56efc5f818942b9827126
2020-12-22 14:14:52 -07:00
Ethan Adams
6070018021
satellite/overlay: use AS OF SYSTEM TIME with Cockroach
Query nodes table using AS OF SYSTEM TIME '-10s' (by default) when on CRDB to alleviate contention on the nodes table and minimize CRDB retries. Queries for standard uploads are already cached, and node lookups for graceful exit uploads has retry logic so it isn't necessary for the nodes returned to be current.
2020-12-22 21:07:07 +02:00
Jennifer Johnson
7cccbdb766 cmd/uplink/share: add -- to flags referenced in help statements
Change-Id: I86cd08d51c5306effec14f338f37c53c2743d6b2
2020-12-22 17:03:00 +00:00
Bill Thorp
1b0424cad6 uplink/cmd: Export a RegisterAccess method.
Gateway-MT requires integration tests, which would be aided by having an
exported RegisterAccess() method in uplink/cmd.

To support this change, a little of the Uplink cmd logic was shifted around
and a method was made public.  I also normalized finding the access
between accessInspect and accessRegister.

Change-Id: I29369296521c2cc179e27233f5451b95f46109d8
2020-12-22 14:52:40 +00:00
Jeff Wendling
876e1be3b5 Blueprint: Sparse Order Storage
Change-Id: I1cf6f3bda84c9b9d5ccfbbaf51813d6ea9a65679
2020-12-21 18:04:54 +00:00
Ethan Adams
563197c628
satellite/overlay: Add index on nodes table (#4012)
satellite/accounting: Add index for project_id on bucket_storage_tallies
2020-12-21 12:48:48 -05:00
crawter
c78fdf4727 web/multinode: base router implemented
Change-Id: I831d087a87e05f6055646419b14d15e34d5de6b1
2020-12-21 16:42:36 +00:00
Ethan Adams
9b52283570
satellite/accounting: Add index for project_id on bucket_storage_tallies (#4010)
Change-Id: I47ab2d1e24f94307c3383c497cffe2a150fa8ab7
2020-12-21 11:42:00 -05:00
Ethan Adams
6e501898c3
satellite/accounting: Performance improvements to getNodeIds used by GetBandwidthSince (#4009) 2020-12-21 16:37:01 +01:00
Stefan Benten
d14f2e4164
Makefile,scripts: move from aarch64 to arm64v8 (#4008)
It turns out that alpine dropped support/updates for the aarch64 image.
Instead they have been using the arm64v8 notation for quite a while,
which resulted in breaking our recent aarch64 builds due to missing
dependencies/updates.

Both arches are exactly the same, aarch64 was created originally by Apple
and arm64 by GNU. The backends have been merged by now and the arm64 became
the de facto standard.
2020-12-21 14:50:57 +01:00
Stefan Benten
866ce478bf
build: update node to v14.15.3 (#4007) 2020-12-20 17:20:19 +01:00
Stefan Benten
7eab859030
cmd: ensure proper arch is used for docker container 2020-12-20 09:26:23 +02:00
Jessica Grebenschikov
d961437889 satellite/orders: remove the config IncludeEncryptedMetadata
Since the Satellite now requires the order encryption functionality (since serial_number table is deprecated) to properly function, we can remove the config flag to turn on/off the feature.

Change-Id: Ie973f72a9a05a81cef9e53dc9c99d22c940c2488
2020-12-18 10:39:29 -08:00
Jessica Grebenschikov
da0327c9b7 satellite/dbcleanup: remove expired serial chore
Change-Id: Ib71d41eb6679d6435e5bc10b6244dac66380a74e
2020-12-18 09:36:28 -08:00
Jessica Grebenschikov
97a5e6c814 satellite/orders: stop inserting/reading from serial_numbers table
This PR contains the minimum changes needed to stop inserting into the serial_numbers table. This is the first step in completely deprecating that table.
The next step is to create another PR to remove the expiredSerial chore, fix more tests, and remove any other methods on the serial_number table.

Change-Id: I5f12a56ebf3fa4d1a1976141d2911f25a98d2cc3
2020-12-18 08:35:13 -08:00
NickolaiYurchenko
aaa4a9f31b web/storagenode: overused disk space added to chart
What: diskspace utilization extended with overused value.

Why: to show overused space on chart.

Change-Id: I84adf0d0cc94b19026d97655cb3060aad4560860
2020-12-18 14:58:55 +00:00
VitaliiShpital
f645654d2e web/satellite: access grant CLI step copy token fix
WHAT:
copy button now copies restricted key instead of regular

WHY:
bug fix

Change-Id: I6696dfa4b5d804a64a6d7b49aa443ba16043e466
2020-12-18 16:22:11 +02:00
Stefan Benten
e1456bc53f
Makefile: build wasm files only once (#4005) 2020-12-18 01:59:27 +01:00
littleskunk
2437d5b171
satellite/access-grants: default auth service url (#4002)
* satellite/access-grants: default auth service url
2020-12-17 23:38:16 +01:00
paul cannon
d3604a5e90 satellite/repair: use survivability model for segment health
The chief segment health models we've come up with are the "immediate
danger" model and the "survivability" model. The former calculates the
chance of losing a segment becoming lost in the next time period (using
the CDF of the binomial distribution to estimate the chance of x nodes
failing in that period), while the latter estimates the number of
iterations for which a segment can be expected to survive (using the
mean of the negative binomial distribution). The immediate danger model
was a promising one for comparing segment health across segments with
different RS parameters, as it is more precisely what we want to
prevent, but it turns out that practically all segments in production
have infinite health, as the chance of losing segments with any
reasonable estimate of node failure rate is smaller than DBL_EPSILON,
the smallest possible difference from 1.0 representable in a float64
(about 1e-16).

Leaving aside the wisdom of worrying about the repair of segments that
have less than a 1e-16 chance of being lost, we want to be extremely
conservative and proactive in our repair efforts, and the health of the
segments we have been repairing thus far also evaluates to infinity
under the immediate danger model. Thus, we find ourselves reaching for
an alternative.

Dr. Ben saves the day: the survivability model is a reasonably close
approximation of the immediate danger model, and even better, it is
far simpler to calculate and yields manageable values for real-world
segments. The downside to it is that it requires as input an estimate
of the total number of active nodes.

This change replaces the segment health calculation to use the
survivability model, and reinstates the call to SegmentHealth() where it
was reverted. It gets estimates for the total number of active nodes by
leveraging the reliability cache.

Change-Id: Ia5d9b9031b9f6cf0fa7b9005a7011609415527dc
2020-12-17 21:30:17 +00:00
littleskunk
3feee9f4f8
satellite/accounting: default project limits (#4001) 2020-12-17 22:27:05 +01:00
Cameron Ayer
28eaae66af satellite/satellitedb: drop num_healthy_pieces column from injuredsegments
This column is no longer used as it has been replaced by the segment_health
column.

Change-Id: I6b4df89cd4f994d8418976f88e8c5f57615f8115
2020-12-17 20:17:08 +00:00
VitaliiShpital
79a3a47805 build: added brotli compression for wasm bits
WHAT:
added brotli compression for wasm files and added copying of those files to static/wasm folder in Dockerfile

WHY:
those files are a part of web worker webpack bundle and I didn't find a way to compress them separately using webpack.
I'm open to any other ideas if they come up

Change-Id: I105cc1582e9816fd9b63052ba48358525c85a164
2020-12-17 19:23:53 +00:00
VitaliiShpital
f4bbd0f5df web/satellite: use brotli instead of gzip
WHAT:
we'll use brotli instead of gzip from now on

WHY:
better compression

Change-Id: Ibeadd6bfc783e9c15cf3f62f719af692071a7721
2020-12-17 19:23:44 +00:00
VitaliiShpital
50dd9fb11a web/satellite: move access grant web worker initialization to onlogin loading state
WHAT:
web worker is initialized during onlogin loading screen now

WHY:
removed unnecessary initializations and increased UX experience

Change-Id: I734f194f862c15b3fb08e436a161da32d8d4a8ac
2020-12-17 19:23:36 +00:00
Michal Niewrzal
b1712cc93b cmd/storj-sim: update default storj-sim access with real node id
Currently node id in access grant is '1' and it cannot be parsed to
valid node id. This change update access grant satellite address with
randomly generated node id.

Change-Id: Id1684ac71509bc5a8177b069a914355be3c72d43
2020-12-17 18:31:43 +00:00
crawter
9ea147d234 web/multinode: initial app and configs
Change-Id: I96f8ebcedf982139ff0d263266f25dd63746091c
2020-12-17 19:46:56 +02:00
crawter
a3c2711b2f mnd/nodes: db interface methods updated
Change-Id: I78643f5bdefa7e2f2cbeea06a5203627dbfa92ee
2020-12-17 17:05:18 +02:00
Yaroslav Vorobiov
26e65eeefc multinode/console: node list and get api
Change-Id: Icc3cc9e6997b715c865fee9f96c8a848b694f41f
2020-12-17 16:19:57 +02:00
VitaliiShpital
cd2cdb616a web/satellite: create access grant: restricted key for CLI step, CLI step added to onboarding tour
Change-Id: I2cd0308a61ca724144720eb7f90ba83a1052aee1

WHAT:
CLI step now has restricted key.
CLI step added to onboarding tour

WHY:
bug fixing, extending onboarding flow

Change-Id: I496a23437d602e5dc9d5fdc64bf0e8b97b656e50
2020-12-17 13:11:26 +00:00
Qweder93
2fd7809e54 storagenode/payout: stefanbenten satellite name added to payout history, satellites with no held history removed from list
Change-Id: I96861058ccb9c8ce52698796c91b999eaec1f6e6
2020-12-17 11:01:28 +00:00
Jeff Wendling
0e83233700 storj-sim: add node id to default access
Change-Id: I59874fe8d73a832d04a5597c98d05971a74d2164
2020-12-17 09:38:23 +00:00
Egon Elbre
12055e7864 all: minor cleanups
Change-Id: I4248dbe36a62a223b06135254b32851485a2eec1
2020-12-16 10:47:46 +00:00
Cameron Ayer
8c52bb3a18 satellite/checker: use numHealthy as segment health in repair queue
A few weeks ago it was discovered that the segment health function
was not working as expected with production values. As a bandaid,
we decided to insert the number of healthy pieces into the segment
health column. This should have effectively reverted our means of
prioritizing repair to the previous implementation.

However, it turns out that the bandaid was placed into the code which
removes items from the irreparable db and inserts them into the repair
queue.

This change: insert number of healthy pieces into the repair queue in the
method, RemoteSegment

Change-Id: Iabfc7984df0a928066b69e9aecb6f615253f1ad2
2020-12-15 17:16:59 -05:00
Ivan Fraixedes
187680b0c1
scripts: Fix typo in a comment
Change-Id: If79e778e786db06d2263bcd5393f639a4bb92542
2020-12-15 18:58:54 +01:00
Cameron Ayer
2ac72eaf16 satellite/repair/checker: add new monkit stats tagged with rs scheme
There is a new checker field called statsCollector. This contains
a map of stats pointers where the key is a stringified redundancy
scheme. stats contains all tagged monkit metrics. These metrics exist
under the key name, "tagged_repair_stats", which is tagged with the
name of each metric and a corresponding rs scheme.

As the metainfo observer works on a segment, it checks statsCollector
for a stats corresponding to the segment's redundancy scheme. If one
doesn't exist, it is created and chained to the monkit scope. Now we can call
Observe, Inc, etc on the fields just like before, and they have tags!

durabilityStats has also been renamed to aggregateStats.

At the end of the metainfo loop, we insert the aggregateStats totals into the
corresponding stats fields for metric reporting.

Change-Id: I8aa1918351d246a8ef818b9712ed4cb39d1ea9c6
2020-12-15 14:08:01 +00:00
Jennifer Johnson
adb2c83e09 cmd/uplink: adds register, url, and dns flags to uplink share
and replaces access grant with access

uplink share <path> --> creates access grant

uplink share --register <path> --> registers access grant

uplink share --url <path> --> creates URL, implies register and public

uplink share --dns <hostname> <path> --> creates dns info, implies register and public

Change-Id: I7930c4973a602d3d721ec6f77170f90957dad8c0
2020-12-14 20:51:44 -05:00
Stefan Benten
9fe477899b satellite/satellitedb: add lint ignore rule to support staticcheck 2020.2
staticcheck 2020.2 is not liking our dbx files, so we need to ignore them.

Change-Id: I6becc3619bb088473f9776d0878ce240d4935936
2020-12-14 21:16:31 +00:00
Qweder93
12144a600b storagenode/console: payout tests and heldhistory joined_at rounding added
Change-Id: I1d43620fbafbf7ed92588b84cb9c6b8ced8832ef
2020-12-14 19:35:04 +02:00
Qweder93
c630037f34 multinodepb: diskspace, reputation and status added
Change-Id: I470fa8b59ce7f00f2fbedbd0c0878fb5fff0590c
2020-12-14 17:01:16 +00:00
Jessica Grebenschikov
3cc98de3ee satellite/console/wasm: reduce size to <9MB
Make changes so that we only import the necessary files from the console package so that the generated wasm code is as small as possible.

This change gets the compiled wasm code down to 8.6MB uncompressed and 2MB when compressed with `gzip --best`.

https://review.dev.storj.io/c/storj/storj/+/3396

Change-Id: Ifdd4be285810757b46bbbe43327c0d0139e5f8f7
2020-12-14 16:41:39 +00:00
JT Olio
d955946f15 satellite/compensation: don't abort entirely if a node isn't found
Change-Id: I1066fb6a281eece892ad179a24b01b2ff6615fe7
2020-12-14 15:56:59 +00:00
Ivan Fraixedes
2dddcffe43 satellite/accounting/rollout: Remove unused variable
Remove a declared variable that's set by never read nor passed to any
function so it's unused code.

Change-Id: I8daf9d1f71d29ab39d7a80011d1b4813ada1c67d
2020-12-14 14:11:41 +00:00
crawter
4a11ec2826 multinode/nodes: package created, api tests added, small restructuring
Change-Id: I9f8146760a2676a204eb1bd3410079c5fa017d70
2020-12-14 14:16:45 +02:00
Brandon Iglesias
ca1e6b9756
Adding Fastly (#3994) 2020-12-11 15:53:05 +02:00
Stefan Benten
d559405ae0
Jenkinsfile: update to go v1.15.6 (#3993) 2020-12-11 11:32:16 +01:00
Stefan Benten
8fe829d5fd
build: add wasm bits to Dockerfile and bump to go v1.15.6 (#3992) 2020-12-11 02:23:39 +01:00
Aaron Covrig
46e26fa47d
Changed ChecksArea.vue audits/checks precision (#3987)
Fixes #3965
2020-12-10 20:03:11 +01:00
Stefan Benten
982c729132
.clabot: adding Doom4535 to approved list (#3989) 2020-12-10 19:42:22 +01:00
Jessica Grebenschikov
0649d2b930 satellite/repair: improve contention for injuredsegments table on CRDB
We migrated satelliteDB off of Postgres and over to CockroachDB (crdb), but there was way too high contention for the injuredsegments table so we had to rollback to Postgres for the repair queue. A couple things contributed to this problem:
1) crdb doesn't support `FOR UPDATE SKIP LOCKED`
2) the original crdb Select query was doing 2 full table scans and not using any indexes
3) the SLC Satellite (where we were doing the migration) was running 48 repair worker processes, each of which run up to 5 goroutines which all are trying to select out of the repair queue and this was causing a ton of contention.

The changes in this PR should help to reduce that contention and improve performance on CRDB.
The changes include:
1) Use an update/set query instead of select/update to capitalize on the new `UPDATE` implicit row locking ability in CRDB.
- Details: As of CRDB v20.2.2, there is implicit row locking with update/set queries (contention reduction and performance gains are described in this blog post: https://www.cockroachlabs.com/blog/when-and-why-to-use-select-for-update-in-cockroachdb/).

2) Remove the `ORDER BY` clause since this was causing a full table scan and also prevented the use of the row locking capability.
- While long term it is very important to `ORDER BY segment_health`, the change here is only suppose to be a temporary bandaid to get us migrated over to CRDB quickly. Since segment_health has been set to infinity for some time now (re: https://review.dev.storj.io/c/storj/storj/+/3224), it seems like it might be ok to continue not making use of this for the short term. However, long term this needs to be fixed with a redesign of the repair workers, possible in the trusted delegated repair design (https://review.dev.storj.io/c/storj/storj/+/2602) or something similar to what is recommended here on how to implement a queue on CRDB https://dev.to/ajwerner/quick-and-easy-exactly-once-distributed-work-queues-using-serializable-transactions-jdp, or migrate to rabbit MQ priority queue or something similar..

This PRs improved query uses the index to avoid full scans and also locks the row its going to update and CRDB retries for us if there are any lock errors.

Change-Id: Id29faad2186627872fbeb0f31536c4f55f860f23
2020-12-10 09:51:26 -08:00