Commit Graph

4651 Commits

Author SHA1 Message Date
Moby von Briesen
a8b66dce17 satellite/accounting: account for old orders that can be submitted in satellite rollup
With the new phase 3 order submission, orders can be added to the
storage and bandwidth rollup tables at timestamps before the most recent
rollup was run. This change shifts the start time of each new rollup
window to account for any unexpired orders that might have been added
since the previous rollup.

A satellitedb migration is necessary to allow upserts in the
accounting_rollups table when entries with identical node_ids and
start_times are inserted.

Change-Id: Ib3022081f4d6be60cfec8430b45867ad3c01da63
2020-11-18 14:46:00 -05:00
Egon Elbre
e19fabc880 scripts/tests/rollingupgrade: fix typo in flag
Change-Id: Ia3e8d076741a30fb2a42af9b2621796a814c75ae
2020-11-18 19:15:42 +00:00
Bill Thorp
5fe3d2dea7 cmd/uplink Allow use of named accesses in uplink register
Previously uplink register only accepted a fully serialized access grant.
This is kind of annoying, I changed it so that it could also use access names.

Change-Id: If6d4d1baa8d4fb3d87fdedb895d459fa12743f1a
2020-11-18 12:23:57 -05:00
Egon Elbre
aeb801604e {satellite,storagenode}/orders: fix flaky tests
Before manipulating order information on storagenodes we need to wait
for the orders to propagate to the database. Some of that happens
async with uplink.

Change-Id: Iaacfd7db0909ab5d2831d06388e5fb27b6d4778f
2020-11-18 13:44:02 +00:00
Moby von Briesen
41d86c0985 storagenode/orders/ordersfile: Add reasonable size caps for orders/limits when detecting file corruption.
Define constants of 32 KiB as the upper limit of the marshalled order
and limit protobuf sizes. This value gives lots of buffer in case the
protobufs ever change, but is not as extreme as what we were doing
before in V0 files, which was to use the Uint32 max value.

Change-Id: I0914d17dde3b044b2611af33f931d46d55f81e98
2020-11-18 12:33:26 +00:00
Egon Elbre
8da5e6a554 scripts/tests/rollingupgrade: use wait-for instead of sleep
Change-Id: Ie879e061d3b312705726375953767d420e922073
2020-11-18 12:00:16 +00:00
Qweder93
a17cd9aa3e storageode/apikey: added service, CLI issue api key
Change-Id: I840cd0fdbd8dca884eefbd111f21fd3990c11e68
2020-11-18 10:40:17 +00:00
paul cannon
2b59640f18 cmd/satellite: ignore Canceled in exit from repair worker
Firstly, this changes the repair functionality to return Canceled errors
when a repair is canceled during the Get phase. Previously, because we
do not track individual errors per piece, this would just show up as a
failure to download enough pieces to repair the segment, which would
cause the segment to be added to the IrreparableDB, which is entirely
unhelpful.

Then, ignore Canceled errors in the return value of the repair worker.
Apparently, when the worker returns an error, that makes Cobra exit the
program with a nonzero exit code, which causes some piece of our
deployment automation to freak out and page people. And when we ask the
repair worker to shut down, "canceled" errors are what we _expect_, not
an error case.

Change-Id: Ia3eb1c60a8d6ec5d09e7cef55dea523be28e8435
2020-11-17 21:37:59 +00:00
VitaliiShpital
f5a5308cc0 web/satellite: upload data step
WHAT:
access grant flow: upload data page

WHY:
info page telling what could be done next

Change-Id: I9bd5d558c97d5cf643f9746169952b8424c3294f
2020-11-17 18:44:22 +00:00
VitaliiShpital
1c13065b0b web/satellite: create access grant: continue in CLI step
WHAT:
continue in CLI step which returns regular API key

WHY:
in case user wants to create access grants in CLI

Change-Id: I8a0fa15f07e553628bda3a3e871506295230f0a2
2020-11-17 18:44:12 +00:00
VitaliiShpital
6f35ee98e6 web/satellite: create access grant: bucket names selection logic
WHAT:
bucket names selection logic for create access grant flow

WHY:
bucket based access grant restriction

Change-Id: I922811ce43afbc0bf0c2c9bcaea755657257f26f
2020-11-17 18:44:04 +00:00
VitaliiShpital
4e49b00c6c web/satellite: create access grant permission step, regular permissions, buckets dropdown
WHAT:
permissions step page for create access grant flow. Regular permissions and buckets dropdown

WHY:
to configure access grant permissions

Change-Id: Ia5571556a7fb83fd9a508e6aabfcdf5b57e9bc96
2020-11-17 18:43:39 +00:00
VitaliiShpital
a332b3d811 web/satellite: create access grant name step
WHAT:
name step for create access grant flow

WHY:
give access grant a name

Change-Id: Ic1819dcc6565b2ca20008459f0a33ece61930165
2020-11-17 18:43:11 +00:00
VitaliiShpital
278e29c1c7 web/satellite: create access grant progress bar
WHAT:
progress bar for create access grant flow

WHY:
progress bar to show user current step of the flow

Change-Id: Ia3665fee91ac9b3c27eed5d5190e69d7ea5b3e8a
2020-11-17 18:43:03 +00:00
VitaliiShpital
6517315ff8 web/satellite: create access grant base container
WHAT:
create access grant base container

WHY:
base container - first step of access grant flow

Change-Id: Id31e25333eadbe6a40cdce972de5cb87413a299e
2020-11-17 18:42:54 +00:00
VitaliiShpital
6664a129b0 web/satellite: add all needed methods to access grant webworker
WHAT:
all needed methods added to webworker

WHY:
to generate correct access grant

Change-Id: I700f24840d5bbe1515dbafa7f4e71e505205f903
2020-11-17 18:42:04 +00:00
VitaliiShpital
b60939e483 web/satellite: delete access grant flow
WHAT:
delete process for access grant flow. Including popup

WHY:
ability to remove access grant

Change-Id: Idf9f4659863a2004ce8b74976525b05103329b9a
2020-11-17 20:18:58 +02:00
VitaliiShpital
e16f02b70d web/satellite: access grant list page
WHAT:
access grants list page where all the created access grants will be visible/deletable

WHY:
initial page of new access grant flow

Change-Id: I0b99f15e47295bd0d307dd3aebd9f6dea3ffbb25
2020-11-17 17:50:00 +00:00
Ivan Fraixedes
fa95c6bbb9
storagenode/orders/ordersfile: Fix error message wrong var
Fix the error message reported by a wrong order size due to passing the
wrong variable to the interpolation pattern.

Change-Id: Ic0059615c60cfa33a26d4aeb0ebda5e586f0df05
2020-11-17 15:22:27 +01:00
NickolaiYurchenko
ac94333422 web/storagenode: notifications components unit tests
Change-Id: I2b0249eff73572b3bf401c5f920440f095bc3978
2020-11-17 12:00:00 +00:00
Ivan Fraixedes
9740da6508 storagenode/orders: Don't panic if size is over MaxInt32
`make` built function to build a new slice with a negative
length panics.
`make` length parameter is of `int` type.

These changes avoid that `make` panics on 32 bits architecture due to
the fact that `int` type is a `int32` an uint32 value can be over the
maximum `int32`, and when that happens the length parameter value
becomes negative and makes `make` to panic.

Change-Id: Ife9ab5993916d6dcf5584b37c208272269cb2b45
2020-11-17 10:35:21 +00:00
Qweder93
c409194d43 storagenode/payouts: estimation payout heldamount rounding removed
Change-Id: I9fdc7cda15de0df8875436b0b376f0e6479d3aeb
2020-11-17 10:06:11 +00:00
Moby von Briesen
0ec685b173 satellite/{satellitedb, repair/{queue, checker}}: Use new column "segmentHealth" instead of "numHealthy" in injured segments queue
We plan to add support for a new Reed-Solomon scheme soon, but our
repair queue orders segments by least number of healthy pieces first.
With a second RS scheme, fewer healthy pieces will not necessarily
correlate to lower health.

This change just adds the new column in a migration. A separate change
will add the new health function.

Right now, since we only support one RS scheme, behavior will not
change. Number of healthy pieces is being inserted as "segment health"
until the new health function is merged.

Segment health is calculated with a new priority function created in
commit 3e5640359. In order to use the function, a new config value is
added, called NodeFailureRate, representing the approximate probability
of any individual node going down in the duration of one checker run.

Change-Id: I51c4202203faf52528d923befbe886dbf86d02f2
2020-11-16 21:18:09 +00:00
Egon Elbre
afc9545ff1 cmd/storj-sim: add "tool wait-for <address>"
For coordinating with other processes it can be useful to wait until
another process is accepting requests on an address.

Change-Id: Id623ed815149f14f9f0344e2f396ab70fc4dec6a
2020-11-16 20:38:56 +00:00
littleskunk
9ab824d3e6
jenkins/rollingupgrade: sleep 5 seconds between old api startup and database migration (#3971) 2020-11-16 21:25:11 +01:00
Cameron Ayer
48d8114b3f satellite/contact: treat pingback failure as error
If the satellite fails to pingback the storage node during CheckIn
an error message is returned to the node in the response, but the actual
error value returned is nil. We are only checking the error. This means
the node has no feedback about the failure, and the node also does not
attempt to retry the connection.

Change-Id: Iaed00e422ba91af573e72255cc6671ea97928eae
2020-11-16 18:26:37 +00:00
crawter
f311722854 multinode/db: nodes repository tests added
Change-Id: Ia5172f249c18540683f66ef244c2c6d39aa3da0a
2020-11-16 20:03:10 +02:00
VitaliiShpital
51fa52e636 web/satellite: access grant type, api, store module, mock
Change-Id: I4c27ca8ac0df2d348e945d3266a56bd26f7d444a
2020-11-16 16:10:58 +00:00
VitaliiShpital
51a712f9e8 satellite/console: get all bucket names endpoint and service method
WHAT:
new endpoint for fetching all bucket names

WHY:
used by new access grant flow

Change-Id: I356a3381359665fd2726120139b34b1e611fe3c4
2020-11-16 17:51:40 +02:00
Moby von Briesen
db480e6e1b storagenode/orders: Improve performance of handling corrupt orders.
This change fixes two things which can make reading from a corrupted
orders file inefficient.
* When a corrupted order is detected, but the underlying error is an
UnexpectedEOF (as opposed to a pb.Unmarshal error, for instance), there
is no point in attempting to read from the file another time to find an
additional uncorrupted order - we will continue to get UnexpectedEOF
errors until we seek to the very end of the file and get a normal EOF.
Instead, when UnexpectedEOF occurs, log and send metrics as with other
types of corruption, but do not attempt to read again.
* When a corrupted order is detected, instead of seeking forward only
one byte for the next attempt, seek forward by the size of entryHeader.
This cuts down on the number of iterations needed to find an uncorrupted
order after detecting a corrupted one.

Change-Id: Ie1a613127e29d29318584ec7f60e8f7554f73487
2020-11-16 14:08:36 +00:00
Egon Elbre
1726b39ed2 scripts: remove thrift mod replace
Currently our code is only using github version of the code, so there
shouldn't be need for the exception.

Change-Id: I0c6e8a8465ab7b525d4b5d1b29e4e5298384286d
2020-11-16 12:00:09 +02:00
Ivan Fraixedes
c2ba5a9905 Makefile: Update Go version security patch
Update Go version to use the last path release 1.15.5 due to a security
fix.
https://groups.google.com/u/1/g/golang-nuts/c/c-ssaaS7RMI/m/5iS6JRtOAwAJ

Change-Id: I748df29e2309408ba1567ebf72652803ee4ec5bc
2020-11-15 00:36:54 +01:00
Jessica Grebenschikov
f558cc825e satellite/orders: add storagenode_bw_phase2 table and dont delete tallies for longer
It turns out we need to make 2 more changes in order for the new order submission phase 3 to get deployed.

This PR makes 2 changes:
1) when the rollup service deletes tallies, we now keep tallies around until orders expire (vs 1 day like before).
2) the reported rollup chore will now write the storagenode_bandwidth_rollups to a new table _phase2 as an intermediary step so it doesn't conflict with phase 3 order settlement.

These changes need to be deployed for 2 days before we can turn on phase 3 of the new orders settlement workflow.

Change-Id: Iafbff577ba7d55f8f17b7db857311b2ce799de60
2020-11-13 17:15:24 +00:00
Egon Elbre
c0b5e7ce3e ci: ensure cockroach doesn't pollute repo
Cockroach 20.2 started automatically profiling, this is a workaround to
disable it and ensure it doesn't create any folders and files in the
repository.

Change-Id: Ib65de01ea1fc619160d710c01602ced3a3a3492e
2020-11-13 16:07:01 +00:00
Malcolm Bouzi
2e6ffd9af6
web/satellite:access grant empty state (#3970) 2020-11-13 18:06:34 +02:00
Yaroslav Vorobiov
1b4bfbb9d2 multinode/console: nodes addition and removal
Change-Id: I60c685953a8d0e24f78b1414c34a28d4b87863b0
2020-11-12 20:26:08 +02:00
crawter
e6967720cd cmd/multinode: create schema command added, run command bug fixed
Change-Id: Ief76fc4a878441e5f112bd79810c66e8d85d7acb
2020-11-12 18:00:18 +00:00
NickolaiYurchenko
f8d3a977fa web/storagenode: PayoutPeriodCalendar.vue unit tests
Change-Id: I6a41611e28993577eb72426b941cf272ae8da46f
2020-11-12 18:50:16 +02:00
Jessica Grebenschikov
226e13e616 satellite/cosole: add tests for wasm access code
Change-Id: I78f71b2f0bef03b6e87cd7d79ccaef5f45393b55
2020-11-12 08:03:36 -08:00
NickolaiYurchenko
259f4ebcf1 web/storagenode: EstimationArea.vue unit tests
Change-Id: I5b7606f3deb0b3b8cccf6ec1026d06f3558fd808
2020-11-12 13:21:10 +00:00
Yingrong Zhao
54c5d564a1 scripts/tests/testversions: fix older uplink setup
This PR does follwing changes:
    1. Change oldest uplink version in the test to v0.35.3
        When the test is first created, we decided to support uplink
        version starting from v0.17.1, however with many API changes,
        older uplinks are not usable with latest version of the network
        anymore. One of the reasons being older uplinks uses deprecated
        endpoint. Therefore, we will change the oldest uplink version to
        the one that's using only new endpoints.
    2. Disable tls certificate verification in uplink
    3. Use storj-sim version control server instead of production one
    4. Skip uplink version v1.3.x due to bug in that release

Change-Id: I926a6bb9829cb7181ee752437cdcb67e59197fe0
2020-11-11 17:00:01 -05:00
paul cannon
3e56403599 satellite/repair: add a repair health function
This will be used to rank segments in need of repair for attention by
the repair workers.

Change-Id: I5b70650cec933696b4c6d73bb7efb97e3efdf24a
2020-11-11 18:48:51 +00:00
Jeff Wendling
31533ed1a1 satellite/console/wasm: remove storj.io/uplink deependency
Change-Id: Iee95389e4ba24618e31aff7be44d05377b2e2419
2020-11-11 16:51:14 +00:00
Malcolm Bouzi
592d0bd6bc
web/satellite: access grant routing (#3966) 2020-11-11 18:41:46 +02:00
Cameron Ayer
5a337c48ec {cmd,private,storagenode}: create storage dir verification during setup
Previously, we created a new file to use for directory verification
every time the storage node starts. This is not helpful if the storage node
points to the wrong directory when restarting. Now we will only create the file
on setup. Now the file should be created only once and will be verified at
runtime.

Change-Id: Id529f681469138d368e5ea3c63159befe62b1a5b
2020-11-11 11:01:36 -05:00
crawter
4ce00c7caa cmd/multinode: run and setup commands added
Change-Id: If7b39c392a9a5617315cefaeafffddab845cf071
2020-11-11 14:48:16 +00:00
VitaliiShpital
5e0106f1fe web/satellite: web worker for wasm
WHAT:
web worker for compiling and instantiation of web assembly module

WHY:
Currently webassembly requires unsafe-eval, however we don't want to
add it to main site due to CSP. The workaround for this is to instantiate
wasm code inside a web worker.

Change-Id: I0c3c9cafa3a0c344761cf6dd86bf96248f1103ca
2020-11-11 16:24:06 +02:00
Cameron Ayer
07acf0e574 cmd/storagenode: add docker env variable to toggle running setup
Previously, we ran setup if no config file was found in the expected dir.
However, there may be situations where a previously set up node's files
may be unreachable. In this case, we would prefer to exit with an error
rather than assume this node needs to be initialized.

The solution here is to add a new env variable to call the setup command.
If SETUP == true, the node will setup, but not run. If SETUP != true,
the node will run and not setup.

If a previously set up node runs with SETUP, it will return an error.
If a node runs without an initial SETUP, it will return an error.

Change-Id: Id2c796ec3d43f2add5e5f34fb777a563eae59f2f
2020-11-11 13:11:19 +00:00
Cameron Ayer
da9f1f0611 satellite/repair: add monkit counter for segments below minimum required
The current monkit reporting for "remote_segments_lost" is not usable for
triggering alerts, as it has reported no data. To allow alerting, two new
metrics "checker_segments_below_min_req" and "repairer_segments_below_min_req"
will increment by zero on each segment unless it is below the minimum
required piece count. The two metrics report what is found by the checker
and the repairer respectively.

Change-Id: I98a68bb189eaf68a833d25cf5db9e68df535b9d7
2020-11-11 12:48:23 +00:00
Egon Elbre
2ff7925e65 ci: set GOTRACEBACK=all
Currently when there's a timeout or panic, the culprit goroutines might
not be printed. Set traceback to all, which prints all user created
goroutines.

Change-Id: I29f87812d2a60f671b3eb172499e24cf70d990b5
2020-11-11 13:57:45 +02:00