Commit Graph

1218 Commits

Author SHA1 Message Date
Egon Elbre
9200efc61f satellite/satellitedb: fix selecting a nullable string
Change-Id: I59e645966e09da586512c69101691b47055c1e5a
2020-04-03 21:30:20 +03:00
paul cannon
0c8c11b251 satellite/audit: add not_enough_shares_for_audit counter
We have been using the SQL expression `name='(*Verifier).Verify' AND
error_name='not enough shares for successful audit'` thus far to detect
cases of this problem and alert on them. Unfortunately, since this
rarely (hopefully never) happens, influxdb has no data for most of the
auditor instances, and when it has no data for a time series, it returns
no columns either. This makes Redash upset when it tries to perform a
query for an alert and can't find the column whose value it expects to
check.

This change should make it so zero values are reported when the problem
has not happened, and higher values when it has.

Change-Id: I79e5e000f879678b661dac88caae1e2915b39ab1
2020-04-03 17:00:50 +00:00
Matt Robinson
4b01a8dd18
Descellate Usage Error to Debug (#3780) 2020-04-03 13:20:20 +02:00
Jeff Wendling
a409bd5dec satellite/orders: check for expired orders first
there are a subset of storagenodes hammering the satellite with
expired orders. if we check for expiration first, we don't have
to do a bunch of pointless signature verification. since a && b
is equal to b && a, we can order these checks in any way we want
and have it still be correct.

Change-Id: I6ffc8025c8b0d54949a1daf5f5ea1fed9e213372
2020-04-02 12:35:11 -06:00
Egon Elbre
6492b13d81 all: remove old uuid
Change-Id: I3a137f73456f010c37d3933dbe12cbbb840b809f
2020-04-02 19:30:36 +03:00
Egon Elbre
1024bf9ce1 all: simplify uuid usage
Instead of uuid.Parse, use uuid.FromString.
This removes a bunch of pointer management logic.

Change-Id: Id25bd174eb43c71d00b450158a198abafd8958f2
2020-04-02 13:45:19 +00:00
Michal Niewrzal
c178a08cb8 satellite/metainfo: add max segment size and max inline size to
BeginObject response

We want to control inline segment size and segment size on satellite
side. We need to return such information to uplink like with redundancy
scheme.

Change-Id: If04b0a45a2757a01c0cc046432c115f475e9323c
2020-04-02 12:41:28 +00:00
Michal Niewrzal
4a79b609e9 satellite/metainfo: fix panic when we batch BeginObjectDelete without
all permissions

Without read and list permissions BeginObjectDelete won't return error
if occurs. This was breaking Batch processing because there was
assumption that without error response will be always not nil.

https://storjlabs.atlassian.net/browse/SM-590

Change-Id: I0fc9539e429110a660eb28725b266d5e4771d198
2020-04-02 12:20:19 +00:00
Egon Elbre
8f73fb7a32 all: simplify uuid usage
uuid.UUID implements driver.Value so it can be directly used as a
scannable result.

Replace uses of dbutil.BytesToUUID with uuid.FromBytes.

Change-Id: I51a670185ceb3cc2199d5aa2b76bc3fc191ca8fe
2020-04-02 05:48:58 +00:00
Jeff Wendling
ffe7a3c211 satellite/compensation: make surge percent an int64 for strictcsv
Change-Id: I1783bf73ee68ca9beb8a03f5928873fab0bbe95d
2020-04-01 14:06:33 +00:00
Egon Elbre
a416b03941 satellite/accounting: fix TestProjectBandwidthTotal
Test was inserting for past 4 days, however the test was summing up for
the current month.

Change-Id: I509afdc6a76b314a6bb90652ab70cd2c2bab1288
2020-04-01 11:50:18 +03:00
Egon Elbre
0a69da4ff1 all: switch to storj.io/common/uuid
Change-Id: I178a0a8dac691e57bce317b91411292fb3c40c9f
2020-03-31 19:16:41 +03:00
Egon Elbre
1d79228ed0 satellite/metainfo: support uplink useragent
Adds support for parsing user agent and specifying uplink version and
it's library dependencies.

Change-Id: Ibaddde4deb93e153ac05c91b676c5b5f1ae1aa37
2020-03-31 15:11:31 +00:00
Qweder93
dc32f1da55 storagenode/cache/heldamount added, errNoRows ignored
Change-Id: If6b675e622d6c1324c0893c43cca93dc5323cd78
2020-03-31 11:35:58 +00:00
Jeff Wendling
e2ff2ce672 satellite: compensation package and commands
Change-Id: I7fd6399837e45ff48e5f3d47a95192a01d58e125
2020-03-30 14:08:14 -06:00
littleskunk
23e5a0471f
satellite/audit: clean up logging (#3832)
Co-authored-by: Ivan Fraixedes <ivan@fraixed.es>
2020-03-30 12:09:50 -06:00
Jennifer Johnson
d77f3b8786 satellitedb/migrate: set vetted_at backfill to now.day
Change-Id: Ib2b12be43dbd3f3705b1891bc703ae15abb75e09
2020-03-30 16:50:23 +00:00
Egon Elbre
439aba922a satellite/overlay: reduce overhead of GetNodes
Instead of filtering on the client side it's better to filter on the
database side.

Change-Id: I845fbbe5ed28c2ffdb0b8a3f789b59c094fd1069
2020-03-30 18:36:23 +03:00
Egon Elbre
cb781d66c7 satellite/overlay: optimize FindStorageNodes
Reduce the number of fields returned from the query.

Benchmark results in `satellite/overlay`:

benchstat before.txt after2.txt
name                               old time/op  new time/op  delta
SelectStorageNodes-32              7.85ms ± 1%  6.27ms ± 1%  -20.18%  (p=0.002 n=10+4)
SelectNewStorageNodes-32           8.21ms ± 1%  6.61ms ± 0%  -19.53%  (p=0.002 n=10+4)
SelectStorageNodesExclusion-32     17.2ms ± 1%  15.9ms ± 1%   -7.55%  (p=0.002 n=10+4)
SelectNewStorageNodesExclusion-32  17.8ms ± 2%  16.1ms ± 0%   -9.38%  (p=0.002 n=10+4)
FindStorageNodes-32                48.4ms ± 1%  45.1ms ± 0%   -6.69%  (p=0.002 n=10+4)
FindStorageNodesExclusion-32       79.2ms ± 1%  76.1ms ± 1%   -3.89%  (p=0.002 n=10+4)

Benchmark results from `satellite/overlay` after making them parallel:

benchstat before-parallel.txt after2-parallel.txt
name                               old time/op  new time/op  delta
SelectStorageNodes-32               548µs ± 1%   353µs ± 1%  -35.60%  (p=0.029 n=4+4)
SelectNewStorageNodes-32            562µs ± 0%   368µs ± 0%  -34.51%  (p=0.029 n=4+4)
SelectStorageNodesExclusion-32     1.02ms ± 1%  0.84ms ± 0%  -18.08%  (p=0.029 n=4+4)
SelectNewStorageNodesExclusion-32  1.03ms ± 1%  0.86ms ± 2%  -16.22%  (p=0.029 n=4+4)
FindStorageNodes-32                3.11ms ± 0%  2.79ms ± 1%  -10.27%  (p=0.029 n=4+4)
FindStorageNodesExclusion-32       4.75ms ± 0%  4.43ms ± 1%   -6.56%  (p=0.029 n=4+4)

Change-Id: I1d85e2764eb270f4c2b1998303ccfc1179d65b26
2020-03-30 18:36:23 +03:00
Egon Elbre
a6540dc3ef satellite/overlay: remove unused KeyLock
Change-Id: Ie99f97772824ceafb3d97453545fc6e96be2fb6f
2020-03-30 16:47:48 +03:00
Ivan Fraixedes
de903a652d satellite/accounting: Test no repair traffic in billing
Add a test for checking that the billing doesn't include traffic due to
audits and repairs.

Change-Id: I1ce1199ec87c4440082c42b100124aeb14200f41
2020-03-30 10:48:35 +00:00
littleskunk
048ca4558f
satellite/repair: clean up logging (#3833)
Co-authored-by: Michal Niewrzal <michal@storj.io>
2020-03-30 11:59:56 +02:00
Egon Elbre
c970969503 satellite/overlay: add benchmark for node selection
Change-Id: I15b767a78b662f8276e656b3fb73a15ec59e76c8
2020-03-27 23:09:29 +02:00
Egon Elbre
480ea1e4b5 satellite/repair/repairer: fix temporary file handling
Change-Id: Ice1a467510737b3375c018ae37b16431c7dffe9e
2020-03-27 21:36:23 +02:00
Egon Elbre
e1a443b04a private/testplanet: allow modifying created database
Instead of providing the database from outside to testplanet create it
inside and then allow wrapping and modifying it. This is more convenient
to use.

Change-Id: I9b8f69e6e0a19ff984b4e2bfe927c9100c77bc6c
2020-03-27 19:14:48 +00:00
Moby von Briesen
a933bcc99a satellite/repair/repairer/ec.go: add option for downloading pieces onto disk instead of in memory during repair
Add flag to satellite repairer, "InMemoryRepair" that allows the
satellite to decide whether to download the entire segment being
repaired into memory (this is what the satellite already does), or to
download it into temporary files on disk that will be read from in the
upload phase of repair.

This should help with handling high repair traffic on satellites that
cannot afford to spend 64mb of memory per repair worker.

Updates tests to test repair for both in memory and to disk.

Change-Id: Iddf591e165621497c98533d45bfea3c28b08a194
2020-03-27 16:41:00 +00:00
Egon Elbre
e8f18a2cfe private/testplanet: expose storagenode and satellite Config
Change-Id: I80fe7ed8ef7356948879afcc6ecb984c5d1a6b9d
2020-03-27 17:01:25 +02:00
igor gaidaienko
9d3d411ca1 satellite/accounting: Add test billing inline segments
New test added to be sure that we bill customers for inline segments

Change-Id: I5eb0ee7b475a7b57ecebaad214ece8c5f1cf8c4d
2020-03-27 10:36:53 +00:00
JT Olio
5511827662 satellite/orders: don't log expired order limits
we still need to come up with a better plan to get storage nodes
to stop doing this, but in the meantime, we know this is happening,
just stop logging it and keep some stats instead.

Change-Id: Icb6bcba275e0e955c54b1a90da2b37219fff2349
2020-03-26 22:31:10 -06:00
Ethan
df462d7265 satellite/accounting: Add index on bucket_bandwidth_rollups to minimize full table scans
https://storjlabs.atlassian.net/browse/SM-545

Change-Id: I5599a72a991d70236f17beca027e9bc032777177
2020-03-26 19:53:50 +00:00
Natalie Villasana
8e0ca0e6f5
satellite/gc: update release default for gc to run separately (#3830) 2020-03-26 14:44:18 -04:00
VitaliiShpital
23da9228b3 satellite/console: email used error handling for registration
Change-Id: Ifd3f2ce065ebd3c5e538c5c1eeaa76137b243b78
2020-03-26 17:42:33 +00:00
Jeff Wendling
97e980cd8a private/dbutil: add database name to configure as a tag
storagenodes have like 10 or more databases. without this
tag they all get sent as the same value, stomping on each
other.

Change-Id: Ib12019684d6ea8f2a5b83df584056dfa79e3c4b3
2020-03-26 16:50:15 +00:00
Jennifer Johnson
b75cbc8e24 satellite,storagenode: remove references to free bandwidth
Change-Id: I42a6597544804fa9235e89ec656ebc365eb522e5
2020-03-25 22:28:34 +00:00
Egon Elbre
c715c75fea pkg/server: add counters for grpc calls
This will help to determine how many grpc calls are made to the
satellite.

Also remove the grpc funcs that have been added to upstream.

Change-Id: I91878f4fd10f9bfe601c94222c102eaaf4d35963
2020-03-25 21:38:13 +02:00
Yingrong Zhao
b7b19289d1 bump storj.io/common to latest
Change-Id: I16e337660ce8e1ef332cc842dbf4cfa067b9b98b
2020-03-25 09:08:40 -04:00
Yingrong Zhao
a731472496 bump storj.io/common to latest and storj.io/drpc to v0.0.11
Change-Id: I7a6e823b441eeff4621dfdf2d6577be76c9761c8
2020-03-24 15:17:10 -04:00
Michal Niewrzal
fdf40a7526 storj: remove storj/private/version package which was moved to
`storj/private` repo

Change-Id: I81c3f5b9d5e4fe7bca760999eb045ee9734e5e2e
2020-03-24 14:31:33 +00:00
Stefan Benten
bdec51658e
cmd/storj-sim: Increase storj-sim max-alpha-usage (#3824) 2020-03-24 14:48:25 +01:00
Michal Niewrzal
aed8dea625 satellite/accounting: Add test not billing after deletion
New test added to be sure that we don`t bill for the data after deleting it.

Change-Id: Ifb5931ce28f6b0294aeb16311164675a17f11917
2020-03-24 12:55:49 +00:00
Stefan Benten
173cb1e484
Changing LogLevel to Warn (#3822)
This is not a process error and can cause false alarm for monitoring systems
2020-03-24 13:46:28 +01:00
Michal Niewrzal
f0aeda3091 storj: remove from storj/pkg packages moved to storj/private repo
* debug
* traces
* cfgstruct
* process

Package `storj/private/version` will be removed as a separate change.

Change-Id: Iadc40faa782e6225513b28218952f02d9c240a9f
2020-03-24 09:56:29 +01:00
Kaloyan Raev
6c44512c15 satellite/metainfo: increase stream ID expiration to 48h
Otherwise, it is not possible to upload very large files like 1 TB.

Change-Id: Iadda1ae91174125736684850b906d5bb6d19f0a9
2020-03-23 15:47:39 +02:00
Jessica Grebenschikov
aeab599d21 satellitedb: removed unused id on storagenode_storage_tallies table, add index on node_id
The goal of this change is to improve the storagenode_storage_tallies table by removing the unneeded id column that is not being used but only taking up space, and also to add an index on a different column that needs it. Removing and adding a column seems simple, but ended up being more complicated because of some cockroachdb limitations.

The cockroachdb limitation when trying to remove a column from a table and create a new primary key are:
1. only allows primary key creation at table creation time (docs: https://www.cockroachlabs.com/docs/stable/primary-key.html)
2. table drop or rename is performed async and cannot be done in a transaction (issue: https://github.com/cockroachdb/cockroach/issues/12123, https://github.com/cockroachdb/cockroach/issues/22868)

To address these differences between cockroachdb  and Postgres, this PR performs different migrations for the two database. The Postgres migration is straight forward and what you would expect, but the cockroach migration has two main changes:
1. To change a primary key, use the recommended process from the cockroachdb docs to create a new table with the new primary key you want and then migrate the data.
2. In order to do 1, we needed to do the new table renaming in a separate transaction from the data migration.

Ref: SM-65

Change-Id: Idc9aee3ab57aa4d5570e3d2980afea853cd966bf
2020-03-20 14:39:44 -07:00
Egon Elbre
1b6ab173a8 private/context2: moved to storj.io/common/context2
Change-Id: Ic1dd1ed645ff3e1057c9b2b143e2c3ddf29d678e
2020-03-20 14:39:46 +00:00
Jennifer Johnson
9b78473c0c satellitedb: adds vetted_at nullable timestamp to nodes table
Change-Id: I42d5a396b4eecbad26b683c6aee51e043d2ff034
2020-03-20 01:37:28 +00:00
JT Olio
b2590cf283
bump uplink to 1.0.0 (#3816)
What: bumps uplink to 1.0.0
Why: we just released it!
2020-03-19 15:38:27 -06:00
Jennifer Johnson
699b635e5d satellite/overlay: rename newNodePercentage to newNodeFraction
Change-Id: Ie66de91f88183b44de0773589e83e4ade9aa997a
2020-03-19 20:09:32 +00:00
Egon Elbre
eb1d8aab96 satellite/metainfo/pointerverification: service for verifying pointers
This implements a service for pointer verification. This makes the
slightly clearer, because it's not part of metainfo.

It also adds a peer identity cache which reduces database calls and peer
identity decoding.

Change-Id: I45da40460d579c6f5fd74c69bccea215157aafda
2020-03-19 16:27:38 +00:00
Qweder93
0df586c3a8 satellitedb/heldamount updated, tests added + storagenode console updated
Change-Id: I10f568a426d0fc42069d025de2accbef5b26dc0c
2020-03-19 15:37:45 +02:00
Kaloyan Raev
78b253c774 libuplink: return deleted bucket/object (step 2)
step 1 in https://review.dev.storj.io/c/storj/uplink/+/1236

Now the old libuplink uses the temporary DeleteBucketReturnDeleted and
DeleteObjectReturnDeleted methods. This way, in the next step, we will
be able to change the DeleteBucket and DeleteObject methods to return
the deleted bucket/object.

Change-Id: I2e638be1960bca6ce1456c92849fcdd6d93e5252
2020-03-18 17:26:23 +00:00
Jessica Grebenschikov
5142874144 satellite/gc: move garbage collection to its own process
Change-Id: I7235aa83f7c641e31c62ba9d42192b2232dca4a5
2020-03-18 16:44:01 +00:00
Egon Elbre
09e0f3de63 satellite/metainfo/piecedeletion: add Service
Change-Id: Id7e32ed569701fa0be66f9527c43a67052994570
2020-03-18 14:50:08 +00:00
Jeff Wendling
115f4559e5 satellite/orders: more efficient processing of orders
by doing an indexed anti-join we're able to reduce the time to
select the pending orders by over 10x on postgres. this should
help us process pending orders much more quickly.

it probably won't do as good a job on cockroach because it does
not do an indexed anti-join and instead does a hash join after
scanning the entire consumed serials table. we should either
remove orders entirely or try to make that more efficient
when necessary.

Change-Id: I8ca0535acd21c51e74955b24c9b86d20e4f2ff9c
2020-03-18 09:03:30 +00:00
paul cannon
ba5991dc86 satellite/repair: add monitoring for remote_segments_healthy_percentage
Change-Id: I6ad29fe1a947ac19d15e40ea33164a510eb33d4f
2020-03-17 17:45:59 +00:00
Moby von Briesen
2f991b6c56 satellite/{overlay, satellitedb}: account for suspended field in overlay cache
Make sure that suspended nodes are treated appropriately by the overlay
cache. This means we should expect the following behavior:
* suspended nodes (vetted or not) should not be selected for uploading
new segments
* suspended nodes should be treated by the checker and repairer as
"unhealthy", and should be removed upon successful repair

This commit also removes unused overlay functionality.

Fixes a bug with commit 8b72181a1f where
the audit reporter was automatically suspending nodes regardless of
audit outcome (see test added).

Tests:
* updates repair tests to ensure that a suspended node is treated as
unhealthy and will be removed from the pointer on successful repair
* updates overlay tests for KnownUnreliableOrOffline and KnownReliable
to expect suspended nodes to be considered "unreliable"
* adds satellitedb test that ensures overlay.SelectStorageNodes and
overlay.SelectNewStorageNodes do not include suspended nodes
* adds audit reporter test to ensure that different audit outcomes
result in the correct suspended/disqualified states

Change-Id: I40dba67278c8e8d2ce0bcec5e0a5cb6e4ce2f561
2020-03-17 17:14:56 +00:00
Michal Niewrzal
81afbcc12e satellite/metainfo: check bucket existence on upload and listing
Initial change for checking bucket existence on satellite side for
requests like BeginObject and ListObjects. This is simple implementation
that is just checking bucket in DB but should be improved in future to
avoid DB calls as much as possible.

Part of https://storjlabs.atlassian.net/browse/USR-365

Change-Id: I9076acddc44d7dbfa7612a1c24a007de01621583
2020-03-17 15:43:22 +00:00
Jeff Wendling
7baa59753a satellite/orders: add tests for double sending the same order
Change-Id: If2fa7f035257df3b04f506f81aa8b2e0916f5033
2020-03-17 14:18:03 +00:00
Egon Elbre
ad9cac3084 satellite/metainfo: reduce test flakiness
Change-Id: I6be930f6dce2186b1575ca470cb893cc0dc5e4ce
2020-03-17 11:54:34 +02:00
Egon Elbre
22ea0c7c1a satellite/metainfo/piecedeletion: add Dialer
This adds a piece deletion handler that has debounce for failed dialing
and batching multiple jobs into a single request.

Change-Id: If64021bebb2faae7f3e6bdcceef705aed41e7d7b
2020-03-16 23:36:01 +00:00
Ethan
bdbf764b86 satellite/orders;overlay: Consolidate order limit storage node lookups into 1 query.
https: //storjlabs.atlassian.net/browse/SM-449
Change-Id: Idc62cc2978fba67cf48f7c98b27b0f996f9c58ac
2020-03-16 23:15:47 +00:00
Stefan Benten
49a30ce4a7
satellite/payments: Set proper defaults for the release (#3806)
* Slight adjustments to the migration

Change-Id: I68ae81c010c3414fde2845df16ab124f8d17834b

* Change Coupon Value

Change-Id: I0f241d09e5f716f1d1b3f0688643ba7f614d83c4

* Change AlphaUsage to 5GB

Change-Id: I5d25c6b5750684510cda8b14a27f38d5b2b07408

* change config lock

Change-Id: Ib7c7a54555ba2387c9aa8dd60a0501b0ee6491dd

* Use Scan properly

Change-Id: Ie39cf4644e3ddd703a254e2f5e616763dd805235

* Fix Config Lock

Change-Id: I558ecc1c1becfaaefc7aea5ad2fe83fd6bf6b561
2020-03-16 22:53:12 +01:00
Moby von Briesen
8b72181a1f satellite/{audit,overlay,satellitedb}: implement unknown audit reputation and suspension
* change overlay.UpdateStats to allow a third audit outcome. Now it can
handle successful, failed, and unknown audits.
* when "unknown audit reputation"
(unknownAuditAlpha/(unknownAuditAlpha+unknownAuditBeta)) falls below the
DQ threshold, put node into suspension.
* when unknown audit reputation goes above the DQ threshold, remove node
from suspension.
* record unknown audits from audit reporter.
* add basic tests around unknown audits and suspension.

Change-Id: I125f06f3af52e8a29ba48dc19361821a9ff1daa1
2020-03-16 20:29:26 +00:00
Stefan Benten
52590197c2
satellite/payments: More Cleanup and Satellite command to ensure we have stripe customers (#3805) 2020-03-16 20:34:15 +01:00
Egon Elbre
3d6518081a satellite/metainfo/piecedeletion: add Combiner
To handle concurrent deletion requests we need to combine them into a
single request.

To implement this we introduces few concurrency ideas:

* Combiner, which takes a node id and a Job and handles combining
  multiple requests to a single batch.

* Job, which represents deleting of multiple piece ids with a
  notification mechanism to the caller.

* Queue, which provides communication from Combiner to Handler.
  It can limit the number of requests per work queue.

* Handler, which takes an active Queue and processes it until it has
  consumed all the jobs.
  It can provide limits to handling concurrency.

Change-Id: I3299325534abad4bae66969ffa16c6ed95d5574f
2020-03-16 17:13:26 +00:00
Kaloyan Raev
27f811a9e1 metainfo: delete methods return the deleted item
This only happens if Read or List permission is granted together with
the Delete permission

Change-Id: I68b5f04a476bddabe499809ac98097aac75732a8
2020-03-16 16:26:16 +02:00
Qweder93
9f84261c36 storagenode/cache heldamount added
Change-Id: I7fc807789de63e8a9b8ca2018fd73bdb9e01ad0d
2020-03-16 00:28:35 +02:00
Qweder93
94c4d1e737 satellite/satellitedb/heldamount added, endpoint added
Change-Id: Ife8402b89f631f65ebb5cdf5ca02e99aa9b0b3ff
2020-03-13 18:15:52 +00:00
Stefan Benten
bd603c0751
satellite/payments: Improve Invoice Generation (#3800) 2020-03-13 17:07:39 +01:00
Bill Thorp
94c11c5212 satellite: remove some unnecessary UTC() calls
Fixes some easy cases of extraneous UTC() calls

Change-Id: I3f4c287ae622a455b9a492a8892a699e0710ca9a
2020-03-13 13:49:44 +00:00
Jeff Wendling
41887883f3 satellite/satellitedb: check indexes on migration
Change-Id: I5ba7ae2b512d77c70405ce332158f12128e27eed
2020-03-13 10:45:22 +00:00
Jess G
39cb821196
satellite/overlay: rm combinedcache, fix IP naming to be network (#3798)
* rn combinedcache, rm dns node lookup

Change-Id: I239f07211764b097d851230d8c81900a47756e9e

* excludeIPs -> excludedNetworks

Change-Id: Ifa6f44ab17457cdd5aff4cd5694296867c18b179

* use lowercase var name

Change-Id: I825aad2b718c71f455e747be18f8cabd02aabe55

* update Getnetwork name

Change-Id: I002a1b7bc6b4ef40159c0cd2b0ef209f80a9c503

* fix comments

Change-Id: Ibddf5b9ffa9d685af6c392d893db063ef18e45fa

* update comments with ipv6

Change-Id: I31758b7d4979e7c27d014668f4fb532ad838cda2

Co-authored-by: Stefan Benten <mail@stefan-benten.de>
2020-03-12 11:37:57 -07:00
littleskunk
02aee17cd9
accounting/projectlimit: reset at the beginning of the month (#3796)
Co-authored-by: Stefan Benten <mail@stefan-benten.de>
2020-03-11 23:00:58 +01:00
JT Olio
051569c69f
satellite: enable open registration (and add flag that disables it) SM-441
Change-Id: I47bfedb312089f6d2bfbab013bd74ad4b8aa5f5e
2020-03-11 03:53:34 +01:00
Jessica Grebenschikov
803e2930f4 satellite: use IP for all uplink operations, use hostname for audit and repairs
My understanding is that the nodes table has the following fields:
- `address` field which can be a hostname or an IP
- `last_net` field that is the /24 subnet of the IP resolved from the address

This PR does the following:
1) add back the `last_ip` field to the nodes table
2) for uplink operations remove the calls that the satellite makes to `lookupNodeAddress` (which makes the DNS calls to resolve the IP from the hostname) and instead use the data stored in the nodes table `last_ip` field. This means that the IP that the satellite sends to the uplink for the storage nodes could be approx 1 hr stale. In the short term this is fine, next we will be adding changes so that the storage node pushes any IP changes to the satellite in real time.
3) use the address field for repair and audit since we want them to still make DNS calls to confirm the IP is up to date
4) try to reduce confusion about hostname, ip, subnet, and address in the code base

Change-Id: I96ce0d8bb78303f82483d0701bc79544b74057ac
2020-03-11 09:11:40 -07:00
JT Olio
520b16e824 satellite/console: allow for project limits even with open registration
Change-Id: I4d2528880638882ab8c427bd926e0c4f4b0a5bab
2020-03-11 12:56:55 +00:00
littleskunk
7aa30d2f06
accounting/projectlimit: remove expansion factor (#3795)
Co-authored-by: Ivan Fraixedes <ivan@fraixed.es>
2020-03-11 11:51:22 +01:00
Moby von Briesen
1baf1bd249 satellite/satellitedb: Add index on num_healthy_pieces column in injuredsegments table
We missed this in the migration that added the num_healthy_pieces
column. It exists in dbx, but not on the actual satellite table.

Change-Id: If16b5ec2325d56406250298531b3285215188bf3
2020-03-10 16:59:35 +00:00
VitaliiShpital
56c33f5193 satellite/payments: project charges api extended to show usage and period
Change-Id: I471def779d8b2a896fc43a692029233a2cd839b0
2020-03-10 18:39:05 +02:00
Michal Niewrzal
16878a22ea satellite/metainfo: stops hiding real validateAuth
Metainfo method validateAuth checks things like API key, user permission
and rate limit but at the end all errors were returned as
rpcstatus.Unauthenticated.

Old Metainfo is not touched to avoid backward compatibility issues.

Change-Id: I78eb276210fc50151da58a5c84e13ecd0961da29
2020-03-10 11:53:00 +00:00
paul cannon
79553059cb satellite/repair: put irreparable segments in irreparableDB
Previously, we were simply discarding rows from the repair queue when
they couldn't be repaired (either because the overlay said too many
nodes were down, or because we failed to download enough pieces).

Now, such segments will be put into the irreparableDB for further
and (hopefully) more focused attention.

This change also better differentiates some error cases from Repair()
for monitoring purposes.

Change-Id: I82a52a6da50c948ddd651048e2a39cb4b1e6df5c
2020-03-09 21:45:16 +00:00
Yingrong Zhao
20e96d417a satellite/metainfo: fix data race in test
fix flaky test: TestDeletePiecesService_DeletePieces_Timeout

Change-Id: Ia707b78adf65967f6466b034a0fbf79f7355c397
2020-03-09 14:59:44 +00:00
Michal Niewrzal
d7b5df70d3 cmd/uplink: remove unused flag
New API has limited number of options to configure at the moment. We
should remove unused flags from Uplink CLI and add if needed in the
future.

Change-Id: Icf3f3dadd43cb61a3b408b02d0762aef34425dbf
2020-03-09 13:44:46 +00:00
Egon Elbre
0675413f7a satellite/satellitedb: increase migrate test timeout
Change-Id: I789ea22ad463a6c31737e959ec54941b66830188
2020-03-09 14:30:50 +02:00
Moby von Briesen
e4da7bd9cd satellite/repair/checker: use repair override if available in checker and irreparable
In production, the satellite is overriding the default repair threshold
(35) to a higher value (52). In some places in the checker and
irreparable processes, the repair threshold on the redundancy scheme is
used in place of the override value. This fixes those cases.

Change-Id: Ie7387217d9fb3886f050b5e5b67be51f276196de
2020-03-06 15:39:53 -05:00
Bill Thorp
e99e675fb1 satellite/satellitedb: use time zones with all timestamps
The migration was broken into one migration per table to reduce table locking and reduce the
chances of failure due to SQL timeouts.

Of the 14 fields that lacked time zones, only the 3 named 'interval_start` seemed to have non-UTC data in them.
These fields are fixed in the migration by removing the +00 and adding  AT TIME ZONE current_setting('TIMEZONE')
Field with good data are migrated by adding AT TIME ZONE 'UTC'

Note that postgres's timezone() is different than cockroach's timezone() so AT TIME ZONE is used.

https://storjlabs.atlassian.net/browse/SM-104

Change-Id: I410f2f1d7c11b143f17844347f37e6f4b1e70fce
2020-03-05 21:11:25 +00:00
Jennifer Johnson
0d60c1a4b2 satellite/audit: fix checkSegmentAltered to detect segments that have changed during an audit
- Previously, checkSegmentAltered only checked for segments that were replaced
  but we want to detect all changes to a segment that occurred while an audit was being conducted.
- Fixed a bug where nodes failing audits during reverify for non-piece-hash-verified
  segments were not being removed from containment mode.
- Filled in gaps in reverify testing to ensure nodes are properly removed from containment.

Change-Id: Icd96d369278987200fd28581395725438972b292
2020-03-05 19:05:39 +00:00
Ivan Fraixedes
e6d452decd
satellite/accounting: Billing tests wait for SNs
The billing tests were flaky because some assertions ran before the
storage nodes finish their work.

A new helper function in testplanet has been added to allow to wait for
storage nodes endpoints to finish their work. This function now it's
used in the billing tests for avoiding their flakiness.

This commit closes the ticket:
https://storjlabs.atlassian.net/browse/SM-403

A part of fixing other billing tests flakiness.

Change-Id: Iacb750af435f515c04b1e1d3510a218d184c9abc
2020-03-05 12:37:24 +01:00
Michal Niewrzal
9f390f37da satellite/metainfo: return default ciphers (path and encryption) for old
uplinks

New libuplink is not storing encryption values in with bucket but old
uplinks are using those values for configuration. If bucket was created
with new libuplink we will send back satellite defaults.

Change-Id: Ie1bf3682847e07b302270b4c4bf1a7219f4bf011
2020-03-05 10:04:50 +00:00
Ivan Fraixedes
a7f927df96
satellite/accounting: Disable billing test
Disable a billing test that sometimes fails in the CI.

Change-Id: Ib77ff32060b2303822f36fdd1774d8a29d7d94a6
2020-03-05 10:46:29 +01:00
Jessica Grebenschikov
2af71f3460 satellite/orders: add monkit to looking up node addr
Change-Id: Ia0eb0ffc343879a6ef9827d46e936e1fbc2e198a
2020-03-04 23:15:18 +00:00
Fadila Khadar
5c9becb9be satellite/orders: billing partial download
Submit an order limit with a high amount but the order has a low amount of traffic.
Make sure the order amount is used for billing.

Change-Id: I6b6ae26e9b8896f4a3acf530b2f48510b6df89cc
2020-03-04 17:12:50 +00:00
Jennifer Johnson
1c1750e6be removes bandwidth limiting
On satellite, remove all references to free_bandwidth column in nodes table.
On storage node, remove references to AllocatedBandwidth and MinimumBandwidth and mark as deprecated.

Protobuf message, NodeCapacity, is left intact for backwards compatibility.
Once this is released to all satellites, we can drop the column from the DB.

Change-Id: I2ff6c6537fc9008a0c5588e951afea58ede85838
2020-03-04 14:04:00 +00:00
Egon Elbre
5f2ca0338b satellite/satellitedb: fix err and close order
Change-Id: Ied927275853c4cf4a8ccb500048d50545f6c6efe
2020-03-04 09:05:22 +00:00
Moby von Briesen
f495544c56 satellite/satellitedb/dbx: add fields to node table for placing nodes into suspended mode for too many unknown-error audits
Change-Id: Iac9a619e5c08377de87ffdf4acdd0155027f5eb3
2020-03-03 03:30:59 +00:00
Qweder93
484ec7463a storagenode: notifications on outdated software version
Change-Id: If19b075c78a7b2c441e11b783c3c09fed55060c7
2020-03-02 16:48:02 +00:00
igor gaidaienko
df88f416c9 satellite/accounting: Add test billing download traffic post deletion
Test checking that download traffic gets billed even if the file and bucket was deleted

Change-Id: Ifd67a8cd4b46d75ed48c86698e18c99f60bc39dc
2020-02-28 11:52:04 +00:00
Ivan Fraixedes
d64ef3d898 satellite/accounting: Test billing donwload/upload traffic
Add a test for checking that the billing:

* it doesn't include upload traffic
* it includes download traffic

Change-Id: I1655c15c1fad642f77dd210f2014b2586ae10104
2020-02-28 09:36:51 +00:00
Michal Niewrzal
4deab5ac6c satellite/metainfo: combine CommitSegment and CommitObject in batch v2
This change is a special case for batch processing. If in batch request
CommitSegment and CommitObject are one after another we can execute
these requests as one. This will avoid current logic where we are saving
pointer for CommitSegment and later we are deleting this pointer and
saving it once again as under last segment path for CommitObject.

This change should handle issue we have in older uplinks with incorrect
order of storing pointers.

Change-Id: I86514c95df169e6fbc91b52e5117472cae70cb8b
2020-02-28 07:40:36 +00:00
Jeff Wendling
1db087cfba satellite/satellitedb: migration to create tables for compensation
these tables are used in future commits with respect to the new
storagenode payments code. if we create them now, it will make
backfilling them with historical data easier.

Change-Id: I3c08c9770ec5b2baa38b4f2fd18c2f07746a61c2
2020-02-27 17:34:50 +00:00
Moby von Briesen
6043d01c90 satellite/audit/verifier: add metric for number of successfully downloaded shares
Change-Id: Ia4f1dc6e088db802e340aaecf80cc7ef6dc237a4
2020-02-27 14:33:59 +00:00
Jeff Wendling
2b9f28b029 satellite/accounting/reportedrollup: remove expiration check
Remove the check around consuming an expired serial so that we
have more time to run the migration. It does open a small race
of double spends for entries already counted and then added to
the pending queue right around when they're going to expire and
the consumed serials have already been removed, but that should
be rare if we keep the pending queue empty.

Change-Id: I000b15979b09c67751281ff675ea6c81fc9d22dc
2020-02-26 15:35:10 -07:00
Moby von Briesen
d5540c89a1 satellite/repair/checker: add monkit metrics for segments immediately above repair threshold
Record counts for segments at health=rt+1 through health=rt+5 for every checker
iteration.

Change-Id: I2a00c0bc34d17beb21cacdeab4dac77f755faefe
2020-02-26 20:27:15 +00:00
Egon Elbre
64330c55b3 all: use pbgrpc
common/pb moved grpc to a separate package common/pb/pbgrpc.
This updates this repository to use it.

Change-Id: I2de2a190688871cf9cb61f7ea511f8a01e264e4e
2020-02-26 21:27:47 +02:00
Egon Elbre
89e5c77d83 satellite/metainfo: track observer timing
Measure total time spent in each observer and distribution of handling
pointers by pointer type.

Change-Id: I2d125dfce8dbbb17225029fa35557bc106491151
2020-02-26 17:42:56 +00:00
Moby von Briesen
4e5a7f13c7 satellite/repair/queue: Prioritize selection of items off repair queue by segment health
Add a column to the repair queue table in the satellite db for healthy
piece count. When an item is selected from the repair queue, the least
durable segment that has not been attempted in the past hour should be
selected first. This prevents our repairer from getting stuck doing work
on segments that are close to the repair threshold while allowing
segments that are more unhealthy to degrade further.

The migration also clears the repair queue so that the migration runs
quickly and we can properly account for segment health in future repair
work.

We do not select items off the repair queue that have been attempted in
the past six hours. This was changed from on hour to allow us time to
try a wider variety of segments when the repair queue is very large.

Change-Id: Iaf183f1e5fd45cd792a52e3563a3e43a2b9f410b
2020-02-26 09:54:16 -05:00
paul cannon
92d86fa044 satellite/repair: fix repair concurrency
This new repair timeout (configured as TotalTimeout) will include both
the time to download pieces and the time to upload pieces, as well as
the time to pop the segment from the repair queue.

This is a move from Github PR #3645.

Change-Id: I47d618f57285845d8473fcd285f7d9be9b4318c8
2020-02-24 19:57:09 +00:00
Cameron Ayer
f22bddf122 {storagenode/contact, private/testplanet}: remove ErrFailureToStart and panic in testplanet.Start
Change-Id: I252e8c9407400af7bda95a7657c8154660c3c801
2020-02-24 18:24:23 +00:00
VitaliiShpital
8ea620b3c4 satellite/console: redirecting to login after activation implemented
Change-Id: Ibcf65f5d4664ac41c795f5ceb0a94bcd42673004
2020-02-24 19:52:28 +02:00
Jeff Wendling
f671eb2beb satellite/satellitedb: use queue for orders to get back fast billing
This change adds two new tables to process orders as fast as we used
to but in an asynchronous manner and with hopefully less storage
usage. This should help scale on cockroach, but limits us to one
worker. It lays the groundwork for the order processing pipeline to
be queue rather than database driven.

For more details, see the added fast billing changes blueprint.

It also fixes the orders db so that all the timestamps that are
passed to columns that do not contain a time zone are converted to
UTC at the last possible opportunity, making it less likely to use
the APIs incorrectly. We really should migrate to include timezones
on all of our timestamp columns.

Change-Id: Ibfda8e7a3d5972b7798fb61b31ff56419c64ea35
2020-02-24 17:07:07 +00:00
Qweder93
dca6fcbe28 satellite/payments/stripecoinpayments: credits added to invoice calculations
Change-Id: I6d3f5244a46f8945d2703af39ced333940db34e9
2020-02-24 16:48:27 +00:00
VitaliiShpital
985c3ef897 satellite/console: handling graphql errors bug fix
Change-Id: Ib20786485b0ea448e388912bb8406030d4fae1f7
2020-02-24 16:22:09 +00:00
Yingrong Zhao
a645e52ed9 satellite/metainfo: remove DeletePieces_node_id metric
Change-Id: I2cb10d411aa2912b256754a24d5c150e9536b4d3
2020-02-21 20:33:33 +00:00
Yaroslav Vorobiov
f185adcf7c satellite/payments: fix projects list pagination
Change-Id: I342e69a17be34a503c1e0cef18ee009f1921fcd4
2020-02-21 19:37:11 +02:00
Michal Niewrzal
54e38b8986 pkg/miniogw: gateway implementation with new libuplink
Change-Id: I170c3a68cfeea33b528eeb27e6aecb126ecb0365
2020-02-21 16:20:38 +01:00
Egon Elbre
5342dd9fe6 go.mod: update uplink
Change-Id: I867a6a1eef8aa5d60bb676e5112b98c4192ce811
2020-02-21 16:08:12 +02:00
Yaroslav Vorobiov
ea970e45ce satellite/payments: remove unused code
Change-Id: I2daaf5089bec000a6e995b8396d55528256aca6c
2020-02-20 16:04:19 +02:00
Yingrong Zhao
77f67a8086 satellite/metainfo: add timeout for delete request
Change-Id: I9cad6d7ea185fc2c0ed4e58b42e4e3a78178a79f
2020-02-20 09:10:16 +00:00
Yingrong Zhao
e6da8d0249 satellite/metainfo: use global limiter for DeletePieces Service
we want to return back to the user as quick as possible but also keep
deleting remaining pieces on the storagenodes

Change-Id: I04e9e7a80b17a8c474c841cceae02bb21d2e796f
2020-02-19 12:17:36 +00:00
Cameron Ayer
3e70a893dd storagenode/{piecestore, contact}: report capacity to satellites if below specific threshold
Curently, storage nodes only report their capacity to satellites
once per hour. If a node fills up, it will fail all uploads until
the next contact cycle begins. With these changes, at the end of an
upload we check whether the MinimumDiskSpace threshold has been
passed. If so, trigger the monitor chore to update the node's
capacity, then trigger the contact chore to report the new
capacity to the satellites

Change-Id: Ie6aadaade1e2c12c87e03f8ff9059a50121380a0
2020-02-18 15:42:48 -05:00
Ivan Fraixedes
1a84a00cc9
satellite/orders: Fix doc comments
Enhance the documentation of the UseSerialNumber method (interface and
implementation) and add several missing dots in doc comments of the
methods of the same interface and implementation.

Change-Id: I792cd344f0d2542e060fa2ec288b71231cae69de
2020-02-18 13:03:23 +01:00
Michal Niewrzal
dbe8428f9f satelite/metainfo: return NotFound on delete non existing bucket
Change-Id: I7f466b5f824eab7b5146c2792f40cb2bcd7976a5
2020-02-18 09:05:30 +00:00
Egon Elbre
892b190db6 satellite/admin: add project limit modification and authorization token
Change-Id: If9a7214a940b8544f8023c2cd82da21f19d3f521
2020-02-17 07:56:16 +00:00
Egon Elbre
ef2f101495 satellite/metainfo: don't allow deleting non-empty bucket
Change-Id: I72a8b959e954c7f52e93fc8ea4006a957cc2941a
2020-02-14 14:36:22 +01:00
Yaroslav Vorobiov
827da1ae2b satellite/payments: fail when trying to consume consumed transactions
Change-Id: Ibb2528079ec917b7611b87a02972fb771937a025
2020-02-13 19:52:55 +00:00
Yaroslav Vorobiov
da58dc4a7a satellite/payments: increase batch size for transactions and account balance loops
Change-Id: I44712d26abde6c405ced35f103d1581423092737
2020-02-13 19:37:22 +00:00
Yaroslav Vorobiov
6c6e2eb8b3 satellite/peyments: fix potential infinite loop in update account balance cycle
Change-Id: Ia4f9abe50b771ff6406e3a1ae76166e046bf63e5
2020-02-13 19:20:32 +00:00
Cameron Ayer
4e86951163 satellite/accounting: iterate over projects from tally rather than live accounting projects
at the end of tally iteration, in order to set the new live
accounting totals, we were iterating over all live accounting
projects. We found a bug with this when running storj-sim. If
we restarted the satellite live accounting would be cleared
because storj-sim was running the live accounting redis instance.
Since live accounting was cleared, at the end of tally, even if
it found data in projects, we would not update the live accounting
totals because we were iterating over the projects from live
accounting to do so. We now iterate over projects found from tally
in order to update live accounting

We also found that if a user deleted everything from their project,
tally would not find it and the live accounting would not be updated.
For this reason, we merge live accounting projects into tally projects

Change-Id: If0726ba0c7b692d69f42c5806e6c0f47eecccb73
2020-02-13 12:57:46 -05:00
Yingrong Zhao
f9189f8d94 satellite/console: only create user with registration token
we should only allow new user to register with a registration
token

Change-Id: Iea579976f1e7aa98799693a90401b31a7915bb22
2020-02-13 17:23:03 +00:00
JT Olio
2ae9978304 satellite/gc: skip first gc run
rationale: if GC kills the satellite, it would be nice to make
it through a repair checker sweep first

Change-Id: Id56171dc8e13940cfb6481e36a910bad077a01ed
2020-02-13 13:41:15 +02:00
Qweder93
eeaaa8aa98 satellite/payments/stripecoinpayments: added ApplyInvoiceCredits
Change-Id: I7ed9d8397c0aa59d4ce0d40d1e50d13929e0fe5f
2020-02-12 20:06:08 +02:00
Ivan Fraixedes
c4fd84ad3e satellite/metainfo: Add metrics and traces DeletePices
Trace the calls to DeletePiecesService.DeletePieces method and add
metrics for having statistics about the rate that specific storage node
is dialed and duration time spent on dialing storage nodes.

These statistics will help us to find out if we should implement
connections queues to storage node for reducing the deletion time in cae
that we see that we're spending too much time dialing frequent storage
nodes.

Ticket: https://storjlabs.atlassian.net/browse/SM-85
Change-Id: I9601676c3a8ad96c73c93833145929e4817755e2
2020-02-12 15:38:50 +00:00
littleskunk
76849558cb satellite/gracefulexit: increase performance and tolerate higher error
rate

Graceful exit is very slow at the moment. Over the last couple days we
increase the batch size on Stefans satellite to 1000 but as a side
effect the error rate was increased. With a batch size of 500 the error
rate looks stable.
This PR will increase the default to batch size to 300. Graceful exit
will still be painful slow but at least it will be a bit faster. At the
same time this PR also increases the number of errors we tolerate. We
don't want to DQ slow storage nodes just because they didn't finish all
300 transfers in time. We want to give them more retries.

Change-Id: I92e3f99e116d4988457d8b902a88e85ed1bcc1a7
2020-02-12 11:40:15 +00:00
Kaloyan Raev
37cf42a9ae satellite/metainfo: overwrite zombie segments
Fixes https://storjlabs.atlassian.net/browse/USER-240

- Adds UnsynchronizedPut method to metainfo service that overwrites any
existing pointer under the same path
- Uses UnsynchronizedPut in the metainfo endpoint for committing the
segments

Change-Id: Icb43f31ea33f14066ca9dfdcf226eb3079b90948
2020-02-12 11:10:38 +00:00
Egon Elbre
dbf46c4aa7 satellite/admin: administrative endpoint
Admin server allows creating basic REST and html API-s
for different administrative tasks.

Change-Id: I3dc1786abe1c87350eed60ec90e48130f44e63cf
2020-02-12 12:12:50 +02:00
Jeff Wendling
2d2f5e1a7f satellite/satellitedb/dbx: remove typo in dbx file and format it
Change-Id: I756315d6228ac9edd35cad8b496d36ecf2b5d26f
2020-02-11 14:15:13 -07:00
Cameron Ayer
f10b22eae9 accounting/tally: if delta < 0, delta = 0
if redis crashed in the middle of tally we could have a situation
where we erroneously subtract from a project total. Currently,
`latest` should never be less than `initial`

Change-Id: Ibb5ab724ac0ad4d684f7954fad7a9e061104b7df
2020-02-11 19:48:55 +00:00
Cameron Ayer
33d696b096 storage/redis/redisserver: simplify redisserver creation
Change-Id: I881576a7881db671b5abeeca7120a022987cc47f
2020-02-11 19:11:57 +00:00
Cameron Ayer
b22bf16b35 satellite/overlay: add config flag for node selection free disk requirement
Currently SNs report their free disk space once per hour. If a node
becomes full, it has to wait until the next contact cycle begins to
report; all the while receiving and failing upload requests. By increasing
the minimum required disk space, we can give the storage nodes more time
to report their space before the completely fill up. This change goes
hand-in-hand with another change we want to implement: trigger capacity
report on SN immediately upon falling below threshold.

Change-Id: I12f778286c6c3f582438b0e2949765ac43325e27
2020-02-11 18:08:25 +00:00
Simon Guindon
961944f24d satellite/orders: Resolve storage node addresses to IP addresses.
This change resolves all the storage node addresses to their IP addresses
before giving them to the uplink so that the uplink doesn't have to resolve
a hundred hosts and can immediately connect to improve uplink performance.

Change-Id: Idb834351e0fece409d74c8a1c29b0b8c9b09c9ff
2020-02-11 18:44:45 +02:00
Egon Elbre
429f08b4f0 satellite: add Admin peer
This peer will contain our administrative panels.
It's completely separated from our other satellite
processes because it allows better control for restricting
access to it.

Change-Id: Ifca473bee82ff6c680b346918ba32b835a7a6847
2020-02-11 16:15:33 +00:00
Michal Niewrzal
426c8eb31a private/testplanet: add DeleteBucket method for uplink
New method added to be able to delete easily bucket during tests.

Change-Id: Iaae89618cc676ddbbbd4b0df2eeacd143ea6f3c2
2020-02-11 15:58:13 +00:00
Yaroslav Vorobiov
bd9cebda5b satellite/payments: fix transaction list pagination
Change-Id: I533f637e5cb12b47d7f7248f8bf7de93bd8be000
2020-02-11 16:22:53 +02:00
Ethan
208c05e3db Add metrics to track rate limit.
Add monkit metric for the rate-limit when the rate limit is hit
Logs warning with projectID

https://storjlabs.atlassian.net/browse/SM-165

Change-Id: I352dc40006021990d1bc66a999f62bbf8deb54db
2020-02-11 14:02:12 +00:00
Egon Elbre
ccd8b7f107 satellite/satellitedb: add benchmark for satellitedb setup and close
Change-Id: Ifb561f2eb81e439ea7cfa2ca2dad6b15aa50417e
2020-02-11 13:30:23 +00:00
Yaroslav Vorobiov
984ed26737 satellite/payments: fix invoice project records pagination
Change-Id: I68de69de78256280a6bbf0b744963b9c8c813007
2020-02-11 14:31:55 +02:00
Qweder93
dc075eaa96 satellite/payments : deposit bonuses (credits) added
Change-Id: Ib151bbb9b02d655fa619c53bfbc04ed6f3bb39e0
2020-02-11 11:11:42 +00:00
Yingrong Zhao
3331b443e7 satellite/metainfo: Delete all the piece of a storage node in one single
request

Change-Id: Ia8758d36f1a113b545e4f746d74d172421f14b24
2020-02-11 00:28:30 +00:00
Natalie Ventura Villasana
3900dadafd satellite/overlay: find new nodes with ExcludedIPs
Adds ExcludedIPs to the NodeCriteria for selecting new storage
nodes. Previously, ExcludedIPs was only added to the NodeCriteria
for selecting reputable storage nodes. Now that both are included
in the FindStorageNodesWithPreferences call, it should no longer
be possible to repair pieces to nodes that are on the same IP as
nodes already storing pieces from that segment.
Adds TestSelectNewStorageNodesExcludedIPs to make sure that
SelectNewStorageNodes returns nodes with different IP addresses.

https://storjlabs.atlassian.net/browse/V3-3011

Change-Id: Ic2d5e607cadeba6e8d5c40f9717149cb30880335
2020-02-10 23:45:17 +00:00
Moby von Briesen
c4a9a5d48b satellite/downtime: update detection and estimation downtime chores for
more trustworthy downtime tracking

Detection chore: Do not update downtime at all from the detection chore.
We only want to include downtime between two explicitly failed ping attempts
(the duration between last contact success and the first failed ping is no longer
included in downtime calculation)

Estimation chore: If the satellite started after the last failed ping for a node,
do not include offline time since the last failed ping time - only
estimate based on two failed pings with no satellite downtime in
between.
This protects us from including satellite downtime in our storagenode downtime calculations.

Change-Id: I1fddc9f7255a7023e02474255d70c64faae75b8a
2020-02-10 22:37:01 +00:00
NikolaiYurchenko
6679036ace web/satellite: unauthorize error handled
Change-Id: I12c6937ed1660af097d6930fe2a90fac5f298311
2020-02-10 11:14:51 +00:00
Cameron Ayer
13903449c7 satellite/accounting: fix flaky TestProjectUsageStorage
Sometimes the upload that is supposed to fail due to excess usage
would pass. This looks to be because it's overwriting another object
uploaded earlier in the test and deleting the old pointer. If tally
happened to run after the pointer is deleted but before the current
upload reaches the live accounting check, it might pass through.
The solution is to upload to a different path each time.

Change-Id: Ie6c825b9c6eab9ed53426ae262e7997bcb6beb7f
2020-02-07 20:58:24 -05:00
Cameron Ayer
75355547c2 satellite/satellitedb: don't include GET_AUDIT and GET_REPAIR with chargeable BW
In the methods we use to retrieve a user's chargeable BW, we were summing GET, GET_AUDIT,
and GET_REPAIR. We only want to charge for GET

Change-Id: Icead7695494b22c7c835482cf8b1512a980d59f1
2020-02-07 12:02:44 +00:00
Jeff Wendling
7999d24f81 all: use monkit v3
this commit updates our monkit dependency to the v3 version where
it outputs in an influx style. this makes discovery much easier
as many tools are built to look at it this way.

graphite and rothko will suffer some due to no longer being a tree
based on dots. hopefully time will exist to update rothko to
index based on the new metric format.

it adds an influx output for the statreceiver so that we can
write to influxdb v1 or v2 directly.

Change-Id: Iae9f9494a6d29cfbd1f932a5e71a891b490415ff
2020-02-05 23:53:17 +00:00
Jeff Wendling
d20db90cff private/dbutil/txutil: create new transactions for retries
it was noticed that if you had a long lived transaction A that
was blocking some other transaction B and A was being aborted
due to retriable errors, then transaction B was never given
priority. this was due to using savepoints to do lightweight
retries.

this behavior was problematic becaue we had some queries blocked
for over 16 hours, so this commit addresses the issue with two
prongs:

    1. bound the amount of time we will retry a transaction
    2. create new transactions when a retry is needed

the first ensures that we never wait for 16 hours, and the value
chosen is 10 minutes. that should be long enough for an ample
amount of retries for small queries, and huge queries probably
shouldn't be retried, even if possible: it's more preferrable to
find a way to make them smaller.

the second ensures that even in the case of retries, queries that
are blocked on the aborted transaction gain priority to run.

between those two changes, the maximum stall time due to retries
should be bounded to around 10 minutes.

Change-Id: Icf898501ef505a89738820a3fae2580988f9f5f4
2020-02-01 18:34:28 +00:00
Egon Elbre
97d360afd1 satellite/satellitedb: use correct type
Array was using a smaller type integer.

Change-Id: I025d61b6cea9869efa0b4ac1d24265356491f6dc
2020-01-31 13:00:14 -05:00
Moby von Briesen
006a2824ba satellite/repair: lock monkit stats in checker and repairer
Change-Id: Ia10fc8da0177389a500359ce51d21a5806f3f7b1
2020-01-30 14:09:56 +00:00
Egon Elbre
8dea4f52db satellite: add control panel
Change-Id: Id48246e9bcd4c6ec643277fe740937b2e42ad85b
2020-01-30 08:06:43 -05:00
Egon Elbre
4e2bf81719 pkg/debug: add better title
Change-Id: Icc6114f4e7523cfe6c7984ef1f6eec664ae4ee65
2020-01-30 07:49:40 -05:00
Egon Elbre
d10d6fd153 storagenode,satellite: ignore error on listening debug port
Change-Id: Id3a6d153535776ce41f8edf2bd6f6dad5e2a60bf
2020-01-29 18:06:02 -05:00
crawter
0b898c48a4 satellite/payments: coupons expiration logic fix
Change-Id: Ic8cc4e117957a75a3eb057075204a5b592e62ff4
2020-01-30 00:22:38 +02:00
Egon Elbre
f237d70098 storagenode,satellite: use pkg/debug
Use debug.Server in storage node and satellite for customizing debug server.

Change-Id: I7979412376d028cadf29656d838ab94f18e2aa99
2020-01-29 16:30:31 -05:00
littleskunk
e0cb8037c1 satellite/projectusage: reduce usage limit from 5GB to 0GB
Change-Id: Ie3d2509613e7a4336e2a8d2b136b32f5f308aafc
2020-01-29 20:38:39 +00:00
crawter
9bb7ceb651 satellite\payments: amount for coupons increased
Change-Id: I1f357b76361e6e3e50bbe4ee66a8edb6ff033f36
2020-01-29 22:08:54 +02:00
crawter
f4667426b5 satellite\payments: project limits for coupons increased
Change-Id: I51eb47eb635fd096348befd39b7efbe3ce8982d6
2020-01-29 19:34:50 +02:00
NikolaiYurchenko
e641ff45a5 web/satellite: logout fix
Change-Id: I1b2b14c098e0959e9c5bd36adc889a425d00963c
2020-01-29 16:53:21 +00:00
Ethan
149273c63f satellite/metainfo: add cache expiration for project level rate limiting
Allow rate limit project cache to expire so we can make project level rate limit changes without restarting the satellite process.

Change-Id: I159ea22edff5de7cbfcd13bfe70898dcef770e42
2020-01-29 16:14:10 +00:00
Stefan Benten
d30d2d920d satellite/metainfo: Adding Monkit Meters to the Request Logs
Change-Id: I33d56510cf72d5f8512c1069ce65856cba7f8957
2020-01-29 15:51:36 +00:00
ccase
e87886696e satellite/metainfo: Too many requests should have RPC status ResourceExhaused
This is necessary to for the client to know that it can retry with a
delay.

Change-Id: Ie0ed95f6ae1c072896285d0714f879611ab0cdb3
2020-01-29 15:06:22 +00:00
crawter
e549e32976 satellite/payments: fix promotional coupons
Change-Id: Ib8b7e38f2cb07085655448264f281fd7fc7867dd
2020-01-29 16:40:43 +02:00
Yaroslav Vorobiov
6b72bf92ce satellite/payments: convert egress price to per byte basis
Change-Id: Ia3a07d0afa5d9d55871996a1d2117b4ec290ce8f
2020-01-29 00:06:01 -05:00
Yaroslav Vorobiov
083b396c16 satellite/payments: allow floating point numbers for pricing
Change-Id: I78b60134cf043746efef5371b761939a10f75aaf
2020-01-28 22:52:13 -05:00
littleskunk
a0c9f7f3b0
satellite/projectusage: reduce usage limit from 25GB to 5GB
Change-Id: I2819012b520fd687ab8058000aa38d76b8208158
2020-01-29 04:01:09 +01:00
Egon Elbre
e66b3c9be1 satellite: remove repair worker from core
Core shouldn't be handling any repair load and we have already disabled it in production.
Let's make it official and remove it.

Change-Id: I46e236692a9164421648cfc974dd3246416b2e00
2020-01-28 20:02:30 -05:00
Egon Elbre
e319660f7a private/lifecycle: implement Group
lifecycle.Group implements controlling multiple items such
that their startup and close works.

Change-Id: Idb4f4a6c3a1f07cdcf44d3147a6c959686df0007
2020-01-29 00:37:33 +00:00
Jessica Grebenschikov
a1948ed338 satellite/orders: add old method for CreateGetOrderLimitsOld to maintain compatibility with old versions of the uplink
Change-Id: I7ce1f4fbc6217f1d340cf778c4b010d40961b3f0
2020-01-28 18:54:24 -05:00
Jessica Grebenschikov
54dbaaece2 satellite/orders: create as many orderLimits as needed to download a file
Change-Id: I2a39483d35037d9940913c035a78a93ea692ce9f
2020-01-28 20:04:11 +00:00
paul cannon
8ce9ce7f0f satellite/gracefulexit: wait for errgroup to return
credit to Yingrong

Change-Id: I538371040d4dcdf6e943c61e8454320fd57b7526
2020-01-28 19:26:43 +00:00
Michal Niewrzal
90fc1922d0 satellite/metainfo: override bucket RS values with satellite config
Satellite now is keeping RS values for uplink but old uplinks were using
default bucket settings. Because of that we need to override buckets
settings with satellite settings to avoid breaking older uplinks.

Change-Id: Ia1068db70e4adbf741c5e81d27d9e39799049c22
2020-01-28 15:51:04 +00:00
Jennifer Johnson
2209924d41 satellite/satellitedb: use arrays and batch inserts for SaveTallies query
Cockroachdb is more performant with multi-row inserts

Change-Id: Ie1ce2a9da0be1df4e66e72fc9cae49cbd95023f3
2020-01-27 16:54:20 -05:00
Egon Elbre
227e03dea1 satellite/satellitedb: insert using arrays
Using dynamic query strings is error prone, prefer arrays.

Change-Id: I303fbf21c6a54795bd9f399371943b5c51e6f863
2020-01-27 21:27:28 +00:00
Jeff Wendling
d09bd4a749 satellite/satellitedb/dbx: regenerate with paged composite key fixes
before dbx would generate a compilcated blob of conditions that
encoded a row comparison, which only optimized to an index seek
on cockroachdb. this means that sqlite and postgres both had
quadratic behavior on paged queries of this form. instead, use
the implicit row construction feature supported in all of the
databases to do paged support so that they all optimize well.

Change-Id: Iac8703929ba2a59ee3ffa619b916d12663422887
2020-01-27 12:43:16 -07:00
Yingrong Zhao
f3fcbe256c satellite/metainfo: revert combine CommitSegment and CommitObject in batch
This reverts commit 8772867855.

for uplink versions v0.25.0 through v0.30.7, there's a bug with multiplesegment upload
where the last segment is inline caused by this commit.

Change-Id: If375e186b23265586caf08991c25980e99f3cc1a
2020-01-27 13:26:33 -05:00
NikolaiYurchenko
9bcb81108f web/satellite: verification email change
Change-Id: I0293ef4411b55e42bb372b230d797d6798eda515
2020-01-27 15:55:52 +02:00
Michal Niewrzal
ca32ffbfc5 satellite/metainfo: move deletion before upload to satellite
Change is adding object deletion to BeginObject request (before upload).
Now when satellite controls deletion we can move deletion before upload
to satellite. This change improves two things:
* no need for additional request to delete object before upload (need
one more change to storj/uplink)
* fix an issue with lack of permissions to upload if caveat allows only
for writing (e.g. disallow deletes but allows to write)

https://storjlabs.atlassian.net/browse/V3-3362

Change-Id: Ic453146298cdd302df290c532123731a3f99e38e
2020-01-27 12:48:10 +00:00
paul cannon
a0a94a9ac7 satellite/satellitedb: insert into reported_serials w/ arrays
Change-Id: Icb682de09ded3e3159e3590594dcf13f2e7f40f0
2020-01-24 18:36:21 -06:00
littleskunk
90cf78e6f2 satellite/coinpayments: fix migration
The old migration was not working. It was updateding pending (status 0)
and failed (status -1) to completed (status 100).

Change-Id: I808ff3cc692fe6c698ce26a8b411b134e67b752b
2020-01-25 00:12:35 +00:00
littleskunk
a6c6440ab7 satellite/order: decrease expire time from 7 days to 2 days
For the last few month we had no issues with order submission. I would
call it stable and now it is time to risk a lower expire time. This will
increase the database performance on the satellite and it will reduce
the delay for billing.

The long term goal is 6h but for that step we need to change graceful
exit first. At the moment storage nodes would get disuqlaified for not
transfering alle pieces in less than 6 hours.

Change-Id: I421a2c2421c5374c4e706e2338f1c2161fedc14c
2020-01-24 23:37:39 +00:00
Jeff Wendling
26e33e7e07 satellite/gracefulexit: make orders with right bucket id and action
paths are organized as follows:

    project_id/segment_index/bucket_name/encrypted_key

so by picking parts[0] and parts[1], we were using the segment
index instead of the bucket name, causing bandwidth to be
accounted for incorrectly. additionally, we were using the
PUT action instead of the PUT_GRACEFUL_EXIT action, causing
the data to be charged incorrectly. we use PUT_REPAIR for
now because nodes won't accept uploads with PUT_GRACEFUL_EXIT
and our tables need migrations to handle rollups with it.

Change-Id: Ife2aff541222bac930c35df8fcf76e8bac5d60b2
2020-01-24 19:27:38 +00:00
Cameron Ayer
494fead7af satellitedb/orders: fix comma bug in SQL stmt
Change-Id: Ibc6024eeeb5aa4de3909c0cec2d01ac0a01c809f
2020-01-24 13:58:32 -05:00
Ivan Fraixedes
f5c9597d29 golangci: Enable new linter added to last release
Enable a new golangci-lint linter that has been added to the last
release. It reports a very little number of issues so they are fix it in
this commit.

Change-Id: I74fef4779c3f592aae19103fd9f70103586fe24e
2020-01-24 18:09:37 +00:00
Ivan Fraixedes
d5a60aec58
satellite/metainfo: Delete segments in reverse order
Change DeleteObjectPieces for deleting the segments' pointers of an
object in a reverse order.

Last segment: L
N: total number of segments

Deleting in reverse order is: L, n-1 to 0

Deleting in reverse order makes BeginDeleteObject usable to delete
partially uploaded objects that were interrupted (e.g. upload
cancellation).

With this change, the uplink upload cancellation, can be changed to use
BeginDeleteObject to cleanup already uploaded segments without having to
retrieve orders and dial every single node which stored a piece.

Ticket: https://storjlabs.atlassian.net/browse/V3-3525
Change-Id: Ieca6fd3801c4b71671811cb5f08a99d5146928a6
2020-01-24 16:05:12 +02:00
Jeff Wendling
665ed3b6b1 satellite/satellitedb: fix issue with shared memory on range for bucket rollups
A uuid.UUID is an array of bytes, and slicing it refers to the
underlying value, much like taking the address. Because range
in Go reuses the same value for every loop iteration, this means
that later iterations would overwrite earlier stored project
ids. We fix that by making a copy of the value before slicing it
for every loop iteration.

Change-Id: Iae3f11138d11a176ce360bd5af2244307c74fdad
2020-01-23 21:57:02 -07:00
ccase
a9e4c6f66d
satellite/satellitedb/dbx: Remove bashism from gen.sh
Change-Id: Ia698edae99d7ff0c73fa457b4a3c0a7b5f0bbec5
2020-01-23 17:09:07 -05:00
Isaac Hess
44de90ecc8 storagenode/pieces: Rename vars and update comments
A few variables were not renamed to the new standard piecesTotal and
piecesContentSize, so it was unclear which value was being used. These
have been updated, and some comments made more thorough.

Change-Id: I363bad4dec2a8e5c54d22c3c4cd85fc3d2b3096c
2020-01-23 11:00:24 -07:00
Isaac Hess
14fd6a9ef0 storagenode/pieces: Track total piece size
This change updates the storagenode piecestore apis to expose access to
the full piece size stored on disk. Previously we only had access to
(and only kept a cache of) the content size used for all pieces. This
was inaccurate when reporting the amount of disk space used by nodes.

We now have access to the total content size, as well as the total disk
usage, of all pieces. The pieces cache also keeps a cache of the total
piece size along with the content size.

Change-Id: I4fffe7e1257e04c46021a2e37c5adc6fe69bee55
2020-01-23 11:00:24 -07:00
Isaac Hess
40a890639d satellite/orders: Flush all pending bandwidth rollup writes on shutdown
Currently we risk losing pending bandwidth rollup writes even on a clean
shutdown. This change ensures that all pending writes are actually
written to the db when shutting down the satellite.

Change-Id: Ideab62fa9808937d3dce9585c52405d8c8a0e703
2020-01-23 08:12:41 -07:00
Isaac Hess
960e103082 satellite/orders: Rename orders_write_cache to rollups_write_cache
Change-Id: Icffca37e40bb8b2927b38d97728575321c2ad90c
2020-01-23 08:12:41 -07:00
Isaac Hess
0548c3f6bf satellite/orders: RollupsWriteCache has a single method to reset cache
Change-Id: I3ae18115dccd7ac8369313bd96951b9da6464cf3
2020-01-23 08:12:41 -07:00
Egon Elbre
c6f94ce9e4 satellite/metainfo: remove support for boltdb based pointerDB
By previous changes we can now remove testplanet.New and
also remove metainfo boltdb support.

Change-Id: I5bdfbbbb45967492728e705b34b2fedb4f28c381
2020-01-23 13:54:00 +02:00