Commit Graph

1095 Commits

Author SHA1 Message Date
Jess G
75b9a5971e
satellite: update log levels (#3851)
* satellite: update log levels

Change-Id: I86bc32e042d742af6dbc469a294291a2e667e81f

* log version on start up for every service

Change-Id: Ic128bb9c5ac52d4dc6d6c4cb3059fbad73f5d3de

* Use monkit for tracking failed ip resolutions

Change-Id: Ia5aa71d315515e0c5f62c98d9d115ef984cd50c2

* fix compile errors

Change-Id: Ia33c8b6e34e780bd1115120dc347a439d99e83bf

* add request limit value to storage node rpc err

Change-Id: I1ad6706a60237928e29da300d96a1bafa94156e5

* we cant track storage node ids in monkit metrics so lets use logging to track that for expired orders

Change-Id: I1cc1d240b29019ae2f8c774792765df3cbeac887

* fix build errs

Change-Id: I6d0ffe058e9a38b7ed031c85a29440f3d68e8d47
2020-04-15 12:32:22 -07:00
Egon Elbre
d3ce845f82 satellite: revert log lines used to figure out node id
Currently storj-sim relies on the log lines to be exactly the same,
when they change it cannot find the necessary information from log.

Change-Id: Ia039915ef3375a7cf60f107b2c05c958de15b6d5
2020-04-15 17:07:56 +03:00
Kaloyan Raev
a2ce836761 remove sugar logging
Change-Id: I6b6ca9704837cb3f5f5449ba7f55661487814d9f
2020-04-15 12:37:47 +00:00
VitaliiShpital
158013a866 satellite/console: redirect on account activation
Change-Id: I2506ce0fd3832bf46fbcdcc5a42bb83dc926e99a
2020-04-15 11:49:50 +00:00
Egon Elbre
2c0d61b18e satellite/metainfo: avoid temporary list
Currently ListV2 loaded the whole data into memory, even when all the
data wasn't being used, using up more memory than needed.

Change-Id: I5846d979344729b447c108a6cc9f4227229ec981
2020-04-15 08:01:42 +00:00
Jess G
5ea1602ca5
satellite/overlay: add selected node cache (#3846)
* init implementation cache

Change-Id: Ia54a1943e0707a77189bc5f4a9aaa8339c98d99a

* one query to init cache

Change-Id: I7c04b3ae104b553ae23fca372351a4328f632c66

* add monit tracking of cache

Change-Id: I7d209e12c8f32d43708b23bf2126c5d5098e0a07

* add first test

Change-Id: I0646a9349d457a9eb3920f7cd2d62fb72ffc3ab5

* add staleness to cache

Change-Id: If002329bfdd53a4b200ad14dbd2ffc8b280aedb8

* add init test

Change-Id: I3a3d0aa74cfac1d125fa93cb749316ed2a74d5b1

* fix comment

Change-Id: I73353d00ccf0952b38c0f8ef7d1755c15cbfe9d9

* mv to nodeselection pkg

Change-Id: I62487f768296c7a7b597fa398a4c42daf6e9c5b7

* add state to cache

Change-Id: I081e77ec0e16706faee1a267de9a7fa643d6ac11

* add refresh concurrent test

Change-Id: Idcba72508291099f280edc65355273c0acc3d3ce

* add a few more tests

Change-Id: I9422e9eaa22bf01c11f14bdb892ebcf7b3e5e5fb

* fix tests, add min version to select allnodes

Change-Id: I926f41d568951ad4ff70c6d4ceb87abb1e3e5009

* update comments

Change-Id: I6ffe33e245ca65fb523c880cd72e63ce35776eb9

* fixes and rm Init

Change-Id: Ifbe09b668978b5d9af09ca38cb080d02a2154cf4

* fix format

Change-Id: I03cc217e28dc1839190c5c6dbdbb602c132a5a38
2020-04-14 13:50:02 -07:00
Moby von Briesen
d7794a4851 satellite/overlay: hardcode default values for audit alpha/beta
Alpha=1 and beta=0 are the expected first values for any alpha/beta
reputation system we are using in the codebase. So we are removing the
configurability of these values.

Change-Id: Ic61861b8ea5047fa1438ea6609b1d0048bf0abc3
2020-04-14 19:12:40 +00:00
Cameron Ayer
3ee6c14f54 satellite/downtime: add concurrency to downtime estimation
We want to increase our throughput for downtime estimation. This commit
adds the ability to reach out to multiple nodes concurrently for downtime
estimation. The number of concurrent routines is determined by a new config
flag, EstimationConcurrencyLimit. It also increases the default
EstimationBatchSize to 1000.

Change-Id: I800ce7ec1035885afa194c3c3f64eedd4f6f61eb
2020-04-14 14:39:13 +00:00
Egon Elbre
c97131ae78 satellite/metainfo: organize attribution methods
Change-Id: I4f35599c3f923861b5b05b128bf904480679f5f9
2020-04-14 11:27:43 +03:00
Jeff Wendling
83bbc7a37e satellite/accounting/tally: ignore expired pointers
until we do a good job of cleaning them up, we should at least
not charge or pay people for them. nodes already locally delete
expired segments.

subsumes the tests in 1112.

Change-Id: I5961185764e02f6136b3231b44ecc75a9a8832c9
2020-04-10 11:36:00 -06:00
Cameron Ayer
02613407ae satellite/satellitedb: only suspend node if not already suspended
Whenever the node's reputation is updated, if its unknown audit
reputation is below the suspension threshold, its suspension field
is set to the current time. This could overwrite the previous
"suspendedAt" value resulting a node that never reaches the end of
its suspension.

Also log whenever a node is disqualified or its suspension status
changes

Change-Id: I5e8c8f1c46f66d79cb279b5b16a84fe03f533deb
2020-04-10 09:37:37 +00:00
Egon Elbre
d86cce202c satellite/satellitedb: use arrays for arguments in node selection
This simplifies the code and makes queries faster:

name                               old time/op  new time/op  delta
SelectStorageNodes-32              7.72ms ± 6%  7.22ms ± 3%  -6.44%  (p=0.016 n=5+5)
SelectNewStorageNodes-32           7.75ms ± 2%  7.37ms ± 1%  -4.89%  (p=0.008 n=5+5)
SelectStorageNodesExclusion-32     16.9ms ± 0%  16.6ms ± 0%  -2.15%  (p=0.008 n=5+5)
SelectNewStorageNodesExclusion-32  17.2ms ± 0%  16.6ms ± 2%  -3.69%  (p=0.008 n=5+5)
FindStorageNodes-32                45.5ms ± 0%  45.1ms ± 1%    ~     (p=0.056 n=5+5)
FindStorageNodesExclusion-32       77.4ms ± 0%  75.9ms ± 0%  -1.91%  (p=0.008 n=5+5)

Change-Id: I38f77f6282b9738e8416113d42c6acb46c03da7b
2020-04-09 21:16:10 +03:00
Egon Elbre
ccf4f9ed2d satellite/satellitedb: node selection code cleanup
Reduce the number of non-methods to reduce funcs in the namespace also
combine a func to slightly condense the code more.

Change-Id: Ifbe728eb8c8ca4c981df648decd259c2097b6b40
2020-04-09 20:41:29 +03:00
Natalie Villasana
cf80b3caf3
satellite/overlay: combine SelectStorageNodes and SelectNewStorageNodes (#3831) 2020-04-09 11:19:44 -04:00
Moby von Briesen
14b3704f56 storagenode: add suspended status to storagenode dashboard/api
* Add migration to storagenode reputation table to add suspended
timestamp
* Send suspended info to storagenode from satellite nodestats endpoint
* Add suspended status to storagenode api
* Add an indicator on the storagenode dashboard informing operator of
the satellites the node is suspended on

Change-Id: Ie3669f6069cc0258ba76ec99d17006e1b5fd9c8a
2020-04-09 13:36:23 +00:00
Michal Niewrzal
f36e8548f1 satellite/metainfo: adjust max inline segment size validation to
potential encryption overhead.

This is the same approach we have for validating remote segment size.

https://storjlabs.atlassian.net/browse/USR-619

Change-Id: I2597ee734313a3068fd986001680bbedbf1bed2a
2020-04-09 12:34:10 +00:00
Egon Elbre
11a44cdd88 all: don't depend on gogo/proto directly
Change-Id: I8822dea0d1b7b99e0b828e0373a0308a42dde2be
2020-04-08 17:32:15 +00:00
Natalie Villasana
0a1bbc9824 satellite/overlay: add TestEnsureMinimumRequested
Adds a test to make sure that the correct amount of total nodes,
reputable nodes, and new nodes are returned by SelectStorageNodes
in different cases.

Change-Id: I0939159600afde8a46c35735f1edf0576fcdb4cd
2020-04-08 16:06:25 +00:00
Qweder93
3a9422cc9a satellite/nodestats: add pricing model to endpoint
Change-Id: Iddace8e437216a343458f440b543cee61164f233
2020-04-08 14:29:51 +00:00
Egon Elbre
cf26951a5b satellite/satellitedb/pbold: remove dead code
Change-Id: I7464773c20b8f99a601ca9cc4bee804f1ac14cf9
2020-04-08 15:22:31 +03:00
Fadila Khadar
1eb501b5c8 satellite/accounting: test zombie segments not billed
Check zombie segments are not billed

Change-Id: If998e785dcc82c382613318aee3043eaef7fd9ea
2020-04-08 08:28:49 +00:00
igor gaidaienko
80ee7321cd satellite/accounting: add test billing without expansion factor
Test checking that billing contains the download traffic without the extra pieces

Change-Id: I5c5d6d877116d26a83c98a93da424eecffcb804e
2020-04-07 12:55:43 +00:00
Jeff Wendling
2ded64ba2c satellite/compensation: more fixes to get prod running smoothly
Change-Id: I13a76d9d49222fb10796415a015f224d4084fde3
2020-04-07 10:10:27 +00:00
Egon Elbre
a4c554f2ed satellite/admin: support user query by email
This adds new endpoint /api/user/{user-email} which allows to get the
projects where the user is a member.

It also moves existing endpoint:

    /project/{projectid}/limit -> /api/project/{projectid}/limit

To avoid future conflicts for displaying pages.

Change-Id: I5efe3e1c8f79894c136f92ed815f635a34ba6f98
2020-04-06 18:32:25 +00:00
Michal Niewrzal
2d1a6968b0 satellite/metainfo: revert returning segment size and inline segment
size with BeginObject

Such solution will add one round trip to satellite during upload so for
now we are reverting this until we will have solution for this.

Change-Id: Ic2d826448ab7b0318cd6922df05deee9167cf2f0
2020-04-06 13:36:34 +02:00
Jennifer Johnson
1547e791a3 satellitedb: remove free_bandwidth column from nodes table
Change-Id: I9d1d3de9216c6533c1042ef473631721a011d086
2020-04-06 09:30:28 +00:00
Cameron Ayer
42be4bdc0f satellite/contact: add timeout to PingBack method
Change-Id: I2ec2f82e2e10d8be16f82e9de13ce42358e47c98
2020-04-04 18:26:30 +00:00
Egon Elbre
9200efc61f satellite/satellitedb: fix selecting a nullable string
Change-Id: I59e645966e09da586512c69101691b47055c1e5a
2020-04-03 21:30:20 +03:00
paul cannon
0c8c11b251 satellite/audit: add not_enough_shares_for_audit counter
We have been using the SQL expression `name='(*Verifier).Verify' AND
error_name='not enough shares for successful audit'` thus far to detect
cases of this problem and alert on them. Unfortunately, since this
rarely (hopefully never) happens, influxdb has no data for most of the
auditor instances, and when it has no data for a time series, it returns
no columns either. This makes Redash upset when it tries to perform a
query for an alert and can't find the column whose value it expects to
check.

This change should make it so zero values are reported when the problem
has not happened, and higher values when it has.

Change-Id: I79e5e000f879678b661dac88caae1e2915b39ab1
2020-04-03 17:00:50 +00:00
Matt Robinson
4b01a8dd18
Descellate Usage Error to Debug (#3780) 2020-04-03 13:20:20 +02:00
Jeff Wendling
a409bd5dec satellite/orders: check for expired orders first
there are a subset of storagenodes hammering the satellite with
expired orders. if we check for expiration first, we don't have
to do a bunch of pointless signature verification. since a && b
is equal to b && a, we can order these checks in any way we want
and have it still be correct.

Change-Id: I6ffc8025c8b0d54949a1daf5f5ea1fed9e213372
2020-04-02 12:35:11 -06:00
Egon Elbre
6492b13d81 all: remove old uuid
Change-Id: I3a137f73456f010c37d3933dbe12cbbb840b809f
2020-04-02 19:30:36 +03:00
Egon Elbre
1024bf9ce1 all: simplify uuid usage
Instead of uuid.Parse, use uuid.FromString.
This removes a bunch of pointer management logic.

Change-Id: Id25bd174eb43c71d00b450158a198abafd8958f2
2020-04-02 13:45:19 +00:00
Michal Niewrzal
c178a08cb8 satellite/metainfo: add max segment size and max inline size to
BeginObject response

We want to control inline segment size and segment size on satellite
side. We need to return such information to uplink like with redundancy
scheme.

Change-Id: If04b0a45a2757a01c0cc046432c115f475e9323c
2020-04-02 12:41:28 +00:00
Michal Niewrzal
4a79b609e9 satellite/metainfo: fix panic when we batch BeginObjectDelete without
all permissions

Without read and list permissions BeginObjectDelete won't return error
if occurs. This was breaking Batch processing because there was
assumption that without error response will be always not nil.

https://storjlabs.atlassian.net/browse/SM-590

Change-Id: I0fc9539e429110a660eb28725b266d5e4771d198
2020-04-02 12:20:19 +00:00
Egon Elbre
8f73fb7a32 all: simplify uuid usage
uuid.UUID implements driver.Value so it can be directly used as a
scannable result.

Replace uses of dbutil.BytesToUUID with uuid.FromBytes.

Change-Id: I51a670185ceb3cc2199d5aa2b76bc3fc191ca8fe
2020-04-02 05:48:58 +00:00
Jeff Wendling
ffe7a3c211 satellite/compensation: make surge percent an int64 for strictcsv
Change-Id: I1783bf73ee68ca9beb8a03f5928873fab0bbe95d
2020-04-01 14:06:33 +00:00
Egon Elbre
a416b03941 satellite/accounting: fix TestProjectBandwidthTotal
Test was inserting for past 4 days, however the test was summing up for
the current month.

Change-Id: I509afdc6a76b314a6bb90652ab70cd2c2bab1288
2020-04-01 11:50:18 +03:00
Egon Elbre
0a69da4ff1 all: switch to storj.io/common/uuid
Change-Id: I178a0a8dac691e57bce317b91411292fb3c40c9f
2020-03-31 19:16:41 +03:00
Egon Elbre
1d79228ed0 satellite/metainfo: support uplink useragent
Adds support for parsing user agent and specifying uplink version and
it's library dependencies.

Change-Id: Ibaddde4deb93e153ac05c91b676c5b5f1ae1aa37
2020-03-31 15:11:31 +00:00
Qweder93
dc32f1da55 storagenode/cache/heldamount added, errNoRows ignored
Change-Id: If6b675e622d6c1324c0893c43cca93dc5323cd78
2020-03-31 11:35:58 +00:00
Jeff Wendling
e2ff2ce672 satellite: compensation package and commands
Change-Id: I7fd6399837e45ff48e5f3d47a95192a01d58e125
2020-03-30 14:08:14 -06:00
littleskunk
23e5a0471f
satellite/audit: clean up logging (#3832)
Co-authored-by: Ivan Fraixedes <ivan@fraixed.es>
2020-03-30 12:09:50 -06:00
Jennifer Johnson
d77f3b8786 satellitedb/migrate: set vetted_at backfill to now.day
Change-Id: Ib2b12be43dbd3f3705b1891bc703ae15abb75e09
2020-03-30 16:50:23 +00:00
Egon Elbre
439aba922a satellite/overlay: reduce overhead of GetNodes
Instead of filtering on the client side it's better to filter on the
database side.

Change-Id: I845fbbe5ed28c2ffdb0b8a3f789b59c094fd1069
2020-03-30 18:36:23 +03:00
Egon Elbre
cb781d66c7 satellite/overlay: optimize FindStorageNodes
Reduce the number of fields returned from the query.

Benchmark results in `satellite/overlay`:

benchstat before.txt after2.txt
name                               old time/op  new time/op  delta
SelectStorageNodes-32              7.85ms ± 1%  6.27ms ± 1%  -20.18%  (p=0.002 n=10+4)
SelectNewStorageNodes-32           8.21ms ± 1%  6.61ms ± 0%  -19.53%  (p=0.002 n=10+4)
SelectStorageNodesExclusion-32     17.2ms ± 1%  15.9ms ± 1%   -7.55%  (p=0.002 n=10+4)
SelectNewStorageNodesExclusion-32  17.8ms ± 2%  16.1ms ± 0%   -9.38%  (p=0.002 n=10+4)
FindStorageNodes-32                48.4ms ± 1%  45.1ms ± 0%   -6.69%  (p=0.002 n=10+4)
FindStorageNodesExclusion-32       79.2ms ± 1%  76.1ms ± 1%   -3.89%  (p=0.002 n=10+4)

Benchmark results from `satellite/overlay` after making them parallel:

benchstat before-parallel.txt after2-parallel.txt
name                               old time/op  new time/op  delta
SelectStorageNodes-32               548µs ± 1%   353µs ± 1%  -35.60%  (p=0.029 n=4+4)
SelectNewStorageNodes-32            562µs ± 0%   368µs ± 0%  -34.51%  (p=0.029 n=4+4)
SelectStorageNodesExclusion-32     1.02ms ± 1%  0.84ms ± 0%  -18.08%  (p=0.029 n=4+4)
SelectNewStorageNodesExclusion-32  1.03ms ± 1%  0.86ms ± 2%  -16.22%  (p=0.029 n=4+4)
FindStorageNodes-32                3.11ms ± 0%  2.79ms ± 1%  -10.27%  (p=0.029 n=4+4)
FindStorageNodesExclusion-32       4.75ms ± 0%  4.43ms ± 1%   -6.56%  (p=0.029 n=4+4)

Change-Id: I1d85e2764eb270f4c2b1998303ccfc1179d65b26
2020-03-30 18:36:23 +03:00
Egon Elbre
a6540dc3ef satellite/overlay: remove unused KeyLock
Change-Id: Ie99f97772824ceafb3d97453545fc6e96be2fb6f
2020-03-30 16:47:48 +03:00
Ivan Fraixedes
de903a652d satellite/accounting: Test no repair traffic in billing
Add a test for checking that the billing doesn't include traffic due to
audits and repairs.

Change-Id: I1ce1199ec87c4440082c42b100124aeb14200f41
2020-03-30 10:48:35 +00:00
littleskunk
048ca4558f
satellite/repair: clean up logging (#3833)
Co-authored-by: Michal Niewrzal <michal@storj.io>
2020-03-30 11:59:56 +02:00
Egon Elbre
c970969503 satellite/overlay: add benchmark for node selection
Change-Id: I15b767a78b662f8276e656b3fb73a15ec59e76c8
2020-03-27 23:09:29 +02:00