Commit Graph

177 Commits

Author SHA1 Message Date
Kaloyan Raev
f773bb80c8 mod: bump common to fetch latest placement type changes
Change-Id: I3d0813f05622e706c9be1a578b5e4d4159d16dfc
2021-11-16 12:42:25 +00:00
Mya
bf51c286d9
satellite/geoip: update node check-in to associate a country code
Resolves https://github.com/storj/storj/issues/4247

Change-Id: Idfd71bf1795d48ca3c686066bbdb95b9c6594f00
2021-11-10 16:44:41 +01:00
Egon Elbre
d5628740fd satellite/overlay/straynodes: prevent disqualification in tests
Currently the test threshold for unneeded node checkins are 30s.
The disqualification threshold was at 30s, which means, it was possible
for all the nodes to get disqualified.

Hopefully fixes #4267

Change-Id: I6b0a10c09b7fd90a9729794885c9e7a593781bad
2021-11-09 12:29:21 +00:00
Márton Elek
fb604be460 satellite/gracefulexit: use nodecache with gracefulexit
Change-Id: I700caf6dfd06bd3b3970a8cf7c5a79da4af27b5f
2021-11-05 19:14:14 +00:00
Cameron Ayer
1de8a695e8 satellite/{overlay,satellitedb}: fix stray nodes DQ bug
We had a bug in the stray nodes chore where nodes who had not been seen
in several months were not being DQd. We figured out that this was
happening because we were using two queries: The first to grab
nodes where last_contact_success < some cutoff, the second to DQ them
unless last_contact_success == '0001-01-01 00:00:00+00'. The problem
is that if all of the nodes returned from the first query had
last_contact_success of '0001-01-01 00:00:00+00', we would pass them to
the second query which would not DQ them. This would result in the stray
nodes DQ loop ending since we found a number of nodes to DQ less than the
limit.

The fix: add the "WHERE last_contact_success != '0001-01-01
00:00:00+00'::timestamptz" to the selection query.

Change-Id: I4e60de90b68d8745d641b4467c2b23e0e56f7dff
2021-11-02 17:05:00 +00:00
Egon Elbre
f2d8e97d97 satellite/satellitedb: simplify select nodes query construction
Change-Id: I07009b28762d4485929a2a999e8f4be8179bee51
2021-10-22 07:41:07 +00:00
Egon Elbre
51f488fc4b satellite/satellitedb: fix selection query with AOST
Union query for both reputable and new nodes didn't properly work. The
top level query is required to have an `AS OF SYSTEM TIME` statement as
well.

Change-Id: I8ee6dd5b700c2b1ed2aa562962bfa72be7eec30a
2021-10-19 16:59:40 +03:00
Cameron Ayer
bb21551a9c satellite/satellitedb: remove references to contained column in nodes table
We don't use this column for anything. If you want to know if a node is
contained, you can check the pending_audits table.

Change-Id: I8da1d8e01a2dcaff63c5067a7927b5451424ad04
2021-10-14 19:17:46 +00:00
Yingrong Zhao
e00b573f8f satellite/overlay: fix UpdateCheckIn comment
Change-Id: I8584895249d7e5be6dbec79974fc9f77b7a1930c
2021-08-06 05:54:00 +00:00
Egon Elbre
65804801ec all: fix mon.Task leak
Change-Id: Ifd58c7ac5631b9c3c750b3f4cc50525167e90709
2021-08-05 14:07:45 +03:00
Yingrong Zhao
ae02f6deda satellite/reputation: return default reputation stats when node is not
found

Change-Id: I587d0ab36ffa0efaf345a6a6e221ae5d2068e1c5
2021-08-04 19:34:54 +00:00
Yingrong Zhao
646ce5b8cc satellite/overlay: remove reputation logic from overlay
Change-Id: I3492860e4537c7a8e4e824ec4c9c8d179134a0c0
2021-07-28 15:15:28 -04:00
Yingrong Zhao
f8914ccce0 satellite/{repair, overlay}: use reputation store in repair
Change-Id: I48db9e68f48239d48621ccc77d33618ecb83ce1a
2021-07-28 13:22:05 -04:00
Yingrong Zhao
e91574cee1 satellite/{reputation, gracefulexit}: use reputation store in
gracefulexit

With the effort to move audit related data into reputation store, this
PR updates gracefulexit endpoint to use reputation service to get a
node's audit score

Change-Id: Iad93ea689ad67ff9c57c7be16687e21e715fab7a
2021-07-28 13:21:41 -04:00
Yingrong Zhao
6c7bf357cd satellite/{reputation,audit,overlay}: replace overlay with reputation
package in audit

This PR implements reputation store and replace overlay in audit service
to use such store for storing node's audit stats.

In order to keep the changeset smaller, most of the changes in this PR is for copying audit logic in overlay to
reputation package. In a following PR, the duplicating code will be
removed from overlay.

Change-Id: I16c12494a0970f44c422b26cf603c1dc489e5bc1
2021-07-28 13:10:48 -04:00
Cameron Ayer
8c124c6fa4 satellite/{reputation,overlay,satellitedb}: create reputation service, DB, add overlay method UpdateReputation
Define service and DB interface for storing node reputation data
and updating the overlay cache.
Add overlay service and DB method UpdateReputation.
See https://github.com/storj/storj/pull/4144

Change-Id: Iedd8bd3274457d26c595919303d55327c1464b8c
2021-06-24 16:19:15 +00:00
Egon Elbre
59e3b586e7 satellite/{gracefulexit,overlay}: enable as of system time queries
Change-Id: I2af5eb0e8a51fca7893ce07b78b5633be71dfef8
2021-06-22 11:50:50 +00:00
Jeremy Wharton
8a070e7c25 satellite/overlay: Ignore unnecessary check-ins
This prevents the database from being contacted unnecessarily,
reducing load.

Change-Id: Ib2420f68a20636ec35eb3dd3df8e02bd5341b419
2021-06-22 09:00:41 +00:00
JT Olio
da9ca0c650 testplanet/satellite: reduce the number of places default values need to be configured
Satellites set their configuration values to default values using
cfgstruct, however, it turns out our tests don't test these values
at all! Instead, they have a completely separate definition system
that is easy to forget about.

As is to be expected, these values have drifted, and it appears
in a few cases test planet is testing unreasonable values that we
won't see in production, or perhaps worse, features enabled in
production were missed and weren't enabled in testplanet.

This change makes it so all values are configured the same,
systematic way, so it's easy to see when test values are different
than dev values or release values, and it's less hard to forget
to enable features in testplanet.

In terms of reviewing, this change should be actually fairly
easy to review, considering private/testplanet/satellite.go keeps
the current config system and the new one and confirms that they
result in identical configurations, so you can be certain that
nothing was missed and the config is all correct.
You can also check the config lock to see what actual config
values changed.

Change-Id: I6715d0794887f577e21742afcf56fd2b9d12170e
2021-06-01 22:14:17 +00:00
Egon Elbre
10372afbe4 ci: fix lint errors
Change-Id: Ib5893440807811f77175ccd347aa3f8ca9cccbdf
2021-05-17 13:37:31 +00:00
Egon Elbre
8f15f975a2 satellite/overlay: improve contended update checkin
Improve UpdateCheckIn on a contended row:

  name                             old time/op  new time/op delta
  UpdateCheckInContended-100x-32   2.29s ±55%   0.17s ±61%  -92.45%  (p=0.008 n=5+5)

Change-Id: I053ab9f1cff136c306e5fb57f5e355cdc0269a8c
2021-05-16 20:41:12 +03:00
Egon Elbre
0858c3797a satellite/{metabase,satellitedb}: deduplicate AS OF SYSTEM TIME code
Currently we were duplicating code for AS OF SYSTEM TIME in several
places. This replaces the code with using a method on
dbutil.Implementation.

As a consequence it's more useful to use a shorter name for
implementation - 'impl' should be sufficiently clear in the context.

Similarly, using AsOfSystemInterval and AsOfSystemTime to distinguish
between the two modes is useful and slightly shorter without causing
confusion.

Change-Id: Idefe55528efa758b6176591017b6572a8d443e3d
2021-05-11 12:40:36 +03:00
Egon Elbre
d2033c2f52 satellite/nodeselection/uploadselection: rename package
Currently nodeselection package only contained state for uploads, move
these to a subpackage, such that we can make another "downloadselection"
for downloads. Then move selection logic from overlay to nodeselection.

Change-Id: I0fc42bcae3a29db2728dae9f3863b1e95bf5165b
2021-05-04 15:50:00 +00:00
Egon Elbre
961e841bd7 all: fix error naming
errs.Class should not contain "error" in the name, since that causes a
lot of stutter in the error logs. As an example a log line could end up
looking like:

    ERROR node stats service error: satellitedbs error: node stats database error: no rows

Whereas something like:

    ERROR nodestats service: satellitedbs: nodestatsdb: no rows

Would contain all the necessary information without the stutter.

Change-Id: I7b7cb7e592ebab4bcfadc1eef11122584d2b20e0
2021-04-29 15:38:21 +03:00
Cameron Ayer
a0c5da6643 satellite/satellitedb: in stray nodes DQ, don't DQ nodes where last_contact_success = '0001-01-01 00:00:00+00'
When nodes check in for the very first time, if the satellite can't ping
them back, they are inserted into the nodes table with
last_contact_success of '0001-01-01 00:00:00+00'. If the stray nodes
chore runs before the node can fix their problem, they are DQd.

Solution: when DQing stray nodes, dont DQ where last_contact_success =
'0001-01-01 00:00:00+00'::timestamptz

Change-Id: I477a02d5ef85b2c930ed6b7d99a4d1995169bca8
2021-04-22 10:13:13 -04:00
Egon Elbre
267506bb20 satellite/metabase: move package one level higher
metabase has become a central concept and it's more suitable for it to
be directly nested under satellite rather than being part of metainfo.

metainfo is going to be the "endpoint" logic for handling requests.

Change-Id: I53770d6761ac1e9a1283b5aa68f471b21e784198
2021-04-21 15:54:22 +03:00
Jeff Wendling
a65aecfd98 compensation: always generate invoices for every node
instead of only generating invoices for nodes that had some
activity, we generate it for every node so that we can find
and pay terminal nodes that did not meet thresholds before
we recognized them as terminal.

Change-Id: Ibb3433e1b35f1ddcfbe292c034238c9fa1b66c44
2021-03-29 14:15:45 +00:00
Egon Elbre
d57873fd45 satellite/overlay: remove Inspector
Currently overlay.Inspector had two rpc methods and both of them were
unimplemented.

Change-Id: I1a2ecc7b7113898fa234a1c1fe451c8cc9e2ee81
2021-03-29 12:26:10 +03:00
Egon Elbre
86e698f572 pb: use *UnimplementedServer to avoid breaking API changes
Change-Id: I99a34eeb37ac4453411f273511710562a519f57a
2021-03-29 12:26:10 +03:00
Cameron Ayer
05f8d2d0b1 satellite/satellitedb: filter offline suspended nodes from selection
Change-Id: I5a6f413453332238d579a7bf50eb30e9156f96c2
2021-03-27 23:36:46 +00:00
Cameron Ayer
1a51049ac0 satellite/{overlay,satellitedb}: add flag to toggle suspending nodes for offline audits
This change introduces a new config flag,
--overlay.audit-history.offline-suspension-enabled,
to toggle suspending nodes for offline audits.

If the flag is set to true, nodes will be suspended if they meet the
requirements.

If the flag is false, nodes will not be suspended. If they are already
suspended and/or under review, these will be cleared.

Change-Id: Ibeba759c42d6e504f6b7598120d4fd4dab85ca74
2021-03-27 16:28:27 +00:00
Cameron Ayer
eb44dc21b4 satellite/satellitedb: select stray nodes for DQ in separate tx from update
Previously we would select a limited number of nodes for DQ in a
CTE and run the update on that set in a single transaction. This
could lead to locking on the table, so instead we select and update
in separate transactions.

Change-Id: I1e802c0845e829eeadcee4fa382f58462515fdb1
2021-03-27 00:00:23 +00:00
Michał Niewrzał
237782813b Merge remote-tracking branch 'origin/multipart-upload'
Change-Id: If6c5a450b238adab55d1e0dea67d01e5f5768a9f
2021-03-23 09:44:49 +01:00
Cameron Ayer
864ad70fe2 satellite/overlay/straynodes: set --stray-nodes.enable-dq release default to true
Since we will enable this on all satellites, just set default to true

Change-Id: Ibc86a0afd0b0f57e86bd067abb9cdf06c295a467
2021-03-22 17:25:09 +00:00
Cameron Ayer
2607b16070 satellite/{overlay/straynodes,satellitedb}: rework DQNodesLastSeenBefore to return DQd node IDs and last contact successes
We would like to log Node IDs and last contact successes of nodes DQd
in this manner. We would also like to avoid returning an unbounded list
of items from the db. Therefore we change the query to select a limited
number of nodes that meet the DQ conditions and iterate until 0 rows are
returned. Each column of the query is already indexed.

Change-Id: Iaec2d9b56e7202b7c2028ba21750d40c8dd506ee
2021-03-22 13:01:30 -04:00
Michał Niewrzał
d995fb497f Merge remote-tracking branch 'origin/main' into multipart-upload
Change-Id: I367da03351ab80f7343332420490dde9282aa47a
2021-02-23 12:31:31 +01:00
Cameron Ayer
549033f2e6 satellite/satellitedb: don't include DQd and exited nodes in DQStrayNodes
Don't update DQ time of already DQd nodes. Don't DQ nodes who exited.

Change-Id: I4528a9ba9f8e278987165ad337a9b34dadb9788b
2021-02-19 15:12:30 -05:00
Egon Elbre
1137620baf satellite/satellitedb: move tests to their domains
Testing interfaces is slightly clearer when it's in the package needing
the database rather than each individual implementation.

Change-Id: I10334c214a205f7e510b939b4359a2214c4e060a
2021-02-19 17:29:15 +02:00
JT Olio
b2ed7edd30 cmd/satellite: restore-trash parallel workers
Change-Id: Ic7466b21c20bda334e7ba4268a494e96b6528ac1
2021-02-18 19:11:19 +02:00
JT Olio
3ae3389ddc cmd/satellite: restore-trash command
Change-Id: I80fc932c12147692d49cde277784871ac611fcad
2021-02-18 09:19:22 -07:00
Michał Niewrzał
12402eb729 Merge remote-tracking branch 'origin/main' into multipart-upload
Change-Id: I38adf8218c1415c7ea1910f8bd6bed13544b0f03
2021-02-17 08:50:38 +01:00
Egon Elbre
f7ad86521e satellite/overlay: fix data race in TestAuditHistoryBasic
Change-Id: I196f10973fe10b10b226ac3a63e62bf4fe9c256b
2021-02-16 12:32:19 +02:00
Michał Niewrzał
908a96ae30 Merge remote-tracking branch 'origin/main' into multipart-upload
Change-Id: I075aaff42ca3f5dc538356cedfccd5939c75e791
2021-02-11 11:48:23 +01:00
Yaroslav Vorobiov
966535e9de {storagenode,satellite}/nodeoperator: add wallet features
Change-Id: Iac7eb40a52b8fddcc573aebaad2e3a30a10cded9
2021-02-08 22:09:45 +02:00
Kaloyan Raev
d0612199f0 Merge remote-tracking branch 'origin/main' into multipart-upload
Conflicts:
	go.mod
	go.sum
	satellite/metainfo/config.go
	satellite/metainfo/metainfo_test.go

Change-Id: I95cf3c1d020a7918795b5eec63f36112fdb86749
2021-02-01 14:32:12 +02:00
Egon Elbre
b7a0739219 satellite/overlay: use DownloadSelectionCache for getting node IPs
Change-Id: Ib8f4eedb2bf465767050693a1e961b37a294ca06
2021-01-29 16:47:10 +02:00
Egon Elbre
54e01d37f9 satellite/overlay: add DownloadSelectionCache
Change-Id: Ic0779280172325f8d03f55a2e9673722f72bdd44
2021-01-29 16:47:06 +02:00
Egon Elbre
19e3dc4ec0 satellite/overlay: rename NodeSelectionCache to UploadSelectionCache
It wasn't obvious that NodeSelectionCache was only for uploads.

Change-Id: Ifeeaa6fdb50a4b7916245b48d8634d70ac54459c
2021-01-28 14:56:53 +02:00
littleskunk
0b2568d712
satellite/overlay/straynodes: increase development duration without contact
Stopping storj-sim for over 5 minutes caused nodes to be disqualified. Set development max duration without contact to 300 days.
2021-01-26 12:24:39 +02:00
Kaloyan Raev
c24ada7114 Merge remote-tracking branch 'origin/main' into multipart-upload
Conflicts:
	go.mod
	go.sum

Change-Id: Icf7c029e9d800e5f6a9fdd208c36f28e05468690
2021-01-20 17:35:57 +02:00