Commit Graph

258 Commits

Author SHA1 Message Date
Artur M. Wolff
37d6df23fa satellite: implement metainfo.GetBucketLocation endpoint
Updates storj/storj-private#408
Updates storj/storj-private#409

Change-Id: Idaaca74b4a5c9c7907d095e0a3a5f29e52843ce6
2023-08-28 13:48:07 +02:00
Márton Elek
5c12a3406d satellite/nodeselection: improve annotation composability
We would like to make it easier to accept multiple annotations.

Examples:
```
country("GB") && annotation(...)
annotated(annotated(X,...),...)
```

Change-Id: I92e622e8b985b314dadddf83b17976c245eb2069
2023-08-28 09:27:04 +00:00
Márton Elek
84ea80c1fd satellite/repair/checker: respect autoExcludeSubnet anntation in checker rangedloop
This patch is a oneliner: rangedloop checker should check the subnets only if it's not turned off with placement annotation.
(see in satellite/repair/checker/observer.go).

But I didn't find any unit test to cover that part, so I had to write one, and I prefered to write it as a unit test not an integration test, which requires a mock repair queue (observer_unit_test.go mock.go).

Because it's small change, I also included a small change: creating a elper method to check if AutoExcludeSubnet annotation is defined

Change-Id: I2666b937074ab57f603b356408ef108cd55bd6fd
2023-08-23 13:45:09 +00:00
Márton Elek
4ccce11893
satellite/overlay: improve realistic placement rule test
10 --> node tag inclusion in raw format
11 --> same, but using same subnet is enabled
12 --> same as 11 but with US restrictions

Change-Id: I20792689e0caf5fe190f566a770d70c3b3824793
2023-08-23 13:56:35 +02:00
Márton Elek
5522423871 satellite/nodeselection: remove AutoExcludeSubnet filter
It's statefull, therefore it can hit naive users. (NodeFilters couldn't be reused for more than one iterations).

But looks like we don't need it, as `SelectBySubnet` doest the same job.

Change-Id: Ie85b7f9c2bd9a47293f4e3b359f8b619215c7649
2023-08-18 08:31:00 +00:00
Márton Elek
b218002752 satellite/overlay/placement: improve placement configurability with &&, placement, region and ! support in country
This patch makes it easier to configure existing placement rules only with string.

 1. placement(n) rule can be used to reuse earlier definitions
 2 .&& can be used in addition to all(n1,n2)
 3. country(c) accepts exclusions (like '!RU'), regions ('EU','EEA'), all and none

See the 'full example' unit test, which uses all of these, in a realistic example.

https://github.com/storj/storj/issues/6126

Change-Id: Ica76f016ebd002eb7ea8103d4258bacd6a6d77bf
2023-08-17 16:12:53 +00:00
Márton Elek
da08117fcd satellite/~placement: do not ignore placement check for placement=0
There are cases when we would like to override the default placement=0 rule.

For example when we would like to exclude tagged nodes from the selection (by default).

Therefore we couldn't use a shortcut any more, we should always check the placement rules, even if we use placement=0.

TODO: we need to update common, and rename `EveryCountry` to `DefaultPlacement`, just to avoid confusion.

https://github.com/storj/storj/issues/6126

Change-Id: Iba6c655bd623e04351ea7ff91fd741785dc193e4
2023-08-16 07:06:56 +00:00
Márton Elek
c08792f066 satellite/overlay: implement an exclude filter for placement configuration
https://github.com/storj/storj/issues/6126

Change-Id: I05215b5d46bec958001cc020edf1fa97b00d3299
2023-08-15 17:29:29 +00:00
Márton Elek
0e17b1018c satellite/{nodeselection,overlay}: support annotations on node filters
Change-Id: I844d8a25042750aae189175842113e2f052d5b17
2023-08-15 16:49:57 +00:00
Márton Elek
0b02a48a10
satellite/nodeselection: SelectBySubnet should use placement filters for all nodes
Current node selection logic (in case of using SelectBySubnet):

 1. selects one subnet randomly
 2. selects one node randomly from the subnet
 3. applies the placement NodeFilters to the node and ignore it, if doesn't match

This logic is wrong:

 1. Imagine that we have a subnet with two DE and one GB nodes.
 2. We would like to select DE nodes
 2. In case of GB node is selected (randomly) in step2, step3 will ignore the subnet, even if there are good (DE) nodes in there.

Change-Id: I7673f52c89b46e0cc7b20a9b74137dc689d6c17e
2023-08-04 10:48:15 +02:00
Márton Elek
f7b39aaed4 satellite/nodeselection: remove stats/size from nodeselection state
stats/size/count is not used by any production code, and it's not required, as we can assert the state with other checks.

real motivation: next commits will make the Selector of the State configurable, therefore we won't have one single Stat, it depends on the request parameters.

(we plan to support both network and id based randomization)

Change-Id: I631828fc0046d2fef5b7a674fc0268a0446e9655
2023-08-01 18:29:41 +00:00
Márton Elek
6f002f4220
satellite/overlay: NR placement should exclude nodes without geofencing information
https://github.com/storj/storj-private/issues/378

Change-Id: If2af02083496e5a8eefe27beabb406388ee50644
2023-07-31 09:55:54 +02:00
Egon Elbre
465941b345 satellite/{nodeselection,overlay}: use location.Set
location.Set is faster for comparisons.

Updates #6028

Change-Id: I764eb5cafc507f908e4168b16a7994cc7721ce4d
2023-07-11 17:16:30 +00:00
Egon Elbre
9370bc4580 satellite/{nodeselection,overlay}: bump common and fix some potential issues
* Handle failed country code conversion.
* Avoid potential issues with a data-race due to shared slice.

Updates #6028

Change-Id: If7beef2619abd084e1f4109de2d323f834a6090a
2023-07-11 11:13:41 +00:00
Michal Niewrzal
1d62dc63f5 satellite/repair/repairer: fix NumHealthyInExcludedCountries calculation
Currently, we have issue were while counting unhealthy pieces we are
counting twice piece which is in excluded country and is outside segment
placement. This can cause unnecessary repair.

This change is also doing another step to move RepairExcludedCountryCodes
from overlay config into repair package.

Change-Id: I3692f6e0ddb9982af925db42be23d644aec1963f
2023-07-10 12:01:19 +02:00
Márton Elek
97a89c3476 satellite: switch to use nodefilters instead of old placement.AllowedCountry
placement.AllowedCountry is the old way to specify placement, with the new approach we can use a more generic (dynamic method), which can check full node information instead of just the country code.

The 90% of this patch is just search and replace:

 * we need to use NodeFilters instead of placement.AllowedCountry
 * which means, we need an initialized PlacementRules available everywhere
 * which means we need to configure the placement rules

The remaining 10% is the placement.go, where we introduced a new type of configuration (lightweight expression language) to define any kind of placement without code change.

Change-Id: Ie644b0b1840871b0e6bbcf80c6b50a947503d7df
2023-07-07 16:55:45 +00:00
Márton Elek
70cdca5d3c
satellite: move satellite/nodeselection/uploadselection => satellite/nodeselection
All the files in uploadselection are (in fact) related to generic node selection, and used not only for upload,
but for download, repair, etc...

Change-Id: Ie4098318a6f8f0bbf672d432761e87047d3762ab
2023-07-07 10:32:03 +02:00
Márton Elek
ddf1f1c340 satellite/{nodeselection,overlay}: NodeFilters for dynamic placement implementations
Change-Id: Ica3a7b535fa6736cd8fb12066e615b70e1fa65d6
2023-07-06 12:08:01 +00:00
Michal Niewrzal
21c1e66a85 satellite/overlay: refactor ReliabilityCache to keep more data
ReliabilityCache will be now using refactored overlay Reliable method.
This method will provide more info about nodes (e.g. country code) and
with this we are able to add two dedicated methods to classify pieces:
* OutOfPlacementPieces
* PiecesNodesLastNetsInOrder

With those new method we will fix issue where offline but reliable node
won't be checked for clumped pieces and off placement pieces.

https://github.com/storj/storj/issues/5998

Change-Id: I9ffbed9f07f4881c9db3bd0e5f0412f1a418dd82
2023-07-05 11:19:10 +02:00
Michal Niewrzal
f2cd7b0928 satellite/overlay: refactor Reliable to be used with repair checker
Currently we are using Reliable to get missing pieces for repair
checker. The issue is that now checker is looking at more things than
just missing pieces (clumped/off, placement pieces) and using only node
ID is not enough. We have issue where we are skipping offline nodes from
clumped and off placement pieces check.

Reliable was refactored to get data (e.g. country, lastNet) about all
reliable nodes. List is split into online and offline. This data will be
cached for quick use by repair checker. It will be also possible to
check nodes metadata like country code or lastNet.

We are also slowly moving `RepairExcludedCountryCodes` config from
overlay to repair which makes more sens for it.

This this first part of changes.

https://github.com/storj/storj/issues/5998

Change-Id: If534342488c0e440affc2894a8fbda6507b8959d
2023-07-05 10:56:31 +02:00
Márton Elek
500b6244f8
satellite/satellitedb: create table for node tags
Change-Id: I884bb740974e6b8241aa6b85faf266b85fe892d4
2023-07-05 09:38:53 +02:00
Márton Elek
d38b8fa2c4 satellite/nodeselection: use the same Node object from overlay and nodeselection
We use two different Node types in `overlay` and `uploadnodeselection` and converting back and forth.

Using the same object would allow us to use a unified node selection interface everywhere.

Change-Id: Ie71e29d60184ee0e5b4547eb54325f09c418f73c
2023-07-03 16:59:33 +00:00
Michal Niewrzal
98f4f249b2 satellite/overlay: refactor KnownReliable to be used with repairer
Currently we are using KnownUnreliableOrOffline to get missing pieces
for segment repairer (GetMissingPieces). The issue is that now repairer
is looking at more things than just missing pieces (clumped/off
placement pieces).

KnownReliable was refactored to get data (e.g. country, lastNet) about
all reliable nodes from provided list. List is split into online and
offline. This way we will be able to use results from this method to all
checks: missing pieces, clumped pieces, out of placement pieces.

This this first part of changes to handle different kind of pieces in
segment repairer.

https://github.com/storj/storj/issues/5998

Change-Id: I6cbaf59cff9d6c4346ace75bb814ccd985c0e43e
2023-06-27 13:27:23 +02:00
Michal Niewrzal
eb407b2ae3 satellite/overlay: delete unused KnownOffline method
Change-Id: Ief9288fee83f9c381dd7840f48333babcd3d6bf7
2023-06-23 13:24:30 +00:00
Michal Niewrzal
9e3fd4d514 satellite/overlay: delete unused method
Change-Id: I87828fcac4f4a9fb08c86af188aa6ea28c5c64af
2023-06-22 12:45:59 +00:00
Michal Niewrzal
f7c7851519 satellite/metainfo: filter metainfo.GetObjectIPs by bucket/object placement
For now we will use bucket placement to determine if we should exclude
some node IPs from metainfo.GetObjectIPs results. Bucket placement is
retrieved directly from DB in parallel to metabase
GetStreamPieceCountByNodeID request.

GetObjectIPs is not heavily used so additional request to DB shouldn't
be a problem for now.

https://github.com/storj/storj/issues/5950

Change-Id: Idf58b1cfbcd1afff5f23868ba2f71ce239f42439
2023-06-07 16:52:02 +00:00
Michal Niewrzal
fe21fd42f7 satellite/overlay: add GetNodesOutOfPlacement method
We would like to verify if nodes matches specific placement e.g. to
validate segment pieces are correctly geofenced.

https://github.com/storj/storj/issues/5896

Change-Id: I842767dccc121a3c60224f677ab55e5dc150c76e
2023-05-30 14:57:20 +02:00
Michal Niewrzal
c48bd81e5f satellite/satellitedb: update SelectAllStorageNodes* to set country code
Methods SelectAllStorageNodesUpload and SelectAllStorageNodesDownload
are not returning full info with overlay.SelectedNode because its
missing CountryCode.

Change-Id: Ie3cb396bf28d7ec4c6ab8927e5bb560236036aa6
2023-05-26 11:02:29 +00:00
paul cannon
c856d45cc0 satellite/overlay: fix GetNodesNetworkInOrder
We were using the UploadSelectionCache previously, which does _not_ have
all nodes, or even all online nodes, in it. So all nodes with less than
MinimumVersion, or with less than MinimumDiskSpace, or nodes suspended
for unknown audit errors, or nodes that have started graceful exit, were
all missing, and ended up having empty last_nets. Even with all that,
I'm kind of surprised how many nodes this involved, but using the upload
selection cache was definitely wrong.

This change uses the download selection cache instead, which excludes
nodes only when they are disqualified, gracefully exited (completely),
or offline.

Change-Id: Iaa07c988aa29c1eb05796ac48a6f19d69f5826c1
2023-05-19 08:08:08 +00:00
paul cannon
958d8676d0 satellite/overlay: remove unnecessary test helper
Change-Id: I8439eec4ed440f60353fc620ca906a917a03613c
2023-05-17 17:04:54 +00:00
paul cannon
75d10fe4fa satellite/overlay: use UploadSelectionCache for GetNodesNetworkInOrder
The query for GetNodesNetworkInOrder is causing far too much load on the
database. Since it is not critical that the repair checker have
perfectly up-to-date node network information, we can use a cache
instead.

Change-Id: I07ad45bfdeb46529da093941a06c2da8a00ce878
2023-05-16 17:32:09 +00:00
Michal Niewrzal
36e046375c satellite/repair/checker: remove segments loop parts
We are switching completely to ranged loop.

https://github.com/storj/storj/issues/5368

Change-Id: I8583549973cd36aa0e0c482c20d7a75cb7568ab3
2023-05-08 12:19:13 +00:00
Michal Niewrzal
1aa24b9f0d satellite/audit: remove segments loop parts
We are switching completely to ranged loop.

https://github.com/storj/storj/issues/5368

Change-Id: I9cec0ac454f40f19d52c078a8b1870c4d192bd7a
2023-04-24 15:52:11 +00:00
Egon Elbre
f40a0cb7ba satellite/*: use typed lrucache and ReadCache
Change-Id: Ieee535dd8735a95dd196a77413e4a25a6a72342c
2023-04-21 10:49:08 +00:00
paul cannon
915f3952af satellite/repair: repair pieces on the same last_net
We avoid putting more than one piece of a segment on the same /24
network (or /64 for ipv6). However, it is possible for multiple pieces
of the same segment to move to the same network over time. Nodes can
change addresses, or segments could be uploaded with dev settings, etc.
We will call such pieces "clumped", as they are clumped into the same
net, and are much more likely to be lost or preserved together.

This change teaches the repair checker to recognize segments which have
clumped pieces, and put them in the repair queue. It also teaches the
repair worker to repair such segments (treating clumped pieces as
"retrievable but unhealthy"; i.e., they will be replaced on new nodes if
possible).

Refs: https://github.com/storj/storj/issues/5391
Change-Id: Iaa9e339fee8f80f4ad39895438e9f18606338908
2023-04-06 17:34:25 +00:00
Kaloyan Raev
98b82486bd satellite/overlay: update country code on every node check-in
We have a specific issue that a user uploaded a file to a bucket
geo-fenced to EU and one of the pieces appeared to be on a node in the
US. The country code of this node is set to SE (Sweden) in the satellite
DB. It turns out that some time ago MaxMind changed the country code of
this node's IP from Sweden to US, but this change hasn't been reflected
in the satellite's database.

So far the satellite updates the country code of a node only if its IP
changes. It was assumed that if the IP does not change, its country code
shouldn't change too. This turned to be a wrong assumption.

With this change, the satellite will look up the MaxMindDB on every
check-in to see if the country code of the node's IP has changed.

Change-Id: Icdf659b09be9fc6ad14601902032b35ba5ea78c4
2023-03-22 08:38:51 +00:00
paul cannon
fd6ce6b9a5 scripts/tests: fix test-sim-rolling-upgrade.sh
This test involves a satellite with dev defaults (DistinctIP=no) being
upgraded past commit 2522ff09b6, which
means we need to run the dev-defaults-satellite-upgrade migration SQL
to avoid getting DistinctIP=yes behavior (which breaks the tests).

Change-Id: I29fb596d1ffa568dad635d98cfe9abacd3aaa48f
2023-03-09 23:35:36 +00:00
Márton Elek
ffaf15a3b0 satellite/overlay: remove unused mail service from overlay
It was surprising that `satellite auditor` complained about SMTP mail settings, even if it's not supposed to sending any mail.

Looks like we can remove the mail service dependency, as it's not a hard requirement for overlay.Service.

Change-Id: I29a52eeff3f967ddb2d74a09458dc0ee2f051bd7
2023-03-09 12:17:35 +00:00
paul cannon
2522ff09b6 satellite/overlay: configurable meaning of last_net
Up to now, we have been implementing the DistinctIP preference with code
in two places:

 1. On check-in, the last_net is determined by taking the /24 or /64
    (in ResolveIPAndNetwork()) and we store it with the node record.
 2. On node selection, a preference parameter defines whether to return
    results that are distinct on last_net.

It can be observed that we have never yet had the need to switch from
DistinctIP to !DistinctIP, or from !DistinctIP to DistinctIP, on the
same satellite, and we will probably never need to do so in an automated
way. It can also be observed that this arrangement makes tests more
complicated, because we often have to arrange for test nodes to have IP
addresses in different /24 networks (a particular pain on macOS).

Those two considerations, plus some pending work on the repair framework
that will make repair take last_net into consideration, motivate this
change.

With this change, in the #2 place, we will _always_ return results that
are distinct on last_net. We implement the DistinctIP preference, then,
by making the #1 place (ResolveIPAndNetwork()) more flexible. When
DistinctIP is enabled, last_net will be calculated as it was before. But
when DistinctIP is _off_, last_net can be the same as address (IP and
port). That will effectively implement !DistinctIP because every
record will have a distinct last_net already.

As a side effect, this flexibility will allow us to change the rules
about last_net construction arbitrarily. We can do tests where last_net
is set to the source IP, or to a /30 prefix, or a /16 prefix, etc., and
be able to exercise the production logic without requiring a virtual
network bridge.

This change should be safe to make without any migration code, because
all known production satellite deployments use DistinctIP, and the
associated last_net values will not change for them. They will only
change for satellites with !DistinctIP, which are mostly test
deployments that can be recreated trivially. For those satellites which
are both permanent and !DistinctIP, node selection will suddenly start
acting as though DistinctIP is enabled, until the operator runs a single
SQL update "UPDATE nodes SET last_net = last_ip_port". That can be done
either before or after deploying software with this change.

I also assert that this will not hurt performance for production
deployments. It's true that adding the distinct requirement to node
selection makes things a little slower, but the distinct requirement is
already present for all production deployments, and they will see no
change.

Refs: https://github.com/storj/storj/issues/5391
Change-Id: I0e7e92498c3da768df5b4d5fb213dcd2d4862924
2023-03-09 02:20:12 +00:00
JT Olio
77bf88e916 satellite/overlay: check node difficulty before entering database
closes https://github.com/storj/storj/issues/5568

Change-Id: Id413637c2678e7a7cf8dbf414e082c687c8e8a39
2023-02-15 17:46:25 +00:00
JT Olio
522aed083d private/server,satellite/contact,misc: use new storj/common noise helpers
this change uses the new storj/common noise helpers, which:
 * add a security fix (require an expected node id for validating
   noise key attestations)
 * stops doing an unnecessary order signature validation (it's
   already been done inside of PutPiece)
 * removes some duplicate code

Change-Id: I5e67a08ff216cd9c5b0b82e40b4d9de664b6b0fc
2023-02-07 09:53:45 -05:00
paul cannon
0e2fef977f satellite/overlay: add SetAllContainedNodes method to overlay.DB
We will be needing an infrequent chore to check which nodes are in the
reverify queue and synchronize that set with the 'contained' field in
the nodes db, since it is easily possible for them to get out of sync.
(We can't require that the reverification queue table be in the same
database as the nodes table, so maintaining consistency with SQL
transactions is out. Plus, even if they were in the same database, using
such SQL transactions to maintain consistency would be slow and
unwieldy.)

This commit adds a method to the overlay allowing the caller to set the
contained status of all nodes in the nodes table at once. This is valid
because our definition of "contained" now depends solely on whether a
node appears at least once in the reverification queue. Only rows whose
contained field does not match the expectation will be updated; the
contained timestamp will not be updated for a node which is supposed to
be contained and was already contained.

Change-Id: I8cabe56ad897b6027e11aa5b17175295391aa3ac
2023-02-06 10:18:54 +00:00
JT Olio
686faeedbd satellite/overlay: return noise info with selected nodes
we have two more fields in the database (noise_proto and
noise_public_key) that now need to go into pb.NodeAddress when
returning AddressedOrderLimits.

the only real complication is making sure type conversions between
database types and NodeURLs and so on don't lose this new
pb.NodeAddress field (NoiseInfo). otherwise this is a relatively
straightforward commit

Change-Id: I45b59d7b2d3ae21c2e6eb95497f07cd388d454b3
2023-02-02 15:46:27 +00:00
JT Olio
2753d5a32f satellite/overlay: keep track of noise info per node
Change-Id: Icef04c3e87dbf4bb57d3837274c323bf6dd2c81f
2023-02-01 23:03:35 -05:00
Cameron
0596651580 satellite/satellitedb: fix updating nodes.last_software_update_email
The CASE expression used to determine which value to set
last_software_update_email to did not have an ELSE clause. Therefore,
when the node is both below the minimum version and did not receive a
version update email (no condition is true), the value would be set to
NULL.

Additionally, replace `time.Now()` with `timestamp` in the check to
determine if the email cooldown has passed.

Change-Id: I2e2e93f1a865e123ed8b665be9621cebfb72236f
2023-02-01 17:25:58 -05:00
paul cannon
b6bcb32ecf satellite/reputation: more accurate "reputation changes" list
`overlay.(*Service).UpdateReputation()` takes a "reputationChanges"
parameter, a slice of node events indicating whether we think the node's
disqualification or suspension status is changing. This is necessary so
that the overlay service can notify the nodeevents DB about these
changes.

In several cases, however, this list of events is not constructed
correctly, because of missing information about the previous state.
In most cases, this is because the node was offline, and the order limit
creation functions (which usually obtain and return the prior reputation
status) ignored that node.

This change makes it so that all callers to
`overlay.(*Service).UpdateReputation()` can be expected to provide a
correct list of change events (as correct as feasible, given that we
can't lock the node's information in the database during the entire
operation).

It ended up that there was only one caller we needed to worry about, and
that was reputation.(*Service).ApplyAudit(). So the bulk of this change
is teaching that function how to recognize when the prior reputation
status was not filled in, and fill it in.

Refs: https://github.com/storj/storj/issues/5464
Change-Id: I52ce385fc9c0ce3b283b998d517998e7f4ec8792
2023-01-31 18:39:40 +00:00
JT Olio
e40191afd6 storj: upgrade to use latest storj/common NodeAddress
Change-Id: I5987391bcfe5f6dfd7b525698c337a4cbda9b76e
2023-01-25 01:37:26 +00:00
Fadila Khadar
5c3a148d6e satellite/overlaycache: fix typo in UpdateCheckIn request
- fix a malformed SQL query
- add test to be sure we don't have this problem again.

Change-Id: I3fde8c59ba01335411e51d964bec95bc26cfc961
2022-12-14 22:21:45 +01:00
paul cannon
ed0fa59f23 satellite/overlay: add SetNodeContained() method
SetNodeContained() will change the contained flag in the nodes table,
which will affect whether nodes are selected for new uploads. This flag
_should_ correlate with whether or not a given node has any entries in
the reverification queue. However, the reverification queue is intended
to be 'safely partitionable' from the nodes table, so we can't enforce
that characteristic transactionally. But this is ok; there are no dire
consequences if they are out of sync.

We will be adding a chore that updates the contained flag based on the
contents of the reverification queue periodically, if something fails
to set it directly when appropriate.

Refs: https://github.com/storj/storj/issues/5231
Change-Id: I26460d8718dee63fd55d00a44568b2065fc8fe30
2022-12-01 12:43:40 +00:00
Cameron
87660bd9b3 satellite/overlay/offlinenodes: insert offline nodes into node events
Add a new chore to periodically insert nodes who are offline and
have not gotten an offline email in a certain amount of time into node
events

Change-Id: I658b385bb777b0240c98092946a93d65bee94abc
2022-11-18 12:10:06 -05:00