Storage Node Satellite Selection design (#3353)
This commit is contained in:
parent
f32ff4ec79
commit
0885eaa183
304
docs/blueprints/storage-node-satellite-selection.md
Normal file
304
docs/blueprints/storage-node-satellite-selection.md
Normal file
@ -0,0 +1,304 @@
|
||||
# Storage Node Satellite Selection
|
||||
|
||||
## Abstract
|
||||
|
||||
This document details an enhanced method of Satellite selection and maintenance
|
||||
for Storage Node operators.
|
||||
|
||||
## Background.
|
||||
|
||||
With the removal of Kademlia, Storage Nodes need a way to identify and select
|
||||
Satellite's with whom to interact. The implementation of Satellite selection is
|
||||
currently accomplished via a list of whitelisted Satellite URLs in the
|
||||
configuration file. The list defaults to well-known satellites hard-coded into
|
||||
the storage node binary. This method is simple and easy to configure at first
|
||||
time setup, but unfortunately requires manual maintenance of the list going
|
||||
forward.
|
||||
|
||||
The ideal solution would be just as easy to set up in the common case while
|
||||
removing the burden of future maintenance.
|
||||
|
||||
## Design
|
||||
|
||||
The proposed design is to discover trusted Satellites from externally
|
||||
maintained lists from trusted sources with the ability to manually trust/block
|
||||
Satellites.
|
||||
|
||||
### Concepts
|
||||
|
||||
#### Satellite URL
|
||||
|
||||
A Satellite URL holds all the information needed to contact and identify a
|
||||
Satellite. It is comprised of an optional scheme (i.e. `storj://`), an ID, and
|
||||
an address.
|
||||
|
||||
The address **MUST** contain both a host and port for the purposes of this
|
||||
feature.
|
||||
|
||||
The following are all examples of valid Satellite URLs:
|
||||
|
||||
```
|
||||
12EayRS2V1kEsWESU9QMRseFhdxYxKicsiFmxrsLZHeLUtdps3S@us-central-1.tardigrade.io:7777
|
||||
storj://12EayRS2V1kEsWESU9QMRseFhdxYxKicsiFmxrsLZHeLUtdps3S@us-central-1.tardigrade.io:7777
|
||||
```
|
||||
|
||||
The following are invalid Satellite URLs (missing or partial IDs):
|
||||
|
||||
```
|
||||
us-central-1.tardigrade.io:7777
|
||||
12EayRS2@us-central-1.tardigrade.io:7777
|
||||
storj://us-central-1.tardigrade.io:7777
|
||||
storj://12EayRS2@us-central-1.tardigrade.io:7777
|
||||
```
|
||||
|
||||
#### Trusted Satellite List
|
||||
|
||||
The Trusted Satellite List is a text document where each line represents the
|
||||
Satellite URL of a trusted Satellite.
|
||||
|
||||
```
|
||||
12EayRS2V1kEsWESU9QMRseFhdxYxKicsiFmxrsLZHeLUtdps3S@us-central-1.tardigrade.io:7777
|
||||
121RTSDpyNZVcEU84Ticf2L1ntiuUimbWgfATz21tuvgk3vzoA6@asia-east-1.tardigrade.io:7777
|
||||
```
|
||||
|
||||
### Storage Node Configuration
|
||||
|
||||
The Storage Node configuration for Satellite selection is a list of one or more
|
||||
entries, where each entry is one of the following:
|
||||
|
||||
* Trusted Satellite List URL
|
||||
* Trusted Satellite URL
|
||||
* Untrusted Satellite URL
|
||||
|
||||
#### Trusted Satellite List URL Entry
|
||||
|
||||
This entry contains a URL where a Trusted Satellite List can be downloaded.
|
||||
|
||||
Supported schemes are `file://`, `http://` and `https://`. When using HTTP an
|
||||
`https://` URL should be preferred over an `http://` URL to ensure transport
|
||||
security and prevent a person-in-the-middle from tampering with the list.
|
||||
|
||||
Examples:
|
||||
|
||||
```
|
||||
https://www.tardigrade.io/trusted-satellites
|
||||
file:///some/path/to/trusted-satellites.txt
|
||||
```
|
||||
|
||||
#### Trusted Satellite URL Entry
|
||||
|
||||
This entry contains the URL to an explicitly trusted Satellite. The format of
|
||||
the entry is a Satellite URL.
|
||||
|
||||
#### Untrusted Satellite Entry
|
||||
|
||||
This entry contains the URL to an explicitly untrusted Satellite. The format of
|
||||
the entry is a `!` followed by one of the following:
|
||||
|
||||
* Satellite ID followed by an `@` (to distinguish it from a host)
|
||||
|
||||
```
|
||||
!121RTSDpyNZVcEU84Ticf2L1ntiuUimbWgfATz21tuvgk3vzoA6@
|
||||
```
|
||||
|
||||
* Satellite host. If the host is a domain, then subdomains are also untrusted
|
||||
(i.e. `!tardigrade.io` will block `us-central-1.tardigrade.io`)
|
||||
|
||||
```
|
||||
!tardigrade.io
|
||||
!us-central-1.tardigrade.io
|
||||
!us-east-1.tardigrade.io
|
||||
```
|
||||
|
||||
* Satellite URL
|
||||
|
||||
```
|
||||
!121RTSDpyNZVcEU84Ticf2L1ntiuUimbWgfATz21tuvgk3vzoA6@us-central-1.tardigrade.io:7777
|
||||
```
|
||||
|
||||
### Building the List of Trusted Satellite URLs
|
||||
|
||||
To build the list of trusted Satellite URLs each entry in the configuration is
|
||||
traversed in order, parsed, and processed accordingly:
|
||||
|
||||
1. If the entry begins with a `!`, it represents an untrusted Satellite entry. It is added to the `untrusted` list, which is used later.
|
||||
1. If the entry begins with `file://`, `http://`, or `https://`, it is a Trusted Satellite List URL. The URL is used to fetch a list of Trusted Satellite URLs that are added into the `trusted` list.
|
||||
1. If the entry begins with `storj://`, or otherwise does not use a scheme, it is a trusted Satellite URL entry. It is added into the `trusted` list.
|
||||
1. If an entry does not match any of the above, it is a configuration error.
|
||||
|
||||
If a Trusted Satellite List cannot be fetched a warning should be logged. If
|
||||
available, the last known copy from the Trusted Satellite List URL should be
|
||||
used. Storage Nodes should attempt to persist the downloaded lists. If they
|
||||
cannot, a warning should be logged.
|
||||
|
||||
After all configuration entries have been processed, each URL in the `trusted`
|
||||
list is checked against the `untrusted` list and removed if it matches an entry
|
||||
. An `untrusted` entry matches a URL using the following criteria:
|
||||
|
||||
* When the `untrusted` entry is just a Satellite ID, it matches any URL with
|
||||
that ID.
|
||||
* When the `untrusted` entry is just a host, it matches any URL with the same
|
||||
host. If the host is a domain name, then the entry also matches URLs within a
|
||||
subdomain of that host.
|
||||
* When the `untrusted` entry is a full Satellite URL, it matches any URL that
|
||||
is equal.
|
||||
|
||||
After the `trusted` list has been pruned, the remaining URLs are aggregated
|
||||
according to the following rules:
|
||||
|
||||
* A Satellite URL is considered _authoritative_ if it matches either of the
|
||||
following criteria:
|
||||
* Configured via a Trusted Satellite URL entry
|
||||
* Configured via a `file://` URL
|
||||
* Configured via an `https://` or `http://` Trusted Satellite List URL AND matches the domain or is a subdomain of the domain name in the Trusted Satellite List URL.
|
||||
* Satellite URLs are equivalent if the address portions are equal.
|
||||
* When aggregating equivalent Satellite URLs (i.e. address matches) with
|
||||
differing IDs, the _authoritative_ Satellite URL wins. If neither or both are
|
||||
_authoritative_, the one aggregated first wins.
|
||||
|
||||
|
||||
#### Example
|
||||
|
||||
Consider the following Trusted Satellite List URLs and their contents. For
|
||||
brevity sake, the full ID of each URL is being shortened (real configurations
|
||||
**MUST** specify the full ID).
|
||||
|
||||
* `file:///path/to/some/trusted-satellites.txt`
|
||||
|
||||
```
|
||||
1@bar.test:7777
|
||||
```
|
||||
|
||||
* `https://foo.test/trusted-satellites`
|
||||
|
||||
```
|
||||
2@f.foo.test:7777
|
||||
2@buz.test:7777
|
||||
2@qiz.test:7777
|
||||
5@ohno.test:7777
|
||||
```
|
||||
|
||||
* `https://bar.test/trusted-satellites`
|
||||
|
||||
```
|
||||
3@f.foo.test:7777
|
||||
3@bar.test:7777
|
||||
3@baz.test:7777
|
||||
3@buz.test:7777
|
||||
3@quz.test:7777
|
||||
```
|
||||
|
||||
* `https://baz.test/trusted-satellites`
|
||||
|
||||
```
|
||||
4@baz.test:7777
|
||||
4@qiz.test:7777
|
||||
4@subdomain.quz.test:7777
|
||||
```
|
||||
|
||||
Now consider the following configuration:
|
||||
|
||||
```
|
||||
- !quz.test
|
||||
- file:///path/to/some/trusted-satellites.txt
|
||||
- https://foo.test/trusted-satellites
|
||||
- https://bar.test/trusted-satellites
|
||||
- https://baz.test/trusted-satellites
|
||||
- 0@f.foo.test:7777
|
||||
- !2@qiz.test:7777
|
||||
- !5
|
||||
```
|
||||
|
||||
After expanding each entry, we have the following unaggregated `trusted` list:
|
||||
|
||||
```
|
||||
1@bar.test:7777 (authoritative due to file:// URL)
|
||||
2@f.foo.test:7777 (authoritative due to foo.test domain)
|
||||
2@buz.test:7777
|
||||
2@qiz.test:7777
|
||||
5@ohno.test:7777
|
||||
3@f.foo.test:7777
|
||||
3@bar.test:7777 (authoritative due to bar.test domain)
|
||||
3@baz.test:7777
|
||||
3@buz.test:7777
|
||||
3@quz.test:7777
|
||||
4@baz.test:7777 (authoritative due to baz.test domain)
|
||||
4@qiz.test:7777
|
||||
4@subdomain.quz.test:7777
|
||||
0@f.foo.test:7777 (authoritative due to explicit configuration)
|
||||
```
|
||||
|
||||
And the following `untrusted` list:
|
||||
|
||||
```
|
||||
quz.test
|
||||
2@qiz.test:7777
|
||||
5
|
||||
```
|
||||
|
||||
The `trusted` list is pruned with the `untrusted` list, leaving the following `trusted` list:
|
||||
|
||||
```
|
||||
1@bar.test:7777 (authoritative due to file:// URL)
|
||||
2@f.foo.test:7777 (authoritative due to foo.test domain)
|
||||
2@buz.test:7777
|
||||
3@f.foo.test:7777
|
||||
3@bar.test:7777 (authoritative due to bar.test domain)
|
||||
3@baz.test:7777
|
||||
3@buz.test:7777
|
||||
4@baz.test:7777 (authoritative due to baz.test domain)
|
||||
4@qiz.test:7777
|
||||
0@f.foo.test:7777 (authoritative due to explicit configuration)
|
||||
```
|
||||
|
||||
We aggregate from top to bottom (i.e. in the order they were specified/fetched)
|
||||
and are left with the following:
|
||||
|
||||
```
|
||||
1@bar.test:7777
|
||||
2@f.foo.test:7777
|
||||
2@buz.test:7777
|
||||
4@baz.test:7777
|
||||
4@qiz.test:7777
|
||||
```
|
||||
|
||||
* `1@bar.test:7777` was selected because even though `3@bar.test:7777` was also
|
||||
authoritative, `1@bar.test:7777` came first.
|
||||
* `2@f.foo.test:7777` was selected because it was authoritative over
|
||||
`3@f.foo.test:7777` and came before `0@f.foo.test`.
|
||||
* `2@buz.test:7777` was selected because it came before `3@buz.test:7777` (neither was authoritative)
|
||||
* `4@baz.test:7777` was selected it was authoritative over `3@baz.test:7777`, even though the latter came first.
|
||||
* `4@qiz.test:7777` was selected because it was the only URL for `qiz.test:7777`
|
||||
|
||||
### Rebuilding the List of Trusted Satellite URLs
|
||||
|
||||
The list of trusted Satellite URLs should be recalculated daily (with some jitter).
|
||||
|
||||
### Backwards Compatability
|
||||
|
||||
The old piecestore configuration (i.e. `piecestore.OldConfig`) currently contains a
|
||||
comma separated list of trusted Satellite URLs (`WhitelistedSatellites`). It
|
||||
defaults to the current list of known good satellites. On startup, if the new
|
||||
configuration is unset, then the old configuration should be used to form
|
||||
a fixed set of trusted Satellite URLs.
|
||||
|
||||
## Open Issues
|
||||
|
||||
* How long should storage nodes use cached/persisted lists for? Should lists be persisted at all?
|
||||
* If aggregation yields no URLs (list URL unreachable) should we default to anything? How should this be reported?
|
||||
* If block listing removes all URLs, how should this be reported?
|
||||
* Can we safely auto-migrate storage nodes into this new method of management?
|
||||
* How long does the storage node wait before garbage collecting pieces from Satellites it no longer trusts? Should this be manual operation?
|
||||
|
||||
## To Do
|
||||
|
||||
* Implement an endpoint at `https://www.tardigrade.io/trusted-satellites` to return the default list of trusted Satellites.
|
||||
* Implement a `trust.ListConfig` configuration struct which:
|
||||
* Contains the list of entries (with a release default of a single list containing `https://www.tardigrade.io/trusted-satellites`)
|
||||
* Contains a refresh interval
|
||||
* Maintains backwards compatability with `WhitelistedSatellites` in `piecestore.OldConfig`
|
||||
* Implement `storj.io/storj/storagenode/trust.List` that:
|
||||
* Consumes `trust.ListConfig` for configuration
|
||||
* Performs the initial fetching and building of trusted Satellite URLs
|
||||
* Updates according to the refresh interval (with jitter)
|
||||
* Refactor `storj.io/storj/storagenode/trust.Pool` to use `trust.List`
|
Loading…
Reference in New Issue
Block a user