Commit Graph

195 Commits

Author SHA1 Message Date
Egon Elbre
ee71fbb41d storagenode/piecestore: start restore trash in the background
Starting restore trash in the background allows the satellite to
continue to the next storagenode without needing to wait until
completion.

Of course, this means the satellite doesn't get feedback whether it
succeeds successfully or not. This means that the restore-trash needs to
be executed several times.

Change-Id: I62d43f6f2e4a07854f6d083a65badf897338083b
2022-12-16 18:15:52 +02:00
Andrew Harding
4fdea51d5c storagenode/storagenodedb: faster test db init
Running all of the migrations necessary to initialize a storage node
database takes a significant amount of time during runs.

The package current supports initializing a database from manually coalesced
migration data (i.e. snapshot) which improves the situation somewhat.

This change takes things a bit further by changing the snapshot code to
instead hydrate the database directory from a pre-generated snapshot zip
file.

name                                old time/op  new time/op  delta
Run_StorageNodeCount_4/Postgres-16   2.50s ± 0%   0.16s ± 0%   ~     (p=1.000 n=1+1)

Change-Id: I213bbba5f9199497fbe8ce889b627e853f8b29a0
2022-12-01 20:45:36 +00:00
Clement Sam
59b37db670 storagenode: overhaul QUIC check implementation
The current implementation blocks the the startup until one or none
of the trusted satellites is able to reach the node via QUIC.
This can cause delayed startup. Also, the quic check is done
once during startup, and if there is a misconfiguration later,
snos would have to restart to node.

In this change, we reuse the contact service which pings the satellite
periodically for node checkin. During checkin the satellite tries
pinging the node back via both TCP and QUIC and reports both statuses.
WIth this, we are able to get a periodic update of the QUIC status
without restarting the node.

Also adds the time the node was last pinged via QUIC to the tooltip
on the QUIC status tab.

Resolves https://github.com/storj/storj/issues/4398

Change-Id: I18aa2a8e8d44e8187f8f2eb51f398fa6073882a4
2022-11-09 03:15:57 +00:00
Márton Elek
7e71986493 storagenode: accept HTTP calls on public port, listening for monitoring requests
Today each storagenode should have a port which is opened for the internet, and handles DRPC protocol calls.

When we do a HTTP call on the DRPC endpoint, it hangs until a timeout.

This patch changes the behavior: the main DRPC port of the storagenodes can accept HTTP requests and can be used to monitor the status of the node:

 * if returns with HTTP 200 only if the storagnode is healthy (not suspended / disqualified + online score > 0.9)
 * it CAN include information about the current status (per satellite). It's opt-in, you should configure it so.

In this way it becomes extremely easy to monitor storagenodes with external uptime services.

Note: this patch exposes some information which was not easily available before (especially the node status, and used satellites). I think it should be acceptable:

 * Until having more community satellites, all storagenodes are connected to the main Storj satellites.
 * With community satellites, it's good thing to have more transparency (easy way to check who is connected to which satellites)

The implementation is based on this line:

```
http.Serve(NewPrefixedListener([]byte("GET / HT"), publicMux.Route("GET / HT")), p.public.http)
```

This line answers to the TCP requests with `GET / HT...` (GET HTTP request to the route), but puts back the removed prefix.

Change-Id: I3700c7e24524850825ecdf75a4bcc3b4afcb3a74
2022-08-26 09:38:09 +00:00
Márton Elek
6b27c64833 testplanet: support snapshot based migration for storagenode
Similar to the existing snapshot based tests of satellite/metabase db we make a migration here which is:
 * dedicated to unit tests
 * faster (with less steps)
 * but safe: additional unit test ensures that the snapshot based migration and normal prod migration have the same results.

Change-Id: Ie324b09f64b4553df02247a9461ece305a6cf832
2022-08-22 09:46:27 +00:00
Óscar de Arriba
4fdb81c510
storagenode/pieces: allow to configure initial piece scan (#5024)
* Allow configure initial piece scan

* Update tests to include new flag

* Update default config specification

* Rename configuration flag

* Rename variable and fix formatting

* Fix format

* Fix typo

Co-authored-by: Stefan Benten <mail@stefan-benten.de>
Co-authored-by: Clement Sam <clementsam75@gmail.com>
Co-authored-by: littleskunk <jens.heimbuerge@googlemail.com>
2022-08-07 22:40:59 +00:00
Egon Elbre
e9692c5681 storagenode/gracefulexit: remove unused interface
Change-Id: Ie6c3d69f5177872d8f4308ac476bc87655da9e4b
2022-08-04 11:26:14 +03:00
Egon Elbre
dc0f7b5f77 storagenode,web/storagenode: use go:embed for assets
Go can now directly embed files without relying on external tools.
This makes code use go:embed and avoid the external tooling.

go:embed requires files to be present in the embedded directory,
hence we need to add .keep to "dist" folder. We also add one to
public/.keep, such that it won't be deleted when building storagenode.

Change-Id: I8bef81236be6829ed37ed4c16ef693677b93a631
2022-03-11 16:01:28 +02:00
Clement Sam
8b40a071a0 storagenode: check if QUIC is properly configured
This change adds a check to confirm if UDP port if properly configured for QUIC

Resolves https://github.com/storj/storj/issues/4332
Partly resolves https://github.com/storj/storj/issues/4358

Change-Id: I9a66f26a115e48b4fcd168f50a7d0b4d81712f4e
2022-01-20 12:04:04 +00:00
Clement Sam
589c82f23e storagenode: display quic status on SNO dashboard
This change includes storagenode QUIC status on SNO dashboard.
If disabled, it displays warning for SNOs to foward their
UDP port for quic.

Change-Id: I8d28c9c0f5f1e90d80b7c18b9e1e7b78c5e45609
2021-12-23 12:22:24 +00:00
igor gaidaienko
92deef4f34 Revert "storagenode/payouts: historical payouts use satellitesDB instead of trustPool"
This reverts commit 4a98dd40e2.
2021-08-02 13:55:21 +03:00
Qweder93
4a98dd40e2 storagenode/payouts: historical payouts use satellitesDB instead of trustPool
Change-Id: I39f4215f4ebf91bd1b38fbcb5c58e6ba53ceff1b
2021-07-15 16:19:18 +03:00
Qweder93
4d0fe39235 storagenode/satellites: address added, caching satellite's addresses from trust
Change-Id: Ica3eea5b8d81b176c6a4385fea803730b08ece16
2021-07-08 15:38:23 +00:00
Qweder93
6c02258925 storagenode/peer: register multinode Payouts endpoint
Change-Id: Iaaba0157e605bb51a73244517986d6b49a59d8d5
2021-07-06 10:09:24 +00:00
Qweder93
b9d7874dc2 multinode: old drpc api returned
drpc api with old names added to prevent breaking
backwards compatibility.

Change-Id: I1b8a41f2c1e0bd11ac83c86f0b1bfbfc1f07378f
2021-06-22 23:02:47 +00:00
Yaroslav Vorobiov
157d3d980e multinode/console: storage usage and total storage usage
Change-Id: I4970275daf4a4a9c5d02aea6a205891869dd4eff
2021-06-10 16:01:41 +00:00
crawter
6d9b91d435 private/multinodepb: drpc operators controller added
Change-Id: Ie9bad9d5dba3e508d4cb8165aaca88d98ef1304d
2021-06-01 11:00:31 +00:00
Qweder93
91b7e24d55 multinode/payouts: naming refactoring
Renamed methods, error messages.

Change-Id: I7d7b6b092c05bbc5bf1322855efc5ccb9b312671
2021-05-31 19:47:12 +00:00
Yaroslav Vorobiov
23f9beb635 multinode/console: held amount summary
Change-Id: Ia800748343e363d930ce0a0b9ab286b5abdc96af
2021-05-28 20:05:16 +03:00
Egon Elbre
10372afbe4 ci: fix lint errors
Change-Id: Ib5893440807811f77175ccd347aa3f8ca9cccbdf
2021-05-17 13:37:31 +00:00
Yingrong Zhao
59f443e71a storagenode/contact: add authentication for PingNode endpoint
Currently, if a node has untrusted a satellite, the satellite can still
successfully ping the node. If a node decide to untrust a satellite, the
satellite should also mark it as conact failed

Change-Id: Idf80fa00d9849205533dd3e5b3b775b5b9686705
2021-05-12 18:15:32 +00:00
Qweder93
a11698f370 multinode/payouts: estimated payouts added
estimated payouts for specific/all satellites added.

Change-Id: I2530c9f1775593588e2a8f6c087ce6b4f9e354c4
2021-05-11 11:33:32 +00:00
Egon Elbre
7802ab714f pkg/,private/: merge with private package
Initially there were pkg and private packages, however for all practical
purposes there's no significant difference between them. It's clearer to
have a single private package - and when we do get a specific
abstraction that needs to be reused, we can move it to storj.io/common
or storj.io/private.

Change-Id: Ibc2036e67f312f5d63cb4a97f5a92e38ae413aa5
2021-04-23 16:37:28 +03:00
NickolaiYurchenko
034eb2c214 storagenode: wallet features
WHAT: updated dashboard type to return wallet features from storagenode api

Change-Id: Ifdbdeae8b9ae239eaebe32d0a093557154f777ad
2021-03-24 09:08:36 +00:00
Yaroslav Vorobiov
966535e9de {storagenode,satellite}/nodeoperator: add wallet features
Change-Id: Iac7eb40a52b8fddcc573aebaad2e3a30a10cded9
2021-02-08 22:09:45 +02:00
Yingrong Zhao
7e80badaf9 pkg/server,pkg/quic: accept an existing conn to create quic listener and
allow disabling tcp/quic

In order to have more control of a server so that we can
simulate connection failures in `testplanet`, this PR changes
quic.Listener to accept an existing UDPConn instead of relying on the
quic-go library to create the UDPConn.
This PR also adds two flags on the `server.Config` struct to allow
enabling/disabling tcp/tls listener and quic listener. By default, they
are both set to true.
    - `DisableTCPTLS`: internal flag, disables tcp/tls listener.
    - `DisableQUIC`: hidden flag, disables quic listener
By making the `DisableQUIC` a hidden flag, it allows storagenode operators to
have the ability to disable quic traffic in case their set up can't work
with udp traffic.

Change-Id: I853b12435d988b9c41ad9b873fd57480d792e378
2021-02-03 12:04:29 -05:00
Qweder93
c139cbd76b storagenode/payouts: fix CurrentMonthExpectations timezone handling. Estimations based on node's join date.
On servers with non-UTC it would have calculated a different month boundary.
If node joined in current month calculations will be related on amount of days node've been working.

Change-Id: Ie572b197f50c6cdff5a044a53dfb5b9138f82f24
2021-01-25 19:03:30 +02:00
Qweder93
6ba8f6c8a9 storanode, satellite: payout renamed to payouts, expected estimation payouts added, console api for audits reworked
Change-Id: I4aa5e99bffaa87d0a800a429a4c83aa498ad4b7b
2021-01-18 10:56:03 +00:00
Yaroslav Vorobiov
6507f3ebc6 multinode/console: trusted satellites list api
Change-Id: I97bb9efb1d6cb7d456df0b86e66417c31018b762
2021-01-08 14:50:12 +02:00
Yaroslav Vorobiov
5a43c86b68 multinode/console: list node satellite infos
Change-Id: Ic6cb8d1a6fd7637fdb7bf49e040c43ac30ab1bbf
2021-01-05 14:49:58 +00:00
Yaroslav Vorobiov
fb00d099cf multinode/console: list node infos
Change-Id: I5cac49feff2bac6fbd7ac61dfccffd672da8e8c0
2021-01-05 14:49:11 +00:00
Stefan Benten
494bd5db81
all: golangci-lint v1.33.0 fixes (#3985) 2020-12-05 17:01:42 +01:00
Qweder93
a17cd9aa3e storageode/apikey: added service, CLI issue api key
Change-Id: I840cd0fdbd8dca884eefbd111f21fd3990c11e68
2020-11-18 10:40:17 +00:00
Qweder93
8dc10e32ad stefan benten satellited added to historical payout data
Change-Id: I1177b2d2ef10d514f7d401e29891fa7dd964e9ac
2020-11-09 15:43:41 +00:00
Egon Elbre
1903b15474 storagenode/internalpb: move gracefulexit.proto
Change-Id: Ia3614846ed49a39c8f39331516d16d45a695240b
2020-10-30 15:24:56 +02:00
Egon Elbre
cda67a659a storagenode/internalpb: move inspector.proto
Change-Id: I951379c3b2ff00d1bc09d6a49c026a7e723432d6
2020-10-30 14:51:26 +02:00
Qweder93
f5ba8b8009 storagenode/suspensions: added offline-suspension notificatio chore + tests
Change-Id: I2521cd2e7d08a1dd379e717a554a026c7508c18f
2020-10-29 19:44:22 +02:00
Qweder93
624255e8ba storagenode/secret: db tests added, small renaming fixes added
Change-Id: I7eae1a9a64c20a39c97e81fa741cfc9b9e1e615a
2020-10-29 14:23:04 +02:00
paul cannon
76d4977b6a storagenode/gracefulexit: logic moved from worker to service
Change-Id: I8b12606a96b712050bf40d587664fb1b2c578fbc
2020-10-22 23:19:30 +00:00
Qweder93
9df74338a8 storagenode: secret db and service added
Change-Id: I91257e5adc4fc6711653f30c118e476ed1c95b6b
2020-10-16 13:24:33 +00:00
Moby von Briesen
aa86c0889c storagenode/console: Add current storage used per satellite to storagenode api
Right now, the best way for a storage node operator to get the current
space used for each satellite is to run the `storagenode exit-satellite`
command for graceful exit, and cancel at the second confirmation prompt.
This is convoluted and the data is readily available from the Blobs
Usage Cache.

This change adds the current space used by each satellite to the
endpoints `/api/sno` and `/api/sno/satellite/<Satellite ID>`

Change-Id: I2173005bb016fc76db96fd598d26b485e5b2aa0b
2020-10-14 21:30:28 +00:00
Egon Elbre
cf2dd76db7 cmd/satellite: proper log usage
log.Fatal immediately terminates the program without running any defers.
We should properly close all the services and databases.

Change-Id: I5e959cef3eafedeacb3a2062e3da47e8d04e8e75
2020-10-13 16:56:35 +03:00
Egon Elbre
2268cc1df3 all: fix linter complaints
Change-Id: Ia01404dbb6bdd19a146fa10ff7302e08f87a8c95
2020-10-13 15:59:01 +03:00
Qweder93
8182fdad0b storagenode: heldamount renamed to payouts, renamed some methods and structs to more meaningful names. grouped estimated payout with pathouts
satellite: heldamount renamed to SNOpayouts.

Change-Id: I244b4d2454e0621f4b8e22d3c0d3e602c0bbcb02
2020-09-16 14:57:35 +00:00
Moby von Briesen
789b07e226 storagenode/orders/store.go: Do not return error from ListUnsentBySatellite when order files are corrupted.
If we see an UnexpectedEOF error when attempting to read orders, return
the orders we have been able to read successfully and do not return an
error. This behavior ensures that the storagenode orders service
attempts to archive corrupted files and does not retry them repeatedly
and get stuck.

Change-Id: I0d00d1e174f968af6e99ca861eddad190f1339e2
2020-09-10 23:36:05 +00:00
Qweder93
ac29d80495 storagenode: heldamount GetPaystub refactored, estimationPayouts logic separated form console to separate service, storagenodeapi tests fixed.
Change-Id: I902823ef40a62861ce32799e9fb7a67a1e14710d
2020-09-09 15:31:16 +00:00
Jeff Wendling
91698207cf storagenode: live tracking of order window usage
This change accomplishes multiple things:

1. Instead of having a max in flight time, which means
   we effectively have a minimum bandwidth for uploads
   and downloads, we keep track of what windows have
   active requests happening in them.

2. We don't double check when we save the order to see if it
   is too old: by then, it's too late. A malicious uplink
   could just submit orders outside of the grace window and
   receive all the data, but the node would just not commit
   it, so the uplink gets free traffic. Because the endpoints
   also check for the order being too old, this would be a
   very tight race that depends on knowledge of the node system
   clock, but best to not have the race exist. Instead, we piggy
   back off of the in flight tracking and do the check when
   we start to handle the order, and commit at the end.

3. Change the functions that send orders and list unsent
   orders to accept a time at which that operation is
   happening. This way, in tests, we can pretend we're
   listing or sending far into the future after the windows
   are available to send, rather than exposing test functions
   to modify internal state about the grace period to get
   the desired effect. This brings tests closer to actual
   usage in production.

4. Change the calculation for if an order is allowed to be
   enqueued due to the grace period to just look at the
   order creation time, rather than some computation involving
   the window it will be in. In this way, you can easily
   answer the question of "will this order be accepted?" by
   asking "is it older than X?" where X is the grace period.

5. Increases the frequency we check to send up orders to once
   every 5 minutes instead of once every hour because we already
   have hour-long buffering due to the windows. This decreases
   the maximum latency that an order will be reported back to
   the satellite by 55 minutes.

Change-Id: Ie08b90d139d45ee89b82347e191a2f8db1b88036
2020-08-19 19:42:33 +00:00
Moby von Briesen
708cb48aa6 storagenode/orders: implement orders filestore on storagenode
* Add all new orders to the orders filestore instead of the database.
* Submit orders from the filestore to the new satellite SettleWindow
endpoint.

The orders filestore will eventually replace the orders DB completely.
For now, we will still be checking the orders DB and submitting those
orders if they exist. In a later release, we will completely remove the
orders DB, but we need both the DB and filestore for the transitionary
period.

Change-Id: Iac8780fd5ab770296181bbd313e1d335f072d4dc
2020-08-19 15:00:35 +00:00
Qweder93
7b4a8c4d6d storagenode/heldamount: payoutHistory added
Change-Id: I93dd3d024085d19ecff76075e52bf66796207fd6
2020-07-14 17:35:03 +03:00
Qweder93
f73e92c268 storagenode/gracefulexit: added blobs clean
on node's start checks if any of trusted satellites has GE status "Exited successfully"
if so - trying to delete blobs/satellite folder, so no trash left on SNO.

Change-Id: I566266c84f2a872df54cd01bc2f15a9934f138ed
2020-07-13 11:49:18 +00:00