Commit Graph

764 Commits

Author SHA1 Message Date
Erik van Velzen
e6b5501f9b satellite/gc/sender: new service to send retain filters
Implement a new service to read retain filter from a bucket and
send them out to storagenodes.

This allows the retain filters to be generated by a separate command on
a backup of the database.

Paralellism (setting ConcurrentSends) and end-to-end garbage collection
tests will be restored in a subsequent commit.

Solves https://github.com/storj/team-metainfo/issues/121

Change-Id: Iaf8a33fbf6987676cc3cf74a18a8078916fe673d
2022-09-20 11:49:40 +00:00
Clement Sam
07beef378d storagenode/collector: delete expired piece info if file does not exist
The collector tries deleting a piece over and over again, though
the piece does not exist on the storagenode's filesystem.
We need to delete the piece info from the expired db if the
targeted file does not exist.
This does not resolve the base problem of why the file
is deleted before the collector tries deleting it.
This change deletes the piece info from the expired db
if the file does not exist, since we're already trying
to delete that piece anyway.

Closes https://github.com/storj/storj/issues/4192

Change-Id: If659185ca14f1cb29fd3c4237374df6fcd535df8
2022-09-15 12:29:29 +00:00
Clement Sam
a848c29b9b storagenode/nodestats: add monkit metrics for reputation scores
Closes https://github.com/storj/storj/issues/4835

Change-Id: Ib56e34145b962bede3525066f9bd7ef950d21e9b
2022-09-15 08:43:48 +00:00
Clement Sam
64e5fb7772 storagenode/collector: fix error check when file does not exist
The collector calls the Delete() method on the pieces
which returns an error which is wrapped by many error classes.
Delete() method is using Stat() from
1aec831d98/storage/filestore/dir.go (L328)
under the hood.
os.IsNotExist(errors.Unwrap(err) will always be false unless
errors.Unwrap(err) is called multiple times till it gets to
the core os.ErrNotExist. Here is a test case to explain better:

    func TestABC(t *testing.T) {
	classA := errs.Class("A")
	classB := errs.Class("B")

	wrappedError := classB.Wrap(classA.Wrap(os.ErrNotExist))

	require.True(t, os.IsNotExist(errs.Unwrap(wrappedError)))
	require.True(t, os.IsNotExist(errors.Unwrap(wrappedError)))
    }

Using errs.Is() seems to resolve this even without unwrapping the error:

    func TestABC(t *testing.T) {
	classA := errs.Class("A")
	classB := errs.Class("B")

	wrappedError := classB.Wrap(classA.Wrap(os.ErrNotExist))

	require.True(t, errs.Is(wrappedError, os.ErrNotExist))
	require.False(t, errs.Is(wrappedError, os.ErrExist))
	require.False(t, os.IsNotExist(wrappedError))
    }

Does not resolve the collector issue here but enhances it:
https://github.com/storj/storj/issues/4192

Change-Id: Ifb75dd15b54c1e1a5e23f6eba2d621d64874a5cc
2022-09-02 12:26:33 +00:00
Márton Elek
7e71986493 storagenode: accept HTTP calls on public port, listening for monitoring requests
Today each storagenode should have a port which is opened for the internet, and handles DRPC protocol calls.

When we do a HTTP call on the DRPC endpoint, it hangs until a timeout.

This patch changes the behavior: the main DRPC port of the storagenodes can accept HTTP requests and can be used to monitor the status of the node:

 * if returns with HTTP 200 only if the storagnode is healthy (not suspended / disqualified + online score > 0.9)
 * it CAN include information about the current status (per satellite). It's opt-in, you should configure it so.

In this way it becomes extremely easy to monitor storagenodes with external uptime services.

Note: this patch exposes some information which was not easily available before (especially the node status, and used satellites). I think it should be acceptable:

 * Until having more community satellites, all storagenodes are connected to the main Storj satellites.
 * With community satellites, it's good thing to have more transparency (easy way to check who is connected to which satellites)

The implementation is based on this line:

```
http.Serve(NewPrefixedListener([]byte("GET / HT"), publicMux.Route("GET / HT")), p.public.http)
```

This line answers to the TCP requests with `GET / HT...` (GET HTTP request to the route), but puts back the removed prefix.

Change-Id: I3700c7e24524850825ecdf75a4bcc3b4afcb3a74
2022-08-26 09:38:09 +00:00
Márton Elek
4b1be6bf8e storagenode/satellite: support different piece hash algorithms
Change-Id: I3db321e79f12f3ebaa249e6c32fa37fd9615687e
2022-08-23 18:15:06 +00:00
Clement Sam
bac0155664 storagenode/storagenodedb: fix null at_rest_total values for storage usage
Wrapping a COALESCE around the computed at_rest_total value to fallback
to the original at_rest_total value when the computed value is null.

https://forum.storj.io/t/release-preparation-v1-62/19444/5?u=clement

Change-Id: Ifa268ccbe35a63e3b68f07464194fa034ad261b5
2022-08-23 12:56:28 +00:00
Márton Elek
6b27c64833 testplanet: support snapshot based migration for storagenode
Similar to the existing snapshot based tests of satellite/metabase db we make a migration here which is:
 * dedicated to unit tests
 * faster (with less steps)
 * but safe: additional unit test ensures that the snapshot based migration and normal prod migration have the same results.

Change-Id: Ie324b09f64b4553df02247a9461ece305a6cf832
2022-08-22 09:46:27 +00:00
paul cannon
37a4edbaff all: reformat comments as required by gofmt 1.19
I don't know why the go people thought this was a good idea, because
this automatic reformatting is bound to do the wrong thing sometimes,
which is very annoying. But I don't see a way to turn it off, so best to
get this change out of the way.

Change-Id: Ib5dbbca6a6f6fc944d76c9b511b8c904f796e4f3
2022-08-10 18:24:55 +00:00
Clement Sam
25f8f678ab storagenode/nodestats: retrieve storage usage starting from last day of previous month
For the storagenode usage graph currently,
1. the graph is still likely to contain spikes on the first day of
the month for a newly setup storagenode because there's no previous
interval_end_time to deduct from.

2. if a node goes offline for too long, say the last usage report
in the storageusage cache has an interval_end_time of
2020-07-30 00:00:00+00:00 and later, it comes back online a few
days later, it requests for the storage usage from the satellite
starting from the current month, say 2021-01-01 00:00:00+00:00,
the calculated hours for the first day would be 48 hours and it
could be wrong because the cache is missing one day usage report.

This PR addresses second issue on the storagenode side by requesting
storage usage data, instead of just a month boundary, request for an
interval starting from the last day of the previous month to the
current day of the current month.
The first one will be a tradeoff and wouldn't really matter since
it will just be an issue on the first day the storagenode joined
the satellite.

Updates https://github.com/storj/storj/issues/4178

Change-Id: I041c56c412030ce013dd77dce11b0b5d6550927b
2022-08-10 01:37:02 +00:00
Clement Sam
9a539c4830 storagenode/storageusage: add interval_end_time, rename interval_start to timestamp
The satellite now returns the last interval_end_time for each
daily storage usage.
We need to store the interval_end_time in the storage usage cache.
Also, renamed interval_start to timestamp to avoid ambiguity since
the interval_start only stores just the date/day returned by the
satellite.

Updates https://github.com/storj/storj/issues/4178

Change-Id: I94138ba8a506eeedd6703787ee03ab3e072efa32
2022-08-10 01:03:00 +00:00
Óscar de Arriba
4fdb81c510
storagenode/pieces: allow to configure initial piece scan (#5024)
* Allow configure initial piece scan

* Update tests to include new flag

* Update default config specification

* Rename configuration flag

* Rename variable and fix formatting

* Fix format

* Fix typo

Co-authored-by: Stefan Benten <mail@stefan-benten.de>
Co-authored-by: Clement Sam <clementsam75@gmail.com>
Co-authored-by: littleskunk <jens.heimbuerge@googlemail.com>
2022-08-07 22:40:59 +00:00
Egon Elbre
e9692c5681 storagenode/gracefulexit: remove unused interface
Change-Id: Ie6c3d69f5177872d8f4308ac476bc87655da9e4b
2022-08-04 11:26:14 +03:00
Egon Elbre
cf92220c20 {satellite,storagenode}/gracefulexit: simplify limiter usage
Change-Id: Ied7091fe5355b96d327e3f893c5bdd4946a9e6af
2022-08-04 08:18:15 +00:00
Egon Elbre
bc9ab8ee5e satellite/audit,storagenode/gracefulexit: fixes to limiter
Ensure we don't rely on limiter to wait multiple times.

Change-Id: I75d48420236216d4c2fc6fa99293f51f80cd9c33
2022-08-03 10:24:16 +03:00
Qweder93
98b8c7be06 storagenode/console: consoleAPI EstimatedPayout flacky test fixed
In rare cases time frame between creating time.Now() variable and calling
service method that receives it (after API method call) was big enough to distort
current month expectations and make test fail with last digit difference in 1

Change-Id: Ib811492d62f6598a5c40a09de6a87bffeaa0a78e
2022-08-01 15:13:15 +00:00
Stefan Benten
c3171b4ba4
storagenode/retain: Move summary and start logs to info level (#4954)
We currently do not log the GC information/stats under normal circumstances.
This is not good for monitoring and troubleshooting.
2022-07-08 18:19:08 +02:00
Egon Elbre
05e165283f storagenode/console/consoleapi: use fixed time.Now()
It seems the tests relied on time.Now(), which might cause some
discrepancies in calculations. Use a fixed time.Now() rather than
recalculating.

As a sidefix, remove "Test" prefix from t.Run. These are unnecessary.

Change-Id: I1de903fcf0fcf46fc8e3acf2463e17239b8e3cc6
2022-07-01 12:36:01 +03:00
René Smeekes
fd042a92b9
storagenode/reputation: clarify wording on suspension notification (#4921)
Fixes https://github.com/storj/storj/issues/4920
2022-06-20 22:07:12 +02:00
Egon Elbre
175d1e694c mod: bump storj.io/private
This updates the database drivers and sqlite implementation.

Change-Id: I384d81d7beca7267f7c1d6115b3e5e33629f8dc7
2022-04-26 19:18:52 +00:00
Egon Elbre
5fca07387f storagenode/storagenodedb: make schemagen gofmt compatible
Few adjustments to the output to make it consistent with gofmt.

Change-Id: Icb673a0632c3be4cb0f4ab1c4aeffc0290e38e95
2022-03-31 13:28:10 +03:00
Egon Elbre
0d2d59f884 all: fix linting issues
Change-Id: Idfc93948e59a181321d79b365e638d63e256a16f
2022-03-21 15:26:42 +00:00
Egon Elbre
466832e4bc storagenode/piecestore: check for remote closing
Remote closing during upload or download is entirely expected and
it shouldn't lead to an error in the log.

Bump drpc to get the version that contains correct error code
for it. Also bump errs, which contains a fix for .Has.

Fixes https://github.com/storj/storj/issues/4609

Change-Id: I9297cabcfdc4b3a2c19d478dc729f779a2aef0c3
2022-03-17 19:27:42 +02:00
Egon Elbre
dc0f7b5f77 storagenode,web/storagenode: use go:embed for assets
Go can now directly embed files without relying on external tools.
This makes code use go:embed and avoid the external tooling.

go:embed requires files to be present in the embedded directory,
hence we need to add .keep to "dist" folder. We also add one to
public/.keep, such that it won't be deleted when building storagenode.

Change-Id: I8bef81236be6829ed37ed4c16ef693677b93a631
2022-03-11 16:01:28 +02:00
Egon Elbre
5f7ea1358d private/web: make caching headers reusable
Move storagnode/console caching headers to private/web. Also,
start using them in multinode/console/server.

Change-Id: I1f0f3c9833a183476009737cece515ae7537fb83
2022-03-11 11:19:11 +02:00
Erik van Velzen
a9bd983f04 sql: capitalize keywords
Capitalize some keywords which were overlooked

Change-Id: Ie2ad283669e2ca2650fcddfd8c7395a81bac09a8
2022-03-01 15:19:38 +00:00
Clement Sam
8b40a071a0 storagenode: check if QUIC is properly configured
This change adds a check to confirm if UDP port if properly configured for QUIC

Resolves https://github.com/storj/storj/issues/4332
Partly resolves https://github.com/storj/storj/issues/4358

Change-Id: I9a66f26a115e48b4fcd168f50a7d0b4d81712f4e
2022-01-20 12:04:04 +00:00
Clement Sam
589c82f23e storagenode: display quic status on SNO dashboard
This change includes storagenode QUIC status on SNO dashboard.
If disabled, it displays warning for SNOs to foward their
UDP port for quic.

Change-Id: I8d28c9c0f5f1e90d80b7c18b9e1e7b78c5e45609
2021-12-23 12:22:24 +00:00
Z
4762493e0f
storagenode/reputation: fix missing space in suspension message (#4309)
Fix missing space in the suspended message

Co-authored-by: Stefan Benten <mail@stefan-benten.de>
2021-12-13 19:30:33 +00:00
littleskunk
07fad75912
storagenode/piecestore: upload and download metrics for Grafana alerts (#4280)
* storagenode/piecestore: Add upload and download metrics for Grafana alerts

* storagenode/piecestore: group download metrics by piece action

Change-Id: Ib2a42b60c56c3f581915d512f4907c8db71e4624

Co-authored-by: Clement Sam <clement@storj.io>
2021-11-18 12:50:39 +00:00
TungHoang
f9b630b0f4
Fix monthly earning estimation (#4282) 2021-11-17 18:26:21 -05:00
Egon Elbre
96058be1a7 storagenode/storagenodedb: fix error message
Change-Id: Ibc98aa6539199f8dc8117f21a6ae645cea419403
2021-11-10 17:46:30 +00:00
Egon Elbre
af5b90ed32 storagenode/storagenodedb: include dbname in error
When something happens during opening or closing then it wasn't clear
which database had the issue.

Fixes storj/storj#4271

Change-Id: I34c8bae79a5b41ccd9b40aa8d836805f8c1a573c
2021-11-10 16:25:34 +00:00
Yingrong Zhao
2df41028a3 storagenode/piecestore: add logs for restore trash endpoint
When a satellite operator runs restore trash command, it's hard for node
operator to know whether their node has received the request. Adding
logs so this process is visible from node operators perspective.

Change-Id: I08c670b1b0c5971e6b73e038a0935cb0caaca63d
2021-10-20 15:37:28 +00:00
Clement Sam
29599dd7cd testsuite/ui/multinode: add first node test
Change-Id: Ibeaee92e6d1a41e408659a5045f52e1908f73089
2021-10-18 16:07:01 +00:00
Clement Sam
0d58172c38 storagenode: add doc.go files for sno packages
Change-Id: I23d4b8b462e1b03718d0c4801cc2aaff520e7356
2021-09-29 08:24:56 +00:00
Egon Elbre
71eb184ef3 storagenode/piecestore: simplify TestTooManyRequests
Change-Id: I1452735bef206bc8cf9bde72fc503fec3ae5b067
2021-09-21 17:18:37 +03:00
Fadila Khadar
c00ecae75c satellite/gracefulexit: stop using gracefulexit_transfer_queue
Remove the logic associated to the old transfer queue.
A new transfer queue (gracefulexit_segment_transfer_queue) has been created for migration to segmentLoop.
Transfers from the old queue were not moved to the new queue.
Instead, it was still used for nodes which have initiated graceful exit before migration.
There is no such node left, so we can remove all this logic.
In a next step, we will drop the table.

Change-Id: I3aa9bc29b76065d34b57a73f6e9c9d0297587f54
2021-09-14 11:52:34 +00:00
Egon Elbre
1aec831d98 satellite/audit,storage: increase sleep delay in TestMaxVerifyCount
Currently TextMaxVerifyCount flakes in some tests, try increasing the
sleep time to ensure that things are slow enough to trigger the error
condition.

Also pass ctx to all the funcs so we can handle sleep better.

Change-Id: I605b6ea8b14a0a66d81a605ce3251f57a1669c00
2021-09-10 15:30:37 +00:00
Michał Niewrzał
c258f4bbac private/testplanet: move Metabase outside Metainfo for satellite
At some point we moved metabase package outside Metainfo
but we didn't do that for satellite structure. This change
refactors only tests.
When uplink will be adjusted we can remove old entries in
Metainfo struct.

Change-Id: I2b66ed29f539b0ec0f490cad42c72840e0351bcb
2021-09-09 07:15:51 +00:00
igor gaidaienko
92deef4f34 Revert "storagenode/payouts: historical payouts use satellitesDB instead of trustPool"
This reverts commit 4a98dd40e2.
2021-08-02 13:55:21 +03:00
Fadila Khadar
c4202b9451 satellite/gracefulexit: use graceful_exit_segment_transfer_queue
For being able to use the segment metainfo loop, graceful exit transfers have to include the segment stream_id/position instead of the path. For this, we created a new table graceful_exit_segment_transfer_queue that will replace the graceful_exit_transfer_queue. The table has been created in a previous migration and made accessible through graceful exit db in another one.
This changes makes graceful exit enqueue transfer items for new exiting nodes in the new table.

Change-Id: I7bd00de13e749be521d63ef3b80c168df66b9433
2021-07-21 14:02:20 +00:00
Fadila Khadar
b0d98b1c1a satellite/gracefulexit: allow use of graceful_exit_segment_transfer_queue
For being able to use the segment metainfo loop, graceful exit transfers have to include the segment stream_id/position instead of the path. For this, we created a new table graceful_exit_segment_transfer_queue that will replace the graceful_exit_transfer_queue. The table has been created in a previous migration.
This change gives access to this table.
Graceful Exit doesn't use the table yet, this will be done in a next change.

Change-Id: I6c09cff4cc45f0529813a8898ddb2d14aadb2cb8
2021-07-21 12:34:44 +00:00
Qweder93
4a98dd40e2 storagenode/payouts: historical payouts use satellitesDB instead of trustPool
Change-Id: I39f4215f4ebf91bd1b38fbcb5c58e6ba53ceff1b
2021-07-15 16:19:18 +03:00
Qweder93
4d0fe39235 storagenode/satellites: address added, caching satellite's addresses from trust
Change-Id: Ica3eea5b8d81b176c6a4385fea803730b08ece16
2021-07-08 15:38:23 +00:00
Yaroslav Vorobiov
cbb4cd3fc3 multinode/reputation: add vetted at timestamp
Change-Id: Id35cb6cfdabf4bf2762e4a162cf3157afb0ff170
2021-07-07 18:11:54 +03:00
Yaroslav Vorobiov
a5fd903177 storagenode/reputation: add vetted at timestamp
Change-Id: I02d59414b6b172cf7f7bfc92df222cf4a5574e0e
2021-07-07 18:11:54 +03:00
Yaroslav Vorobiov
818f6c6ea6 multinode/console: add summary to storage usage API
Change-Id: Ia8a1e598d667f25461f73f1626da22113cb7caeb
2021-07-07 15:00:05 +03:00
Qweder93
6c02258925 storagenode/peer: register multinode Payouts endpoint
Change-Id: Iaaba0157e605bb51a73244517986d6b49a59d8d5
2021-07-06 10:09:24 +00:00
Yaroslav Vorobiov
68627e7d80 multinode/console: add reputation satellite api
Change-Id: I7cef6c1c271607f7485f604d5b61587558a31878
2021-07-05 15:32:22 +00:00