Commit Graph

113 Commits

Author SHA1 Message Date
Clement Sam
b64179c82a {storagenode/pieces,cmd/storagenode}: refactor lazyfilewalker commands and tests
With this change we are directly testing how the command
is executed when the args are passed

Change-Id: Ibb33926014c9d71c928e0fd374bf4edc5a8a1232
2023-06-02 00:11:53 +00:00
Clement Sam
c6f67d4799 storagenode: fix lazyfilewalker failing with SIGPIPE
Lazyfilewalker was failing with SIGPIPE which was quite
misleading. The command was failing because the
the value of the --lower-io-priority flag was assumed
to be an arguement since it was passed as
"--lower-io-priority true" instead "--lower-io-priority=true"

Resolves https://github.com/storj/storj/issues/5900

Change-Id: Icf79fcce76dafee21659d76ee0ce19d8520c8f1d
2023-05-24 15:19:31 +00:00
Clement Sam
018b6eeeaf storagenode: add tests for lazyfilewalker
Updates https://github.com/storj/storj/issues/5349

Change-Id: I9544c14ba2acacd5b304f151ab29c70ff61adc5b
2023-05-08 21:50:40 +00:00
Clement Sam
cf7ce81d09 cmd/storagenode: refactor lazyfilewalker commands to satisfy the execwrapper.Command interface
Follow-up change for https://review.dev.storj.io/c/storj/storj/+/10335

Updates https://github.com/storj/storj/issues/5349

Change-Id: Iadf55bae84ebc0803a0766830e596c396dfb332b
2023-05-08 15:09:53 +00:00
Clement Sam
291e639ac2 storagenode/pieces/lazyfilewalker: add execwrapper package
The execwrapper package wraps the exec.Cmd and has a Command
interface that mimics the behaviour of the exec.Cmd.
This is useful for testing the lazyfilewalker subprocesses
by stubbing instead of spawning a real subprocess.

Updates https://github.com/storj/storj/issues/5349

Change-Id: I14084139c76a531f2b6d7163f9aa35c3f5e192d7
2023-05-06 02:02:23 +00:00
Clement Sam
ec8bfe6b94 storagenode/pieces: capture logger time Key in zapwrapper
Updates https://github.com/storj/storj/issues/5349

Change-Id: I426f38c0ae0f93d498317e3f66ba4f5724620758
2023-05-06 02:02:23 +00:00
Egon Elbre
8b82dba602 storagenode/blobstore/filestore: add tracking of blobs
We've had issues with forgetting to close readers and writers.
Add leak tracking to find those pesky issues.

Change-Id: If6b0ad6e9958318a7e0affee9c6d0a1ece412b6d
2023-05-05 15:40:15 +03:00
Clement Sam
e0542c2d24 storagenode: run garbage collection filewalker as a low I/O subprocess
Updates https://github.com/storj/storj/issues/5349

Change-Id: I7d810d737b17f0b74943765f7f7cc30b9fcf1425
2023-05-02 19:43:38 +00:00
Clement Sam
f076238748 storagenode: run used-space filewalker as a low IO subprocess
As part of fixing the IO priority of filewalker related
processes such as the garbage collection and used-space
calculation, this  change allows the initial used-space
calculation to run as a separate subprocess with lower
IO priority.

This can be enabled with the `--storage2.enable-lazy-filewalker`
config item. It falls back to the old behaviour when the
subprocess fails.

Updates https://github.com/storj/storj/issues/5349

Change-Id: Ia6ee98ce912de3e89fc5ca670cf4a30be73b36a6
2023-04-14 04:16:14 +00:00
Egon Elbre
f5020de57c storagenode/blobstore: move blob store logic
The blobstore implementation is entirely related to storagenode, so the
rightful place is together with the storagenode implementation.

Fixes https://github.com/storj/storj/issues/5754

Change-Id: Ie6637b0262cf37af6c3e558556c7604d9dc3613d
2023-04-05 18:06:20 +00:00
Clement Sam
e5c43722dc storagenode/pieces: introduce FileWalker
FileWalker implements methods to walk over pieces in
in a storage directory.

This is just a refactor to separate filewalker functions
from pieces.Store. This is needed to simplify the work
to create a separate filewalker subprocess and reduce the
number of config flags passed to the subprocess.

You might want to check https://review.dev.storj.io/c/storj/storj/+/9773

Change-Id: I4e9567024e54fc7c0bb21a7c27182ef745839fff
2023-03-30 18:33:52 +00:00
Clement Sam
c3d5965ef2 storagenode/monitor: add timeout to storage dir verification
Resolves https://github.com/storj/storj/issues/4567

Change-Id: Ia071c476bcd1f5c99a9874801c94db86d1e105c6
2023-03-14 13:43:14 +00:00
Andrew Harding
5c744d7ed4 storagenode/pieces: close reader after use
Change-Id: Icd9df821edb668c5521732396b7d6be3b8e75c7a
2023-03-13 14:06:10 +00:00
paul cannon
ed7c82439d storage/filestore: avoid stat() during walkNamespaceInPath
Calling stat() (really, lstat()) on every file during a directory walk
is the step that takes up the most time. Furthermore, not all directory
walk uses _need_ to have a stat done on every file. Therefore, in this
commit we avoid doing the stat at the lowest level of
walkNamespaceInPath. The stat will still be done when it is requested,
with the Stat() method on the blobInfo object.

The major upside of this is that we can avoid the stat call on most
files during a Retain operation. This should speed up garbage collection
considerably.

The major downside is that walkNamespaceInPath will no longer
automatically skip over directories that are named like blob files, or
blob files which are deleted between readdir() and stat(). Callers to
walkNamespaceInPath and its variants (WalkNamespace,
WalkSatellitePieces, etc) are now expected to handle these cases
individually.

Thanks to forum member Toyoo for the insight that this would speed up
garbage collection.

Refs: https://github.com/storj/storj/issues/5454
Change-Id: I72930573d58928fa25057ed89cd4ec474b884199
2023-01-30 13:47:03 +00:00
Egon Elbre
90b7076d26 storagenode/pieces: fix log line
Change-Id: I8dba6b0f3d6af3140dfa503c8d6b33e6808d004f
2023-01-17 11:04:47 +02:00
Egon Elbre
9544a670d7 storagenode/pieces: fix concurrent empty and restore trash
This ensures that empty trash and restore trash cannot run at the same
time.

Fixes https://github.com/storj/storj/issues/5416

Change-Id: I9d2e3aa3d66e61e5c8a7427a95208bb96089792d
2023-01-03 15:01:54 +00:00
Michal Niewrzal
5110803102 storagenode/piecestore: add Exists endpoint
Adds new method Exists which can be used to verify which
requested piece ids exists on storage node. Will verify only pieces
which belongs to the satellite that used that endpoint.

Minum WASM size was increased a bit.

https://github.com/storj/storj/issues/5415

Change-Id: Ia5f9cadeb526541b2776a8973eb7d50133ad8636
2022-12-17 04:08:26 +00:00
Egon Elbre
ee71fbb41d storagenode/piecestore: start restore trash in the background
Starting restore trash in the background allows the satellite to
continue to the next storagenode without needing to wait until
completion.

Of course, this means the satellite doesn't get feedback whether it
succeeds successfully or not. This means that the restore-trash needs to
be executed several times.

Change-Id: I62d43f6f2e4a07854f6d083a65badf897338083b
2022-12-16 18:15:52 +02:00
Clement Sam
f5156296d4 storagenode/pieces: warn and trash v0 pieces when not found in v0pieceInfoDB
Context: https://github.com/storj/storj/issues/4225#issuecomment-1307575782

Closes https://github.com/storj/storj/issues/4225

Change-Id: Ib8c3189f86118338556d48a6af657e6dc109b4c0
2022-11-14 14:54:16 +00:00
Egon Elbre
ff22fc7ddd all: fix deprecated ioutil commands
Change-Id: I59db35116ec7215a1b8e2ae7dbd319fa099adfac
2022-10-11 15:27:29 +00:00
Clement Sam
07beef378d storagenode/collector: delete expired piece info if file does not exist
The collector tries deleting a piece over and over again, though
the piece does not exist on the storagenode's filesystem.
We need to delete the piece info from the expired db if the
targeted file does not exist.
This does not resolve the base problem of why the file
is deleted before the collector tries deleting it.
This change deletes the piece info from the expired db
if the file does not exist, since we're already trying
to delete that piece anyway.

Closes https://github.com/storj/storj/issues/4192

Change-Id: If659185ca14f1cb29fd3c4237374df6fcd535df8
2022-09-15 12:29:29 +00:00
Márton Elek
4b1be6bf8e storagenode/satellite: support different piece hash algorithms
Change-Id: I3db321e79f12f3ebaa249e6c32fa37fd9615687e
2022-08-23 18:15:06 +00:00
Óscar de Arriba
4fdb81c510
storagenode/pieces: allow to configure initial piece scan (#5024)
* Allow configure initial piece scan

* Update tests to include new flag

* Update default config specification

* Rename configuration flag

* Rename variable and fix formatting

* Fix format

* Fix typo

Co-authored-by: Stefan Benten <mail@stefan-benten.de>
Co-authored-by: Clement Sam <clementsam75@gmail.com>
Co-authored-by: littleskunk <jens.heimbuerge@googlemail.com>
2022-08-07 22:40:59 +00:00
Clement Sam
0d58172c38 storagenode: add doc.go files for sno packages
Change-Id: I23d4b8b462e1b03718d0c4801cc2aaff520e7356
2021-09-29 08:24:56 +00:00
Egon Elbre
1aec831d98 satellite/audit,storage: increase sleep delay in TestMaxVerifyCount
Currently TextMaxVerifyCount flakes in some tests, try increasing the
sleep time to ensure that things are slow enough to trigger the error
condition.

Also pass ctx to all the funcs so we can handle sleep better.

Change-Id: I605b6ea8b14a0a66d81a605ce3251f57a1669c00
2021-09-10 15:30:37 +00:00
Qweder93
4d0fe39235 storagenode/satellites: address added, caching satellite's addresses from trust
Change-Id: Ica3eea5b8d81b176c6a4385fea803730b08ece16
2021-07-08 15:38:23 +00:00
Egon Elbre
10372afbe4 ci: fix lint errors
Change-Id: Ib5893440807811f77175ccd347aa3f8ca9cccbdf
2021-05-17 13:37:31 +00:00
Jennifer Johnson
71072eb593 storagenode/pieces: send piece deletions to trash
This is a temporary precaution to avoid incorrectly auditing nodes for pieces that were deleted between database backups if we have to restore from a previous backup.

Here we send pieces to trash rather than directly deleting them from storage nodes so we can restore from trash after a db restoration.

Change-Id: Icd979d2a9a755e7428190c0129c9bc969649d544
2021-04-07 16:52:10 +00:00
Stefan Benten
494bd5db81
all: golangci-lint v1.33.0 fixes (#3985) 2020-12-05 17:01:42 +01:00
Egon Elbre
2268cc1df3 all: fix linter complaints
Change-Id: Ia01404dbb6bdd19a146fa10ff7302e08f87a8c95
2020-10-13 15:59:01 +03:00
nerdatwork
870abd8676
storagenode/pieces: tidying trash log 2020-09-24 11:55:06 +03:00
nerdatwork
54dd430048
storagenode/pieces: fix typo for satellite id and piece id 2020-09-22 08:19:12 +03:00
nerdatwork
96ec44ff1b
storagenode/pieces: make log more legible 2020-09-18 15:10:13 +03:00
Cameron Ayer
ca0c1a5f0c storagenode/{monitor,pieces}, storage/filestore: add loop to check storage directory writability
periodically create and delete a temp file in the storage directory
to verify writability. If this check fails, shut the node down.

Change-Id: I433e3a8d1d775fc779ae78e7cf3144a05ffd0574
2020-08-31 21:20:49 +00:00
Cameron Ayer
586e6f2f13 private/testblobs, storage, storage/filestore: add storage dir verification to filestore
Sometimes SNOs fail to properly configure or lose connection to their storage directory
which can result in DQ. This causes unnecessary repair and is unfortunate for all parties.

This change introduces the creation of a special file in the storage directory at runtime
containing the node ID. While the storage node runs, it periodically verifies that it can
find said file with the correct contents in the correct location. If not, the node will
shut down with an error message.

This change will solve the issue of nodes losing access to the storage directory, but it will not
solve the issue of nodes pointing to the wrong directory, as the identifying file is created each
time the node starts up. After this change has been the minimum version for a few releases, we will
remove the creation of the directory-identifying file from the storage node run command and add it
to the setup command.

Change-Id: Ib7b10e96ac07373219835e39239e93957e7667a4
2020-08-19 17:18:14 +00:00
Qweder93
53a5d18e1a storagenode: fixed logging about piece being moved to trash, and added logging when piece was actually deleted
Change-Id: I46f6a141b27033c2087b5c4681506d80b90f4a18
2020-08-02 20:00:05 +03:00
Egon Elbre
e70da5cd4e all: fix comments
Change-Id: I2d2307e3fab87de47a72b3595d051e2c95ff4f8a
2020-07-16 19:13:14 +03:00
Egon Elbre
080ba47a06 all: fix dots
Change-Id: I6a419c62700c568254ff67ae5b73efed2fc98aa2
2020-07-16 14:58:28 +00:00
stefanbenten
257855b5de all: replace == comparison with errors.Is
Change-Id: I05d9a369c7c6f144b94a4c524e8aea18eb9cb714
2020-07-14 15:50:25 +00:00
Qweder93
0521435e08 storagenode/gracefulexit: added deletion of all files left in storage/blobs/satellite after successful GE
https://storjlabs.atlassian.net/browse/SG-368

Change-Id: I29a978fe0d0153aedf2be91dc7f45b4ef386d447
2020-07-08 14:38:31 +03:00
Qweder93
f2a0c64425 storage/filestore: log potential disk corruption
In walkNamespaceWithPrefix log in case of "lstat" error, because this may indicate an underlying disk corruption.

SG-50

Change-Id: I867c3ffc47cfac325ae90658ec4780d213ff3e63
2020-05-27 12:12:55 +00:00
littleskunk
ef2671927d
storagenode/piecestore: move queue size defaults (#3881) 2020-05-15 19:10:26 +02:00
Egon Elbre
c630cf2490 storagenode/pieces: implement buffering for writing
Currently uploads can cause a lot of IOPS, reduce this by introducing a
in-memory buffer on-top of the file.

Change-Id: I5f4e3e01c0a36258271d180b922107de447bcb59
2020-05-04 06:01:32 +00:00
Egon Elbre
d225e2de48 all: add missing ctx.Cleanup calls in tests
Change-Id: Iaa65f90b9731d721691322bb92fc3da736aa10fe
2020-04-29 17:58:40 +00:00
Isaac Hess
237d9da477 storagenode/pieces: Deleter can handle multiple tests
Before the deleter would close its done channel once, so if additional
tests shared a storagenode, even if not in parallel, the later waits
would not work properly. This fixes that problem.

Change-Id: I7dcacf6699cef7c2c2948ba0f4369ef520601bf5
2020-04-29 11:26:56 -06:00
Isaac Hess
13bf0c62ab satellite/pieces: Fix race in piece deleter
There was a race in the test code for piece deleter, which made it
possible to broadcast on the condition variable before anyone was
waiting. This change fixes that and has Wait take a context so it times
out with the context.

Change-Id: Ia4f77a7b7d2287d5ab1d7ba541caeb1ba036dba3
2020-04-28 10:50:20 -06:00
Isaac Hess
db0371703f storagenode/pieces: Return UnhandledCount to satellite
When we receive a piece deletion request, include the number of piece
IDs we couldn't add to the queue in the reponse

Change-Id: Ibebbe92ac50105bb5c74b18211ed38d468eb33f3
2020-04-27 08:56:56 -06:00
Isaac Hess
edda8d73bd storagenode/pieces: Piece deleter monitor queue
Each time we process a piece deletion on the storagenode, monitor how
long the item was in the queue and the size of the queue.

Change-Id: I23f1a44f8b9cecb901bdf4739d55c005ffed4bef
2020-04-27 08:55:43 -06:00
Isaac Hess
a785d37157 storagenode/pieces: Process deletes asynchronously
To improve delete performance, we want to process deletes asynchronously
once the message has been received from the satellite. This change makes
it so that storagenodes will send the delete request to a piece Deleter,
which will process a "best-effort" delete asynchronously and return a
success message to the satellite.

There is a configurable number of max delete workers and a max delete
queue size.

Change-Id: I016b68031f9065a9b09224f161b6783e18cf21e5
2020-04-23 11:51:19 -06:00
Egon Elbre
11a44cdd88 all: don't depend on gogo/proto directly
Change-Id: I8822dea0d1b7b99e0b828e0373a0308a42dde2be
2020-04-08 17:32:15 +00:00