Commit Graph

87 Commits

Author SHA1 Message Date
Michal Niewrzal
6ac5bf0d7c satellite/gracefulexit: remove segments loop parts
We are switching completely to ranged loop.

https://github.com/storj/storj/issues/5368

Change-Id: Ia3e2d7879d91f7f5ffa99b8e8f108380e3b39f31
2023-04-24 15:00:26 +00:00
Egon Elbre
f5020de57c storagenode/blobstore: move blob store logic
The blobstore implementation is entirely related to storagenode, so the
rightful place is together with the storagenode implementation.

Fixes https://github.com/storj/storj/issues/5754

Change-Id: Ie6637b0262cf37af6c3e558556c7604d9dc3613d
2023-04-05 18:06:20 +00:00
paul cannon
ed7c82439d storage/filestore: avoid stat() during walkNamespaceInPath
Calling stat() (really, lstat()) on every file during a directory walk
is the step that takes up the most time. Furthermore, not all directory
walk uses _need_ to have a stat done on every file. Therefore, in this
commit we avoid doing the stat at the lowest level of
walkNamespaceInPath. The stat will still be done when it is requested,
with the Stat() method on the blobInfo object.

The major upside of this is that we can avoid the stat call on most
files during a Retain operation. This should speed up garbage collection
considerably.

The major downside is that walkNamespaceInPath will no longer
automatically skip over directories that are named like blob files, or
blob files which are deleted between readdir() and stat(). Callers to
walkNamespaceInPath and its variants (WalkNamespace,
WalkSatellitePieces, etc) are now expected to handle these cases
individually.

Thanks to forum member Toyoo for the insight that this would speed up
garbage collection.

Refs: https://github.com/storj/storj/issues/5454
Change-Id: I72930573d58928fa25057ed89cd4ec474b884199
2023-01-30 13:47:03 +00:00
Egon Elbre
e9692c5681 storagenode/gracefulexit: remove unused interface
Change-Id: Ie6c3d69f5177872d8f4308ac476bc87655da9e4b
2022-08-04 11:26:14 +03:00
Egon Elbre
cf92220c20 {satellite,storagenode}/gracefulexit: simplify limiter usage
Change-Id: Ied7091fe5355b96d327e3f893c5bdd4946a9e6af
2022-08-04 08:18:15 +00:00
Egon Elbre
bc9ab8ee5e satellite/audit,storagenode/gracefulexit: fixes to limiter
Ensure we don't rely on limiter to wait multiple times.

Change-Id: I75d48420236216d4c2fc6fa99293f51f80cd9c33
2022-08-03 10:24:16 +03:00
Fadila Khadar
c00ecae75c satellite/gracefulexit: stop using gracefulexit_transfer_queue
Remove the logic associated to the old transfer queue.
A new transfer queue (gracefulexit_segment_transfer_queue) has been created for migration to segmentLoop.
Transfers from the old queue were not moved to the new queue.
Instead, it was still used for nodes which have initiated graceful exit before migration.
There is no such node left, so we can remove all this logic.
In a next step, we will drop the table.

Change-Id: I3aa9bc29b76065d34b57a73f6e9c9d0297587f54
2021-09-14 11:52:34 +00:00
Michał Niewrzał
c258f4bbac private/testplanet: move Metabase outside Metainfo for satellite
At some point we moved metabase package outside Metainfo
but we didn't do that for satellite structure. This change
refactors only tests.
When uplink will be adjusted we can remove old entries in
Metainfo struct.

Change-Id: I2b66ed29f539b0ec0f490cad42c72840e0351bcb
2021-09-09 07:15:51 +00:00
Fadila Khadar
c4202b9451 satellite/gracefulexit: use graceful_exit_segment_transfer_queue
For being able to use the segment metainfo loop, graceful exit transfers have to include the segment stream_id/position instead of the path. For this, we created a new table graceful_exit_segment_transfer_queue that will replace the graceful_exit_transfer_queue. The table has been created in a previous migration and made accessible through graceful exit db in another one.
This changes makes graceful exit enqueue transfer items for new exiting nodes in the new table.

Change-Id: I7bd00de13e749be521d63ef3b80c168df66b9433
2021-07-21 14:02:20 +00:00
Fadila Khadar
b0d98b1c1a satellite/gracefulexit: allow use of graceful_exit_segment_transfer_queue
For being able to use the segment metainfo loop, graceful exit transfers have to include the segment stream_id/position instead of the path. For this, we created a new table graceful_exit_segment_transfer_queue that will replace the graceful_exit_transfer_queue. The table has been created in a previous migration.
This change gives access to this table.
Graceful Exit doesn't use the table yet, this will be done in a next change.

Change-Id: I6c09cff4cc45f0529813a8898ddb2d14aadb2cb8
2021-07-21 12:34:44 +00:00
Egon Elbre
86e698f572 pb: use *UnimplementedServer to avoid breaking API changes
Change-Id: I99a34eeb37ac4453411f273511710562a519f57a
2021-03-29 12:26:10 +03:00
Kaloyan Raev
6f3d0c4ad5 Merge remote-tracking branch 'origin/main' into multipart-upload
Conflicts:
	go.mod
	go.sum
	satellite/repair/repair_test.go
	satellite/repair/repairer/segments.go

Change-Id: Ie51a56878bee84ad9f2d31135f984881a882e906
2021-02-02 19:19:04 +02:00
paul cannon
c489a70e62 storagenode/gracefulexit: omit finished exits from ListPendingExits
From the name of the function and from the way it is used (only called
in one place, from "storj.io/storagenode/gracefulexit".(*Chore).Run()),
it should not return graceful exits that have already completed.

In particular, this causes a problem in the case that a node has already
completed a graceful exit from one satellite, after which the satellite
was decommissioned and no longer in the "trusted" list. This causes an
error message to show up in the node logs every single minute like
"failed to get satellite address ... satellite \"X\" is untrusted".

https://forum.storj.io/t/error-gracefulexit-service-failed-to-get-satellite-address/11372

This change causes ListPendingExits to list pending exits only, not all
exits.

Correspondingly, the check for whether an exit is already completed, in
(*Chore).Run(), becomes unnecessary and is here removed.

Change-Id: Ia3e9bb3e92be4a32ebcbda0321e3fe61d77deaa8
2021-02-01 15:28:50 +00:00
nerdatwork
74e293693e
storagenode/gracefulexit: improve error message 2021-02-01 15:09:18 +02:00
Michal Niewrzal
b3aa28cc02 satellite/gracefulexit: migrate to metabase
Change-Id: I8be9cc68894124427e4a30d7631126b3afb1f281
2020-12-18 10:57:39 +00:00
Michal Niewrzal
7dde184cb5 Merge 'master' branch
Change-Id: I6070089128a150a4dd501bbc62a1f8b394aa643e
2020-11-10 11:58:59 +00:00
Moby von Briesen
db6bc6503d satellite/metainfo: Update metainfo RS config to more easily support multiple RS schemes.
Make metainfo.RSConfig a valid pflag config value. This allows us to
configure the RSConfig as a string like k/m/o/n-shareSize, which makes
having multiple supported RS schemes easier in the future.

RS-related config values that are no longer needed have been removed
(MinTotalThreshold, MaxTotalThreshold, MaxBufferMem, Verify).

Change-Id: I0178ae467dcf4375c504e7202f31443d627c15e1
2020-11-09 22:16:13 +00:00
Egon Elbre
1903b15474 storagenode/internalpb: move gracefulexit.proto
Change-Id: Ia3614846ed49a39c8f39331516d16d45a695240b
2020-10-30 15:24:56 +02:00
Egon Elbre
76f4619a9c {satellite,storagenode}/gracefulexit: ensure client is closed
Change-Id: I576a955a5578caf7fcbee832beca28cef2b0c83e
2020-10-27 23:27:07 +02:00
paul cannon
76d4977b6a storagenode/gracefulexit: logic moved from worker to service
Change-Id: I8b12606a96b712050bf40d587664fb1b2c578fbc
2020-10-22 23:19:30 +00:00
Egon Elbre
0bdb952269 all: use keyed special comment
Change-Id: I57f6af053382c638026b64c5ff77b169bd3c6c8b
2020-10-13 15:13:41 +03:00
Michal Niewrzal
aa47e70f03 satellite/metainfo: use metabase.SegmentKey with metainfo.Service
Instead of using string or []byte we will be using dedicated type
SegmentKey.

Change-Id: I6ca8039f0741f6f9837c69a6d070228ed10f2220
2020-09-03 15:11:32 +00:00
Egon Elbre
f0ef01de5b storagenode/gracefulexit: retry workers faster
Change-Id: Ica20a691ff117a2b36a6362ee1fed21ce49a9ac1
2020-08-24 12:27:27 +03:00
Egon Elbre
e6bea41083 Revert "gracefulexit: reconnect added"
This reverts commit cff44fbd19.

Change-Id: I6590f483493e308b8244151e1df7570fd32ca2f8
2020-08-23 18:11:24 +03:00
Qweder93
cff44fbd19 gracefulexit: reconnect added
Change-Id: I236689af944effe3e79ef92e852ae264d3b372e5
2020-08-22 14:59:46 +03:00
littleskunk
db57d76ee9
storagenode/gracefulexit: fix wrong error handling for corrupted pieces (#3930) 2020-08-21 11:35:03 +02:00
Ethan
5445d595c0 storagenode/gracefulexit: Wait for the worker delete and transfer goroutines to finish before completing the exit
A failed test showed the same piece being deleted twice. This happens if the graceful exit completes before a previous piece deletion finishes. This change adds a "wait" on the limiter before executing the delete all step when GE is done.

Change-Id: I1c8c49d1e501c2728c80d4224a4854e742be27da
2020-08-19 14:20:26 +00:00
Egon Elbre
d8dcae3075 all: fix error checking
Change-Id: Ia0da1bbd6ce695139922f94096c2419281905e32
2020-07-16 19:13:14 +03:00
Egon Elbre
e70da5cd4e all: fix comments
Change-Id: I2d2307e3fab87de47a72b3595d051e2c95ff4f8a
2020-07-16 19:13:14 +03:00
Egon Elbre
080ba47a06 all: fix dots
Change-Id: I6a419c62700c568254ff67ae5b73efed2fc98aa2
2020-07-16 14:58:28 +00:00
Qweder93
f73e92c268 storagenode/gracefulexit: added blobs clean
on node's start checks if any of trusted satellites has GE status "Exited successfully"
if so - trying to delete blobs/satellite folder, so no trash left on SNO.

Change-Id: I566266c84f2a872df54cd01bc2f15a9934f138ed
2020-07-13 11:49:18 +00:00
Qweder93
0521435e08 storagenode/gracefulexit: added deletion of all files left in storage/blobs/satellite after successful GE
https://storjlabs.atlassian.net/browse/SG-368

Change-Id: I29a978fe0d0153aedf2be91dc7f45b4ef386d447
2020-07-08 14:38:31 +03:00
Yingrong Zhao
51dfc6bf4f storagenode/gracefulexit: make minimum transfer speed to be 5KiB
with 128B/sec, a satellite with 10min default timeout could already
closed its connection to a node even though the node was able to
compelete the transfer.

Change-Id: I6173d6473a62c6d0b0e0a8765c1ae0a5e57b0a08
2020-06-23 21:14:18 +00:00
Qweder93
e52809d53e cmd/storagenode: add check if satellites available to gracefulexit
Change-Id: I8747507593d810bbdec0d140de0600ee147011c3
2020-06-10 13:38:36 +00:00
paul cannon
7395dd1e6e storagenode/gracefulexit: revalidate existing pieces
..before they are transferred to another node and submitted to the
satellite as successful piece transfers, because if we submit an invalid
signature, the node will be marked as a cheater and disqualified
immediately.

These signatures should have been validated when the piece was
originally stored, but bitrot does happen and needn't be cause for an
immediate DQ.

Change-Id: I8b0ebd5812ea8a2e60766005b7251fbb74ef7857
2020-05-28 09:50:14 -05:00
Egon Elbre
94b2b315f7 storagenode/trust: refactor GetAddress to GetNodeURL
Most places now need the NodeURL rather than the ID and Address
separately. This simplifies code in multiple places.

Change-Id: I52621d8ca52296a8b5bf7afbc1001cf8bfb44239
2020-05-20 11:05:15 +00:00
Egon Elbre
ed627144ed all: use DialNodeURL throughout the codebase
Change-Id: Iaf9ae3aeef7305c937f2660c929744db2d88776c
2020-05-20 10:36:30 +00:00
Egon Elbre
c630cf2490 storagenode/pieces: implement buffering for writing
Currently uploads can cause a lot of IOPS, reduce this by introducing a
in-memory buffer on-top of the file.

Change-Id: I5f4e3e01c0a36258271d180b922107de447bcb59
2020-05-04 06:01:32 +00:00
Egon Elbre
11a44cdd88 all: don't depend on gogo/proto directly
Change-Id: I8822dea0d1b7b99e0b828e0373a0308a42dde2be
2020-04-08 17:32:15 +00:00
Egon Elbre
e1a443b04a private/testplanet: allow modifying created database
Instead of providing the database from outside to testplanet create it
inside and then allow wrapping and modifying it. This is more convenient
to use.

Change-Id: I9b8f69e6e0a19ff984b4e2bfe927c9100c77bc6c
2020-03-27 19:14:48 +00:00
Egon Elbre
e8f18a2cfe private/testplanet: expose storagenode and satellite Config
Change-Id: I80fe7ed8ef7356948879afcc6ecb984c5d1a6b9d
2020-03-27 17:01:25 +02:00
Yingrong Zhao
b7b19289d1 bump storj.io/common to latest
Change-Id: I16e337660ce8e1ef332cc842dbf4cfa067b9b98b
2020-03-25 09:08:40 -04:00
Bill Thorp
94c11c5212 satellite: remove some unnecessary UTC() calls
Fixes some easy cases of extraneous UTC() calls

Change-Id: I3f4c287ae622a455b9a492a8892a699e0710ca9a
2020-03-13 13:49:44 +00:00
Egon Elbre
5342dd9fe6 go.mod: update uplink
Change-Id: I867a6a1eef8aa5d60bb676e5112b98c4192ce811
2020-02-21 16:08:12 +02:00
Jeff Wendling
7999d24f81 all: use monkit v3
this commit updates our monkit dependency to the v3 version where
it outputs in an influx style. this makes discovery much easier
as many tools are built to look at it this way.

graphite and rothko will suffer some due to no longer being a tree
based on dots. hopefully time will exist to update rothko to
index based on the new metric format.

it adds an influx output for the statreceiver so that we can
write to influxdb v1 or v2 directly.

Change-Id: Iae9f9494a6d29cfbd1f932a5e71a891b490415ff
2020-02-05 23:53:17 +00:00
Egon Elbre
10be538602 storagenode: add pkg/debug support
Change-Id: If941095b886c28a0d53fff4c9bf9fa0ce7471dea
2020-01-29 16:30:31 -05:00
littleskunk
5c68f4fc7c storagenode/gracefulexit: higher concurrency and shorter timeouts
1 transfer with a minimum speed of 128 Bytes was a nice try but it is
way too low. Even a pi3 was able to handle 7 grpc transfers. We have 4
satellites and with 5 concurrent transfers that should be a total of 20
concurrent transfers. Each transfer will have a minimum speed of 5KB/s.
That should give us a better througput and still be Ok on a pi3.

Change-Id: I650a7baf890080901ef70ea3b5636d93009b4e60
2020-01-24 23:51:39 +00:00
Isaac Hess
44de90ecc8 storagenode/pieces: Rename vars and update comments
A few variables were not renamed to the new standard piecesTotal and
piecesContentSize, so it was unclear which value was being used. These
have been updated, and some comments made more thorough.

Change-Id: I363bad4dec2a8e5c54d22c3c4cd85fc3d2b3096c
2020-01-23 11:00:24 -07:00
Isaac Hess
14fd6a9ef0 storagenode/pieces: Track total piece size
This change updates the storagenode piecestore apis to expose access to
the full piece size stored on disk. Previously we only had access to
(and only kept a cache of) the content size used for all pieces. This
was inaccurate when reporting the amount of disk space used by nodes.

We now have access to the total content size, as well as the total disk
usage, of all pieces. The pieces cache also keeps a cache of the total
piece size along with the content size.

Change-Id: I4fffe7e1257e04c46021a2e37c5adc6fe69bee55
2020-01-23 11:00:24 -07:00
Michal Niewrzal
6502454947 satellite/metainfo: move RS configuration to satellite
With this change RS configuration will be set on satellite. Uplink with
get RS values with BeginObject request and will use it. For backward
compatibility and to avoid super large change redundancy scheme stored
with bucket is not touched. This can be done in future.

Change-Id: Ia5f76fc10c37e2c44e4f7b8754f28eafe1f97eff
2020-01-22 09:33:53 +00:00