storj

JakeHillion/storj

Fork 0

Commit Graph

Author	SHA1	Message	Date
paul cannon	72189330fd	satellite/gracefulexit: revamp graceful exit Currently, graceful exit is a complicated subsystem that keeps a queue of all pieces expected to be on a node, and asks the node to transfer those pieces to other nodes one by one. The complexity of the system has, unfortunately, led to numerous bugs and unexpected behaviors. We have decided to remove this entire subsystem and restructure graceful exit as follows: * Nodes will signal their intent to exit gracefully * The satellite will not send any new pieces to gracefully exiting nodes * Pieces on gracefully exiting nodes will be considered by the repair subsystem as "retrievable but unhealthy". They will be repaired off of the exiting node as needed. * After one month (with an appropriately high online score), the node will be considered exited, and held amounts for the node will be released. The repair worker will continue to fetch pieces from the node as long as the node stays online. * If, at the end of the month, a node's online score is below a certain threshold, its graceful exit will fail. Refs: https://github.com/storj/storj/issues/6042 Change-Id: I52d4e07a4198e9cb2adf5e6cee2cb64d6f9f426b	2023-09-27 08:40:01 +00:00
Andrew Harding	4241e6bf5f	satellite/gracefulexit: implement rangedloop observer The tests are forked from the chore tests with slight adaptations for being run against the ranged loop. I also moved a benchmark for the database from chore_test.go to db_test.go. The pathcollector is reused as a rangedloop.Partial. https://github.com/storj/storj/issues/5234 Change-Id: I56182031d133812a9f4d4a433c01b9150af39f31	2022-12-22 10:47:10 -07:00

Author

SHA1

Message

Date

paul cannon

72189330fd

satellite/gracefulexit: revamp graceful exit

Currently, graceful exit is a complicated subsystem that keeps a queue
of all pieces expected to be on a node, and asks the node to transfer
those pieces to other nodes one by one. The complexity of the system
has, unfortunately, led to numerous bugs and unexpected behaviors.

We have decided to remove this entire subsystem and restructure graceful
exit as follows:

* Nodes will signal their intent to exit gracefully
* The satellite will not send any new pieces to gracefully exiting nodes
* Pieces on gracefully exiting nodes will be considered by the repair
  subsystem as "retrievable but unhealthy". They will be repaired off of
  the exiting node as needed.
* After one month (with an appropriately high online score), the node
  will be considered exited, and held amounts for the node will be
  released. The repair worker will continue to fetch pieces from the
  node as long as the node stays online.
* If, at the end of the month, a node's online score is below a certain
  threshold, its graceful exit will fail.

Refs: https://github.com/storj/storj/issues/6042
Change-Id: I52d4e07a4198e9cb2adf5e6cee2cb64d6f9f426b

2023-09-27 08:40:01 +00:00

Andrew Harding

4241e6bf5f

satellite/gracefulexit: implement rangedloop observer

The tests are forked from the chore tests with slight adaptations for
being run against the ranged loop. I also moved a benchmark for the
database from chore_test.go to db_test.go.

The pathcollector is reused as a rangedloop.Partial.

https://github.com/storj/storj/issues/5234

Change-Id: I56182031d133812a9f4d4a433c01b9150af39f31

2022-12-22 10:47:10 -07:00

2 Commits