Commit Graph

24 Commits

Author SHA1 Message Date
Ethan Adams
43103ae13f
lower storage node counts in tests (#3427) 2019-10-31 10:57:54 -04:00
Natalie Villasana
4878135068
satellite/gracefulexit, storagenode/gracefulexit: add timeouts (#3407) 2019-10-30 13:40:57 -04:00
Egon Elbre
65a8e0bcbc
{satellite,storagenode}/gracefulexit: clearer log messages (#3413) 2019-10-30 10:21:27 +02:00
Maximillian von Briesen
54594e79c3 satellite/gracefulexit: add metrics on satellite for graceful exit (#3355) 2019-10-29 16:22:20 -04:00
Maximillian von Briesen
cd0940724c satellite/gracefulexit: use sync2 cycle inside satellite graceful exit endpoint (#3394) 2019-10-29 14:40:42 -04:00
Maximillian von Briesen
cd3d3850f9 satellite/gracefulexit: only allow one connection per node to graceful exit endpoint (#3357) 2019-10-29 13:23:17 -04:00
Ethan Adams
7c2daa4dd9
Use original pointer when calling UpdatePieces (#3397) 2019-10-28 14:43:46 -04:00
Ethan Adams
9905f2c61e add piece num to transfer queue PK (#3390) 2019-10-28 11:08:33 -04:00
Yingrong Zhao
292e64ee2f
satellite/gracefulexit: check duplicate node id before update pointer (#3380)
* check duplicate node id before update pointer

* add test for transfer failure when pointer already contain the receiving node id

* check exiting and receiving nod are still in the pointer

* check node id only exists once in a pointer

* return error if the existing node doesn't match with the piece info in the pointer

* try to recreate the issue on jenkins

* should not remove exiting node piece in test

* Update satellite/gracefulexit/endpoint.go

Co-Authored-By: Maximillian von Briesen <mobyvb@gmail.com>

* Update satellite/gracefulexit/endpoint.go

Co-Authored-By: Maximillian von Briesen <mobyvb@gmail.com>
2019-10-27 14:20:22 -04:00
Maximillian von Briesen
a4e618fd1f handle context cancelled in satellite graceful exit endpoint (#3388) 2019-10-27 10:14:25 -04:00
Ethan Adams
e54d290d2e satellite/gracefulexit: Add signatures for success/failed exit finished messages. (#3368)
* add signatures, fix process loop bug, move delete to on success

* added tests for signatures

* PR comment updates

* fixed setting reason by default.

* updates for PR comments

* added signed failure when verificationi fails

* moved to sign_test

* fix panic

* removed testplanet from test
2019-10-25 16:36:26 -04:00
Maximillian von Briesen
6df4d7bc73
storagenode/gracefulexit + satellite/gracefulexit: add storagenode-side transfer validation (#3371)
* Make the exiting node check piece hashes, piece IDs, and piece hash signatures before relaying successful transfer data to the satellite.
* Enable immediate graceful exit failure for "successful" transfers that fail satellite-side validation.
* Move transfer piece logic in storagenode worker to separate function (to make the worker easier to understand)
2019-10-25 13:16:20 -04:00
Ethan Adams
f0caa6ce5e
satellite:gracefulexit: Update pointer after success (#3369) 2019-10-25 11:14:22 -04:00
Natalie Villasana
696c567e89
satellite/gracefulexit: add piece hash validation for successful transfer (#3313) 2019-10-24 15:38:40 -04:00
Yingrong Zhao
fa1ac24e19
satellite/gracefulexit: add failure threshold check (#3329)
* add overall failure percentage check and inactive time frame check before sending a response to sno

* update comment

* delete node from transfer queue if it has been inactive for too long

* fix linting error

* add test config value

* fix nil pointer

* add config value into testplanet

* add unit test for overall failure threshold

* move timeframe threshold to chore

* update protolock

* add chore test

* add per peiece failure count logic

* change config name from EndpointMaxFailures to MaxFailuresPerPiece

* address comments

* fix linting error

* add error handling for no row returned from progress table

* fix test for graceful exit chore on storagenode

* fix typo InActive -> Inactive

* improve readability for failure threshold calculation

* update config lock

* change error handling for GetProgress in graceful exit endpoint on the satellite side

* return proper rpc error in endpoint

* add check in chore test for checking finish timestamp and queue
2019-10-24 12:24:42 -04:00
Maximillian von Briesen
abb567f6ae
cmd/satellite: add graceful exit reports command to satellite CLI (#3300)
* update lock file and add comment

* add created at and bytes transferred

* cleanup

* rename db func to GetGracefulExitNodesByTimeFrame

* fix flag

* split into two overlay functions

* := to =

* fix test

* add node not found error class

* fix overlay test

* suggested test changes

* review suggestions

* get exit status from overlay.Get()

* check rows.Err

* fix panic when ExitFinishedAt is nil

* fix comments in cmdGracefulExit
2019-10-22 21:06:01 -04:00
Ethan Adams
3e0d12354a
storagenode/gracefulexit: Implement storage node graceful exit worker - part 1 (#3322) 2019-10-22 16:42:21 -04:00
Natalie Villasana
22d0f89941
satellite/gracefulexit: use zap.Stringer instead of zap.String (#3299) 2019-10-17 10:29:35 -04:00
Ethan Adams
37ab84355f
satellite/gracefulexit: protobuf field name updates (#3284)
rename piece_id to original_piece_id
2019-10-15 15:59:12 -04:00
Ethan Adams
78ccf14837 fix interface and EOF check (#3251) 2019-10-12 07:06:20 -06:00
Ethan Adams
a1275746b4
satellite/gracefulexit: Implement the 'process' endpoint on the satellite (#3223) 2019-10-11 17:18:05 -04:00
Ethan Adams
4c4519f0be
satellite/gracefulexit: add transfer queue for pieces (#3174)
initial impl of transfer queue
updated docs represent the new design how we handle durability during exit
2019-10-07 16:38:05 -04:00
Natalie Villasana
4f2f8ae11b satellite/overlay: add UpdateExitStatus and GetExitingNodes for graceful exit (#3087) 2019-10-01 18:18:21 -04:00
Ethan Adams
9edfb6efe0
satellite/satellitedb: Initial GE Satellite DB Implementation (#3049)
Initial GE Satellite DB impl
Add basic CRUD operations for graceful_exit_progress and graceful_exit_transfer_queue tables.
2019-09-25 11:12:44 -06:00