* pkg/pg: Add new service function storage node
Add a new service function to the storage node piece store for deleting
pieces when satellites request them.
* storagenode/piecestore: Add endpoint to delete piece
Add a new endpoint to receive from trusted satellites to delete a piece.
* private/testplanet: Fix storagenode mock
Add to the storagenode mock the new endpoint method.
* proto.lock: Update it with the last protbuff changes
* storagenode/piecestore: Reuse test piece upload
Extract the repeated logic from several tests functions for uploading a
test piece to a test helper function.
* uplink/piecestore: Implement client side method
Implement the client side method of the new piecestore RPC function.
* storagenode/piecestore: Add test DeletePiece endpoint
Implement a test for the DeletePiece new endpoint method.
* put TestCreateV0 back in StoreForTest
* avoid direct handles to V0 pieceinfo db
* type mismatch fix
* use storage.Blobs interface in store_test.go
..instead of filestore.Store. this will allow filestore.Store to become
unexported.
* unexport filestore.Store
rename it to blobStore. things should use the storage.Blobs interface
instead. changes in this commit are purely mechanical (made through the
"refactor" tool in Gocode followed by search/replace on the word "Store"
within the storage/filestore/ directory).
* kill filestore.StoreForTest
now that filestore.blobStore is unexported, there isn't a need for a
specialized wrapper type. this (not coincidentally) also makes it
possible for the WriterForFormatVersion() method on
storagenode/pieces.StoreForTest to work, without requiring everything to
wrap the store.blobs attribute in a filestore.StoreForTest, which was
impractical.
* change satellite.Peer name to Core
* change to Core in testplanet
* missed a few places
* keep shared stuff in peer.go to stay consistent with storj/docs
* improve errors in satellite contact endpoints
* add changes per CR comments
* update pingback method so it still updates node table
* fix err and returns
* fix zap logging to be better
When the contact chore starts running before the monitor service has
provided any useful capacity data, the first outgoing contact has
not-very-helpful data for the satellite. This change causes the contact
chore to wait until capacity data is available. The wait should be quite
short in all reasonable cases: even when a node starts with a lot of
stored pieces and no cached spaceUsedDB data, new data will have been
calculated and cached by the call to
`peer.Storage2.CacheService.Init(ctx)` in `storagenode.cmdRun()` before
`peer.Run(ctx)`.
Change-Id: Ibc26d5c1fc10a23006c00bc3f13ff6cf71f8bf1d
* Make the exiting node check piece hashes, piece IDs, and piece hash signatures before relaying successful transfer data to the satellite.
* Enable immediate graceful exit failure for "successful" transfers that fail satellite-side validation.
* Move transfer piece logic in storagenode worker to separate function (to make the worker easier to understand)
* add overall failure percentage check and inactive time frame check before sending a response to sno
* update comment
* delete node from transfer queue if it has been inactive for too long
* fix linting error
* add test config value
* fix nil pointer
* add config value into testplanet
* add unit test for overall failure threshold
* move timeframe threshold to chore
* update protolock
* add chore test
* add per peiece failure count logic
* change config name from EndpointMaxFailures to MaxFailuresPerPiece
* address comments
* fix linting error
* add error handling for no row returned from progress table
* fix test for graceful exit chore on storagenode
* fix typo InActive -> Inactive
* improve readability for failure threshold calculation
* update config lock
* change error handling for GetProgress in graceful exit endpoint on the satellite side
* return proper rpc error in endpoint
* add check in chore test for checking finish timestamp and queue
libuplink was incorrectly setting timeouts to 10 seconds still, but
should have been at least 10 minutes. the order sender was setting them
to 1 hour. we don't want timeouts in uplink-side logic as it establishes
a minimum rate on tcp streams.
instead of all of this, just use tcp keep alive. tcp keep alive packets are
sent every 15 seconds and if the peer stops responding the connection
dies. this is enabled by default with go. this will kill tcp connections
when they stop working.
Change-Id: I3d7ad49f71950b3eb43044eedf4b17993116045b
The upload code currently updates the usage in a deferred call to saveOrder().
The consequence is that in the success case, the RPC is completed before
the usage has been updated.
This change repurposes the deferred call to update usage in the
failure case, while explicitly updating the usage before completing the
RPC.
This fixes some test flakiness when using dRPC. gRPC waits until the final status is written before a Recv call completes, and the final status is written by the server after the handler function has exited. In practice this means that the client is blocked until the defer call is also finished. So this change will not change performance at all.
It has two advantages:
(1) It fixes test flakiness
and, more importantly:
(2) reduces the chances that someone will accidentally write a flaky test in the future
* add exit-status command
* remove todo and fix format
* fix status display
* change startExit to exit progress
* fix linting error
* add successful column in exit progress
* fix test
* remove extra new line
* fix TYPOS
* format the percentage better