* storagenode/piecestore: track live requests together
Change-Id: I9ed44e4484b97bcbe076c222450c3449fe8b1075
* show grpc status codes in monkit failures
Change-Id: I68bc3a8d24a372e8147ef2a74636fc3e40fa799a
* small nit
Change-Id: I722b09345377b079e41c5a3dc86d7fd6232c9d24
* pkg/server: don't use global logger
* satellite/overlay: use correct logger
* pkg/kademlia: use correct logger
* linksharing: use conventional way to pass in logger
* use zaptest in tests
* rename pkg/linksharing to linksharing
* rename pkg/httpserver to linksharing/httpserver
* rename pkg/eestream to uplink/eestream
* rename pkg/stream to uplink/stream
* rename pkg/metainfo/kvmetainfo to uplink/metainfo/kvmetainfo
* rename pkg/auth/signing to pkg/signing
* rename pkg/storage to uplink/storage
* rename pkg/accounting to satellite/accounting
* rename pkg/audit to satellite/audit
* rename pkg/certdb to satellite/certdb
* rename pkg/discovery to satellite/discovery
* rename pkg/overlay to satellite/overlay
* rename pkg/datarepair to satellite/repair
* Added a gc package at satellite/gc, which contains the gc.Service, which runs garbage collection integrated with the metainfoloop, and the gc PieceTracker, which implements the metainfo loop Observer interface and stores all of the filters (about which pieces are good) for each node.
* Added a gc config located at satellite/gc/service.go (loop disabled by default in release)
* Creates bloom filters with pieces to be retained inside the metainfo loop
* Sends RetainRequests (or filters with good piece ids) to all storage nodes.
* pkg/datarepair/repairer: Track always time for repair
Make a minor change in the worker function of the repairer, that when
successful, always track the metric time for repair independently if the
time since checker queue metric can be tracked.
* storage/postgreskv: Wrap error in Get func
Wrap the returned error of the Get function as it is done when the
query doesn't return any row.
* satellite/metainfo: Move debug msg to the right place
NewStore function was writing a debug log message when the DB was
connected, however it was always writing it out despite if an error
happened when getting the connection.
* pkg/datarepair/repairer: Wrap error before logging it
Wrap the error returned by process which is executed by the Run method
of the repairer service to add context to the error log message.
* pkg/datarepair/repairer: Make errors more specific in worker
Make the error messages of the "worker" method of the Service more
specific and the logged message for such errors.
* pkg/storage/repair: Improve error reporting Repair
In order of improving the error reporting by the
pkg/storage/repair.Repair method, several errors of this method and
functions/methods which this one relies one have been updated to be
wrapper into their corresponding classes.
* pkg/storage/segments: Track path param of Repair method
Track in monkit the path parameter passed to the Repair method.
* satellite/satellitedb: Wrap Error returned by Delete
Wrap the error returned by repairQueue.Delete method to enhance the
error with a class and stack and the
pkg/storage/segments.Repairer.Repair method get a more contextualized
error from it.
Create a new variable rather than reusing the existing one because the
name of the existing one is confusing when reading the logic and it
requires more time that the logic doesn't have a bug.
* Added the ability to pass timeout settings from cmd/uplink to libuplink.
* Removed commented out code.
* Updated 2min timeouts for the uplink CLI.
* Removed comment.
* Made transport defaultDialTimeout and defaultRequestTimeout public
* Added comments to describe where these defaults apply.
* Added a new defaults to libuplink and added tests.
* Added a new defaults to libuplink and added tests.
* pkg/datarepair: Add test to check num upload pieces
Add a new test for ensuring the number of pieces that the repair process
upload when a segment is injured.
* satellite/orders: Don't create "put order limits" over total
Repair must not create "put order limits" more than the total count.
* pkg/datarepair: Update upload repair pieces test
Update the test which checks the number of pieces which are uploaded
during a repair for using the same excess over the success threshold
value than the implementation.
* satellites/orders: Limit repair put order for not being total
Limit the number of put orders to be used by repair for only uploading
pieces to a % excess over the successful threshold.
* pkg/datarepair: Change DataRepair test to pass again
Make some changes in the DataRepair test to make pass again after the
repair upload repaired pieces only until a % excess over success
threshold.
Also update the steps description of the DataRepair test after it has been
changed, to match on what's now, besides to leave it more generic for
avoiding having to update it on minimal future refactorings.
* satellite: Make repair excess optimal threshold configurable
Add a new configuration parameter to the satellite for being able to
configure the percentage excess over the optimal threshold, used for
determining how many pieces should be repaired/uploaded, rather than
having the value hard coded.
* repairer: Add configurable param to segments/repairer
Add a new parameters to the segment/repairer to calculate the maximum
number of excess nodes, based on the optimal threshold, that repaired
pieces can be uploaded.
This new parameter has been added for not returning more nodes than the
number of upload orders for data repair satellite service calculate for
repairing pieces.
* pkg/storage/ec: Update log message in clien.Repair
* satellite: Update configuration lock file
checker_segment_total_count - Number of total segments in pointer during checker iteration
checker_segment_healthy_count - Number of healthy segments in pointer during checker iterationn
time_since_checker_queue - Seconds elapsed between checker queue and beginning repair
time_for_repair - Seconds elapsed between beginning repair and ending repair/dequeueing