* satellite: update log levels
Change-Id: I86bc32e042d742af6dbc469a294291a2e667e81f
* log version on start up for every service
Change-Id: Ic128bb9c5ac52d4dc6d6c4cb3059fbad73f5d3de
* Use monkit for tracking failed ip resolutions
Change-Id: Ia5aa71d315515e0c5f62c98d9d115ef984cd50c2
* fix compile errors
Change-Id: Ia33c8b6e34e780bd1115120dc347a439d99e83bf
* add request limit value to storage node rpc err
Change-Id: I1ad6706a60237928e29da300d96a1bafa94156e5
* we cant track storage node ids in monkit metrics so lets use logging to track that for expired orders
Change-Id: I1cc1d240b29019ae2f8c774792765df3cbeac887
* fix build errs
Change-Id: I6d0ffe058e9a38b7ed031c85a29440f3d68e8d47
Add flag to satellite repairer, "InMemoryRepair" that allows the
satellite to decide whether to download the entire segment being
repaired into memory (this is what the satellite already does), or to
download it into temporary files on disk that will be read from in the
upload phase of repair.
This should help with handling high repair traffic on satellites that
cannot afford to spend 64mb of memory per repair worker.
Updates tests to test repair for both in memory and to disk.
Change-Id: Iddf591e165621497c98533d45bfea3c28b08a194
Previously, we were simply discarding rows from the repair queue when
they couldn't be repaired (either because the overlay said too many
nodes were down, or because we failed to download enough pieces).
Now, such segments will be put into the irreparableDB for further
and (hopefully) more focused attention.
This change also better differentiates some error cases from Repair()
for monitoring purposes.
Change-Id: I82a52a6da50c948ddd651048e2a39cb4b1e6df5c
all of the packages and tests work with both grpc and
drpc. we'll probably need to do some jenkins pipelines
to run the tests with drpc as well.
most of the changes are really due to a bit of cleanup
of the pkg/transport.Client api into an rpc.Dialer in
the spirit of a net.Dialer. now that we don't need
observers, we can pass around stateless configuration
to everything rather than stateful things that issue
observations. it also adds a DialAddressID for the
case where we don't have a pb.Node, but we do have an
address and want to assert some ID. this happened
pretty frequently, and now there's no more weird
contortions creating custom tls options, etc.
a lot of the other changes are being consistent/using
the abstractions in the rpc package to do rpc style
things like finding peer information, or checking
status codes.
Change-Id: Ief62875e21d80a21b3c56a5a37f45887679f9412
* update audit status as failed for nodes that failed piece hash verification
* remove comment
* fix lint error
* add test
* fix format
* use named return value for Get
* add comments
* add more better comment
* format
* add outline for ECRepairer
* add description of process in TODO comments
* begin download/getting hash for a single piece
* verify piece hash and order limit during download
* fix download piece
* begin filling out ESREpair. Get
* wip move ecclient.Repair to ecrepairer.Repair
* pass satellite signee into repairer
* reconstruct original stripe from pieces
* move rebuildStripe()
* calculate piece size differently, increment successful count
* fix shares slices initialization
* rename stripeData to segment
* do not pad reader in Repair()
* temp debug
* create unsafeRSScheme
* use decode reader
* rename file name to be all lowercase
* make repair downloader async
* declare condition variable inside Get method
* set downloadAndVerifyPiece's in-memory buffer to be share size
* update unusedLimits var
* address comments
* remove unnecessary comments
* move initialization of segmentRepaire to be outside of repairer service
* use ReadAll during download
* remove dots and move hashing to after validating for order limit signature
* wip test
* make sure files exactly at min threshold are repaired
* remove unused code
* use corrput data and write back to storagenode
* only create corrupted node and piece ids once
* add comment
* address nat's comment
* fix linting and checker_test
* update comment
* add comments
* remove "copied from ecclient" comments
* add clarification comments in ec.Repair