This change makes dial timeout configurable and change it also from
defatul 20s to 5s. Main motivation is that during repair we often loose
lots of time to dial which eventually will fail. New timeout should be
still enough to dial but we will move forward quicker to next node if
that one will fail.
Timeout is also applied directly as context timeout in case we will
use noise of tcp fast open one day.
Change-Id: I021bf459af49b11241e314fa1a7887c81d5214ea
We avoid putting more than one piece of a segment on the same /24
network (or /64 for ipv6). However, it is possible for multiple pieces
of the same segment to move to the same network over time. Nodes can
change addresses, or segments could be uploaded with dev settings, etc.
We will call such pieces "clumped", as they are clumped into the same
net, and are much more likely to be lost or preserved together.
This change teaches the repair checker to recognize segments which have
clumped pieces, and put them in the repair queue. It also teaches the
repair worker to repair such segments (treating clumped pieces as
"retrievable but unhealthy"; i.e., they will be replaced on new nodes if
possible).
Refs: https://github.com/storj/storj/issues/5391
Change-Id: Iaa9e339fee8f80f4ad39895438e9f18606338908
It was surprising that `satellite auditor` complained about SMTP mail settings, even if it's not supposed to sending any mail.
Looks like we can remove the mail service dependency, as it's not a hard requirement for overlay.Service.
Change-Id: I29a52eeff3f967ddb2d74a09458dc0ee2f051bd7
This code is essentially replacement for eestream.CalcPieceSize. To call
eestream.CalcPieceSize we need eestream.RedundancyStrategy which is not
trivial to get as it requires infectious.FEC. For example infectious.FEC
creation is visible on GE loop observer CPU profile because we were
doing this for each segment in DB.
New method was added to storj.Redundancy and here we are just wiring it
with metabase Segment.
BenchmarkSegmentPieceSize
BenchmarkSegmentPieceSize/eestream.CalcPieceSize
BenchmarkSegmentPieceSize/eestream.CalcPieceSize-8 5822 189189 ns/op 9776 B/op 8 allocs/op
BenchmarkSegmentPieceSize/segment.PieceSize
BenchmarkSegmentPieceSize/segment.PieceSize-8 94721329 11.49 ns/op 0 B/op 0 allocs/op
Change-Id: I5a8b4237aedd1424c54ed0af448061a236b00295
* use the same DB application name for satellite and metabase
* use noop orders DB implementation to avoid storing allocated bandwidth
in DB
Change-Id: I20e88c694d38240fe1a20c45719e210cfb76402c
Reputation updates during repair currently consumes a lot of database
resources. Sometimes increasing the rate of repair is more important
than auditing a node based on whether they have or don't have the
correct piece during repair. This is the job of the audit service.
This commit is to implement an intermediate solution from this issue: https://github.com/storj/storj/issues/5089
This commit does not address the more in-depth fix discussed here: https://github.com/storj/storj/issues/4939
Change-Id: I4163b18d78a96fadf5265789fd73c8aa8def0e9f
If we are processing list of segments (csv) we should not stop if one of
segments is not found in DB.
Change-Id: I720f85dc7601c2ca77032e20c1577de55092bd9b
Current option is to put stream id and position as an input but
it's not very efficient when we have long list of segments to repair.
This change adds option to read whole csv file and process each entry
one by one.
If command will have single argument then it will treat it as csv file
location and if will have two arguments then it will parse it just as
stream id and position.
Change-Id: I1e91cf57a794d81d74af0091c24a2e7d00d1fab9
Implements logic for satellite command to repair a single segment.
Segment will be repaired even if it's healthy. This is not checked
during this process. As a part of repair command will download whole
segment into memory and will try to reupload segment to number of new
nodes that equal to existing number of pieces. After successful
upload new pieces will completely replace existing pieces.
Command:
satellite repair-segment <streamid> <position>
https://github.com/storj/storj/issues/5254
Change-Id: I8e329718ecf8e457dbee3434e1c68951699007d9
Add nodeevents.DB to satellite overlay service so we can insert node
events into the nodeevents DB.
Change-Id: I642c0ccc9941ecdb08cb22d5c8cf701959a55156
New flag 'MultipleVersions' was not correctly passed from metainfo
configuration to metabase configuration. Because configuration was
set correctly for unit tests we didn't catch it and issue was found
while testing on QA satellite.
This change reduce number of places where new metabase flags needs
to be propagated from metainfo configuration to avoid problems with
setting new flags in the future.
Fixes https://github.com/storj/storj/issues/5274
Change-Id: I74bc122649febefd87f665be2fba628f6bfd9044