Commit Graph

26 Commits

Author SHA1 Message Date
Egon Elbre
7ce372c686 satellite/internalpb: add inspectors
Change-Id: Ib688e43d05135c0c31ae95df533f1e4535ea396a
2020-10-30 13:28:17 +02:00
Egon Elbre
004e610d0f satellite/internalpb: move datarepair.pb to internal
Change-Id: If901d9ff4e5ee6715b963eeeb46513a602a44b3d
2020-10-30 13:28:14 +02:00
Egon Elbre
0bdb952269 all: use keyed special comment
Change-Id: I57f6af053382c638026b64c5ff77b169bd3c6c8b
2020-10-13 15:13:41 +03:00
Egon Elbre
080ba47a06 all: fix dots
Change-Id: I6a419c62700c568254ff67ae5b73efed2fc98aa2
2020-07-16 14:58:28 +00:00
littleskunk
048ca4558f
satellite/repair: clean up logging (#3833)
Co-authored-by: Michal Niewrzal <michal@storj.io>
2020-03-30 11:59:56 +02:00
Moby von Briesen
a933bcc99a satellite/repair/repairer/ec.go: add option for downloading pieces onto disk instead of in memory during repair
Add flag to satellite repairer, "InMemoryRepair" that allows the
satellite to decide whether to download the entire segment being
repaired into memory (this is what the satellite already does), or to
download it into temporary files on disk that will be read from in the
upload phase of repair.

This should help with handling high repair traffic on satellites that
cannot afford to spend 64mb of memory per repair worker.

Updates tests to test repair for both in memory and to disk.

Change-Id: Iddf591e165621497c98533d45bfea3c28b08a194
2020-03-27 16:41:00 +00:00
paul cannon
79553059cb satellite/repair: put irreparable segments in irreparableDB
Previously, we were simply discarding rows from the repair queue when
they couldn't be repaired (either because the overlay said too many
nodes were down, or because we failed to download enough pieces).

Now, such segments will be put into the irreparableDB for further
and (hopefully) more focused attention.

This change also better differentiates some error cases from Repair()
for monitoring purposes.

Change-Id: I82a52a6da50c948ddd651048e2a39cb4b1e6df5c
2020-03-09 21:45:16 +00:00
paul cannon
92d86fa044 satellite/repair: fix repair concurrency
This new repair timeout (configured as TotalTimeout) will include both
the time to download pieces and the time to upload pieces, as well as
the time to pop the segment from the repair queue.

This is a move from Github PR #3645.

Change-Id: I47d618f57285845d8473fcd285f7d9be9b4318c8
2020-02-24 19:57:09 +00:00
Jeff Wendling
7999d24f81 all: use monkit v3
this commit updates our monkit dependency to the v3 version where
it outputs in an influx style. this makes discovery much easier
as many tools are built to look at it this way.

graphite and rothko will suffer some due to no longer being a tree
based on dots. hopefully time will exist to update rothko to
index based on the new metric format.

it adds an influx output for the statreceiver so that we can
write to influxdb v1 or v2 directly.

Change-Id: Iae9f9494a6d29cfbd1f932a5e71a891b490415ff
2020-02-05 23:53:17 +00:00
Moby von Briesen
006a2824ba satellite/repair: lock monkit stats in checker and repairer
Change-Id: Ia10fc8da0177389a500359ce51d21a5806f3f7b1
2020-01-30 14:09:56 +00:00
Egon Elbre
8dea4f52db satellite: add control panel
Change-Id: Id48246e9bcd4c6ec643277fe740937b2e42ad85b
2020-01-30 08:06:43 -05:00
Egon Elbre
6615ecc9b6 common: separate repository
Change-Id: Ibb89c42060450e3839481a7e495bbe3ad940610a
2019-12-27 14:11:15 +02:00
littleskunk
71b58edb2c satellite/repair: decrease repair interval
Change-Id: Id9efdbfaa82521c35dc41e7a52b700522c197e77
2019-12-10 00:36:00 +00:00
littleskunk
c52c7275ad
satellite/repair: reduce upload timeout (#3597) 2019-11-18 18:52:56 +01:00
Egon Elbre
ee6c1cac8a
private: rename internal to private (#3573) 2019-11-14 21:46:15 +02:00
Yingrong Zhao
bfa6699e2c
satellite/repair: add timeout for repair download from a single node(#3418) 2019-10-30 16:31:08 -04:00
littleskunk
2a5526fcc4
satellite/repair: reduce timeout (#3302) 2019-10-18 13:43:24 +02:00
littleskunk
6e7607239c
satellite/repair: improve logging (#3287)
* satellite/repair: improve logging

* use Stringer wherever possible
2019-10-16 17:28:56 +02:00
Maximillian von Briesen
289cfe8ff2 satellite/repair: do not log "retrieved segment" if repair queue empty (#2995) 2019-09-11 16:06:36 +03:00
Egon Elbre
a801fab66a
all: add archview annotations (#2964) 2019-09-10 16:24:16 +03:00
Maximillian von Briesen
fb10815229 Repair with hashes (#2925)
* add outline for ECRepairer

* add description of process in TODO comments

* begin download/getting hash for a single piece

* verify piece hash and order limit during download

* fix download piece

* begin filling out ESREpair. Get

* wip move ecclient.Repair to ecrepairer.Repair

* pass satellite signee into repairer

* reconstruct original stripe from pieces

* move rebuildStripe()

* calculate piece size differently, increment successful count

* fix shares slices initialization

* rename stripeData to segment

* do not pad reader in Repair()

* temp debug

* create unsafeRSScheme

* use decode reader

* rename file name to be all lowercase

* make repair downloader async

* declare condition variable inside Get method

* set downloadAndVerifyPiece's in-memory buffer to be share size

* update unusedLimits var

* address comments

* remove unnecessary comments

* move initialization of segmentRepaire to be outside of repairer service

* use ReadAll during download

* remove dots and move hashing to after validating for order limit signature

* wip test

* make sure files exactly at min threshold are repaired

* remove unused code

* use corrput data and write back to storagenode

* only create corrupted node and piece ids once

* add comment

* address nat's comment

* fix linting and checker_test

* update comment

* add comments

* remove "copied from ecclient" comments

* add clarification comments in ec.Repair
2019-09-06 15:20:36 -04:00
Egon Elbre
c8edeb0257
satellite/overlay: rename overlay.Cache to overlay.Service (#2717) 2019-08-06 19:35:59 +03:00
Bill Thorp
fcbc9d71da
satellite/repair: add shouldDelete (#2702)
* add shouldDelete to repair
2019-08-05 11:09:16 -04:00
Alexander Leitner
4632ab0a67
Delete irreparable segments (#2642)
* Delete irreparable segments
2019-07-30 11:38:25 -04:00
Egon Elbre
e75813d094 satellite/repair: move segment repairer to satellite and simplify (#2651) 2019-07-29 13:24:56 +02:00
Egon Elbre
5d0816430f
rename all the things (#2531)
* rename pkg/linksharing to linksharing
* rename pkg/httpserver to linksharing/httpserver
* rename pkg/eestream to uplink/eestream
* rename pkg/stream to uplink/stream
* rename pkg/metainfo/kvmetainfo to uplink/metainfo/kvmetainfo
* rename pkg/auth/signing to pkg/signing
* rename pkg/storage to uplink/storage
* rename pkg/accounting to satellite/accounting
* rename pkg/audit to satellite/audit
* rename pkg/certdb to satellite/certdb
* rename pkg/discovery to satellite/discovery
* rename pkg/overlay to satellite/overlay
* rename pkg/datarepair to satellite/repair
2019-07-28 08:55:36 +03:00