Commit Graph

118 Commits

Author SHA1 Message Date
Ivan Fraixedes
b6ae6a4e2c
pkg/datarepair/checker: Fix doc comments (#2401) 2019-07-01 18:13:18 +02:00
Natalie Villasana
3ffb42483f
account for disqualified nodes in data repair tests (#2315) 2019-07-01 11:34:42 -04:00
Natalie Villasana
3f643551e7 remove flakiness in TestDataRepair and TestSegmentStoreRepair (#2335)
* stop audit loop in repair tests to prevent possible timeout
2019-07-01 11:15:45 -04:00
Egon Elbre
b6ad3e9c9f
internal/testrand: new package for random data (#2282) 2019-06-26 13:38:51 +03:00
Egon Elbre
c28f800098
Skip TestDataRepair and TestUplinksParallel, because they are flaky (#2337) 2019-06-25 19:30:39 +03:00
Kaloyan Raev
964c87c476 Fix checks around repair threshold (#2246) 2019-06-19 22:13:11 +02:00
JT Olio
e58a06bd0c config: update release values to match prod (#2192) 2019-06-15 18:19:19 +02:00
Kaloyan Raev
ebd9b375fc
Repair should not corrupt files (#2194) 2019-06-14 12:16:31 +03:00
ethanadams
8f2dca8437 Re-enabling and fixing repairer tests (#2099)
* Disabled discovery service by changiing from Stop() to Pause()

Paused to solve race condition.  If discovery is running, it may mark a node "up" after they've been manually marked "down" in this test.

* Extend to the repair timeout

Fixes intermittent test failures when repairs were taking more than 2 seconds.

* Re-enabled test. Disabled discovery service by changiing from Stop() to Pause()

* Changed back to Stop.

* Revert "Changed back to Stop."

This reverts commit 46d410e72dfae63e0c44915be42784cc9a7b5abf.

* re-enabling TestIdentifyInjuredSegments

* Changed Pause to Stop.  Commented on timeout change

* testing...

* temporarily skipping audit tests

* changing back to discover Stop for testing via jenkins

* Revert "changing back to discover Stop for testing via jenkins"

This reverts commit 6aa8558b11a0053c30e0c8b2dbf0d6c0cb34ee6c.

* Changing back to Stop().  Depends on PR 2137

* Revert "temporarily skipping audit tests"

This reverts commit 1940ed9b315d663a0eb6c95521780cbcb48cb121.

* Removed reference to Graveyard since its been removed
2019-06-10 09:06:21 +02:00
JT Olio
43d4f3daf5 discovery: remove graveyard (#2145) 2019-06-07 08:40:51 +03:00
JT Olio
f1641af802 storage: add monkit task to missing places (#2122)
* storage: add monkit task to missing places

Change-Id: I9e17a6b14f7c25bbf698eeecf32785e9add3f26e

* fix tests

Change-Id: Id078276fa3de61a28eb3d01d4e751732ecbb173f

* import order

Change-Id: I814e33755b9f10b5219af37cd828cd75eb3da1a4

* remove part of other commit

Change-Id: Idaa4c95cd65e97567fb466de49718db8203cfbe1
2019-06-05 16:23:10 +02:00
JT Olio
3fe8343b6c repairer: fix config comments (#2105) 2019-06-04 14:13:31 +02:00
JT Olio
9c5708da32 pkg/*: add monkit task to missing places (#2109) 2019-06-04 13:36:27 +02:00
aligeti
4ad5120923
Checker service refactor (v3-1871) (#2082)
*  refactor the checker service

* monkit update
2019-05-31 10:12:49 -04:00
aligeti
934ebf9cbf
Added the irreparable repair functionality (#1955)
* Added the irreparable repair functionality
2019-05-30 11:18:20 -04:00
Maximillian von Briesen
da91d22376 properly check last iteration of checker (#2040) 2019-05-23 18:14:08 +02:00
Maximillian von Briesen
b4f18226db
Send number of files as part of durability stats (#2030) 2019-05-22 18:50:43 -04:00
Maximillian von Briesen
45a2253628 Send durability stats after iterating over all segments (#2028) 2019-05-22 17:17:52 -04:00
Bill Thorp
6522579ecb better repairer logging (#2006)
* logging and delete only repairs with no errors

* removing delete logi~c
2019-05-21 00:05:28 +02:00
Bill Thorp
91721f63ba
Bt/repair no nodes (#1974)
* handle cases where repair is equal to total
2019-05-17 15:02:40 -04:00
aligeti
60cf1dafb0
repair segment reassess it missing pieces just before repair (#1939)
* repair segment reaccess it missing pieces just before repair to see if it actually needs repair
2019-05-16 09:49:10 -04:00
Bill Thorp
4002ed4463 unskip TestIdentifyIrreparableSegments (#1927) 2019-05-09 15:55:34 +03:00
Natalie Villasana
b48f584cea
repair checker resumes iterating where left off (#1879) 2019-05-08 13:59:50 -04:00
Bill Thorp
ea978dd674
hopefully sensible satellite defaults (#1888)
* hopefully sensible satellite defaults
2019-05-07 10:44:47 -04:00
Bill Thorp
6ece4f11ad
moved invalid/offline back into SQL (#1838)
* moved invalid/offline back into SQL, removed GetAll()
2019-05-01 09:45:52 -04:00
Bill Thorp
2c9ef5b107
longer repair window (#1866) 2019-04-30 11:20:18 -04:00
Egon Elbre
db939d37ec
cover all the things (#1818) 2019-04-26 16:39:11 +03:00
Michal Niewrzal
fe3dfc1587
Move pointerdb.Service to satellite (#1826) 2019-04-25 10:46:32 +02:00
Egon Elbre
c284cfde30
ensure TestParallel doesn't deadlock on error (#1808) 2019-04-24 13:15:46 +03:00
Bill Thorp
cd4a3e06d8
wired up IsHealthy to config (#1820)
* wired up IsHealthy to config
2019-04-23 18:45:50 -04:00
Fadila
8ddf481b33 Checker: invalid and offline nodes search update (#1812)
* simplified invalid and offline login into getMissingPieces
2019-04-23 16:54:39 -04:00
Egon Elbre
bdd0d778eb
interface tests belong to the interface, not the implementation (#1794) 2019-04-23 11:47:16 +03:00
Egon Elbre
f49b0acc5b repair until queue is empty (#1716)
* repair until queue is empty
2019-04-22 11:16:21 -04:00
Natalie Villasana
8d1f614662 removes unused queue code, moves queue_test.go to repairqueue_test.go in satellitedb dir (#1783) 2019-04-22 13:35:52 +03:00
Bill Thorp
9dc4e82437
removed commented code, removed unnecessary pointer (#1766)
* removed commented code
2019-04-16 15:55:28 -04:00
Bill Thorp
17a227e6e9
refactor injuredsegments db so that we can't have duplicates (#1717)
made repairqueue not use a true queue, forbid duplicates
2019-04-16 14:14:09 -04:00
JT Olio
ffdb2e7728
actually skip the data repair test (#1728)
Change-Id: I76286fc6cc5129d8be50d45a684a3e0dce9c0cc6
2019-04-09 23:29:05 -06:00
JT Olio
61ec92f2e8
disable datarepair test for now (#1727)
Change-Id: I1854817ea051ab621936f587b198de2da07c9960
2019-04-09 22:31:55 -06:00
Maximillian von Briesen
3fb4813227
Fix data repair checker missing pieces list (#1705) 2019-04-08 15:46:23 -04:00
Maximillian von Briesen
bb3b4e4816 Data repair integration test (#1582) 2019-04-08 13:33:47 -04:00
littleskunk
43ef0eb4c3
Don't crash on audit and repair failures (#1622)
* Fix satellite crash on repair

(cherry picked from commit cabf6c9f97780f900d76e2388ffa54b916f14528)

* Fix satellite crash on audit

(cherry picked from commit 9da67488c4b36a378f346fbb27651316284b0f36)
2019-04-01 11:16:17 +02:00
Egon Elbre
be06fdfd6c Create orders.Service (#1593) 2019-03-28 22:09:23 +02:00
Kaloyan Raev
d1639c4157 Merge statdb pkg into overlay pkg (#1570) 2019-03-25 18:25:09 -04:00
Egon Elbre
94e79eda6d
remove overlay endpoint (#1521) 2019-03-23 10:06:11 +02:00
Kaloyan Raev
d057efb05e
Add Repair method to ECClient (#1509) 2019-03-19 15:14:59 +02:00
Egon Elbre
05d148aeb5
Storage node and upload/download protocol refactor (#1422)
refactor storage node server
refactor upload and download protocol
2019-03-18 12:55:06 +02:00
Cameron
c7ffbe1c28
Add ability to view irreparable segments on satellite (#1448)
* define irreparable inspector protobuf

* add IrreparableDB method GetLimited

* fill out irreparable inspector API

* add IrreparableInspector server to satellite, fix small error

* refactor IrreparableDB to use pb.IrreparableSegment instead of irreparable.RemoteSegmentInfo
2019-03-15 16:21:52 -04:00
Bill Thorp
665fd33e3c
Repair queue isolation level fix (#1466)
Implemented custom SQLite and Postgres Repairqueue Dequeue handlers
2019-03-14 17:12:47 -04:00
aligeti
c6ad7644d2
Total file count through Monkit (#1351)
* segment, file, byte stats, total and per-bucket; checker: report segment health stats; reports the total num of lost files

* code review updates
2019-02-26 10:17:51 -05:00
Bill Thorp
9b580c5fb6 Repair checker is checking the same 1000 elements all the time (#1297)
* removed limit on repair, now using cycle

* added BatchIteratorOptions

* consolidated boltdb common.go

* PR feedback cleanup
2019-02-14 13:33:41 +01:00