Egon Elbre
b6ad3e9c9f
internal/testrand: new package for random data ( #2282 )
2019-06-26 13:38:51 +03:00
Egon Elbre
c28f800098
Skip TestDataRepair and TestUplinksParallel, because they are flaky ( #2337 )
2019-06-25 19:30:39 +03:00
Kaloyan Raev
964c87c476
Fix checks around repair threshold ( #2246 )
2019-06-19 22:13:11 +02:00
JT Olio
e58a06bd0c
config: update release values to match prod ( #2192 )
2019-06-15 18:19:19 +02:00
Kaloyan Raev
ebd9b375fc
Repair should not corrupt files ( #2194 )
2019-06-14 12:16:31 +03:00
ethanadams
8f2dca8437
Re-enabling and fixing repairer tests ( #2099 )
...
* Disabled discovery service by changiing from Stop() to Pause()
Paused to solve race condition. If discovery is running, it may mark a node "up" after they've been manually marked "down" in this test.
* Extend to the repair timeout
Fixes intermittent test failures when repairs were taking more than 2 seconds.
* Re-enabled test. Disabled discovery service by changiing from Stop() to Pause()
* Changed back to Stop.
* Revert "Changed back to Stop."
This reverts commit 46d410e72dfae63e0c44915be42784cc9a7b5abf.
* re-enabling TestIdentifyInjuredSegments
* Changed Pause to Stop. Commented on timeout change
* testing...
* temporarily skipping audit tests
* changing back to discover Stop for testing via jenkins
* Revert "changing back to discover Stop for testing via jenkins"
This reverts commit 6aa8558b11a0053c30e0c8b2dbf0d6c0cb34ee6c.
* Changing back to Stop(). Depends on PR 2137
* Revert "temporarily skipping audit tests"
This reverts commit 1940ed9b315d663a0eb6c95521780cbcb48cb121.
* Removed reference to Graveyard since its been removed
2019-06-10 09:06:21 +02:00
JT Olio
43d4f3daf5
discovery: remove graveyard ( #2145 )
2019-06-07 08:40:51 +03:00
JT Olio
f1641af802
storage: add monkit task to missing places ( #2122 )
...
* storage: add monkit task to missing places
Change-Id: I9e17a6b14f7c25bbf698eeecf32785e9add3f26e
* fix tests
Change-Id: Id078276fa3de61a28eb3d01d4e751732ecbb173f
* import order
Change-Id: I814e33755b9f10b5219af37cd828cd75eb3da1a4
* remove part of other commit
Change-Id: Idaa4c95cd65e97567fb466de49718db8203cfbe1
2019-06-05 16:23:10 +02:00
JT Olio
3fe8343b6c
repairer: fix config comments ( #2105 )
2019-06-04 14:13:31 +02:00
JT Olio
9c5708da32
pkg/*: add monkit task to missing places ( #2109 )
2019-06-04 13:36:27 +02:00
aligeti
4ad5120923
Checker service refactor (v3-1871) ( #2082 )
...
* refactor the checker service
* monkit update
2019-05-31 10:12:49 -04:00
aligeti
934ebf9cbf
Added the irreparable repair functionality ( #1955 )
...
* Added the irreparable repair functionality
2019-05-30 11:18:20 -04:00
Maximillian von Briesen
da91d22376
properly check last iteration of checker ( #2040 )
2019-05-23 18:14:08 +02:00
Maximillian von Briesen
b4f18226db
Send number of files as part of durability stats ( #2030 )
2019-05-22 18:50:43 -04:00
Maximillian von Briesen
45a2253628
Send durability stats after iterating over all segments ( #2028 )
2019-05-22 17:17:52 -04:00
Bill Thorp
6522579ecb
better repairer logging ( #2006 )
...
* logging and delete only repairs with no errors
* removing delete logi~c
2019-05-21 00:05:28 +02:00
Bill Thorp
91721f63ba
Bt/repair no nodes ( #1974 )
...
* handle cases where repair is equal to total
2019-05-17 15:02:40 -04:00
aligeti
60cf1dafb0
repair segment reassess it missing pieces just before repair ( #1939 )
...
* repair segment reaccess it missing pieces just before repair to see if it actually needs repair
2019-05-16 09:49:10 -04:00
Bill Thorp
4002ed4463
unskip TestIdentifyIrreparableSegments ( #1927 )
2019-05-09 15:55:34 +03:00
Natalie Villasana
b48f584cea
repair checker resumes iterating where left off ( #1879 )
2019-05-08 13:59:50 -04:00
Bill Thorp
ea978dd674
hopefully sensible satellite defaults ( #1888 )
...
* hopefully sensible satellite defaults
2019-05-07 10:44:47 -04:00
Bill Thorp
6ece4f11ad
moved invalid/offline back into SQL ( #1838 )
...
* moved invalid/offline back into SQL, removed GetAll()
2019-05-01 09:45:52 -04:00
Bill Thorp
2c9ef5b107
longer repair window ( #1866 )
2019-04-30 11:20:18 -04:00
Egon Elbre
db939d37ec
cover all the things ( #1818 )
2019-04-26 16:39:11 +03:00
Michal Niewrzal
fe3dfc1587
Move pointerdb.Service to satellite ( #1826 )
2019-04-25 10:46:32 +02:00
Egon Elbre
c284cfde30
ensure TestParallel doesn't deadlock on error ( #1808 )
2019-04-24 13:15:46 +03:00
Bill Thorp
cd4a3e06d8
wired up IsHealthy to config ( #1820 )
...
* wired up IsHealthy to config
2019-04-23 18:45:50 -04:00
Fadila
8ddf481b33
Checker: invalid and offline nodes search update ( #1812 )
...
* simplified invalid and offline login into getMissingPieces
2019-04-23 16:54:39 -04:00
Egon Elbre
bdd0d778eb
interface tests belong to the interface, not the implementation ( #1794 )
2019-04-23 11:47:16 +03:00
Egon Elbre
f49b0acc5b
repair until queue is empty ( #1716 )
...
* repair until queue is empty
2019-04-22 11:16:21 -04:00
Natalie Villasana
8d1f614662
removes unused queue code, moves queue_test.go to repairqueue_test.go in satellitedb dir ( #1783 )
2019-04-22 13:35:52 +03:00
Bill Thorp
9dc4e82437
removed commented code, removed unnecessary pointer ( #1766 )
...
* removed commented code
2019-04-16 15:55:28 -04:00
Bill Thorp
17a227e6e9
refactor injuredsegments db so that we can't have duplicates ( #1717 )
...
made repairqueue not use a true queue, forbid duplicates
2019-04-16 14:14:09 -04:00
JT Olio
ffdb2e7728
actually skip the data repair test ( #1728 )
...
Change-Id: I76286fc6cc5129d8be50d45a684a3e0dce9c0cc6
2019-04-09 23:29:05 -06:00
JT Olio
61ec92f2e8
disable datarepair test for now ( #1727 )
...
Change-Id: I1854817ea051ab621936f587b198de2da07c9960
2019-04-09 22:31:55 -06:00
Maximillian von Briesen
3fb4813227
Fix data repair checker missing pieces list ( #1705 )
2019-04-08 15:46:23 -04:00
Maximillian von Briesen
bb3b4e4816
Data repair integration test ( #1582 )
2019-04-08 13:33:47 -04:00
littleskunk
43ef0eb4c3
Don't crash on audit and repair failures ( #1622 )
...
* Fix satellite crash on repair
(cherry picked from commit cabf6c9f97780f900d76e2388ffa54b916f14528)
* Fix satellite crash on audit
(cherry picked from commit 9da67488c4b36a378f346fbb27651316284b0f36)
2019-04-01 11:16:17 +02:00
Egon Elbre
be06fdfd6c
Create orders.Service ( #1593 )
2019-03-28 22:09:23 +02:00
Kaloyan Raev
d1639c4157
Merge statdb pkg into overlay pkg ( #1570 )
2019-03-25 18:25:09 -04:00
Egon Elbre
94e79eda6d
remove overlay endpoint ( #1521 )
2019-03-23 10:06:11 +02:00
Kaloyan Raev
d057efb05e
Add Repair method to ECClient ( #1509 )
2019-03-19 15:14:59 +02:00
Egon Elbre
05d148aeb5
Storage node and upload/download protocol refactor ( #1422 )
...
refactor storage node server
refactor upload and download protocol
2019-03-18 12:55:06 +02:00
Cameron
c7ffbe1c28
Add ability to view irreparable segments on satellite ( #1448 )
...
* define irreparable inspector protobuf
* add IrreparableDB method GetLimited
* fill out irreparable inspector API
* add IrreparableInspector server to satellite, fix small error
* refactor IrreparableDB to use pb.IrreparableSegment instead of irreparable.RemoteSegmentInfo
2019-03-15 16:21:52 -04:00
Bill Thorp
665fd33e3c
Repair queue isolation level fix ( #1466 )
...
Implemented custom SQLite and Postgres Repairqueue Dequeue handlers
2019-03-14 17:12:47 -04:00
aligeti
c6ad7644d2
Total file count through Monkit ( #1351 )
...
* segment, file, byte stats, total and per-bucket; checker: report segment health stats; reports the total num of lost files
* code review updates
2019-02-26 10:17:51 -05:00
Bill Thorp
9b580c5fb6
Repair checker is checking the same 1000 elements all the time ( #1297 )
...
* removed limit on repair, now using cycle
* added BatchIteratorOptions
* consolidated boltdb common.go
* PR feedback cleanup
2019-02-14 13:33:41 +01:00
Michal Niewrzal
b2f9453184
Disable Checker subsystem in tests ( #1279 )
...
* Disable Checker subsystem in tests
* rename field
* remove sleeps and errgroup.Group
2019-02-11 22:06:39 +01:00
JT Olio
2a59679766
pkg/transport: require tls configuration for dialing ( #1286 )
...
* separate TLS options from server options (because we need them for dialing too)
* stop creating transports in multiple places
* ensure that we actually check revocation, whitelists, certificate signing, etc, for all connections.
2019-02-11 13:17:32 +02:00
Egon Elbre
bb11d83ed0
Proper planet shutdown ( #1249 )
2019-02-06 15:19:14 +02:00