Commit Graph

56 Commits

Author SHA1 Message Date
Ivan Fraixedes
f420b29d35
[V3-1927] Repairer uploads to max threshold instead of success… (#2423)
* pkg/datarepair: Add test to check num upload pieces
  Add a new test for ensuring the number of pieces that the repair process
  upload when a segment is injured.
* satellite/orders: Don't create "put order limits" over total
  Repair must not create "put order limits" more than the total count.
* pkg/datarepair: Update upload repair pieces test
  Update the test which checks the number of pieces which are uploaded
  during a repair for using the same excess over the success threshold
  value than the implementation.
* satellites/orders: Limit repair put order for not being total
  Limit the number of put orders to be used by repair for only uploading
  pieces to a % excess over the successful threshold.
* pkg/datarepair: Change DataRepair test to pass again
  Make some changes in the DataRepair test to make pass again after the
  repair upload repaired pieces only until a % excess over success
  threshold.
  Also update the steps description of the DataRepair test after it has been
  changed, to match on what's now, besides to leave it more generic for
  avoiding having to update it on minimal future refactorings.
* satellite: Make repair excess optimal threshold configurable
  Add a new configuration parameter to the satellite for being able to
  configure the percentage excess over the optimal threshold, used for
  determining how many pieces should be repaired/uploaded, rather than
  having the value hard coded.
* repairer: Add configurable param to segments/repairer
  Add a new parameters to the segment/repairer to calculate the maximum
  number of excess nodes, based on the optimal threshold, that repaired
  pieces can be uploaded.
  This new parameter has been added for not returning more nodes than the
  number of upload orders for data repair satellite service calculate for
  repairing pieces.
* pkg/storage/ec: Update log message in clien.Repair
* satellite: Update configuration lock file
2019-07-12 00:44:47 +02:00
Michal Niewrzal
268c629ba8
Replace base64 encoding for path segments (#2345) 2019-07-11 13:26:07 -04:00
Maximillian von Briesen
de85d17069
Add checker metrics (#2487)
checker_segment_total_count - Number of total segments in pointer during checker iteration
checker_segment_healthy_count - Number of healthy segments in pointer during checker iterationn
time_since_checker_queue - Seconds elapsed between checker queue and beginning repair
time_for_repair - Seconds elapsed between beginning repair and ending repair/dequeueing
2019-07-10 17:27:46 -04:00
Maximillian von Briesen
52e5a4eee3 pass logger into repairer and ecclient (#2365) 2019-07-02 13:08:02 +03:00
JT Olio
e58a06bd0c config: update release values to match prod (#2192) 2019-06-15 18:19:19 +02:00
JT Olio
3fe8343b6c repairer: fix config comments (#2105) 2019-06-04 14:13:31 +02:00
JT Olio
9c5708da32 pkg/*: add monkit task to missing places (#2109) 2019-06-04 13:36:27 +02:00
Bill Thorp
6522579ecb better repairer logging (#2006)
* logging and delete only repairs with no errors

* removing delete logi~c
2019-05-21 00:05:28 +02:00
aligeti
60cf1dafb0
repair segment reassess it missing pieces just before repair (#1939)
* repair segment reaccess it missing pieces just before repair to see if it actually needs repair
2019-05-16 09:49:10 -04:00
Bill Thorp
ea978dd674
hopefully sensible satellite defaults (#1888)
* hopefully sensible satellite defaults
2019-05-07 10:44:47 -04:00
Bill Thorp
2c9ef5b107
longer repair window (#1866) 2019-04-30 11:20:18 -04:00
Michal Niewrzal
fe3dfc1587
Move pointerdb.Service to satellite (#1826) 2019-04-25 10:46:32 +02:00
Egon Elbre
f49b0acc5b repair until queue is empty (#1716)
* repair until queue is empty
2019-04-22 11:16:21 -04:00
Bill Thorp
17a227e6e9
refactor injuredsegments db so that we can't have duplicates (#1717)
made repairqueue not use a true queue, forbid duplicates
2019-04-16 14:14:09 -04:00
Maximillian von Briesen
bb3b4e4816 Data repair integration test (#1582) 2019-04-08 13:33:47 -04:00
Egon Elbre
be06fdfd6c Create orders.Service (#1593) 2019-03-28 22:09:23 +02:00
Egon Elbre
94e79eda6d
remove overlay endpoint (#1521) 2019-03-23 10:06:11 +02:00
Kaloyan Raev
d057efb05e
Add Repair method to ECClient (#1509) 2019-03-19 15:14:59 +02:00
Egon Elbre
05d148aeb5
Storage node and upload/download protocol refactor (#1422)
refactor storage node server
refactor upload and download protocol
2019-03-18 12:55:06 +02:00
JT Olio
2a59679766 pkg/transport: require tls configuration for dialing (#1286)
* separate TLS options from server options (because we need them for dialing too)
* stop creating transports in multiple places
* ensure that we actually check revocation, whitelists, certificate signing, etc, for all connections.
2019-02-11 13:17:32 +02:00
Egon Elbre
bb11d83ed0
Proper planet shutdown (#1249) 2019-02-06 15:19:14 +02:00
Egon Elbre
5a63c00442
Fix issues with blocking during startup (#1212) 2019-02-01 19:28:40 +02:00
Egon Elbre
d5346982c2
Delete provider package (#1177) 2019-01-30 22:47:21 +02:00
Jennifer Li Johnson
856b98997c
updates copyright 2018 to 2019 (#1133) 2019-01-24 15:15:10 -05:00
Egon Elbre
5de7f8af7f
Satellite Peer (#1119) 2019-01-23 21:58:44 +02:00
Egon Elbre
78dc02b758 Satellite Peer (#1034)
* add satellite peer

* Add overlay

* reorganize kademlia

* add RunRefresh

* add refresh to storagenode.Peer

* add discovery

* add agreements and metainfo

* rename

* add datarepair checker

* add repair

* add todo notes for audit

* add testing interface

* add into testplanet

* fixes

* fix compilation errors

* fix compilation errors

* make testplanet run

* remove audit refrences

* ensure that audit tests run

* dev

* checker tests compilable

* fix discovery

* fix compilation

* fix

* fix

* dev

* fix

* disable auth

* fixes

* revert go.mod/sum

* fix linter errors

* fix

* fix copyright

* Add address param for SN dashboard (#1076)

* Rename storj-sdk to storj-sim (#1078)

* Storagenode logs and config improvements  (#1075)

* Add more info to SN logs

* remove config-dir from user config

* add output where config was stored

* add message for successful connection

* fix linter

* remove storage.path from user config

* resolve config path

* move success  message to info

* log improvements

* Remove captplanet (#1070)

* pkg/server: include production cert (#1082)

Change-Id: Ie8e6fe78550be83c3bd797db7a1e58d37c684792

* Generate Payments Report (#1079)

* memory.Size: autoformat sizes based on value entropy (#1081)

* Jj/bytes (#1085)

* run tally and rollup

* sets dev default tally and rollup intervals

* nonessential storj-sim edits (#1086)

* Closing context doesn't stop storage node (#1084)

* Print when cancelled

* Close properly

* Don't log nil

* Don't print error when closing dashboard

* Fix panic in inspector if ping fails (#1088)

* Consolidate identity management to identity cli commands (#1083)

* Consolidate identity management:

Move identity cretaion/signing out of storagenode setup command.

* fixes

* linters

* Consolidate identity management:

Move identity cretaion/signing out of storagenode setup command.

* fixes

* sava backups before saving signed certs

* add "-prebuilt-test-cmds" test flag

* linters

* prepare cli tests for travis

* linter fixes

* more fixes

* linter gods

* sp/sdk/sim

* remove ca.difficulty

* remove unused difficulty

* return setup to its rightful place

* wip travis

* Revert "wip travis"

This reverts commit 56834849dcf066d3cc0a4f139033fc3f6d7188ca.

* typo in travis.yaml

* remove tests

* remove more

* make it only create one identity at a time for consistency

* add config-dir for consitency

* add identity creation to storj-sim

* add flags

* simplify

* fix nolint and compile

* prevent overwrite and pass difficulty, concurrency, and parent creds

* goimports
2019-01-18 08:54:08 -05:00
Bill Thorp
342dc857f5 rollup query (#1056)
* implemention notes

* more notes

* starting rollup query

* not working yet

* fixed build

* fixed cfg bug

* change context cancelled errs to debugs

* using byte hours for at rest tally

* revert changes to go.mod

* comment fixes

* prevent double recording tallies in rollup

* linting

* stop leaking dbx

* nodeid changes

* fix build
2019-01-16 14:30:33 -05:00
Alexander Leitner
bfde515391
Clean up Storage node setup (#1013)
* Edit config on Setup

* Default to 1TiB storage space and 500GiB bandwidth

* Use human readable formats

* Use memory

* units of 1024 are measured with KiB/MiB etc

* pkg/cfgstruct: allow values to be configured with human readable sizes

Change-Id: Ic4e9ae461516d1d26fb81f6e44c5ac5cfccf777f

* Modify tests

* Removed comments

* More merge conflict stuff resolved

* Fix lint

* test fixin

Change-Id: I3a008206bf03a4446da19f642a2f9c1f9acaae36

* Remove commented code but secretly leave it in the histroy forever

* Move flag definition to struct
2019-01-14 16:19:15 -05:00
Michal Niewrzal
b712fbcbb0
Fix 'empty queue' error when satellite starts (#939) 2019-01-02 17:00:32 +01:00
Egon Elbre
4346cd060f
Implement mutex around satellitedb (#932) 2018-12-27 11:56:25 +02:00
Cameron
f70b826fd4
repair queue masterDB support (#865)
* add injuredsegment model to satellitedb.dbx

* add context to queue.RepairQueue interface

* use queue.RepairQueue interface, use masterdb
2018-12-21 10:11:19 -05:00
Egon Elbre
d9b9ae6ffa
Cleanup overlay methods and names. (#914) 2018-12-20 15:57:54 +02:00
Natalie Villasana
17c60c1f06
moves node selection config setup from uplink to satellite (#891) 2018-12-17 16:05:05 -05:00
Kaloyan Raev
2a8fef8062
Data repairer should use the redundancy strategy from segment's pointer (#838) 2018-12-13 09:12:36 +02:00
JT Olio
ceb590fa67
capt: reduce nodes to 10 (#793)
Change-Id: Ief380fe29e3043657705cd7505c266fd774181a4
2018-12-11 11:40:54 -07:00
Natalie Villasana
9e1ec97b31
adds node selection config (#782) 2018-12-11 12:30:14 -05:00
Bryan White
2a0c4e60d2
preparing for use of customtype gogo extension with NodeID type (#693)
* preparing for use of `customtype` gogo extension with `NodeID` type

* review changes

* preparing for use of `customtype` gogo extension with `NodeID` type

* review changes

* wip

* tests passing

* wip fixing tests

* more wip test fixing

* remove NodeIDList from proto files

* linter fixes

* linter fixes

* linter/review fixes

* more freaking linter fixes

* omg just kill me - linterrrrrrrr

* travis linter, i will muder you and your family in your sleep

* goimports everything - burn in hell travis

* goimports update

* go mod tidy
2018-11-29 19:39:27 +01:00
Michal Niewrzal
376bd74bed
Stops logging repairer errors for empty queue (#717) 2018-11-27 16:57:51 +01:00
Jennifer Li Johnson
93c5f385a8
Enable checker in captplanet and staging (#643)
* enable checker

* add option to use mock overlay in checker

* adds logs to checker

* appease linter
2018-11-20 10:54:22 -05:00
Jennifer Li Johnson
e678e52229
Creates Accounting Pkg to tally at rest node storage (#568)
* creates accounting package with tally service

* adds cancel on context

* test online nodes
2018-11-08 11:18:28 -05:00
Bill Thorp
07ed38c930
Added distqueue interface and redis and test queue (#555) 2018-11-08 08:53:27 -05:00
Bryan White
ee62e2a9d8
Use transport client and cleanup all the clients (#574)
* wip

* linter fixes

* linter fixes

* test fixes

* linter fixes

* fix merge + restructure piecestore packages

* review feedback

* linter fixes

* linter fixes

* remove unnecessary aliases to piecestore

* more merge fixing
2018-11-06 18:49:17 +01:00
Cameron
de46a999bc
Integrate SegmentStore Repair method with repair service (#582)
* add storeConfig struct and getSegmentStore helper for creating a segment store

* implement segment store in repairer, remove unnecessary repairer Repair method

* change repair method parameter from int to int32 to match type being passed in

* implement repairer service in captplanet

* rework Config, set Config defaults in captplanet/setup
2018-11-06 09:52:11 -05:00
Egon Elbre
2a8b681c4d
Run repairer and checker early (#565)
* Run repairers, checker, auditors first time they run to detect potential setup problems.
* Fix error handling in audit.Service
2018-11-01 16:03:45 +02:00
Dennis Coyle
a3becb8a7b
Add repairer & checker to Satelite (#561)
* Added repairer & checker to Satellite

* fixed repairer and checker configs
2018-10-31 12:22:35 -04:00
Jennifer Li Johnson
7ae2fa3575
moves bulk of code from ticker case to outside for indentation's sake (#559)
* moves bulk of code from ticker case to outside for indentation's sake

* adds whitespace

* removes break
2018-10-30 16:14:15 -04:00
Jennifer Li Johnson
1fb96689b8
creates run loop for data repair checker (#490)
* creates run loop for data repair checker

* moves actual checking and repairing under ticker case

* fixes mismatched queueaddrs
2018-10-30 15:16:40 -04:00
Cameron
f7828e73ea
remove ctx from repairer struct (#535) 2018-10-25 14:59:36 -04:00
Egon Elbre
8efb4f0e89
Fix repairing run (#523)
* Fix repairing run
* Fix concurrency bugs
* Add sync2.Limiter concurrency primitive
2018-10-24 15:35:59 +03:00
Alexander Leitner
3e1b16ea99
Start redis (#470)
* Start miniredis, repairer, and checker with captplanet
2018-10-12 14:04:16 -04:00