storj/internal
Egon Elbre 62e3bf5b34 storagenode/retain: fix concurrency issues (#2828)
* nicer flags

* fix concurrency

* add concurrent workers

* initialize things

* fix tests

* close retain service

* ensure we don't have workers working on the same satellite

* ensure things compile

* fix other compilation issues:

* concurrency changes

ran this with `go test -count=1000` and it passed all of them.

- we add a closed channel so that we can select on it with
  context cancellation.
- we put a once in so we only close the channel once.
- every time the queue/running state changes, we have to broadcast
  because we may want to wake up N pending Wait calls or other
  concurrent workers.
- because we broadcast, we don't need to do the polling in Wait
  anymore.
- ensure Run doesn't start multiple times so that we don't have
  to worry about concurrent Close with multiple Runs.
- hold the lock while we start workers so that a concurrent Close
  with Run can't decide that there's nothing started and exit
  and then have Run start things.
- make sure to poll the closed/context channels through loops
  or at the start of Run calls in case Close happens first.
- these polls should be under a mutex because they have a default
  case which makes it possible to schedule such that Close hasn't
  executed the channel close so it starts more work.
- cancel a local Run context when it's going to exit to make sure
  that any retainPieces calls have a canceled context.
- hopefully enough comments to both check my work and help readers
  digest what's going on.

Change-Id: Ida0e226a7e01e8ae64fa2c59dd5a84b04bccfbd7

* use the retain error class

Change-Id: I1511eaef135f98afd57b878e997e4c8a0d11cafc

* concurrency fixes again

- forgot to update the gc test to use the old Wait api.
- we need to drop the lock while we wait for the workers
  to exit, because they may be blocked on the condition
  variable
- additionally, we need to broadcast when we close the
  signal channel because the state changed: they want
  to wake up and exit.

Change-Id: I4204699792275260cd912f29aa73720f7d9b14b5

* undo my misguided rename

Change-Id: I6baffe1eb0434e260212c485bbcc01bed3250881

* remove pollInterval

* format paragraph more nicely

* move skew calculation into retain pieces
2019-08-28 16:35:25 -04:00
..
cui better batch-generation (#1219) 2019-02-06 09:04:12 +01:00
currency satellite/rewards: use USD type (#2384) 2019-07-01 15:16:49 -04:00
date storagenode/nodestats: cache node stats (#2543) 2019-08-08 16:47:04 +03:00
dbutil storagenode/nodestats: cache node stats (#2543) 2019-08-08 16:47:04 +03:00
debugging Identity versioning (#1389) 2019-04-08 20:15:19 +02:00
errs2 libuplink upload/download err handling improvements (#2725) 2019-08-07 16:28:27 +02:00
fpath Fix storing struct in context (#2126) 2019-06-05 16:02:29 +02:00
memory Don't convert to float to avoid losing precision. (#2092) 2019-06-03 18:37:40 +02:00
migrate fix import ordering (#2322) 2019-06-25 12:46:29 +03:00
post lint: add linting for errs package (#2881) 2019-08-27 19:07:12 +03:00
processgroup updates copyright 2018 to 2019 (#1133) 2019-01-24 15:15:10 -05:00
readcloser remove utils.CombineErrors and utils.ErrorGroup (#1603) 2019-03-29 14:30:23 +02:00
s3client aws s3 performance tests (#2060) 2019-05-28 11:46:58 -07:00
sync2 internal/sync2: Fix typo in doc comment (#2685) 2019-08-01 17:10:27 +02:00
testblobs move piece info into files (#2629) 2019-08-07 20:47:30 -05:00
testcontext Re-implement libstorj API (V2) using libuplink (V3) (#2573) 2019-07-30 13:40:05 +02:00
testidentity lib/uplink: remove redis and bolt dependencies (#2812) 2019-08-19 16:10:38 -06:00
testpeertls Enable Scopelint Linter (#2049) 2019-05-29 09:30:16 -04:00
testplanet storagenode/retain: fix concurrency issues (#2828) 2019-08-28 16:35:25 -04:00
testrand storagenode/nodestats: cache node stats (#2543) 2019-08-08 16:47:04 +03:00
testrevocation pkg/revocation: ensure we close revocation databases (#2825) 2019-08-20 18:04:17 +03:00
teststorj Refactor pb.Node protobuf (#1785) 2019-04-22 12:07:50 +03:00
version don't use global loggers (#2675) 2019-07-31 17:38:44 +03:00