storj/satellite/satellitedb
Jessica Grebenschikov 0649d2b930 satellite/repair: improve contention for injuredsegments table on CRDB
We migrated satelliteDB off of Postgres and over to CockroachDB (crdb), but there was way too high contention for the injuredsegments table so we had to rollback to Postgres for the repair queue. A couple things contributed to this problem:
1) crdb doesn't support `FOR UPDATE SKIP LOCKED`
2) the original crdb Select query was doing 2 full table scans and not using any indexes
3) the SLC Satellite (where we were doing the migration) was running 48 repair worker processes, each of which run up to 5 goroutines which all are trying to select out of the repair queue and this was causing a ton of contention.

The changes in this PR should help to reduce that contention and improve performance on CRDB.
The changes include:
1) Use an update/set query instead of select/update to capitalize on the new `UPDATE` implicit row locking ability in CRDB.
- Details: As of CRDB v20.2.2, there is implicit row locking with update/set queries (contention reduction and performance gains are described in this blog post: https://www.cockroachlabs.com/blog/when-and-why-to-use-select-for-update-in-cockroachdb/).

2) Remove the `ORDER BY` clause since this was causing a full table scan and also prevented the use of the row locking capability.
- While long term it is very important to `ORDER BY segment_health`, the change here is only suppose to be a temporary bandaid to get us migrated over to CRDB quickly. Since segment_health has been set to infinity for some time now (re: https://review.dev.storj.io/c/storj/storj/+/3224), it seems like it might be ok to continue not making use of this for the short term. However, long term this needs to be fixed with a redesign of the repair workers, possible in the trusted delegated repair design (https://review.dev.storj.io/c/storj/storj/+/2602) or something similar to what is recommended here on how to implement a queue on CRDB https://dev.to/ajwerner/quick-and-easy-exactly-once-distributed-work-queues-using-serializable-transactions-jdp, or migrate to rabbit MQ priority queue or something similar..

This PRs improved query uses the index to avoid full scans and also locks the row its going to update and CRDB retries for us if there are any lock errors.

Change-Id: Id29faad2186627872fbeb0f31536c4f55f860f23
2020-12-10 09:51:26 -08:00
..
dbx satellite/satellitedb: add ListAllBuckets method 2020-12-10 14:19:27 +00:00
satellitedbtest Allow for DB application names per process. (#3983) 2020-12-04 11:24:39 +01:00
testdata satellite/accounting: account for old orders that can be submitted in satellite rollup 2020-11-18 14:46:00 -05:00
apikeys.go all: fix linter complaints 2020-10-13 15:59:01 +03:00
attribution_test.go all: remove old uuid 2020-04-02 19:30:36 +03:00
attribution.go all: fix dots 2020-07-16 14:58:28 +00:00
audithistory_test.go satellite/{overlay,satellitedb}: always show node's real online score 2020-10-02 12:28:11 -04:00
audithistory.go satellite/internalpb: move audithistory.pb 2020-10-30 15:30:11 +02:00
buckets_test.go satellite/satellitedb: add ListAllBuckets method 2020-12-10 14:19:27 +00:00
buckets.go satellite/satellitedb: add ListAllBuckets method 2020-12-10 14:19:27 +00:00
coinpaymentstxs_test.go satellite/satellitedb: Coinpayments repeat insert bug fix 2020-07-20 20:21:35 +00:00
coinpaymentstxs.go satellite/satellitedb: Coinpayments repeat insert bug fix 2020-07-20 20:21:35 +00:00
compensation.go all: fix dots 2020-07-16 14:58:28 +00:00
consoledb_test.go satellite/satellitedb/satellitedbtest: pass ctx as an argument 2020-01-20 16:35:42 +02:00
consoledb.go satellite/payments: fix promotional coupons 2020-01-29 16:40:43 +02:00
containment.go all: fix dots 2020-07-16 14:58:28 +00:00
coupons.go all: fix dots 2020-07-16 14:58:28 +00:00
customers.go all: replace == comparison with errors.Is 2020-07-14 15:50:25 +00:00
database.go Allow for DB application names per process. (#3983) 2020-12-04 11:24:39 +01:00
gracefulexit.go go.mod: update pgx to v4.9.0 2020-09-29 19:03:08 +00:00
invoiceprojectrecords.go satellite/payments: delete credits and credits_spendings db tables 2020-07-30 12:19:57 +03:00
irreparabledb.go satellite/internalpb: add inspectors 2020-10-30 13:28:17 +02:00
migrate_test.go Allow for DB application names per process. (#3983) 2020-12-04 11:24:39 +01:00
migrate.go satellite/accounting: account for old orders that can be submitted in satellite rollup 2020-11-18 14:46:00 -05:00
nodeapiversion.go all: use jackc/pgx in place of lib/pq 2020-07-13 15:54:41 +00:00
nodeselection.go all: golangci-lint v1.33.0 fixes (#3985) 2020-12-05 17:01:42 +01:00
offers.go all: add missing dots 2020-08-11 17:50:01 +03:00
orders.go satellite/orders: ensure that expired deletion doesn't stall 2020-11-23 14:52:40 +02:00
overlaycache_test.go satellite: remove IsUp field from overlay.UpdateRequest 2020-11-02 15:34:17 -05:00
overlaycache.go satellite/overlay: Add retry to all selects in overlaycache 2020-11-29 16:46:57 -05:00
payout.go storagenode: heldamount renamed to payouts, renamed some methods and structs to more meaningful names. grouped estimated payout with pathouts 2020-09-16 14:57:35 +00:00
peeridentities.go all: fix dots 2020-07-16 14:58:28 +00:00
projectaccounting.go all: fix defers in loop 2020-11-02 15:06:38 +02:00
projectmembers_test.go web/satellite: project members sorting fixed (#3231) 2019-10-15 15:24:53 +03:00
projectmembers.go all: fix linter complaints 2020-10-13 15:59:01 +03:00
projects_test.go satellite/satellitedb/dbx: name the package dbx 2020-01-15 15:16:39 -07:00
projects.go satellite: make limits be nullable 2020-09-21 19:34:19 +00:00
regtokens.go all: fix dots 2020-07-16 14:58:28 +00:00
repairqueue.go satellite/repair: improve contention for injuredsegments table on CRDB 2020-12-10 09:51:26 -08:00
resetpasstokens.go all: fix dots 2020-07-16 14:58:28 +00:00
revocation_test.go satellite: Check macaroon revocation 2020-06-22 13:50:07 -06:00
revocation.go all: fix dots 2020-07-16 14:58:28 +00:00
storagenodeaccounting.go satellitedb: retry GetBandwidthSince on cockroach 2020-11-29 16:36:15 -07:00
stripecoinpaymentsdb.go satellite/payments: delete credits and credits_spendings db tables 2020-07-30 12:19:57 +03:00
usercredits.go all: fix dots 2020-07-16 14:58:28 +00:00
users_test.go satellite/satellitedb/dbx: name the package dbx 2020-01-15 15:16:39 -07:00
users.go all: fix dots 2020-07-16 14:58:28 +00:00