Go to file
Jessica Grebenschikov 0649d2b930 satellite/repair: improve contention for injuredsegments table on CRDB
We migrated satelliteDB off of Postgres and over to CockroachDB (crdb), but there was way too high contention for the injuredsegments table so we had to rollback to Postgres for the repair queue. A couple things contributed to this problem:
1) crdb doesn't support `FOR UPDATE SKIP LOCKED`
2) the original crdb Select query was doing 2 full table scans and not using any indexes
3) the SLC Satellite (where we were doing the migration) was running 48 repair worker processes, each of which run up to 5 goroutines which all are trying to select out of the repair queue and this was causing a ton of contention.

The changes in this PR should help to reduce that contention and improve performance on CRDB.
The changes include:
1) Use an update/set query instead of select/update to capitalize on the new `UPDATE` implicit row locking ability in CRDB.
- Details: As of CRDB v20.2.2, there is implicit row locking with update/set queries (contention reduction and performance gains are described in this blog post: https://www.cockroachlabs.com/blog/when-and-why-to-use-select-for-update-in-cockroachdb/).

2) Remove the `ORDER BY` clause since this was causing a full table scan and also prevented the use of the row locking capability.
- While long term it is very important to `ORDER BY segment_health`, the change here is only suppose to be a temporary bandaid to get us migrated over to CRDB quickly. Since segment_health has been set to infinity for some time now (re: https://review.dev.storj.io/c/storj/storj/+/3224), it seems like it might be ok to continue not making use of this for the short term. However, long term this needs to be fixed with a redesign of the repair workers, possible in the trusted delegated repair design (https://review.dev.storj.io/c/storj/storj/+/2602) or something similar to what is recommended here on how to implement a queue on CRDB https://dev.to/ajwerner/quick-and-easy-exactly-once-distributed-work-queues-using-serializable-transactions-jdp, or migrate to rabbit MQ priority queue or something similar..

This PRs improved query uses the index to avoid full scans and also locks the row its going to update and CRDB retries for us if there are any lock errors.

Change-Id: Id29faad2186627872fbeb0f31536c4f55f860f23
2020-12-10 09:51:26 -08:00
.github pr template: migrations run concurrently with api servers now 2019-10-31 09:27:46 -06:00
certificate certificate/authorization: add ctx to OpenDB 2020-10-29 09:46:23 +02:00
cmd cmd/storj-sim: fix 32bit code 2020-12-09 09:49:33 +02:00
docs Fixed typos in downtime tracking with audits doc (#3977) 2020-11-27 17:25:21 +01:00
installer/windows storagenode/windows-installer: ignore set firewall exception error 2020-05-27 17:56:49 +03:00
multinode Allow for DB application names per process. (#3983) 2020-12-04 11:24:39 +01:00
pkg pkg/revocation: pass ctx into opening the database 2020-10-29 07:15:36 +00:00
private private/testplanet: add helper OpenProject method to testplanet uplink 2020-12-07 13:45:47 +00:00
resources cmd: add ca-certificates to Docker images (#3986) 2020-12-08 01:38:33 +01:00
satellite satellite/repair: improve contention for injuredsegments table on CRDB 2020-12-10 09:51:26 -08:00
scripts scripts/tests/testversions: fix indentation 2020-12-04 21:54:55 +00:00
storage Allow for DB application names per process. (#3983) 2020-12-04 11:24:39 +01:00
storagenode storagenode/console: diskSpaceInfo extended with overused diskspace, getDashboardData updated. 2020-12-08 14:55:55 +00:00
versioncontrol cmd/satellite: proper log usage 2020-10-13 16:56:35 +03:00
web web/satellite: dashboard lag message tooltip (#3982) 2020-12-08 19:01:51 +02:00
.clabot added myself to the clabot list. (#3988) 2020-12-09 19:34:22 +01:00
.dockerignore Forward-port release-alpha8 build script issues (#1726) 2019-04-09 23:01:10 -06:00
.gitignore cmd/gateway: remove gateway command from repository and adjust `make 2020-03-27 07:42:42 +00:00
CODE_OF_CONDUCT.md Adding CODE_OF_CONDUCT to storj/storj repo (#779) 2018-12-07 15:10:02 -05:00
docker-compose.yaml satellite/testing: Change testing to use PG 12.3 (#3913) 2020-06-25 20:17:39 +03:00
go.mod Upgrade to uplink v1.4.2 2020-12-10 15:47:11 +02:00
go.sum satellite/satellitedb: add ListAllBuckets method 2020-12-10 14:19:27 +00:00
Jenkinsfile Makefile: Update Go version security patch 2020-11-15 00:36:54 +01:00
Jenkinsfile.public ci: ensure cockroach doesn't pollute repo 2020-11-13 16:07:01 +00:00
LICENSE license code with agplv3 (#126) 2018-07-05 10:24:26 -04:00
Makefile Makefile: Update Go version security patch 2020-11-15 00:36:54 +01:00
monkit.lock satellite/{accounting, contact}: Remove periods and spaces from metrics. 2020-12-03 15:33:01 +00:00
package-lock.json Satellite api keys frontend (#1039) 2019-02-01 18:19:30 +02:00
README.md updating Aha link on our readme 2020-05-04 14:21:58 -04:00

Storj V3 Network

Go Report Card Go Doc Coverage Status Alpha

Storj is building a decentralized cloud storage network. Check out our white paper for more info!


Storj is an S3-compatible platform and suite of decentralized applications that allows you to store data in a secure and decentralized manner. Your files are encrypted, broken into little pieces and stored in a global decentralized network of computers. Luckily, we also support allowing you (and only you) to retrieve those files!

Table of Contents

Contributing to Storj

All of our code for Storj v3 is open source. Have a code change you think would make Storj better? Please send a pull request along! Make sure to sign our Contributor License Agreement (CLA) first. See our license section for more details.

Have comments or bug reports? Want to propose a PR before hand-crafting it? Jump on to our forum and join the Engineering Discussions to say hi to the developer community and to talk to the Storj core team.

Want to vote on or suggest new features? Post it on ideas.storj.io.

Issue tracking and roadmap

See the breakdown of what we're building by checking out the following resources:

Install required packages

To get started running Storj locally, download and install the latest release of Go (at least Go 1.13) at golang.org.

You will also need Git. (brew install git, apt-get install git, etc). If you're building on Windows, you also need to install and have gcc setup correctly.

We support Linux, Mac, and Windows operating systems. Other operating systems supported by Go should also be able to run Storj.

Download and compile Storj

Aside about GOPATH: Go 1.11 supports a new feature called Go modules, and Storj has adopted Go module support. If you've used previous Go versions, Go modules no longer require a GOPATH environment variable. Go by default falls back to the old behavior if you check out code inside of the directory referenced by your GOPATH variable, so make sure to use another directory, unset GOPATH entirely, or set GO111MODULE=on before continuing with these instructions.

First, fork our repo and clone your copy of our repository.

git clone git@github.com:<your-username>/storj storj
cd storj

Then, let's install Storj.

go install -v ./cmd/...

Make changes and test

Make the changes you want to see! Once you're done, you can run all of the unit tests:

go test -v ./...

You can also execute only a single test package if you like. For example: go test ./pkg/identity. Add -v for more informations about the executed unit tests.

Push up a pull request

Use Git to push your changes to your fork:

git commit -a -m 'my changes!'
git push origin master

Use Github to open a pull request!

A Note about Versioning

While we are practicing semantic versioning for our client libraries such as uplink, we are not practicing semantic versioning in this repo, as we do not intend for it to be used via Go modules. We may have backwards-incompatible changes between minor and patch releases in this repo.

Start using Storj

Our wiki has documentation and tutorials. Check out these three tutorials:

License

The network under construction (this repo) is currently licensed with the AGPLv3 license. Once the network reaches beta phase, we will be licensing all client-side code via the Apache v2 license.

For code released under the AGPLv3, we request that contributors sign our Contributor License Agreement (CLA) so that we can relicense the code under Apache v2, or other licenses in the future.

Support

If you have any questions or suggestions please reach out to us on our community forum or email us at support@tardigrade.io.