it was noticed that if you had a long lived transaction A that
was blocking some other transaction B and A was being aborted
due to retriable errors, then transaction B was never given
priority. this was due to using savepoints to do lightweight
retries.
this behavior was problematic becaue we had some queries blocked
for over 16 hours, so this commit addresses the issue with two
prongs:
1. bound the amount of time we will retry a transaction
2. create new transactions when a retry is needed
the first ensures that we never wait for 16 hours, and the value
chosen is 10 minutes. that should be long enough for an ample
amount of retries for small queries, and huge queries probably
shouldn't be retried, even if possible: it's more preferrable to
find a way to make them smaller.
the second ensures that even in the case of retries, queries that
are blocked on the aborted transaction gain priority to run.
between those two changes, the maximum stall time due to retries
should be bounded to around 10 minutes.
Change-Id: Icf898501ef505a89738820a3fae2580988f9f5f4
The old migration was not working. It was updateding pending (status 0)
and failed (status -1) to completed (status 100).
Change-Id: I808ff3cc692fe6c698ce26a8b411b134e67b752b
Currently Cockroach DB setup takes a significant amount of time.
This flattens the database setup into a single query,
which improves the test time significantly.
The migration tests still test each migration separately.
Change-Id: Iaca16f34a6af3926fa2b5ebf618f939fd59460b3
Limits how many times metainfo APIs can be called per second by project ID. If limit is exceeded, the API will return Unauthorized/Too Many requests.
Limit per second and the size of the limiter cache per project are configurable, as well as whether the limiter is enabled.
Tests added/updated for the new rate_limit field in projects table.
Tests added for exceeding limits and disableing limiter.
Change-Id: Ic8ad102de3b690a475809d4f684156d5715f20fa
Also added temporary types withRebind and withTagTx,
which will be later removed. Currently they help to avoid
changing the whole codebase at the same time.
Change-Id: I7f07ba8f4709a23a463bfa67464628665a05808f
warning: databases migrated to version 77 before this commit
is merged must be manually re-migrated. this should not be a
problem for anything but staging databases.
Change-Id: Ie1631c48379472352014183ee43f1465e22200f7
this commit introduces the reported_serials table. its purpose is
to allow for blind writes into it as nodes report in so that we have
minimal contention. in order to continue to accurately account for
used bandwidth, though, we cannot immediately add the settled amount.
if we did, we would have to give up on blind writes.
the table's primary key is structured precisely so that we can quickly
find expired orders and so that we maximally benefit from rocksdb
path prefix compression. we do this by rounding the expires at time
forward to the next day, effectively giving us storagenode petnames
for free. and since there's no secondary index or foreign key
constraints, this design should use significantly less space than
the current used_serials table while also reducing contention.
after inserting the orders into the table, we have a chore that
periodically consumes all of the expired orders in it and inserts
them into the existing rollups tables. this is as if we changed
the nodes to report as the order expired rather than as soon as
possible, so the belief in correctness of the refactor is higher.
since we are able to process large batches of orders (typically
a day's worth), we can use the code to maximally batch inserts into
the rollup tables to make inserts as friendly as possible to
cockroach.
Change-Id: I25d609ca2679b8331979184f16c6d46d4f74c1a6
This reverts commit 8e242cd012.
Revert because lib/pq has known issues with context cancellation.
These issues need to be resolved before these changes can be merged.
Change-Id: I160af51dbc2d67c5449aafa406a403e5367bb555
this will allow for some nice runtime analysis down the road.
also, this allows for wrapping database handles in a way that
can interact with these contexts
requires https://review.dev.storj.io/c/storj/dbx/+/514
Change-Id: Ib087b7cd73296dd2c1e0331314da34d861f61d2b
When an uplink requests an upload or download from the satellite we are trackig the
allocated bandwidth twice. The value in bucket_bandwidth_rollups is used
for project limits but the value in storagenode_bandwidth_rollups is not
used at all. We can increase the performance by removing it. Uplinks
will get a faster response from the satellite.
Change-Id: Icccd41f94107ef34668f30f99bf5f728c384b07e
Backstory: I needed a better way to pass around information about the
underlying driver and implementation to all the various db-using things
in satellitedb (at least until some new "cockroach driver" support makes
it to DBX). After hitting a few dead ends, I decided I wanted to have a
type that could act like a *dbx.DB but which would also carry
information about the implementation, etc. Then I could pass around that
type to all the things in satellitedb that previously wanted *dbx.DB.
But then I realized that *satellitedb.DB was, essentially, exactly that
already.
One thing that might have kept *satellitedb.DB from being directly
usable was that embedding a *dbx.DB inside it would make a lot of dbx
methods publicly available on a *satellitedb.DB instance that previously
were nicely encapsulated and hidden. But after a quick look, I realized
that _nothing_ outside of satellite/satellitedb even needs to use
satellitedb.DB at all. It didn't even need to be exported, except for
some trivially-replaceable code in migrate_postgres_test.go. And once
I made it unexported, any concerns about exposing new methods on it were
entirely moot.
So I have here changed the exported *satellitedb.DB type into the
unexported *satellitedb.satelliteDB type, and I have changed all the
places here that wanted raw dbx.DB handles to use this new type instead.
Now they can just take a gander at the implementation member on it and
know all they need to know about the underlying database.
This will make it possible for some other pending code here to
differentiate between postgres and cockroach backends.
Change-Id: I27af99f8ae23b50782333da5277b553b34634edc
for storj-sim to work, we need to avoid schemas in cockroach urls
so we have storj-sim create namespaced databases instead of schemas
and we have the migrate command create the database in the same way
that it would create a schema for postgres. then it works!
a follow up commit will move the creation of the database/schemas
into storj-sim's setup step so that we can avoid doing these icky
creations during normal migration calls. it will also make the
pointerdb have an explicit call to migrate instead of just doing
it every time it's opened.
Change-Id: If69ef5cb96b6866b0438c761bd445afb3597ae5f
satellitedb migration tests ran against multiple base versions, however after the merging all the steps the base versions didn't exists anymore - which meant none of the migration tests were actually running.
first, so that they all work the same way, because it's getting
complicated, and second, so that we can do the appropriate thing
instead of CREATE SCHEMA for cockroachdb.
Change-Id: I27fbaeeb6223a3e06d97bcf692a2d014b31465f7
* update migration steps, add crdb support to testplanet
* add crdb support
* have jenkins run a bares bones crdb compat test
* skip crdb tests
* skip crdb tests
* fix root_piece_id column
* write crdb store to tmp dir
* escape
* merge migration
* rm migration versions
* rm unneeded migration test data
* create index w/postgres + crdb compatible syntax
* add default to offers.invitee_credit_duration_days
* changes so that schema matches from master to branch
* change to be crdb compatible
* add check to confirm db version
* mv version check to migration
* update tests
* add minversion to sadb migration, update tests
* confirm min version for all dbs in a migration
* add validate migration to sadb
* fix lint err
* rm min version check from migrate
* change sadb check
* hard code min db version
* fix comment
* separate sadb migration, add version check
* update checkversion to do same validation as migration
* changes per CR
* add sa migration to storj-sim
* add different debug port in storj-sim for migration
* add wait for exit for storj-sim migration
* update sa docker entrypoint to support migration
* storj-sim satellite parts all wait for migration
* upgrade golang-migrate/migrate to v4 because bug
* fix go mod tidy
* create upsert query for check-in method
* add tests
* fix lint err
* add benchmark test for db query
* fix lint and tests
* add a unit test, fix lint
* add address to tests
* replace print w/ b.Fatal
* refactor query per CR comments
* fix disqualified, only set if null
* fix query
* add version to updatecheckin query
* fix version
* fix tests
* change version for tests
* add version to tests
* add IP, add transport, mv unit test
* use node.address as arg
* add last ip
* fix lint
* satellitedb/certDB: refactors of the node certificate storage DB table
The existing implementation doesnt allow to store the complete certificate chain of uplinkIDs or storagenodeIDs, so the current table is dropped and new table will be added which addresses the storage and retrieval of certificates
pkg/identity: fixes spelling mistakes that I missed on PR#2754
Fixes V3-1992/V3-2388
* parent 13dd501042
author Yingrong Zhao <yingrong.zhao@gmail.com> 1563560530 -0400
committer Yingrong Zhao <yingrong.zhao@gmail.com> 1563581673 -0400
parent 13dd501042
author Yingrong Zhao <yingrong.zhao@gmail.com> 1563560530 -0400
committer Yingrong Zhao <yingrong.zhao@gmail.com> 1563581428 -0400
satellite/console: add referral link logic (#2576)
* setup referral route
* referredBy
* add user id
* modify user query
* separate optional field from userInfo
* get current reward on init of satellite gui
* remove unsed code
* fix format
* only apply 0 credit on registration
* only pass required information for rewards
* fix time parsing
* fix test and linter
* rename method
* add todo
* remove user referral logic
* add null check and fix format
* get current offer
* remove partnerID on CreateUser struct
* fix storj-sim user creation
* only redeem credit when there's an offer
* fix default offer configuration
* fix migration
* Add helper function for get correct credit duration
* add comment
* only store userid into user_credit table
* add check for partner id to set correct offer type
* change free credit to use invitee credits
* remove unecessary code
* add credit update in activateAccount
* remove unused code
* fix format
* close reader and fix front-end build
* move create credit logic into CreateUser method
* when there's no offer set, user flow shouldn't be interrupted by referral program
* add appropriate error messages
* remove unused code
* add comment
* add error class for no current offer error
* add error class for credits update
* add comment for migration
* only log secret when it's in debug level
* fix typo
* add testdata
* satellite/satellitedb: User var block for Error
To follow with the code style of the majority of the sources of the
current code base the Error variable should be in a block.
Replacing a single var expression to a block one makes the godoc more
consistent across the repository.
* satellite/satellitedb: Remove empty spaces end of line
* setup referral route
* referredBy
* add user id
* modify user query
* separate optional field from userInfo
* get current reward on init of satellite gui
* remove unsed code
* fix format
* only apply 0 credit on registration
* only pass required information for rewards
* fix time parsing
* fix test and linter
* rename method
* add todo
* remove user referral logic
* add null check and fix format
* get current offer
* remove partnerID on CreateUser struct
* fix storj-sim user creation
* only redeem credit when there's an offer
* fix default offer configuration
* fix migration
* Add helper function for get correct credit duration
* add comment
* only store userid into user_credit table
* add check for partner id to set correct offer type
* change free credit to use invitee credits
* remove unecessary code
* add default offer for offers table
* fix migration test
* Trigger Jenkins
* set the default value to be correct type
* skip soon will deleted test
* fix test data
* add orderby for ListAll
* change durations, redeemable cap to be a nullable field
* remove unecessary code
* add bucket metadata table in SA masterDB
* fix indentation
* update db model per CR comments
* update testdata
* add missing field on sql testdata
* fix args to testdata
* unique bucket name
* fix fkey constraint for test
* fix one too many commas
* update timestamp type
* Trigger Jenkins
* Trigger Jenkins yet again
* v3-2023: add project_id migration for bucket_storage_tallies and bucket_bandwidth_rollups
* added test data for migration 37
* corrected data format
* test sql update
* migrate script updates
* adding previous data
Adds a migration step to pull in old reputation success / total counts into modern alpha / beta scores
If audit success count is less than 50, audit alpha will be set to 50
If uptime success count is less than 100, uptime alpha will be set to 100
This helps us deal with cases where nodes have not been audited or checked for uptime yet, in which case alpha/beta values of 0/0 would cause a node to be considered disqualified.
A node with audit alpha/beta of 50/0 will be disqualified on the 19th check
A node with uptime alpha/beta of 100/0 will be disqualified on the 44th check
This does not affect brand new nodes (nodes that were not in the database before this change). The alpha/beta values for those nodes will be set to 1/0 as before
* satellite/satellitedb: Alter nodes disqualification column
Change the type of the 'disqualification' column of the nodes table from
boolean to timestamp.
* overlay/cache: Change Disqualified field type
Change the Disqualified field type the NodeDossier struct type from bool
to time.Time to match with the disqualified type used by the DB layer.
* satellite/satellitedb: Update queries uses disqualified
Update the queries which uses the disqualified column due to the column
type has been changed from boolean to nullable timestamp.
* docs/design: Update disqualification due impl changes
Update the disqualification design document to contain the architectural
change required to be able to restore unfair disqualified nodes in case
of an unexpected cause (bug, mistake, hard network disconnection, etc.).
* add user credits table
* change primary key, change type for credit_type, and change relation kind of foreign keys from cascade to restrict
* modify table and query methods
* modify schema
* add dbx queries
* add migration file
* add orderby to read available credit entries
* adds model to satellite dbx
* cleans up model spacing
* generated golang from dbx
* added migration steps
* Added testdata
* changed node_id -> bucket_id
* adds -- NEW DATA -- to testdata
* more testdata changes
* adds -- NEW DATA -- line
* dbx makes the table plural
* missed a singular value_attribution
* restart jenkins
* Update satellitedb.dbx
* adjust to PR comments
* autogenerated dbx models
* restart jenkins
* init marketing service
Fix linting error
Create offerdb implementation
Create offers service
Add update method
Create offer table and migration
Fix linting error
fix conflicts
Insert new data
Change duration to have clear indication to be based on days
add error wrapper
Change from using uuid to int for id field
* Create Marketing service
* make error virable name more readable
* add condition in update service method to check offer status
* generate lock file
Change get to listAllOffers
* Add method for getting current offer
wip
* add check for expires_at in update method
* Fix conflicts
* add copyright header
* Fix linting error
* only allow update to active offers
* add isDefault argument to GetCurrent
* Update lock file
* add migration file
* finish migrate for adding credit_in_cents for both award and invitee
* save 100 years as expiration date for default offers
* create crud test for offers
* add GetCurrent test
* modify doc
* Fix GetCurrent to work with default offer
* fix linting issue
* add more tests and address feedbacks
* fix migration file
* add type column back to match with mockup design
* add type column back to match with mockup design
* move doc changes to new pr
* add comments
* change GetCurrent to GetCurrentByType
* fix typo
What: Changes to support custom usage limit for the project. With this implementation by default project usage limit is taken from configuration flag. If project DB field usage_limit will be set to value larger than 0 it will become custom usage limit and we will be used to verify is limit was exceeded.
Whats changed:
usage_limit (bigint) field added to projects table (with migration)
things related to project usage moved from metainfo endpoint to project usage type
accounting.ProjectAccounting extended with GetProjectUsageLimits() method
Why: We need to have different usage limits per project. https://storjlabs.atlassian.net/browse/V3-1814
* add last_ip field to dbx model node, generate dbx
* add last_ip to node proto, generate pb
* migrate
* resolve address in transport.DialNode, update lastIp in cache.UpdateAddress
* use net.SplitHostPort to isolate host address from port
* define DistinctIPs flag
* add test for GetIP
* select last_ip when querying for nodes
* if distinctIPs flag == true, query for nodes with distinct IPs
* some basic tests
* change last_ip to field 14 in proto
* remove comments
* check err
* change distinctIPs to distinctIP
* exclude IPs from newNodes in query for reputable nodes
* add index on last_ip
* only add to excludedIPs if flag is true
* test half new nodes returns distinct IPs
* fix alignment
* add test
* rework ip filter query, add retry logic, add switch for database driver
* add retry to SelectNewNodes
* change discovery intervals so IPs don't get overwritten
* remove TestGetIP
* edit updating node stats in test
* split exclude into nodeIDs and IPs
* separate non-distinct IP query into other function
* trigger checks
* remove else block
* add flags to sotrj-sim for SA dbs
* add schema to postgres
* add createschema with parse to sa
* add metainfo db postgres support
* add kv default as bolt
* add debug log to see db source
* add env var for postgres to test-sim.sh
* fix lint errs
* dynamically add postgres to args
* add postgres to integration tests
* add sqlite and postgres integration jenkins
* fix db name
* merge integration tests into one step
* test integration tests w/psql
* try using different schema
* debug failure
* use correct host for running storj-sim
* rm sqlite integration
* add back integration
* Initial Webserver Draft for Version Controlling
* Rename type to avoid confusion
* Move Function Calls into Version Package
* Fix Linting and Language Typos
* Fix Linting and Spelling Mistakes
* Include Copyright
* Include Copyright
* Adjust Version-Control Server to return list of Versions
* Linting
* Improve Request Handling and Readability
* Add Configuration File Option
Add Systemd Service file
* Add Logging to File
* Smaller Changes
* Add Semantic Versioning and refuses outdated Software from Startup (#1612)
* implements internal Semantic Version library
* adds version logging + reporting to process
* Advance SemVer struct for easier handling
* Add Accepted Version Store
* Fix Function
* Restructure
* Type Conversion
* Handle Version String properly
* Add Note about array index
* Set temporary Default Version
* Add Copyright
* Adding Version to Dashboard
* Adding Version Info Log
* Renaming and adding CheckerProcess
* Iteration Sync
* Iteration V2
* linting
* made LogAndReportVersion a go routine
* Refactor to Go Routine
* Add Context to Go Routine and allow Operation if Lookup to Control Server fails
* Handle Unmarshal properly
* Linting
* Relocate Version Checks
* Relocating Version Check and specified default Version for now
* Linting Error Prevention
* Refuse Startup on outdated Version
* Add Startup Check Function
* Straighten Logging
* Dont force Shutdown if --dev flag is set
* Create full Service/Peer Structure for ControlServer
* Linting
* Straighting Naming
* Finish VersionControl Service Layout
* Improve Error Handling
* Change Listening Address
* Move Checker Function
* Remove VersionControl Peer
* Linting
* Linting
* Create VersionClient Service
* Renaming
* Add Version Client to Peer Definitions
* Linting and Renaming
* Linting
* Remove Transport Checks for now
* Move to Client Side Flag
* Remove check
* Linting
* Transport Client Version Intro
* Adding Version Client to Transport Client
* Add missing parameter
* Adding Version Check, to set Allowed = true
* Set Default to true, testing
* Restructuring Code
* Uplink Changes
* Add more proper Defaults
* Renaming of Version struct
* Dont pass Service use Pointer
* Set Defaults for Versioning Checks
* Put HTTP Server in go routine
* Add Versioncontrol to Storj-Sim
* Testplanet Fixes
* Linting
* Add Error Handling and new Server Struct
* Move Lock slightly
* Reduce Race Potentials
* Remove unnecessary files
* Linting
* Add Proper Transport Handling
* small fixes
* add fence for allowed check
* Add Startup Version Check and Service Naming
* make errormessage private
* Add Comments about VersionedClient
* Linting
* Remove Checks that refuse outgoing connections
* Remove release cmd
* Add Release Script
* Linting
* Update to use correct Values
* Change Timestamp handling
* Adding Protobuf changes back in
* Adding SatelliteDB Changes and adding Storj Node Version to PB
* Add Migration Table
* Add Default Stats for Creation
* Move to BigInt
* Proper SQL Migration
* Ensure minimum Version is passed to the node selection
* Linting...
* Remove VersionedClient and adjust smaller changes from prior merge
* Linting
* Fix PB Message Handling and Query for Node Selection
* some future-proofing type changes
Change-Id: I3cb5018dcccdbc9739fe004d859065992720caaf
* fix a compiler error
Change-Id: If66bb92d8b98e31cd618ecec9c6448ab9b037fa5
* Comment on Constant for Overlay
* Remove NOT NULL and add epoch call as function
* add versions to bootstrap and satellites
Change-Id: I436944589ea5f21600cdd997742a84fe0b16e47b
* Change Update Migration
* Fix DB Migration
* Increase Timeout temporarily, to see whats going on
* Remove unnecessary const and vars
Cleanup Function calls from deprecated NodeVersion struct
* Updated Protopuf, removed depcreated Code from Inspector
* Implement NodeVersion into InfoResponse
* Regenerated locked.go
* Linting
* Fix Tests
* Remove unnecessary constant
* Update Function and Flag Description
* Remove Empty Stat Creation
* return properly with error
* Remove unnecessary struct
* simplify migration step
* Update Inspector to return Version Info
* Update local Endpoint Version Handling
* Reset Travis Timeout
* Add Default for CommitHash
* single quotes
* reorg uplink cmd files for consistency
* init implementation of usage limiting
* Revert "reorg uplink cmd files for consistency"
This reverts commit 91ced7639bf36fc8af1db237b01e233ca92f1890.
* add changes per CR comments
* fix custom query to use rebind
* updates per convo about what to limit on
* changes per comments
* fix syntax and comments
* add integration test, add db methods for test
* update migration, add rebind to query
* update testdata for psql
* remove unneeded drop index statement
* fix migrations, fix calculate usage limit
* fix comment
* add audit test back
* change methods to use bucketName/projectID, fix tests
* add changes per CR comments
* add test for uplink upload and err ssg
* changes per CR comments
* check get/put limit separately
* go gen
* undo changes to bucket usage
* update locked
* spacing
* moves changes to migration v11
* minor changes to fix lint and test err
* change sql to fix errs
* Move from Unique to Index
* Remove Index
* Make some more Indexes Unique and adjust migration
* Fix Migration Statements
* Fix Typo
* Fix Migration of older Table
* Exchange DROP statement
* Remove "if not exists"
* Revert Change in old Migration