* pkg/process/metrics: add an instance prefix
the distinction between which satellite is sending which
data should go in the instance field, not the suffix or application
fields. (un)fortunately, the instance id is deliberately not
configurable because we don't want it to be easy to accidentally
have multiple applications collide with the same instance id.
so we're currently stuffing the human readable instance in the
suffix. :(
perhaps a reasonable tradeoff would be an optional instance
prefix that allows operators to put their domain name in
the instance
Change-Id: I6fcc8498be908c5740439cc00f77474ad151febd
* linting
Change-Id: I9f9a44fa9a2634ef5e4f89548d42d57ce9e4450e
* add path implementation
This commit adds a pkg/paths package which contains two types,
Encrypted and Unencrypted, to statically enforce what is contained
in a path. It's part of a refactoring of the code base to be more
clear about what is contained in a storj.Path at all the layers.
Change-Id: Ifc4d4932da26a97ea99749b8356b4543496a8864
* add encryption store
This change adds an encryption.Store type to keep a collection
of root keys for arbitrary locations in some buckets. It allows
one to look up all of the necessary information to encrypt paths,
decrypt paths and decrypt list operations.
It adds some exported functions to perform encryption on paths
using a Store.
Change-Id: I1a3d230c521d65f0ede727f93e1cb389f8be9497
* add shim around streams store
This commit changes no functionality, but just reorganizes the code
so that changes can be made directly to the streams store
implementation without affecting callers.
It also adds a Path type that will be used at the interface boundary
for the streams store so that it can be sure that it's getting well
formed paths that it expects.
Change-Id: I50bd682995b185beb653b00562fab62ef11f1ab5
* refactor streams to use encryption store
This commit changes the streams store to use the path type as
well as the encryption store to handle all of it's encryption
and decryption.
Some changes were made to how the default key is returned in
the encryption store to have it include the case when the bucket
exists but no paths matched. The path iterator could also be
simplified to not report if a consume was valid: that information
is no longer necessary.
The kvmetainfo tests were changed to appropriately pass the
subtests *testing.T rather than having the closure it executes
use the parent one. The test framework now correctly reports
which test did the failing.
There are still some latent issues with listing in that listing
for "a/" and listing for "a" are not the same operation, but we
treat them as such. I suspect that there are also issues with
paths like "/" or "//foo", but that's for another time.
Change-Id: I81cad4ba2850c3d14ba7e632777c4cac93db9472
* use an encryption store at the upper layers
Change-Id: Id9b4dd5f27b3ecac863de586e9ae076f4f927f6f
* fix linting failures
Change-Id: Ifb8378879ad308d4d047a0483850156371a41280
* fix linting in encryption test
Change-Id: Ia35647dfe18b0f20fe13763b28e53294f75c38fa
* get rid of kvmetainfo rootKey
Change-Id: Id795ca03d9417e3fe9634365a121430eb678d6d5
* Fix linting failure for return with else
Change-Id: I0b9ffd92be42ffcd8fef7ea735c5fc114a55d3b5
* fix some bugs adding enc store to kvmetainfo
Change-Id: I8e765970ba817289c65ec62971ae3bfa2c53a1ba
* respond to review feedback
Change-Id: I43e2ce29ce2fb6677b1cd6b9469838d80ec92c86
* add voucher service on storage node
* config field tag syntax, go routines for requests
* hook up voucher service in storagenode/peer.go
* add voucher config to testplanet
* add voucher config to testplanet
* add voucher response status INVALID, ACCEPTED, REJECTED
* add a test for vouchers service
* handle no row from GetValid, test it
* add trust pool to voucher service
* use trusted list to get satellites
* verify vouchers upon receipt
* test VerifyVoucher
This commit adds two functions that implement the algorithms
described in the password key derivation design document. They
will be used during setup to derive bucket level root keys or
default passwords to use when buckets do not have their own
independent key.
Change-Id: Ie7fb2d8d549ba7465d0722716a2c1ac0ad907286
* pkg/audit: Add DQ test for too many failed audits
Add an integration test which checks that a node which fails several
audits gets disqualified but not before it reaches the audit reputation
disqualification cut-off.
* internal/testplanet: Set DQ cut-off config values
Set the values of the Overlay cache DQ cut-off configuration parameters
used by testplanet.
Move 2 helper function used for test which relay on testplanet from the
test file where they were created to separated file to contain them
because they are not only used in the test file were initially they were
created.
* add counters for nodes that have/have not been seen in the past 24 hours/week
* add additional uptime counters
* add monkit stats for containment mode
* satellite/satellitedb: Alter nodes disqualification column
Change the type of the 'disqualification' column of the nodes table from
boolean to timestamp.
* overlay/cache: Change Disqualified field type
Change the Disqualified field type the NodeDossier struct type from bool
to time.Time to match with the disqualified type used by the DB layer.
* satellite/satellitedb: Update queries uses disqualified
Update the queries which uses the disqualified column due to the column
type has been changed from boolean to nullable timestamp.
* docs/design: Update disqualification due impl changes
Update the disqualification design document to contain the architectural
change required to be able to restore unfair disqualified nodes in case
of an unexpected cause (bug, mistake, hard network disconnection, etc.).
* reputation: Add configuration parameters
Add the configuration parameters which will be used by the algorithm
which will calculate the storage node reputation.
Because the reputation calculation is based on audit and uptime check
results some configuration parameters are in pkg/audit, others in
pkg/discovery and other in the satellite which will combine the both
reputation results to obtain the storage node reputation for repair and
uplink.
* satellite-config: Refresh lock file with new params
Refresh the Satellite configuration yaml lock file with the new
parameters added in this branch.
* Disabled discovery service by changiing from Stop() to Pause()
Paused to solve race condition. If discovery is running, it may mark a node "up" after they've been manually marked "down" in this test.
* Extend to the repair timeout
Fixes intermittent test failures when repairs were taking more than 2 seconds.
* Re-enabled test. Disabled discovery service by changiing from Stop() to Pause()
* Changed back to Stop.
* Revert "Changed back to Stop."
This reverts commit 46d410e72dfae63e0c44915be42784cc9a7b5abf.
* re-enabling TestIdentifyInjuredSegments
* Changed Pause to Stop. Commented on timeout change
* testing...
* temporarily skipping audit tests
* changing back to discover Stop for testing via jenkins
* Revert "changing back to discover Stop for testing via jenkins"
This reverts commit 6aa8558b11a0053c30e0c8b2dbf0d6c0cb34ee6c.
* Changing back to Stop(). Depends on PR 2137
* Revert "temporarily skipping audit tests"
This reverts commit 1940ed9b315d663a0eb6c95521780cbcb48cb121.
* Removed reference to Graveyard since its been removed
Fix a bug in a range loop which used the reference of the variable assigned in the same loop to be appended in a slice, turning out that the slice will contain the same reference rather the a reference for each value of the ranged items.
What: add monkit.Task to a bunch of functions that are missing it
Why: this will significantly help our instrumentation, data collection, and tracing about what's going on in the network
What: this will make it so release binaries default to whatever-release instead of whatever-dev in metrics collection
Why: So we can monitor release binaries with default configuration without getting drowned out by dev binaries
* fix bug for setting flag only values in process setup
when the code was changed to directly load values into the config
structs, it was missed that some configuration is only defined
through flags, but can be loaded from config files still.
so, we need to propogate the settings to the flag only values.
* add test for setting propagation
* fix linting error
* set up voucher service skeleton, basic test
* add VetNode db method
* basic test for VetNode
* encode and sign voucher functions
* fill out and sign vouchers
* test pass/fail voucher request
* match EncodeVoucher to other Encode functions
* change BindSetup to be an option to Bind
* add process.Bind to allow composite structures
* hack fix for noprefix flags
* used tagged version of structs
Before this PR, some flags were created by calling `cfgstruct.Bind` and having their fields create a flag. Once the flags were parsed, `viper` was used to acquire all the values from them and config files, and the fields in the struct were set through the flag interface.
This doesn't work for slices of things on config structs very well, since it can only set strings, and for a string slice, it turns out that the implementation in `pflag` appends an entry rather than setting it.
This changes three things:
1. Only have a `Bind` call instead of `Bind` and `BindSetup`, and make `BindSetup` an option instead.
2. Add a `process.Bind` call that takes in a `*cobra.Cmd`, binds the struct to the command's flags, and keeps track of that struct in a global map keyed by the command.
3. Use `viper` to get the values and load them into the bound configuration structs instead of using the flags to propagate the changes.
In this way, we can support whatever rich configuration we want in the config yaml files, while still getting command like flags when important.
* added scopelint and correcte issues found
* corrected scopelint issue
* made updates based on Ivan's suggestions
Most were around naming conventions
Some were false positives, but I kept them since the test.Run could eventually be changed to run in parallel, which could cause a bug
Others were false positives. Added // nolint: scopelint
* first round cleanup based on go-critic
* more issues resolved for ifelsechain and unlambda checks
* updated from master and gocritic found a new ifElseChain issue
* disable appendAssign. i reports false positives
* re-enabled go-critic appendAssign and disabled lint check at code line level
* fixed go-critic lint error
* fixed // nolint add gocritic specifically
What: Changes to support custom usage limit for the project. With this implementation by default project usage limit is taken from configuration flag. If project DB field usage_limit will be set to value larger than 0 it will become custom usage limit and we will be used to verify is limit was exceeded.
Whats changed:
usage_limit (bigint) field added to projects table (with migration)
things related to project usage moved from metainfo endpoint to project usage type
accounting.ProjectAccounting extended with GetProjectUsageLimits() method
Why: We need to have different usage limits per project. https://storjlabs.atlassian.net/browse/V3-1814
* add repair monkit stats
* rename values, use meter instead of counter, use success threshold instead of repair threshold
* Counter -> Meter
* add repair segment size
* update names and use ratios for healthy before/after repair
* restart jenkins
* add last_ip field to dbx model node, generate dbx
* add last_ip to node proto, generate pb
* migrate
* resolve address in transport.DialNode, update lastIp in cache.UpdateAddress
* use net.SplitHostPort to isolate host address from port
* define DistinctIPs flag
* add test for GetIP
* select last_ip when querying for nodes
* if distinctIPs flag == true, query for nodes with distinct IPs
* some basic tests
* change last_ip to field 14 in proto
* remove comments
* check err
* change distinctIPs to distinctIP
* exclude IPs from newNodes in query for reputable nodes
* add index on last_ip
* only add to excludedIPs if flag is true
* test half new nodes returns distinct IPs
* fix alignment
* add test
* rework ip filter query, add retry logic, add switch for database driver
* add retry to SelectNewNodes
* change discovery intervals so IPs don't get overwritten
* remove TestGetIP
* edit updating node stats in test
* split exclude into nodeIDs and IPs
* separate non-distinct IP query into other function
* trigger checks
* remove else block
* uplink: Add a new flag to set the filepath of the file which is used for
saving the encryption key and rename the one that hold the encryption key and
establish that it has priority over the key stored in the file to make the
configuration usable without having a huge refactoring in test-sim.
* cmd/uplink: Adapt the setup subcommand for storing the user input key to a file
and adapt the rest of the subcommands for reading the key from the key-file when
the key isn't explicitly set with a command line flag.
* cmd/gateway: Adapt it to read the encryption key from the key-file or use the
one passed by a command line flag.
* pkg/process: Export the default configuration filename so other packages which
use the same value can reference to it rather than having it hardcoded.
* Adapt several integrations (scripts, etc.) to consider the changes applied in uplink and cmd packages.
* repair no cutoff longtail
* commit repair pieces even if not hitting success threshold
* commit repair pieces even if not hitting success threshold
* remove useless condition
* better error message
* cmd/uplink: add share command to restrict an api key
This commit is an early bit of work to just implement restricting
macaroon api keys from the command line. It does not convert
api keys to be macaroons in general.
It also does not apply the path restriction caveats appropriately
yet because it does not encrypt them.
* cmd/uplink: fix path encryption for shares
It should now properly encrypt the path prefixes when adding
caveats to a macaroon.
* fix up linting problems
* print summary of caveat and require iso8601
* make clone part more clear
A previous change reused the same timeout for dialing as well as
requesting in order to speed up some tests. This change introduces
a distinct timeout so that the different operations can have different
timeouts.
Ran into difficulties trying to find the ideal solution for sharing
these counts between multiple satellite servers, so for now this is a
dumb solution storing recent space-usage changes in a big dumb in-memory
map with a big dumb lock around it. The interface used, though, should
allow us to swap out the implementation without much difficulty
elsewhere once we know what we want it to be.
The timeout tests were configured to use very short timeouts but
for some reason they took many seconds to complete. This commit
fixes two issues:
1. The transports were always using the default timeout rather than
the timeout specified.
2. The tests were possibly calling t.FailNow inside of the non-test
goroutine which causes it to exit, possibly losing an error from
the error group. Additionally, it didn't seem to be testing that
the error came back as a deadline wrapped in a transport error.
The tests run in ~3s instead of ~60s now.
* set all intervals to UTC in rollupStats map, only delete latest day after both rollups
* clean up usage of interval, use intervalEndTime rather than createdAt
* change some variable names, add comments
* add flag for tally deletion
* adds deletetallies flag to testplanet
* space
* Removes println:
* adds test for deletes false
* tie defaults to releases
this change makes it so that by default, the flag defaults are
chosen based on whether the build was built as a release build or
an ordinary build. release builds by default get release defaults,
whereas ordinary builds by default get dev defaults.
any binary can have its defaults changed by specifying
--defaults=dev
or
--defaults=release
Change-Id: I6d216aa345d211c69ad913159d492fac77b12c64
* make release defaults more clear
this change extends cfgstruct structs to support either
a 'default' tag, or a pair of 'devDefault' and 'releaseDefault'
tags, but not both, for added clarity
Change-Id: Ia098be1fa84b932fdfe90a4a4d027ffb95e249c6
* clarify cfgstruct.DefaultsFlag
Change-Id: I55f2ff9080ebbc0ce83abf956e085242a92f883e
this adds a flag to all pkg/process binaries where if set,
will write a process-level trace for all monkit-instrumented
functions as an svg to the target path.
usage like:
uplink cp --debug.trace-out=trace.svg file sj://bucket/file
Change-Id: I8f6bbfc488f0f1ac270cb28a61347c6b9698cea7
What: This change moves project-level bucket metadata encryption information to the volatile section, because it is unlikely to remain in future releases
Why: Ultimately, the web user interface will allow bucket management (creation, removal, etc), but not object management as that requires an encryption key for sure and we don't want to have users give the satellite their encryption keys.
At a high level, a (*Project) type should map to all of the things you can do inside the web user interface within a project, which by necessity cannot have an encryption key. So, we really don't want an encryption key in the non-volatile section of this library.
* Initial Webserver Draft for Version Controlling
* Rename type to avoid confusion
* Move Function Calls into Version Package
* Fix Linting and Language Typos
* Fix Linting and Spelling Mistakes
* Include Copyright
* Include Copyright
* Adjust Version-Control Server to return list of Versions
* Linting
* Improve Request Handling and Readability
* Add Configuration File Option
Add Systemd Service file
* Add Logging to File
* Smaller Changes
* Add Semantic Versioning and refuses outdated Software from Startup (#1612)
* implements internal Semantic Version library
* adds version logging + reporting to process
* Advance SemVer struct for easier handling
* Add Accepted Version Store
* Fix Function
* Restructure
* Type Conversion
* Handle Version String properly
* Add Note about array index
* Set temporary Default Version
* Add Copyright
* Adding Version to Dashboard
* Adding Version Info Log
* Renaming and adding CheckerProcess
* Iteration Sync
* Iteration V2
* linting
* made LogAndReportVersion a go routine
* Refactor to Go Routine
* Add Context to Go Routine and allow Operation if Lookup to Control Server fails
* Handle Unmarshal properly
* Linting
* Relocate Version Checks
* Relocating Version Check and specified default Version for now
* Linting Error Prevention
* Refuse Startup on outdated Version
* Add Startup Check Function
* Straighten Logging
* Dont force Shutdown if --dev flag is set
* Create full Service/Peer Structure for ControlServer
* Linting
* Straighting Naming
* Finish VersionControl Service Layout
* Improve Error Handling
* Change Listening Address
* Move Checker Function
* Remove VersionControl Peer
* Linting
* Linting
* Create VersionClient Service
* Renaming
* Add Version Client to Peer Definitions
* Linting and Renaming
* Linting
* Remove Transport Checks for now
* Move to Client Side Flag
* Remove check
* Linting
* Transport Client Version Intro
* Adding Version Client to Transport Client
* Add missing parameter
* Adding Version Check, to set Allowed = true
* Set Default to true, testing
* Restructuring Code
* Uplink Changes
* Add more proper Defaults
* Renaming of Version struct
* Dont pass Service use Pointer
* Set Defaults for Versioning Checks
* Put HTTP Server in go routine
* Add Versioncontrol to Storj-Sim
* Testplanet Fixes
* Linting
* Add Error Handling and new Server Struct
* Move Lock slightly
* Reduce Race Potentials
* Remove unnecessary files
* Linting
* Add Proper Transport Handling
* small fixes
* add fence for allowed check
* Add Startup Version Check and Service Naming
* make errormessage private
* Add Comments about VersionedClient
* Linting
* Remove Checks that refuse outgoing connections
* Remove release cmd
* Add Release Script
* Linting
* Update to use correct Values
* Change Timestamp handling
* Adding Protobuf changes back in
* Adding SatelliteDB Changes and adding Storj Node Version to PB
* Add Migration Table
* Add Default Stats for Creation
* Move to BigInt
* Proper SQL Migration
* Ensure minimum Version is passed to the node selection
* Linting...
* Remove VersionedClient and adjust smaller changes from prior merge
* Linting
* Fix PB Message Handling and Query for Node Selection
* some future-proofing type changes
Change-Id: I3cb5018dcccdbc9739fe004d859065992720caaf
* fix a compiler error
Change-Id: If66bb92d8b98e31cd618ecec9c6448ab9b037fa5
* Comment on Constant for Overlay
* Remove NOT NULL and add epoch call as function
* add versions to bootstrap and satellites
Change-Id: I436944589ea5f21600cdd997742a84fe0b16e47b
* Change Update Migration
* Fix DB Migration
* Increase Timeout temporarily, to see whats going on
* Remove unnecessary const and vars
Cleanup Function calls from deprecated NodeVersion struct
* Updated Protopuf, removed depcreated Code from Inspector
* Implement NodeVersion into InfoResponse
* Regenerated locked.go
* Linting
* Fix Tests
* Remove unnecessary constant
* Update Function and Flag Description
* Remove Empty Stat Creation
* return properly with error
* Remove unnecessary struct
* simplify migration step
* Update Inspector to return Version Info
* Update local Endpoint Version Handling
* Reset Travis Timeout
* Add Default for CommitHash
* single quotes
* update ProjectStorageTotals func to get all records for project
* fix var name
* update test to catch bug
* fix spelling
* modify query so we only have to make 1
Make separate "CreateCertificate" and "CreateSelfSignedCertificate"
functions to take the two roles of NewCert. These names should help
clarify that they actually make certificates and not just allocate new
"Cert" or "Certificate" objects.
Secondly, in the case of non-self-signed certs, require a public and a
private key to be passed in instead of two private keys, because it's
pretty hard to tell when reading code which one is meant to be the
signer and which one is the signee. With a public and private key, you
know.
(These are some changes I made in the course of the openssl port,
because the NewCert function kept being confusing to me. It's possible
I'm just being ridiculous, and this doesn't help improve readability for
anyone else, but if I'm not being ridiculous let's get this in)
* Initial Webserver Draft for Version Controlling
* Rename type to avoid confusion
* Move Function Calls into Version Package
* Fix Linting and Language Typos
* Fix Linting and Spelling Mistakes
* Include Copyright
* Include Copyright
* Adjust Version-Control Server to return list of Versions
* Linting
* Improve Request Handling and Readability
* Add Configuration File Option
Add Systemd Service file
* Add Logging to File
* Smaller Changes
* Add Semantic Versioning and refuses outdated Software from Startup (#1612)
* implements internal Semantic Version library
* adds version logging + reporting to process
* Advance SemVer struct for easier handling
* Add Accepted Version Store
* Fix Function
* Restructure
* Type Conversion
* Handle Version String properly
* Add Note about array index
* Set temporary Default Version
* Add Copyright
* Adding Version to Dashboard
* Adding Version Info Log
* Renaming and adding CheckerProcess
* Iteration Sync
* Iteration V2
* linting
* made LogAndReportVersion a go routine
* Refactor to Go Routine
* Add Context to Go Routine and allow Operation if Lookup to Control Server fails
* Handle Unmarshal properly
* Linting
* Relocate Version Checks
* Relocating Version Check and specified default Version for now
* Linting Error Prevention
* Refuse Startup on outdated Version
* Add Startup Check Function
* Straighten Logging
* Dont force Shutdown if --dev flag is set
* Create full Service/Peer Structure for ControlServer
* Linting
* Straighting Naming
* Finish VersionControl Service Layout
* Improve Error Handling
* Change Listening Address
* Move Checker Function
* Remove VersionControl Peer
* Linting
* Linting
* Create VersionClient Service
* Renaming
* Add Version Client to Peer Definitions
* Linting and Renaming
* Linting
* Remove Transport Checks for now
* Move to Client Side Flag
* Remove check
* Linting
* Transport Client Version Intro
* Adding Version Client to Transport Client
* Add missing parameter
* Adding Version Check, to set Allowed = true
* Set Default to true, testing
* Restructuring Code
* Uplink Changes
* Add more proper Defaults
* Renaming of Version struct
* Dont pass Service use Pointer
* Set Defaults for Versioning Checks
* Put HTTP Server in go routine
* Add Versioncontrol to Storj-Sim
* Testplanet Fixes
* Linting
* Add Error Handling and new Server Struct
* Move Lock slightly
* Reduce Race Potentials
* Remove unnecessary files
* Linting
* Add Proper Transport Handling
* small fixes
* add fence for allowed check
* Add Startup Version Check and Service Naming
* make errormessage private
* Add Comments about VersionedClient
* Linting
* Remove Checks that refuse outgoing connections
* Remove release cmd
* Add Release Script
* Linting
* Update to use correct Values
* Move vars private and set minimum default versions for testing builds
* Remove VersionedClient
* Better Error Handling and naked return removal
* Straighten the Regex and string conversion
* Change Check to allows testplanet and storj-sim to run without the
need to pass an LDFlag
* Cosmetic Change to Dashboard
* Cleanup Returns and remove commented code
* Remove Version Check if no build options are passed in
* Pass in Config Values instead of Pointers
* Handle missed Error
* Update Endpoint URL
* Change Type of Release Flag
* Add additional Logging
* Remove Versions Logging of other Services
* minor fixes
Change-Id: I5cc04a410ea6b2008d14dffd63eb5f36dd348a8b
* timeout calculation
* Initial Migration Draft to wipe the network gracefully (#1578)
* Initial Migration Draft to wipe the network gracefully
* Change from DATE to UNIX Timestamp (INT)
* SQL Derps
* Remove Satellite Migration - gets done manually
* Move from possible STRING to INT Vars
* Add TestTable for Migration Tests
* Adjust Test data
* Update TestData Part 2
* Update Wipeout to 28th 00:00
* merge spelling
We want to use those fields in the bucket-level Pointer objects as
bucket defaults, but we need to be able to get at them first.
I don't see any strong reason not to make these available, except
that it was kind of a pain.
* reorg uplink cmd files for consistency
* init implementation of usage limiting
* Revert "reorg uplink cmd files for consistency"
This reverts commit 91ced7639bf36fc8af1db237b01e233ca92f1890.
* add changes per CR comments
* fix custom query to use rebind
* updates per convo about what to limit on
* changes per comments
* fix syntax and comments
* add integration test, add db methods for test
* update migration, add rebind to query
* update testdata for psql
* remove unneeded drop index statement
* fix migrations, fix calculate usage limit
* fix comment
* add audit test back
* change methods to use bucketName/projectID, fix tests
* add changes per CR comments
* add test for uplink upload and err ssg
* changes per CR comments
* check get/put limit separately
* add MetadataSize to stats
* add logic for accumulating bucket stats in calculateAtRestData
* rename stats to BucketTally, move to accounting package
* define method on accountingDB for inserting bucketTallies
* insert bucketTallies into bucket_storage_tally table