Don't abbreviate multinode in the command help message because there
isn't a need for it and the abbreviation isn't clear at all.
Change-Id: I7a1f2be6ae1f7d4b287c18c48b22c630549b731f
Now that we have both the storagenode and updater processes running
in a single docker container, we need a way to know which log entry
is logged by any of the processes.
This change includes a Process field in the log entries.
Resolves https://github.com/storj/storj/issues/4648
Change-Id: I167b9ab65728a41136d264b5fe2c41bb64ed1785
Change the order of when the storage node setup node loads the identity
for avoiding to write anything in the disk in the case that there is an
error loading the identity.
This bug was reported by @onionjake Github username's and the specific
changes to make.
Closes#4387#4396
Change-Id: I360fff3c23b160c9e055203d3526d749edfd9129
Currently the address being used is most of the time just :28967, which is not the correct address to reach the node from the public on.
This change uses the designated contact external address value that contains the set and preferred way to reach the node.
Change-Id: I99e979c2541043755b81e65c36c4289bfa3f60f3
The info command prints the details of the storagenode
to stdout.
It returns the storagenode info in JSON format
if --json flag is specified which can be piped
to the multinode add command.
Change-Id: I0163db8e02c4ec7346bfa69274d1772669357c6c
func init() code isn't that well defined and reordering of them
could cause problems when starting the whole process from it.
Change-Id: I4088a0db156ece15354877011a481f6f91c9b332
Currently TextMaxVerifyCount flakes in some tests, try increasing the
sleep time to ensure that things are slow enough to trigger the error
condition.
Also pass ctx to all the funcs so we can handle sleep better.
Change-Id: I605b6ea8b14a0a66d81a605ce3251f57a1669c00
Initially there were pkg and private packages, however for all practical
purposes there's no significant difference between them. It's clearer to
have a single private package - and when we do get a specific
abstraction that needs to be reused, we can move it to storj.io/common
or storj.io/private.
Change-Id: Ibc2036e67f312f5d63cb4a97f5a92e38ae413aa5
Previously, we created a new file to use for directory verification
every time the storage node starts. This is not helpful if the storage node
points to the wrong directory when restarting. Now we will only create the file
on setup. Now the file should be created only once and will be verified at
runtime.
Change-Id: Id529f681469138d368e5ea3c63159befe62b1a5b
log.Fatal immediately terminates the program without running any defers.
We should properly close all the services and databases.
Change-Id: I5e959cef3eafedeacb3a2062e3da47e8d04e8e75
To prevent storagenode from implicitly recreating missing dbs and storage,
as such behaviour leads to audit failures. Do not allow storagenode to
start if any of dbs or storage is missing, corrupted, or dedicated storage disk is
unmounted, to get downtime instead.
Change-Id: Ic64e1f0ff4d8ef5b2fddbe7a7e53df4f4bd8652e
Add a long description to the graceful exit command to clearly mention
that the command is interactive asking which satellite the SNO wants to
exit.
Change-Id: Icd4056a470e707322f600133e63d9dc56eb877b7
See https://storjlabs.atlassian.net/browse/SM-752
These changes allow us to change the log level at runtime through a handler off of the debug endpoint.
Examples of changing the log level on storj-sim
To get the current level for the satellite api process:
curl -XGET 'http://127.0.0.1:10009/logging' --header 'Content-Type: text/plain'
To change the log level:
curl -XPUT 'http://127.0.0.1:10009/logging' --header 'Content-Type: text/plain' --data-raw '{"level":"error"}'
Change-Id: I05d164b290929fa06b6d78c01075ee41f8238044
Currently uploads can cause a lot of IOPS, reduce this by introducing a
in-memory buffer on-top of the file.
Change-Id: I5f4e3e01c0a36258271d180b922107de447bcb59
CreateTables hasn't been quite true for a while now, rename to
MigrateToLatest to be clearer in it's behavior.
Change-Id: Ida48e95122a5d9b7a814e922d3698e00024a2ba7
Previous split to a storj.io/private repository broke tag-release.sh
script. This is the minimal temporary fix to make things work.
This links the build information to specified variables and sets them
inline. This approach, of course, is very fragile.
Change-Id: I73db2305e6c304146e5a14b13f1d917881a7455c
* debug
* traces
* cfgstruct
* process
Package `storj/private/version` will be removed as a separate change.
Change-Id: Iadc40faa782e6225513b28218952f02d9c240a9f
storagenode database preflight check.
Disable preflight database check by default, and have the option to
enable it. This will allow us to enable it once it is definitely
working.
Also change the name of the config flag for preflight time sync.
Change-Id: Ie2e20f9e25dcb38794eafa7e1505e7c6ff287c99
for storagenode
Ensure that database schema matches latest test migration schema before
allowing the node to start up.
Ensure minimal read/write functionality for each storagenode database
before allowing the node to start up.
This will eliminate many unhandled audit errors we are seeing.
Change-Id: Ic0e628b04a9c35b7a8243f6a81d4683918170ba9
* add exit-status command
* remove todo and fix format
* fix status display
* change startExit to exit progress
* fix linting error
* add successful column in exit progress
* fix test
* remove extra new line
* fix TYPOS
* format the percentage better
* storagenode/storagenodedb: Migrate to separate dbs
* storagenode/storagenodedb: Add migration to drop versions tables
* Put drop table statements into a transaction.
* Fix CI errors.
* Fix CI errors.
* Changes requested from PR feedback.
* storagenode/storagenodedb: fix tx commit
What:
cmd/inspector/main.go: removes kad commands
internal/testplanet/planet.go: Waits for contact chore to finish
satellite/contact/nodesservice.go: creates an empty nodes service implementation
satellite/contact/service.go: implements Local and FetchInfo methods & adds external address config value
satellite/discovery/service.go: replaces kad.FetchInfo with contact.FetchInfo in Refresh() & removes Discover()
satellite/peer.go: sets up contact service and endpoints
storagenode/console/service.go: replaces nodeID with contact.Local()
storagenode/contact/chore.go: replaces routing table with contact service
storagenode/contact/nodesservice.go: creates empty implementation for ping and request info nodes service & implements RequestInfo method
storagenode/contact/service.go: creates a service to return the local node and update its own capacity
storagenode/monitor/monitor.go: uses contact service in place of routing table
storagenode/operator.go: moves operatorconfig from kad into its own setup
storagenode/peer.go: sets up contact service, chore, pingstats and endpoints
satellite/overlay/config.go: changes NodeSelectionConfig.OnlineWindow default to 4hr to allow for accurate repair selection
Removes kademlia setups in:
cmd/storagenode/main.go
cmd/storj-sim/network.go
internal/testplane/planet.go
internal/testplanet/satellite.go
internal/testplanet/storagenode.go
satellite/peer.go
scripts/test-sim-backwards.sh
scripts/testdata/satellite-config.yaml.lock
storagenode/inspector/inspector.go
storagenode/peer.go
storagenode/storagenodedb/database.go
Why: Replacing Kademlia
Please describe the tests:
• internal/testplanet/planet_test.go:
TestBasic: assert that the storagenode can check in with the satellite without any errors
TestContact: test that all nodes get inserted into both satellites' overlay cache during testplanet setup
• satellite/contact/contact_test.go:
TestFetchInfo: Tests that the FetchInfo method returns the correct info
• storagenode/contact/contact_test.go:
TestNodeInfoUpdated: tests that the contact chore updates the node information
TestRequestInfoEndpoint: tests that the Request info endpoint returns the correct info
Please describe the performance impact: Node discovery should be at least slightly more performant since each node connects directly to each satellite and no longer needs to wait for bootstrapping. It probably won't be faster in real time on start up since each node waits a random amount of time (less than 1 hr) to initialize its first connection (jitter).
* add cache, update cache w/piece create/delete
* add service w/loop to cache to recalculate space used cache
* add piecestore cache to other sn svcs to use
* add table to persist the total space used
* rm cache where not needed
* rm stuff from sn svcs
* start fixing tests, changes per comments
* update commits
* add unit tests
* fix commiting before we write header bytes
* fix cache create test
* copy cache map, add started back to recalc
* fix test
* add test, update comments
What: Change cmd/uplink to use scopes
It moves the fields that will be subsumed by scopes into an explicit legacy section and hides their configuration flags.
Why: So that it can read scopes in from files and stuff
* change BindSetup to be an option to Bind
* add process.Bind to allow composite structures
* hack fix for noprefix flags
* used tagged version of structs
Before this PR, some flags were created by calling `cfgstruct.Bind` and having their fields create a flag. Once the flags were parsed, `viper` was used to acquire all the values from them and config files, and the fields in the struct were set through the flag interface.
This doesn't work for slices of things on config structs very well, since it can only set strings, and for a string slice, it turns out that the implementation in `pflag` appends an entry rather than setting it.
This changes three things:
1. Only have a `Bind` call instead of `Bind` and `BindSetup`, and make `BindSetup` an option instead.
2. Add a `process.Bind` call that takes in a `*cobra.Cmd`, binds the struct to the command's flags, and keeps track of that struct in a global map keyed by the command.
3. Use `viper` to get the values and load them into the bound configuration structs instead of using the flags to propagate the changes.
In this way, we can support whatever rich configuration we want in the config yaml files, while still getting command like flags when important.