Reverts https://review.dev.storj.io/c/storj/storj/+/2074
That change broke all billing commands with `sql: database is closed`.
Fixing the issue does not make sense as the whole setup of billing
commands is now error-prone. It's better to revert it.
Change-Id: Id13f817b73c7313ba60181f740b0712e4bdce077
setup
We don't have default ports as a part of configuration anymore because
satellite-addr flag was removed.
Change-Id: Ibf9fc4b399beaf51ebb9461de2d8994a322f9686
This changeset allows a user agent string to be set during uplink setup, which
is thereafter used for partner value attribution. EG
uplink setup --client.user-agent ”MyCompany”
Change-Id: Iefa8755fccc06acb8a303a342b943cece44a81f7
Remove unneeded observer's tests cases after the changes that enhanced
the segment-reaper not to be limited to objects up to 65 segments.
Also remove some tests cases that were already covered by the
TestObserver_processSegment_from_to test function.
Change-Id: I2385b406b50f7810ed22d3297b903a993f83fcfe
Viper has a feature/bug where all YAML config keys are cast
to lowercase (see https://github.com/spf13/viper/issues/260
and https://github.com/spf13/viper/issues/411).
Since we use Viper to load our config values, it doesn't seem
like there's an easy way to preserve case sensitivity for the
access names chosen by users right now. Adding this prompt
should help user experience by clarifying that all access
names must be lowercase.
Change-Id: I47e8344bb0ca7e78458405496f20e78e3c9f9a88
Rename a test function to indicate the function that it actually tests
rather than indicating the function that calls the function under test.
Change-Id: I2c1fe165e3acf952993a9a65dfbfbdcda0d1cccc
After the segment-reaper was enhanced to support objects with any number
of segments, the test helper function to generate test data doesn't have
to support to generate objects with a number of segments up to or over
the previous limit.
The commit remove that unnecessary test logic.
Change-Id: I2897c5a96fb133f61de2cdb0ef7d13ee5e0151e8
This allows to seeing logs in the output of the invoice commands.
Existing ensure-stripe-customer commands is moved from the 'reports' to
the new 'billing' root command.
Change-Id: I752c7ab6ca59bfac8e0f174a45d2ab45fc18e467
able to cover more testing scenarios
Currently, its hard to implement test suite for payments because
mockpayments is on to high level and we cannot emulate many things e.g.
adding credit card. This change is first to be able to add mock for
Stripe client and do more granular tests.
Change-Id: Ied85d4bd0642debdffe1161657c1e475202e9d23
To avoid including multiple months in a single invoice, we need all
inspector's invoice commands to run in for specific period.
See https://storjlabs.atlassian.net/browse/USR-725
Change-Id: I3637dc189234f02350daca8d897c21765762ea55
See https://storjlabs.atlassian.net/browse/SM-752
These changes allow us to change the log level at runtime through a handler off of the debug endpoint.
Examples of changing the log level on storj-sim
To get the current level for the satellite api process:
curl -XGET 'http://127.0.0.1:10009/logging' --header 'Content-Type: text/plain'
To change the log level:
curl -XPUT 'http://127.0.0.1:10009/logging' --header 'Content-Type: text/plain' --data-raw '{"level":"error"}'
Change-Id: I05d164b290929fa06b6d78c01075ee41f8238044
Reorganize some operations in the observer.processSegment method:
* to minimal reduce the memory usage removing the segments of the
objects marked as skipped. Skipped objects aren't discarded of the
analysis stage so the segments aren't needed.
* returns earlier when an object is skipped because isn't needed a
further processing.
Change-Id: I210a26c394477ee411ff7f640507dcc07733a47f
Rename a test name and an error class because the bitmask was renamed to
bitArray in a previous commit, however, they weren't updated.
Change-Id: I92afcf747dec9e0a15d4c59b896e07b942f3518b
The bitArray.IsSequence method had a bug that caused that sometimes the
test "Bitmask/IsSequence/sequence starts at other index than 0" to fail
due to it uses random values.
The test mostly found a corner case, I've added new tests to catch the
bug and then apply the fix.
Change-Id: If01dc07006d261ba40bbd99d36ef776e09ed3efc
Change the bitmask used by segment reaper to use []byte rather than uint64
This passes tests but I have literally no clue how to integration test this.
Change-Id: I393f4598b27cae6e427da2190dd3109bca721c34
Currently uploads can cause a lot of IOPS, reduce this by introducing a
in-memory buffer on-top of the file.
Change-Id: I5f4e3e01c0a36258271d180b922107de447bcb59
CreateTables hasn't been quite true for a while now, rename to
MigrateToLatest to be clearer in it's behavior.
Change-Id: Ida48e95122a5d9b7a814e922d3698e00024a2ba7
when tracing is enabled, we should also set sampling rate to
a non-zero value. For now, we will set it to 1.
Uplink CLI users should be able to override it with the sample
flag.
Change-Id: I8bcf514fb14c2a1c4349b7957dd24ec23e4a85e5
In cases like the segment reaper script connecting to the metainfodb,
we don't want a db migration to happen automatically when we call
metainfo.NewStore. This adds MigrateToLatest method for postgreskv
and cockroackv, and calls MigrateToLatest in places where NewStore used
to create tables.
Change-Id: I682d0f26d609af0601dfdb32a24866cdf5d32a7e
Previously we are using tracing.sampled to be the switch for turning on/off tracing.
However we would like to separate sampling rate from being the switch,
so we can set sampling rate to be 0 but still intialize tracing for
satellite and storagenodes
Change-Id: I27e6ba25ea6f6b612b4e1a57cf1301889ded41ec
We want to make tracing to be opt-in.
For now, we will use `tracing.sample` as the toggle config to enable or
disable tracing and default to sample every traces from uplink cli.
If user wants to change the default sampling rate, they can do so by
using the `--tracing.sample` flag to override the default value
Change-Id: I6f25dac0f43024c50a8aaf6c549e6a514211f834
currently production uses a different application suffix for gc
services, so chronograf can distinguish between gc processes and core
processes, but it'd be nice to be a bit more consistent with repairers
and api servers
Change-Id: Icb96fed006c59d7afd730317d35636a6e4573b58
Currently storj-sim relies on the log lines to be exactly the same,
when they change it cannot find the necessary information from log.
Change-Id: Ia039915ef3375a7cf60f107b2c05c958de15b6d5
uuid.UUID implements driver.Value so it can be directly used as a
scannable result.
Replace uses of dbutil.BytesToUUID with uuid.FromBytes.
Change-Id: I51a670185ceb3cc2199d5aa2b76bc3fc191ca8fe
Previous split to a storj.io/private repository broke tag-release.sh
script. This is the minimal temporary fix to make things work.
This links the build information to specified variables and sets them
inline. This approach, of course, is very fragile.
Change-Id: I73db2305e6c304146e5a14b13f1d917881a7455c
* debug
* traces
* cfgstruct
* process
Package `storj/private/version` will be removed as a separate change.
Change-Id: Iadc40faa782e6225513b28218952f02d9c240a9f
We are not generating identity for gateway during setup anymore, its
generated on-fly by libuplink. With this change we can remove
`--identity-dir` from gateway.
Change-Id: I63298115d4399564b2f29541b8dc16e3b3acdcf2
The admission/v3 protocol now supports arbitrary key/value headers to be
included in each packet of metrics. This commit creates support for
this, so the lua config file can declare a filter taking into account
the key/value headers.
Change-Id: I41de8c018d33304ccf46ec221ae689d55c5fb1ee
Previously, we were simply discarding rows from the repair queue when
they couldn't be repaired (either because the overlay said too many
nodes were down, or because we failed to download enough pieces).
Now, such segments will be put into the irreparableDB for further
and (hopefully) more focused attention.
This change also better differentiates some error cases from Repair()
for monitoring purposes.
Change-Id: I82a52a6da50c948ddd651048e2a39cb4b1e6df5c
New API has limited number of options to configure at the moment. We
should remove unused flags from Uplink CLI and add if needed in the
future.
Change-Id: Icf3f3dadd43cb61a3b408b02d0762aef34425dbf
On satellite, remove all references to free_bandwidth column in nodes table.
On storage node, remove references to AllocatedBandwidth and MinimumBandwidth and mark as deprecated.
Protobuf message, NodeCapacity, is left intact for backwards compatibility.
Once this is released to all satellites, we can drop the column from the DB.
Change-Id: I2ff6c6537fc9008a0c5588e951afea58ede85838
By using a require for storj.io/storj it will make the import
unambiguous. This means it is possible to have a module name
storj.io/storj/cmd/gateway.
Change-Id: I98439cbbaf433ae31309b7f80a19ced896018f65
Now that we are trying to identify the root cause of the satellite load limitations (i.e. currently the satellite has a max ability of 400 rps for uploads and we need this to be higher), we are using the golang diagnostic tools to collect insight into what the bottlenecks are. We currently have a debug endpoint to gather some cpu and mem data, but it could be useful to have continuous profiling. GCP stackdriver has support for continuous profiling so lets set that up and see if it is helpful to gather more data.
This PR adds support for [GCP continuous profiler](https://cloud.google.com/profiler) which allows enabling continuous cpu/mem profiling and the stats are sent to stackdriver in google cloud console.
To enable the continuous profiling for a storj component, do the following:
- prereq: the workload must be running in GKE and have Stackdriver Profiling IAM role permissions
- provide the config flag `debug.profilename` in the config.yaml file for the workload (i.e. satellite api process, etc). The profilename should be the workload name, for example "satellite-api".
- once the above config flag is provided, the profiler will be initialized and profiling stats will automatically be sent to GCP project where the workload is running and viewable in the Stackdriver Profile page in the console
The current implementation assumes the workload is running in GKE, however if we find if useful we can add support to enable this from anywhere. But for simplicity, its configured this way assuming the main goal is to enable in production systems.
Change-Id: Ibf8ebe2df7bf06fdd4951ee6a1e48854dd36ad47
The code to generate monkit.lock has a bug where it doesn't take
ScopeNamed into account and assumes the package. Since the downgrade
file was created from monkit.lock, we also assumed the package, so
we were downgrading to the wrong metric.
No other places call ScopeNamed that would cause a problem.
Change-Id: If9fbbd971a7d755f5de33ed20b8a6bcc95670ee3
This peer will contain our administrative panels.
It's completely separated from our other satellite
processes because it allows better control for restricting
access to it.
Change-Id: Ifca473bee82ff6c680b346918ba32b835a7a6847
In case the endpoint doesn't start, it might end up indefinitely
waiting for it to come up stalling jenkins.
Change-Id: Ib10bf1a25461e7532ec56ca705178bc9a7f85d12
this commit updates our monkit dependency to the v3 version where
it outputs in an influx style. this makes discovery much easier
as many tools are built to look at it this way.
graphite and rothko will suffer some due to no longer being a tree
based on dots. hopefully time will exist to update rothko to
index based on the new metric format.
it adds an influx output for the statreceiver so that we can
write to influxdb v1 or v2 directly.
Change-Id: Iae9f9494a6d29cfbd1f932a5e71a891b490415ff
Currently we risk losing pending bandwidth rollup writes even on a clean
shutdown. This change ensures that all pending writes are actually
written to the db when shutting down the satellite.
Change-Id: Ideab62fa9808937d3dce9585c52405d8c8a0e703
Setup command of uplink has to create the configuration directory just
before saving the configuration file for making it more robust than
creating in the initial state of the process.
When creating the directory at the beginning of the process leaves the
possibility to delete such directory during the setup process and leads
to a failure.
Ticket https://storjlabs.atlassian.net/browse/V3-3545
Change-Id: I30db0175e23a597e9675d267b4d7e25d5d4c5119
storagenode database preflight check.
Disable preflight database check by default, and have the option to
enable it. This will allow us to enable it once it is definitely
working.
Also change the name of the config flag for preflight time sync.
Change-Id: Ie2e20f9e25dcb38794eafa7e1505e7c6ff287c99
for storagenode
Ensure that database schema matches latest test migration schema before
allowing the node to start up.
Ensure minimal read/write functionality for each storagenode database
before allowing the node to start up.
This will eliminate many unhandled audit errors we are seeing.
Change-Id: Ic0e628b04a9c35b7a8243f6a81d4683918170ba9
this commit introduces the reported_serials table. its purpose is
to allow for blind writes into it as nodes report in so that we have
minimal contention. in order to continue to accurately account for
used bandwidth, though, we cannot immediately add the settled amount.
if we did, we would have to give up on blind writes.
the table's primary key is structured precisely so that we can quickly
find expired orders and so that we maximally benefit from rocksdb
path prefix compression. we do this by rounding the expires at time
forward to the next day, effectively giving us storagenode petnames
for free. and since there's no secondary index or foreign key
constraints, this design should use significantly less space than
the current used_serials table while also reducing contention.
after inserting the orders into the table, we have a chore that
periodically consumes all of the expired orders in it and inserts
them into the existing rollups tables. this is as if we changed
the nodes to report as the order expired rather than as soon as
possible, so the belief in correctness of the refactor is higher.
since we are able to process large batches of orders (typically
a day's worth), we can use the code to maximally batch inserts into
the rollup tables to make inserts as friendly as possible to
cockroach.
Change-Id: I25d609ca2679b8331979184f16c6d46d4f74c1a6
JIRA: https://storjlabs.atlassian.net/browse/V3-3499
The `uplink share` command does not print the restricted API key and the
restricted encryption access anymore.
Change-Id: Ie4ebe0b27067ee00af97c775f4e06f558b894fe2
We want to make using uplink as easy as possible. That's why we wan't to
avoid requiring setup or import command before normal usage if user
specified --access flag. If this flag is set then rest flags should be
set as defaults.
https://storjlabs.atlassian.net/browse/V3-3490
Change-Id: I95a7bd77a3f00b8d9981fee513e9e77aef298bca
We decided that better name for "scope" will be "access". This change
refactors cmd part of code but don't touch libuplink. For backward
compatibility old configs with "scope" field will be loaded without any
issue. Old flag "scope" won't be supported directly from command line.
https://storjlabs.atlassian.net/browse/V3-3488
Change-Id: I349d6971c798380d147937c91e887edb5e9ae4aa
Remove starting up messages from peers. We expect all of them to start,
if they don't, then they should return an error why they don't start.
The only informative message is when a service is disabled.
When doing initial database setup then each migration step isn't
informative, hence print only a single line with the final version.
Also use shorter log scopes.
Change-Id: Ic8b61411df2eeae2a36d600a0c2fbc97a84a5b93
As per discussed we decided to rate limit how fast we iterate through
the metainfo database in the metainfo loop. This puts in place a
mechanism for rate limiting and burst limiting if need be in the future.
The default for this rate limiting is still no limits so it stays the
same as our previous functionality.
Change-Id: I950f7192962b0e49f082d2c4284e2d52b0a925c7
Fixes Least Authority Issue F:
https://storjlabs.atlassian.net/browse/V3-3409
If the --allowed-path-prefix flag is not set to the `share` command, any
command arguments will be used as allowed path prefixes.
This patch also improves the output of the `share` command to print the
state of all restrictions, so users can confirm they match their
intention.
Change-Id: Id1b4df20b182d3fe04cb2196feea090975fce8b4
flate compression with default settings plays very poorly
together with race, causing test to take a significant amount
of time.
Use pass-through compression to avoid the issue.
Improves test from 2m45s to 17s.
Change-Id: Iadf1381c538736d48e018164697bdfd3356e24b8
This change updates the three satellite report commands that accept date
ranges to parse and treat those dates uniformly.
- End dates are now uniformly exclusive. Exclusive end dates helps
operators avoid one-off errors on month boundaries, as in the operator
does not have to remember how many days are in that month and can just
run the report from the 1st (inclusive) through the 1st (exclusive).
- Fixed the date range validity check which only failed if the start
date came after the end date (it should have failed dates that were
equal since the check happened after adjusting for inclusivity).
Change-Id: Ib2ee1c71ddb916c6e1906834d5ff0dc47d1a5801
- also updated ping chore to pick up trust changes
- fixed small typo in blueprint
- fixed flags for storj-sim
- wired up changes to testplanet
Change-Id: I02982f3a63a1b4150b82a009ee126b25ed51917d
Add randomized test cases for testing the observer processSegment method
when the observer has a time range (from and to) set and there are
objects with segments whose creation date is outside of this range.
Change-Id: Ieac82a21f278f0850b95275bfdd2e8a812cc57a5
the old code to eventually create an api key on a satellite had
some issues. namely, there were some ignored errors, swallowed
errors, incorrect returns of nil errors when there should have
been an error, and it did not handle working against an already
existing database.
this commit fixes the above issues and organizes the code into
a set of methods performing individual steps rather than one big
function. it adds retries and attempts to get existing values
instead of creating them when possible, which means that it will
work if the values already exist. additionally, it removes the
3 second sleep in favor of a bounded retry loop with a small sleep
which improves startup times.
Change-Id: I4de04659e5a62dd3f675fbf3c76f3311c410a03e
After changing how we execute the storagenode-updater process we lost
timestamps in the log.
The fix is to start using zap logging.
The Windows Installer is changed to register the storagenode-updater
service in a way that the Windows Service Manager passes the
--log.output flag instead of the old --log.
The old --log flag is deprecated, but not removed. We will support it
for backward compatibility. This is required as the storagenode-updater
can auto-updated itself, but the Windows Service Manager of this old
installtion will continue passing the old --log flag when starting it.
Change-Id: I690dff27e01335e617aa314032ecbadc4ea8cbd5
Signed-off-by: Kaloyan Raev <kaloyan@storj.io>
Implement some unit test cases for the observer.processSegment method and fix a bug found by these tests.
A production snapshot is more certain but it's huge for having in the repository and run the test with it by the CI.
We want to have tests for the different cases to detect zombie segments and relaying on production data cannot guarantee to have all of them.
NOTE the test has been implemented with random values for not having always the same combination of segments list and the same values avoiding that the implementation gets stale due to the test. The issue about this is that it's harder to understand and we could get only sometimes failures in the CI in case that the implementation has some bug.
The random tests allow to eventually check cases that a static test may now cover because it is not expressed or because it is needed to implement a large number of cases.
Because we are worried that the test implementation is complex and we could have bugs on it despite that the same bugs should exist in both, the implementation and the test, moreover that we have to consider that the implementation and the tests have been written by different people. Because of that, we may replace entirely these random tests by a list of static ones.
for storj-sim to work, we need to avoid schemas in cockroach urls
so we have storj-sim create namespaced databases instead of schemas
and we have the migrate command create the database in the same way
that it would create a schema for postgres. then it works!
a follow up commit will move the creation of the database/schemas
into storj-sim's setup step so that we can avoid doing these icky
creations during normal migration calls. it will also make the
pointerdb have an explicit call to migrate instead of just doing
it every time it's opened.
Change-Id: If69ef5cb96b6866b0438c761bd445afb3597ae5f
* Move the observer implementation and the type definitions related with
it and helper functions to its own file.
* Cluster struct type is used as a key for a ObjectsMap type.
Observer struct type has a field of ObjectsMap.
Cluster has a field for project ID.
Observer processSegment method uses its ObjectMap field for tracking
objects.
However Observer processSegment clears the map once the projectID
diverges from the one kept in the Observer lastProjectID field, meaning
that it isn't needed to keep the projectID as part of the ObjectMap key.
For this reason, ObjectMap can use as a key just only the bucket name
and Cluster struct isn't needed.
Because of such change, the ObjectMap type has been renamed to a more
descriptive name.
* Make the types defined for this specific package not being exported.
* Create a constructor function for observer to encapsulate the map
allocation.
* Don't throw away the entirely buckets objects map when an empty one
is used to reuse part of the allocations.
Encapsulate the clearing up logic into a method.
* Make the analyzeProject function to be a method of observer.
* change satellite.Peer name to Core
* change to Core in testplanet
* missed a few places
* keep shared stuff in peer.go to stay consistent with storj/docs
* separate sadb migration, add version check
* update checkversion to do same validation as migration
* changes per CR
* add sa migration to storj-sim
* add different debug port in storj-sim for migration
* add wait for exit for storj-sim migration
* update sa docker entrypoint to support migration
* storj-sim satellite parts all wait for migration
* upgrade golang-migrate/migrate to v4 because bug
* fix go mod tidy
* rm dup api code from sa peer, update storj-sim
* fix for backwards compat tests
* use env var instead of localhost
* changes per CR
* fix env var name
* skip peer for setup
* set up satellite repair run command
* add separated repair process to storj-sim
* add repairer peer to satellite in testplanet
* move api run cmd into api.go
* add satellite run repair to entrypoint
When the contact chore starts running before the monitor service has
provided any useful capacity data, the first outgoing contact has
not-very-helpful data for the satellite. This change causes the contact
chore to wait until capacity data is available. The wait should be quite
short in all reasonable cases: even when a node starts with a lot of
stored pieces and no cached spaceUsedDB data, new data will have been
calculated and cached by the call to
`peer.Storage2.CacheService.Init(ctx)` in `storagenode.cmdRun()` before
`peer.Run(ctx)`.
Change-Id: Ibc26d5c1fc10a23006c00bc3f13ff6cf71f8bf1d
* update lock file and add comment
* add created at and bytes transferred
* cleanup
* rename db func to GetGracefulExitNodesByTimeFrame
* fix flag
* split into two overlay functions
* := to =
* fix test
* add node not found error class
* fix overlay test
* suggested test changes
* review suggestions
* get exit status from overlay.Get()
* check rows.Err
* fix panic when ExitFinishedAt is nil
* fix comments in cmdGracefulExit
libuplink was incorrectly setting timeouts to 10 seconds still, but
should have been at least 10 minutes. the order sender was setting them
to 1 hour. we don't want timeouts in uplink-side logic as it establishes
a minimum rate on tcp streams.
instead of all of this, just use tcp keep alive. tcp keep alive packets are
sent every 15 seconds and if the peer stops responding the connection
dies. this is enabled by default with go. this will kill tcp connections
when they stop working.
Change-Id: I3d7ad49f71950b3eb43044eedf4b17993116045b
* set up redis support in live accounting
* move live.Service interface into accounting package and rename to Cache, pass into satellite
* refactor Cache to store one int64 total, add IncrBy method to redis client implementation
* add monkit tracing to live accounting
* add exit-status command
* remove todo and fix format
* fix status display
* change startExit to exit progress
* fix linting error
* add successful column in exit progress
* fix test
* remove extra new line
* fix TYPOS
* format the percentage better
all of the packages and tests work with both grpc and
drpc. we'll probably need to do some jenkins pipelines
to run the tests with drpc as well.
most of the changes are really due to a bit of cleanup
of the pkg/transport.Client api into an rpc.Dialer in
the spirit of a net.Dialer. now that we don't need
observers, we can pass around stateless configuration
to everything rather than stateful things that issue
observations. it also adds a DialAddressID for the
case where we don't have a pb.Node, but we do have an
address and want to assert some ID. this happened
pretty frequently, and now there's no more weird
contortions creating custom tls options, etc.
a lot of the other changes are being consistent/using
the abstractions in the rpc package to do rpc style
things like finding peer information, or checking
status codes.
Change-Id: Ief62875e21d80a21b3c56a5a37f45887679f9412
* storagenode/storagenodedb: Migrate to separate dbs
* storagenode/storagenodedb: Add migration to drop versions tables
* Put drop table statements into a transaction.
* Fix CI errors.
* Fix CI errors.
* Changes requested from PR feedback.
* storagenode/storagenodedb: fix tx commit
* test that all nodes can check in with all satellites
* keep kademlia config
* add untrusted satellite test
* use getversion
* remove kademlia config changes in test-sim-backwards.sh
* add kademlia flags back to storj-sim storagenode
* reset kademlia flags in storagenode entrypoint
What:
cmd/inspector/main.go: removes kad commands
internal/testplanet/planet.go: Waits for contact chore to finish
satellite/contact/nodesservice.go: creates an empty nodes service implementation
satellite/contact/service.go: implements Local and FetchInfo methods & adds external address config value
satellite/discovery/service.go: replaces kad.FetchInfo with contact.FetchInfo in Refresh() & removes Discover()
satellite/peer.go: sets up contact service and endpoints
storagenode/console/service.go: replaces nodeID with contact.Local()
storagenode/contact/chore.go: replaces routing table with contact service
storagenode/contact/nodesservice.go: creates empty implementation for ping and request info nodes service & implements RequestInfo method
storagenode/contact/service.go: creates a service to return the local node and update its own capacity
storagenode/monitor/monitor.go: uses contact service in place of routing table
storagenode/operator.go: moves operatorconfig from kad into its own setup
storagenode/peer.go: sets up contact service, chore, pingstats and endpoints
satellite/overlay/config.go: changes NodeSelectionConfig.OnlineWindow default to 4hr to allow for accurate repair selection
Removes kademlia setups in:
cmd/storagenode/main.go
cmd/storj-sim/network.go
internal/testplane/planet.go
internal/testplanet/satellite.go
internal/testplanet/storagenode.go
satellite/peer.go
scripts/test-sim-backwards.sh
scripts/testdata/satellite-config.yaml.lock
storagenode/inspector/inspector.go
storagenode/peer.go
storagenode/storagenodedb/database.go
Why: Replacing Kademlia
Please describe the tests:
• internal/testplanet/planet_test.go:
TestBasic: assert that the storagenode can check in with the satellite without any errors
TestContact: test that all nodes get inserted into both satellites' overlay cache during testplanet setup
• satellite/contact/contact_test.go:
TestFetchInfo: Tests that the FetchInfo method returns the correct info
• storagenode/contact/contact_test.go:
TestNodeInfoUpdated: tests that the contact chore updates the node information
TestRequestInfoEndpoint: tests that the Request info endpoint returns the correct info
Please describe the performance impact: Node discovery should be at least slightly more performant since each node connects directly to each satellite and no longer needs to wait for bootstrapping. It probably won't be faster in real time on start up since each node waits a random amount of time (less than 1 hr) to initialize its first connection (jitter).