* change satellite.Peer name to Core
* change to Core in testplanet
* missed a few places
* keep shared stuff in peer.go to stay consistent with storj/docs
* rm dup api code from sa peer, update storj-sim
* fix for backwards compat tests
* use env var instead of localhost
* changes per CR
* fix env var name
* skip peer for setup
* add overall failure percentage check and inactive time frame check before sending a response to sno
* update comment
* delete node from transfer queue if it has been inactive for too long
* fix linting error
* add test config value
* fix nil pointer
* add config value into testplanet
* add unit test for overall failure threshold
* move timeframe threshold to chore
* update protolock
* add chore test
* add per peiece failure count logic
* change config name from EndpointMaxFailures to MaxFailuresPerPiece
* address comments
* fix linting error
* add error handling for no row returned from progress table
* fix test for graceful exit chore on storagenode
* fix typo InActive -> Inactive
* improve readability for failure threshold calculation
* update config lock
* change error handling for GetProgress in graceful exit endpoint on the satellite side
* return proper rpc error in endpoint
* add check in chore test for checking finish timestamp and queue
* add metrics counter and chore
* updates metrics observer interval release default and dev default to 15min
* add more specific check for remote pointers
* add Counter field to metrics chore, add counter tests
* rm redundant ObjectCount suffix
* make pointer check easier to read
* change metrics.Config.Interval to ChoreInterval
* rm unneeded var
* fix comment
* update satellite config lock
* set up redis support in live accounting
* move live.Service interface into accounting package and rename to Cache, pass into satellite
* refactor Cache to store one int64 total, add IncrBy method to redis client implementation
* add monkit tracing to live accounting
* test that all nodes can check in with all satellites
* keep kademlia config
* add untrusted satellite test
* use getversion
* remove kademlia config changes in test-sim-backwards.sh
* add kademlia flags back to storj-sim storagenode
* reset kademlia flags in storagenode entrypoint
What:
cmd/inspector/main.go: removes kad commands
internal/testplanet/planet.go: Waits for contact chore to finish
satellite/contact/nodesservice.go: creates an empty nodes service implementation
satellite/contact/service.go: implements Local and FetchInfo methods & adds external address config value
satellite/discovery/service.go: replaces kad.FetchInfo with contact.FetchInfo in Refresh() & removes Discover()
satellite/peer.go: sets up contact service and endpoints
storagenode/console/service.go: replaces nodeID with contact.Local()
storagenode/contact/chore.go: replaces routing table with contact service
storagenode/contact/nodesservice.go: creates empty implementation for ping and request info nodes service & implements RequestInfo method
storagenode/contact/service.go: creates a service to return the local node and update its own capacity
storagenode/monitor/monitor.go: uses contact service in place of routing table
storagenode/operator.go: moves operatorconfig from kad into its own setup
storagenode/peer.go: sets up contact service, chore, pingstats and endpoints
satellite/overlay/config.go: changes NodeSelectionConfig.OnlineWindow default to 4hr to allow for accurate repair selection
Removes kademlia setups in:
cmd/storagenode/main.go
cmd/storj-sim/network.go
internal/testplane/planet.go
internal/testplanet/satellite.go
internal/testplanet/storagenode.go
satellite/peer.go
scripts/test-sim-backwards.sh
scripts/testdata/satellite-config.yaml.lock
storagenode/inspector/inspector.go
storagenode/peer.go
storagenode/storagenodedb/database.go
Why: Replacing Kademlia
Please describe the tests:
• internal/testplanet/planet_test.go:
TestBasic: assert that the storagenode can check in with the satellite without any errors
TestContact: test that all nodes get inserted into both satellites' overlay cache during testplanet setup
• satellite/contact/contact_test.go:
TestFetchInfo: Tests that the FetchInfo method returns the correct info
• storagenode/contact/contact_test.go:
TestNodeInfoUpdated: tests that the contact chore updates the node information
TestRequestInfoEndpoint: tests that the Request info endpoint returns the correct info
Please describe the performance impact: Node discovery should be at least slightly more performant since each node connects directly to each satellite and no longer needs to wait for bootstrapping. It probably won't be faster in real time on start up since each node waits a random amount of time (less than 1 hr) to initialize its first connection (jitter).
* Adjust arm32v6 and aarch64 docker images to match the hello-world image
* Update from master, fix potential bug in push-images target, and update storagenode deploy to handle arm64 image
Creates a new chore, dbcleanup, which can be used for routine deletion of items from the satellite database and adds functionality for deletion of expired serial numbers
* + add git worktree cleanup
+ specify commit in git worktree add
* fail backwards compatibility test if no postgres
* fix typo
* rm dir after worktree rm
* add jenkins runs backwards compat test, fix sa config issue
* try to use current branch
* add jenkins stage condition of branch name
* try when with branch env var
* only run test on master branch
* fix != to ==
* add comment, fix indentation
* run for all branches
* make comment more specific
* add stage to jenkins, add script for backwards compat tests
* debug backwards compat tests
* add setup in a script
* set env vars
* add more env ars
* mv to one script to debug
* debug api key problem
* add tmp dir
* rm print statements
What: Change cmd/uplink to use scopes
It moves the fields that will be subsumed by scopes into an explicit legacy section and hides their configuration flags.
Why: So that it can read scopes in from files and stuff
* Added batch update stats for recordAuditSuccessStatus
* Added batch update stats to recordAuditFailStatus
* added configurable batch size
* build individual update/delete statements so the statements can be batched into 1 call to the DB
* notified #config-changes channel and ran make update-satellite-config-lock
* updated tests to use batch update stats
* Added a gc package at satellite/gc, which contains the gc.Service, which runs garbage collection integrated with the metainfoloop, and the gc PieceTracker, which implements the metainfo loop Observer interface and stores all of the filters (about which pieces are good) for each node.
* Added a gc config located at satellite/gc/service.go (loop disabled by default in release)
* Creates bloom filters with pieces to be retained inside the metainfo loop
* Sends RetainRequests (or filters with good piece ids) to all storage nodes.
* pkg/datarepair: Add test to check num upload pieces
Add a new test for ensuring the number of pieces that the repair process
upload when a segment is injured.
* satellite/orders: Don't create "put order limits" over total
Repair must not create "put order limits" more than the total count.
* pkg/datarepair: Update upload repair pieces test
Update the test which checks the number of pieces which are uploaded
during a repair for using the same excess over the success threshold
value than the implementation.
* satellites/orders: Limit repair put order for not being total
Limit the number of put orders to be used by repair for only uploading
pieces to a % excess over the successful threshold.
* pkg/datarepair: Change DataRepair test to pass again
Make some changes in the DataRepair test to make pass again after the
repair upload repaired pieces only until a % excess over success
threshold.
Also update the steps description of the DataRepair test after it has been
changed, to match on what's now, besides to leave it more generic for
avoiding having to update it on minimal future refactorings.
* satellite: Make repair excess optimal threshold configurable
Add a new configuration parameter to the satellite for being able to
configure the percentage excess over the optimal threshold, used for
determining how many pieces should be repaired/uploaded, rather than
having the value hard coded.
* repairer: Add configurable param to segments/repairer
Add a new parameters to the segment/repairer to calculate the maximum
number of excess nodes, based on the optimal threshold, that repaired
pieces can be uploaded.
This new parameter has been added for not returning more nodes than the
number of upload orders for data repair satellite service calculate for
repairing pieces.
* pkg/storage/ec: Update log message in clien.Repair
* satellite: Update configuration lock file
* pkg/process/metrics: add an instance prefix
the distinction between which satellite is sending which
data should go in the instance field, not the suffix or application
fields. (un)fortunately, the instance id is deliberately not
configurable because we don't want it to be easy to accidentally
have multiple applications collide with the same instance id.
so we're currently stuffing the human readable instance in the
suffix. :(
perhaps a reasonable tradeoff would be an optional instance
prefix that allows operators to put their domain name in
the instance
Change-Id: I6fcc8498be908c5740439cc00f77474ad151febd
* linting
Change-Id: I9f9a44fa9a2634ef5e4f89548d42d57ce9e4450e
* move offer out of marketing package and remove marketing package
* fix imports
* fix rename errors
* remove offer service
* change package name from offers to rewards
* fix linting
* remove unused code and use appropriate comment
* scripts/tag-release.sh: libuplink release tagging
a couple of arch review meetings ago we discussed how to make sure
that it is much easier to get the release defaults for binaries,
libraries, and so on. we already imperfectly solved the binary
problem with the release.sh script, but (until now!) have not solved
the problem of getting release defaults for people building from
source.
the solution we seemed to all prefer was to make sure our tagged
version commits check the release state into the source code.
this script aides in tagging commits with version tags and
updating the source defaults. it still plays nicely with
release.sh and our other build processes.
after this is merged we should configure github/go modules to
prefer people use one of our tags instead of master (which will
keep dev defaults).
Change-Id: I36c5c33a1bc90ec1685f59b05dde779090e252b6
* gofmt release.go
Change-Id: I6e968eff86230496e9cbddecd767ca8d8ff36ba4
* regex for version tag
Change-Id: Icaa6d753ffc962115d961bcabe9daed89b16430c
* added some docs
Change-Id: Ide624fab794ce849e3a3e7254fb038251bba0c71
* add voucher service on storage node
* config field tag syntax, go routines for requests
* hook up voucher service in storagenode/peer.go
* add voucher config to testplanet
* add voucher config to testplanet
* add voucher response status INVALID, ACCEPTED, REJECTED
* add a test for vouchers service
* handle no row from GetValid, test it
* add trust pool to voucher service
* use trusted list to get satellites
* verify vouchers upon receipt
* test VerifyVoucher