Commit Graph

23 Commits

Author SHA1 Message Date
Yingrong Zhao
ee87846f0b satellite/contact: add placeholder for GetTime endpoint
Change-Id: I42f8479708f0558350c2280a398d84d145e8118f
2020-01-14 06:38:47 +00:00
Ethan
8859c36234 satellite/{downtime,contact}: Add CheckNodeAvailability for use within the downtime tracking chores.
https://storjlabs.atlassian.net/browse/V3-2545

Change-Id: I1dd54a0c77cb4905bb1f350beeb82c6f7700ee70
2020-01-02 18:24:11 +00:00
Egon Elbre
6615ecc9b6 common: separate repository
Change-Id: Ibb89c42060450e3839481a7e495bbe3ad940610a
2019-12-27 14:11:15 +02:00
Egon Elbre
d55288cf68 pkg/rpc: replace methods with direct calls to pb
Change-Id: I8bd015d8d316a2c12c1daceca1d9fd257f6f57bc
2019-12-22 17:12:43 +02:00
Egon Elbre
006baa9ca6 pkg/rpc: remove drpc aliases
We need to split up pb package, which means we cannot have a core package
that depends on them.

Change-Id: I7f4f6fd82f89a51a9b2ad08bf2b1207253b8a215
2019-12-22 16:58:08 +02:00
littleskunk
8b3444e088
satellite/nodeselection: don't select nodes that haven't checked in for a while (#3567)
* satellite/nodeselection: dont select nodes that havent checked in for a while

* change testplanet online window to one minute

* remove satellite reconfigure online window = 0 in repair tests

* pass timestamp into UpdateCheckIn

* change timestamp to timestamptz

* edit tests to set last_contact_success to 4 hours ago

* fix syntax error

* remove check for last_contact_success > last_contact_failure in IsOnline
2019-11-15 23:43:06 +01:00
Egon Elbre
ee6c1cac8a
private: rename internal to private (#3573) 2019-11-14 21:46:15 +02:00
littleskunk
7eb6724c92
logging: unify logging around satellite ID, node ID and piece ID (#3491)
* logging: unify logging around satellite ID, node ID and piece ID

* unify segment index
2019-11-05 22:04:07 +01:00
Jess G
4d85b11574
satellite/contact: improve errors in contact endpoints (#3356)
* improve errors in satellite contact endpoints

* add changes per CR comments

* update pingback method so it still updates node table

* fix err and returns

* fix zap logging to be better
2019-10-30 11:57:21 -07:00
JT Olio
1b66517664 contact: small typo
Change-Id: If9126ae518b5672bfd9163a8c9fc518727d5138b
2019-10-14 13:21:05 -06:00
Jennifer Li Johnson
b185dbbee2
satellite/discovery: remove discovery related code (#3175) 2019-10-14 10:57:01 -04:00
Jeff Wendling
098cbc9c67 all: use pkg/rpc instead of pkg/transport
all of the packages and tests work with both grpc and
drpc. we'll probably need to do some jenkins pipelines
to run the tests with drpc as well.

most of the changes are really due to a bit of cleanup
of the pkg/transport.Client api into an rpc.Dialer in
the spirit of a net.Dialer. now that we don't need
observers, we can pass around stateless configuration
to everything rather than stateful things that issue
observations. it also adds a DialAddressID for the
case where we don't have a pb.Node, but we do have an
address and want to assert some ID. this happened
pretty frequently, and now there's no more weird
contortions creating custom tls options, etc.

a lot of the other changes are being consistent/using
the abstractions in the rpc package to do rpc style
things like finding peer information, or checking
status codes.

Change-Id: Ief62875e21d80a21b3c56a5a37f45887679f9412
2019-09-25 15:37:06 -06:00
Egon Elbre
9ceff9f9c6 satellite/overlay: move CheckIn benchmark to overlay (#3095) 2019-09-20 16:35:52 -04:00
Jennifer Li Johnson
724bb44723
Remove Kademlia dependencies from Satellite and Storagenode (#2966)
What:

cmd/inspector/main.go: removes kad commands
internal/testplanet/planet.go: Waits for contact chore to finish
satellite/contact/nodesservice.go: creates an empty nodes service implementation
satellite/contact/service.go: implements Local and FetchInfo methods & adds external address config value
satellite/discovery/service.go: replaces kad.FetchInfo with contact.FetchInfo in Refresh() & removes Discover()
satellite/peer.go: sets up contact service and endpoints
storagenode/console/service.go: replaces nodeID with contact.Local()
storagenode/contact/chore.go: replaces routing table with contact service
storagenode/contact/nodesservice.go: creates empty implementation for ping and request info nodes service & implements RequestInfo method
storagenode/contact/service.go: creates a service to return the local node and update its own capacity
storagenode/monitor/monitor.go: uses contact service in place of routing table
storagenode/operator.go: moves operatorconfig from kad into its own setup
storagenode/peer.go: sets up contact service, chore, pingstats and endpoints
satellite/overlay/config.go: changes NodeSelectionConfig.OnlineWindow default to 4hr to allow for accurate repair selection
Removes kademlia setups in:

cmd/storagenode/main.go
cmd/storj-sim/network.go
internal/testplane/planet.go
internal/testplanet/satellite.go
internal/testplanet/storagenode.go
satellite/peer.go
scripts/test-sim-backwards.sh
scripts/testdata/satellite-config.yaml.lock
storagenode/inspector/inspector.go
storagenode/peer.go
storagenode/storagenodedb/database.go
Why: Replacing Kademlia

Please describe the tests:
• internal/testplanet/planet_test.go:

TestBasic: assert that the storagenode can check in with the satellite without any errors
TestContact: test that all nodes get inserted into both satellites' overlay cache during testplanet setup
• satellite/contact/contact_test.go:

TestFetchInfo: Tests that the FetchInfo method returns the correct info
• storagenode/contact/contact_test.go:

TestNodeInfoUpdated: tests that the contact chore updates the node information
TestRequestInfoEndpoint: tests that the Request info endpoint returns the correct info
Please describe the performance impact: Node discovery should be at least slightly more performant since each node connects directly to each satellite and no longer needs to wait for bootstrapping. It probably won't be faster in real time on start up since each node waits a random amount of time (less than 1 hr) to initialize its first connection (jitter).
2019-09-19 15:56:34 -04:00
Jess G
93788e5218
remove kademlia: create upsert query to update uptime (#2999)
* create upsert query for check-in method

* add tests

* fix lint err

* add benchmark test for db query

* fix lint and tests

* add a unit test, fix lint

* add address to tests

* replace print w/ b.Fatal

* refactor query per CR comments

* fix disqualified, only set if null

* fix query

* add version to updatecheckin query

* fix version

* fix tests

* change version for tests

* add version to tests

* add IP, add transport, mv unit test

* use node.address as arg

* add last ip

* fix lint
2019-09-19 11:37:31 -07:00
Jess G
fae2c2c9f5 satellite/contact: return status codes from endpoint (#3086) 2019-09-19 11:01:34 +03:00
Cameron
ccdd435610
defer client.close() (#3084) 2019-09-18 16:17:04 -04:00
Jess G
d3ef574b20 pkg/pb: minor changes to contact.proto (#3048)
* minor fixes to contact proto

* simply and rm nodeAddr object from client
2019-09-13 19:37:32 -05:00
Maximillian von Briesen
82a651ac3a
satellite/contact: Populate PeerIdentities table in satellitedb (#2998)
* Add peer identities db dependency to contact service
* Update peer identities db on contact checkin
2019-09-12 12:33:04 -04:00
Jess G
2fc4d61610
implement contact.checkin method (#2952)
* implement contact.checkin method

* add batching to update uptime checks

* rm batching

* rm other unneeded things

* fix lint

* fix unit test

* changes per CR comments

* couple more CR changes

* add identity check into grpcOpt

* fix lint

* why do you fix the test

* revert test change

* stop contact chore for repair test

* put node in cache

* comment out contact chore. See what happens

* Revert "comment out contact chore. See what happens"

This reverts commit 2e45008e36a50e0a842ae455ac83de77093d4daa.

* try stopping contact earlier

* stop contact chore in uplink_test

* replace self on chore with *RoutingTable for access to latest node info

* Revert "stop contact chore in uplink_test"

This reverts commit 302db70f4071112d1b9f7ee0279225ea12757723.

* Revert "try stopping contact earlier"

This reverts commit 806cc3b82f9d598899dafd83da9315a1cb0cb43c.

* Revert "stop contact chore for repair test"

This reverts commit dd34de1cfdfc09b972186c9ab9a4f1e822446b79.
2019-09-10 09:05:07 -07:00
Egon Elbre
a801fab66a
all: add archview annotations (#2964) 2019-09-10 16:24:16 +03:00
Jennifer Li Johnson
3387750280 storagenode/contact: create chore for nodes to ping satellites (#2877)
Creates a chore for nodes to announce themselves to their trusted satellites. Runs on startup and every hour thereafter
2019-09-06 12:14:03 -04:00
paul cannon
adfa16188b pkg/contact: bare-bones service and endpoint (#2941)
* pkg/contact: bare-bones service and endpoint

* split contact package into satellite and node

* use new contact protobuf types
2019-09-04 11:29:34 -07:00