Commit Graph

7 Commits

Author SHA1 Message Date
Jessica Grebenschikov
803e2930f4 satellite: use IP for all uplink operations, use hostname for audit and repairs
My understanding is that the nodes table has the following fields:
- `address` field which can be a hostname or an IP
- `last_net` field that is the /24 subnet of the IP resolved from the address

This PR does the following:
1) add back the `last_ip` field to the nodes table
2) for uplink operations remove the calls that the satellite makes to `lookupNodeAddress` (which makes the DNS calls to resolve the IP from the hostname) and instead use the data stored in the nodes table `last_ip` field. This means that the IP that the satellite sends to the uplink for the storage nodes could be approx 1 hr stale. In the short term this is fine, next we will be adding changes so that the storage node pushes any IP changes to the satellite in real time.
3) use the address field for repair and audit since we want them to still make DNS calls to confirm the IP is up to date
4) try to reduce confusion about hostname, ip, subnet, and address in the code base

Change-Id: I96ce0d8bb78303f82483d0701bc79544b74057ac
2020-03-11 09:11:40 -07:00
Egon Elbre
f3b4bf2b7c satellite/satellitedb/satellitedbtest: pass ctx as an argument
ctx is created in most tests, instead pass in as argument
to reduce code duplication.

Change-Id: I466c51c008392001129c8b007c9d6b3619935ac4
2020-01-20 16:35:42 +02:00
Egon Elbre
6615ecc9b6 common: separate repository
Change-Id: Ibb89c42060450e3839481a7e495bbe3ad940610a
2019-12-27 14:11:15 +02:00
Egon Elbre
ee6c1cac8a
private: rename internal to private (#3573) 2019-11-14 21:46:15 +02:00
Yingrong Zhao
fa1ac24e19
satellite/gracefulexit: add failure threshold check (#3329)
* add overall failure percentage check and inactive time frame check before sending a response to sno

* update comment

* delete node from transfer queue if it has been inactive for too long

* fix linting error

* add test config value

* fix nil pointer

* add config value into testplanet

* add unit test for overall failure threshold

* move timeframe threshold to chore

* update protolock

* add chore test

* add per peiece failure count logic

* change config name from EndpointMaxFailures to MaxFailuresPerPiece

* address comments

* fix linting error

* add error handling for no row returned from progress table

* fix test for graceful exit chore on storagenode

* fix typo InActive -> Inactive

* improve readability for failure threshold calculation

* update config lock

* change error handling for GetProgress in graceful exit endpoint on the satellite side

* return proper rpc error in endpoint

* add check in chore test for checking finish timestamp and queue
2019-10-24 12:24:42 -04:00
Maximillian von Briesen
abb567f6ae
cmd/satellite: add graceful exit reports command to satellite CLI (#3300)
* update lock file and add comment

* add created at and bytes transferred

* cleanup

* rename db func to GetGracefulExitNodesByTimeFrame

* fix flag

* split into two overlay functions

* := to =

* fix test

* add node not found error class

* fix overlay test

* suggested test changes

* review suggestions

* get exit status from overlay.Get()

* check rows.Err

* fix panic when ExitFinishedAt is nil

* fix comments in cmdGracefulExit
2019-10-22 21:06:01 -04:00
Natalie Villasana
4f2f8ae11b satellite/overlay: add UpdateExitStatus and GetExitingNodes for graceful exit (#3087) 2019-10-01 18:18:21 -04:00