Commit Graph

83 Commits

Author SHA1 Message Date
Yingrong Zhao
6331f839ae
satellite/gracefulexit: not allow disqualified node to graceful exit (#3493) 2019-11-07 12:19:34 -05:00
Natalie Villasana
68a7790069 satellite/gracefulexit: select new node filtered by Distinct IP (#3435) 2019-11-06 16:38:51 -05:00
Egon Elbre
aa761700af satellite/satellitedb: update nodes in sorted order (#3446) 2019-11-01 18:07:23 +01:00
Maximillian von Briesen
54594e79c3 satellite/gracefulexit: add metrics on satellite for graceful exit (#3355) 2019-10-29 16:22:20 -04:00
Yingrong Zhao
fa1ac24e19
satellite/gracefulexit: add failure threshold check (#3329)
* add overall failure percentage check and inactive time frame check before sending a response to sno

* update comment

* delete node from transfer queue if it has been inactive for too long

* fix linting error

* add test config value

* fix nil pointer

* add config value into testplanet

* add unit test for overall failure threshold

* move timeframe threshold to chore

* update protolock

* add chore test

* add per peiece failure count logic

* change config name from EndpointMaxFailures to MaxFailuresPerPiece

* address comments

* fix linting error

* add error handling for no row returned from progress table

* fix test for graceful exit chore on storagenode

* fix typo InActive -> Inactive

* improve readability for failure threshold calculation

* update config lock

* change error handling for GetProgress in graceful exit endpoint on the satellite side

* return proper rpc error in endpoint

* add check in chore test for checking finish timestamp and queue
2019-10-24 12:24:42 -04:00
Maximillian von Briesen
abb567f6ae
cmd/satellite: add graceful exit reports command to satellite CLI (#3300)
* update lock file and add comment

* add created at and bytes transferred

* cleanup

* rename db func to GetGracefulExitNodesByTimeFrame

* fix flag

* split into two overlay functions

* := to =

* fix test

* add node not found error class

* fix overlay test

* suggested test changes

* review suggestions

* get exit status from overlay.Get()

* check rows.Err

* fix panic when ExitFinishedAt is nil

* fix comments in cmdGracefulExit
2019-10-22 21:06:01 -04:00
Bryan White
f468816f13
{internal/version,versioncontrol,cmd/storagenode-updater}: add rollout to storagenode updater (#3276) 2019-10-21 12:50:59 +02:00
Egon Elbre
3c438f31bd
satellite/satellitedb: remove sqlite support (#3296) 2019-10-19 00:27:57 +03:00
Natalie Villasana
45c35d7c3f
satellite/satellitedb: add exit_status column to nodes table (#3301) 2019-10-17 11:01:39 -04:00
Natalie Villasana
cf430d2d73
scripts: add check-monitoring script to detect changes to monkit calls (#3114) 2019-10-15 13:00:14 -04:00
Ethan Adams
a1275746b4
satellite/gracefulexit: Implement the 'process' endpoint on the satellite (#3223) 2019-10-11 17:18:05 -04:00
Maximillian von Briesen
f75893c1ba
satellite/overlay: do not include gracefully exiting nodes in node selection (#3211) 2019-10-08 15:03:38 -04:00
Maximillian von Briesen
0ea0d8c3da
satellite/overlay: remove overlay.IsVetted (#3203) 2019-10-08 09:25:41 -04:00
Natalie Villasana
4f2f8ae11b satellite/overlay: add UpdateExitStatus and GetExitingNodes for graceful exit (#3087) 2019-10-01 18:18:21 -04:00
Cameron
fd72de211c satellite/satellitedb: update node version columns in UpdateCheckIn (#3129)
* update node version columns in UpdateCheckIn

* tests and fix sqlite implementation

* check timestamps

* edit timestamp check
2019-09-26 02:07:39 +02:00
Jess G
93788e5218
remove kademlia: create upsert query to update uptime (#2999)
* create upsert query for check-in method

* add tests

* fix lint err

* add benchmark test for db query

* fix lint and tests

* add a unit test, fix lint

* add address to tests

* replace print w/ b.Fatal

* refactor query per CR comments

* fix disqualified, only set if null

* fix query

* add version to updatecheckin query

* fix version

* fix tests

* change version for tests

* add version to tests

* add IP, add transport, mv unit test

* use node.address as arg

* add last ip

* fix lint
2019-09-19 11:37:31 -07:00
Egon Elbre
8ef57a2af3
satellite/satellitedb: use noreturn (#3022) 2019-09-12 20:31:50 +03:00
Egon Elbre
3d410add40
satellite/overlay: avoid large statement for piece counts (#3001) 2019-09-12 00:38:58 +03:00
Jess G
2fc4d61610
implement contact.checkin method (#2952)
* implement contact.checkin method

* add batching to update uptime checks

* rm batching

* rm other unneeded things

* fix lint

* fix unit test

* changes per CR comments

* couple more CR changes

* add identity check into grpcOpt

* fix lint

* why do you fix the test

* revert test change

* stop contact chore for repair test

* put node in cache

* comment out contact chore. See what happens

* Revert "comment out contact chore. See what happens"

This reverts commit 2e45008e36a50e0a842ae455ac83de77093d4daa.

* try stopping contact earlier

* stop contact chore in uplink_test

* replace self on chore with *RoutingTable for access to latest node info

* Revert "stop contact chore in uplink_test"

This reverts commit 302db70f4071112d1b9f7ee0279225ea12757723.

* Revert "try stopping contact earlier"

This reverts commit 806cc3b82f9d598899dafd83da9315a1cb0cb43c.

* Revert "stop contact chore for repair test"

This reverts commit dd34de1cfdfc09b972186c9ab9a4f1e822446b79.
2019-09-10 09:05:07 -07:00
Bryan White
a33106df1c
satellite/satellitedb: persist piece counts to/from db (#2803) 2019-08-27 14:37:42 +02:00
Egon Elbre
00b2e1a7d7 all: enable staticcheck (#2849)
* by having megacheck in disable it also disabled staticcheck

* fix closing body

* keep interfacer disabled

* hide bodies

* don't use deprecated func

* fix dead code

* fix potential overrun

* keep stylecheck disabled

* don't pass nil as context

* fix infinite recursion

* remove extraneous return

* fix data race

* use correct func

* ignore unused var

* remove unused consts
2019-08-22 13:40:15 +02:00
Egon Elbre
2d69d47655
all: fix Error.New formatting (#2840) 2019-08-21 19:30:29 +03:00
Bryan White
6400d63a6c
satellite/satellitedb: Add piece count column to nodes table (#2795) 2019-08-19 12:58:13 +02:00
Yaroslav Vorobiov
2c769fe9d9
satellite/overlaycache: add missing audit and uptime success count (#2788) 2019-08-14 20:53:39 +03:00
ethanadams
c9b46f2fe2
V3-1987: Optimize audits stats persistence (#2632)
* Added batch update stats for recordAuditSuccessStatus
* Added batch update stats to recordAuditFailStatus
* added configurable batch size
* build individual update/delete statements so the statements can be batched into 1 call to the DB
* notified #config-changes channel and ran make update-satellite-config-lock
* updated tests to use batch update stats
2019-07-31 13:21:06 -04:00
Marc Schubert
64199d37bd
Improve Node Selection Query (#2648)
* Update overlaycache.go

Removes one select statement and columns gets filtered in first query. 

Needs to be tested agains real database that this query is working and faster!

* Correct linting

reorder scans that this fit to new sql result order
2019-07-29 14:32:43 +02:00
Egon Elbre
5d0816430f
rename all the things (#2531)
* rename pkg/linksharing to linksharing
* rename pkg/httpserver to linksharing/httpserver
* rename pkg/eestream to uplink/eestream
* rename pkg/stream to uplink/stream
* rename pkg/metainfo/kvmetainfo to uplink/metainfo/kvmetainfo
* rename pkg/auth/signing to pkg/signing
* rename pkg/storage to uplink/storage
* rename pkg/accounting to satellite/accounting
* rename pkg/audit to satellite/audit
* rename pkg/certdb to satellite/certdb
* rename pkg/discovery to satellite/discovery
* rename pkg/overlay to satellite/overlay
* rename pkg/datarepair to satellite/repair
2019-07-28 08:55:36 +03:00
Stefan Benten
e260f9f1cc
Remove unnecessary UNION and Selects (#2624) 2019-07-25 15:17:12 -04:00
Alexander Leitner
64b2769de3
discovery: parallelize refresh (#2535)
* parallelize discovery refresh

* add paginateQualifiedtest, address pr comments

* Remove duplicate uptime update

* Lower concurrency in Testplanet for discovery
2019-07-12 10:35:48 -04:00
JT Olio
a79c7d77f3 overlay cache: slight modification of node-is-online rules (#2490) 2019-07-09 22:36:09 -04:00
Egon Elbre
674742d1a7
satellite/datarepair: use reliability cache (#1976) 2019-07-09 01:04:35 +03:00
Alexander Leitner
19ab9852f2
Update node.proto to use time.Time instead of timestamp (#2482) 2019-07-08 14:24:42 -04:00
Kaloyan Raev
d32c907440
overlay.UpdateStats removes node from containment mode (#2419) 2019-07-02 18:16:25 +03:00
Stefan Benten
01beaa289a
Mask IP Addresses to subnets (#2305) 2019-06-24 17:33:18 +02:00
paul cannon
8948459166
do ip filtering in a more correct way? (#2301)
This doesn't solve much of the performance difficulty but ought to be
lots more correct in terms of proper node selection semantics.
2019-06-23 16:16:45 -05:00
Bill Thorp
8f47fca5d3
Remove audit / uptime ratio fields (#2247)
* removed ratios
2019-06-21 13:14:53 -04:00
Natalie Villasana
9386187fe6
add disqualification and new reputation system into overlay cache (#2227) 2019-06-20 09:56:04 -04:00
Maximillian von Briesen
ad8cad4909
Expand the inspector tool to provide node id's for each segment, rather than just numeric totals (#2205) 2019-06-18 18:22:14 -04:00
Bill Thorp
119a8fd3cc removed fields (#2234)
* removed fields

* sql tweaks
2019-06-18 13:40:28 -04:00
Bill Thorp
81134e97cc
nodes db - adds alpha and beta reputation fields (#2215)
* node dbx and migrate adding alpha and beta reputation
2019-06-18 09:45:02 -04:00
Maximillian von Briesen
8398fae9b5
Add node churn and containment/reverify monkit stats (#2217)
* add counters for nodes that have/have not been seen in the past 24 hours/week

* add additional uptime counters

* add monkit stats for containment mode
2019-06-18 08:54:52 -04:00
Ivan Fraixedes
35c8648330
[v3-1914] Storage node disqualification: Change type from bool to timestamp (#2212)
* satellite/satellitedb: Alter nodes disqualification column
  Change the type of the 'disqualification' column of the nodes table from
  boolean to timestamp.
* overlay/cache: Change Disqualified field type
  Change the Disqualified field type the NodeDossier struct type from bool
  to time.Time to match with the disqualified type used by the DB layer.
* satellite/satellitedb: Update queries uses disqualified
  Update the queries which uses the disqualified column due to the column
  type has been changed from boolean to nullable timestamp.
* docs/design: Update disqualification due impl changes
  Update the disqualification design document to contain the architectural
  change required to be able to restore unfair disqualified nodes in case
  of an unexpected cause (bug, mistake, hard network disconnection, etc.).
2019-06-18 11:14:31 +02:00
Natalie Villasana
25d7dda135 add disqualified check to node selection queries (#2102)
add disqualified check to queries, skip TestStatDB
2019-06-05 20:21:32 -04:00
JT Olio
29d16b4d68 satellite: add monkit task to missing places (#2108) 2019-06-04 13:55:37 +02:00
Cameron
e077b0d380
rename VetNode to IsVetted (#2097)
* rename VetNode to IsVetted
2019-06-03 10:53:30 -04:00
Egon Elbre
b8e0ac6377
satellitedb/overlaycache: avoid tx leak in case of error (#2095) 2019-06-03 17:37:43 +03:00
Natalie Villasana
6db9388082 add disqualified column to nodes table (#2086)
* add disqualified column to nodes table, update migrate script and testdata

* fix crazy formatting of postgres.v25.sql
2019-05-30 17:38:23 -04:00
Cameron
590b1a5a1d
Satellite voucher service (#2043)
* set up voucher service skeleton, basic test

* add VetNode db method

* basic test for VetNode

* encode and sign voucher functions

* fill out and sign vouchers

* test pass/fail voucher request

* match EncodeVoucher to other Encode functions
2019-05-30 15:52:33 -04:00
Cameron
4058c29ca4
filter duplicate node IPs (#1890)
* add last_ip field to dbx model node, generate dbx

* add last_ip to node proto, generate pb

* migrate

* resolve address in transport.DialNode, update lastIp in cache.UpdateAddress

* use net.SplitHostPort to isolate host address from port

* define DistinctIPs flag

* add test for GetIP

* select last_ip when querying for nodes

* if distinctIPs flag == true, query for nodes with distinct IPs

* some basic tests

* change last_ip to field 14 in proto

* remove comments

* check err

* change distinctIPs to distinctIP

* exclude IPs from newNodes in query for reputable nodes

* add index on last_ip

* only add to excludedIPs if flag is true

* test half new nodes returns distinct IPs

* fix alignment

* add test

* rework ip filter query, add retry logic, add switch for database driver

* add retry to SelectNewNodes

* change discovery intervals so IPs don't get overwritten

* remove TestGetIP

* edit updating node stats in test

* split exclude into nodeIDs and IPs

* separate non-distinct IP query into other function

* trigger checks

* remove else block
2019-05-22 16:06:27 -04:00
Egon Elbre
42562429f5 Optimize KnownUnreliableOrOffline SQL query (#1968) 2019-05-19 17:10:46 +02:00