storj

Author	SHA1	Message	Date
Cameron Ayer	b22bf16b35	satellite/overlay: add config flag for node selection free disk requirement Currently SNs report their free disk space once per hour. If a node becomes full, it has to wait until the next contact cycle begins to report; all the while receiving and failing upload requests. By increasing the minimum required disk space, we can give the storage nodes more time to report their space before the completely fill up. This change goes hand-in-hand with another change we want to implement: trigger capacity report on SN immediately upon falling below threshold. Change-Id: I12f778286c6c3f582438b0e2949765ac43325e27	2020-02-11 18:08:25 +00:00
Natalie Ventura Villasana	3900dadafd	satellite/overlay: find new nodes with ExcludedIPs Adds ExcludedIPs to the NodeCriteria for selecting new storage nodes. Previously, ExcludedIPs was only added to the NodeCriteria for selecting reputable storage nodes. Now that both are included in the FindStorageNodesWithPreferences call, it should no longer be possible to repair pieces to nodes that are on the same IP as nodes already storing pieces from that segment. Adds TestSelectNewStorageNodesExcludedIPs to make sure that SelectNewStorageNodes returns nodes with different IP addresses. https://storjlabs.atlassian.net/browse/V3-3011 Change-Id: Ic2d5e607cadeba6e8d5c40f9717149cb30880335	2020-02-10 23:45:17 +00:00
Jeff Wendling	7999d24f81	all: use monkit v3 this commit updates our monkit dependency to the v3 version where it outputs in an influx style. this makes discovery much easier as many tools are built to look at it this way. graphite and rothko will suffer some due to no longer being a tree based on dots. hopefully time will exist to update rothko to index based on the new metric format. it adds an influx output for the statreceiver so that we can write to influxdb v1 or v2 directly. Change-Id: Iae9f9494a6d29cfbd1f932a5e71a891b490415ff	2020-02-05 23:53:17 +00:00
Egon Elbre	0c0b47823d	satellite: use require.WithinDuration Noticed that assert/require has WithinDuration for comparing time.Time-s. Change-Id: Ia340896443f610d38799b7ef245b5775eecfc92b	2020-01-21 19:43:53 +02:00
Egon Elbre	f3b4bf2b7c	satellite/satellitedb/satellitedbtest: pass ctx as an argument ctx is created in most tests, instead pass in as argument to reduce code duplication. Change-Id: I466c51c008392001129c8b007c9d6b3619935ac4	2020-01-20 16:35:42 +02:00
Egon Elbre	a4026f97b8	satellite: fix test time comparisons Correct way to compare time that may have an error is to use InDelta. Change-Id: I0140892119c44c63fa042bbc7292ab91bb33a350	2020-01-20 10:17:20 +00:00
Natalie Ventura Villasana	6b1829f3c3	satellite/downtime: new chore estimates downtime Adds EstimationChore to the downtime package, which is an independent chore that finds offline nodes given a configurable limit, then uptime checks those nodes, and sets a last contact success or failure given a response. For failed nodes, the chore updates the amount of downtime the node has been offline in the DowntimeTracking table. Design doc section: https://github.com/storj/storj/blob/master/docs/blueprints/storage-node-downtime-tracking.md#estimating-offline-time Jira: https://storjlabs.atlassian.net/browse/V3-2545 Change-Id: I60af95803930bf9b33232b248bb20cca6f0e0b5f	2020-01-09 15:05:13 -05:00
Yingrong Zhao	76ee8a1b4c	satellite: remove UptimeReputation configs from codebase With the new storage node downtime tracking feature, we need remove current uptime reputation configs: UptimeReputationAlpha, UptimeReputationBeta, and UptimeReputationDQ. This is the first step of removing the uptime reputation columns from satellitedb Change-Id: Ie8fab13295dbf545e33aeda0c4306cda4ba54e36	2020-01-08 18:54:15 +00:00
Natalie Ventura Villasana	1cb0f80a8d	satellite/gracefulexit: dq node on exit fail Disqualifies a node when the node fails to complete a graceful exit. Adds a new DisqualifyNode method to the overlay cache, since there wasn't an existing method to disqualify a node but do nothing else to its stats. Adds checks to existing tests to make sure that a storage node that fails a graceful exit is marked as disqualified in the overlay cache. https: //storjlabs.atlassian.net/browse/V3-3342 Change-Id: I4d554a519ab59db31ad3b8e28764c8683a6e3888	2020-01-06 19:16:26 -05:00
Moby von Briesen	6c2e4cc0cd	satellite/overlay: Return NodeLastContact instead of a node dossier from overlay.GetOfflineNodesLimited We only care about node ID, address, and last contact success/failure from the downtime service, so the overlay should only return these values for the downtime-specific queries. Change-Id: I08a6ecfdd2a12b82cae62e87d6adeab53975bfce	2020-01-06 17:12:30 -05:00
Ethan	05b406e992	satellite:{downtime,overlay}: Implement offline node detection chore https://storjlabs.atlassian.net/browse/V3-3398 Change-Id: I598c3bad819026377d1d113c099dc9bba8b02742	2020-01-03 17:10:03 +00:00
Moby von Briesen	ff74b44c5f	satellite/overlay: Add ability for overlay to get offline nodes ordered by last checked time This is required for the downtime tracking service: https://storjlabs.atlassian.net/browse/V3-2545 Change-Id: I286cdc07d802393948eb10c25c45ba78cc3ceafc	2020-01-02 16:39:38 +00:00
Egon Elbre	6615ecc9b6	common: separate repository Change-Id: Ibb89c42060450e3839481a7e495bbe3ad940610a	2019-12-27 14:11:15 +02:00
Kaloyan Raev	f8d0864630	satellite/metainfo: use KnownReliable in DeleteObjectPieces This reduces the number of queries to the overlay when deleting objects. Change-Id: I28ed2c2d225e0c5eb1a8d952235fa7e5837a48d1	2019-12-18 12:38:22 +00:00
Kaloyan Raev	5ee1a00857	satellite/overlay: filter reliable nodes from a list Adds the KnownReliable method to Overlay Service that filters all nodes from the given list to be only reliable nodes (online and qualified). The method return []*pb.Node of reliable nodes. The pb.Node values are ready for dialing. The first use case is when deleting an object to efficiently dial all reliable nodes holding a piece of that object and send them a delete request. Change-Id: I13e0a8666f3807c5c31ef1a1087476018a5d3acb	2019-12-17 21:20:08 +00:00
Egon Elbre	7a36507a0a	private/testcontext: ensure we call cleanup everywhere Change-Id: Icb921144b651611d78f3736629430d05c3b8a7d3	2019-12-17 14:16:09 +00:00
littleskunk	8b3444e088	satellite/nodeselection: don't select nodes that haven't checked in for a while (#3567 ) * satellite/nodeselection: dont select nodes that havent checked in for a while * change testplanet online window to one minute * remove satellite reconfigure online window = 0 in repair tests * pass timestamp into UpdateCheckIn * change timestamp to timestamptz * edit tests to set last_contact_success to 4 hours ago * fix syntax error * remove check for last_contact_success > last_contact_failure in IsOnline	2019-11-15 23:43:06 +01:00
Egon Elbre	ee6c1cac8a	private: rename internal to private (#3573 )	2019-11-14 21:46:15 +02:00
Egon Elbre	1e64006e32	lint: add staticcheck as a separate step (#3569 )	2019-11-14 10:31:30 +02:00
Natalie Villasana	68a7790069	satellite/gracefulexit: select new node filtered by Distinct IP (#3435 )	2019-11-06 16:38:51 -05:00
Maximillian von Briesen	54594e79c3	satellite/gracefulexit: add metrics on satellite for graceful exit (#3355 )	2019-10-29 16:22:20 -04:00
Yingrong Zhao	fa1ac24e19	satellite/gracefulexit: add failure threshold check (#3329 ) * add overall failure percentage check and inactive time frame check before sending a response to sno * update comment * delete node from transfer queue if it has been inactive for too long * fix linting error * add test config value * fix nil pointer * add config value into testplanet * add unit test for overall failure threshold * move timeframe threshold to chore * update protolock * add chore test * add per peiece failure count logic * change config name from EndpointMaxFailures to MaxFailuresPerPiece * address comments * fix linting error * add error handling for no row returned from progress table * fix test for graceful exit chore on storagenode * fix typo InActive -> Inactive * improve readability for failure threshold calculation * update config lock * change error handling for GetProgress in graceful exit endpoint on the satellite side * return proper rpc error in endpoint * add check in chore test for checking finish timestamp and queue	2019-10-24 12:24:42 -04:00
Maximillian von Briesen	abb567f6ae	cmd/satellite: add graceful exit reports command to satellite CLI (#3300 ) * update lock file and add comment * add created at and bytes transferred * cleanup * rename db func to GetGracefulExitNodesByTimeFrame * fix flag * split into two overlay functions * := to = * fix test * add node not found error class * fix overlay test * suggested test changes * review suggestions * get exit status from overlay.Get() * check rows.Err * fix panic when ExitFinishedAt is nil * fix comments in cmdGracefulExit	2019-10-22 21:06:01 -04:00
Natalie Villasana	45c35d7c3f	satellite/satellitedb: add exit_status column to nodes table (#3301 )	2019-10-17 11:01:39 -04:00
Jennifer Li Johnson	b185dbbee2	satellite/discovery: remove discovery related code (#3175 )	2019-10-14 10:57:01 -04:00
Ethan Adams	a1275746b4	satellite/gracefulexit: Implement the 'process' endpoint on the satellite (#3223 )	2019-10-11 17:18:05 -04:00
Maximillian von Briesen	f75893c1ba	satellite/overlay: do not include gracefully exiting nodes in node selection (#3211 )	2019-10-08 15:03:38 -04:00
Maximillian von Briesen	0ea0d8c3da	satellite/overlay: remove overlay.IsVetted (#3203 )	2019-10-08 09:25:41 -04:00
Jennifer Li Johnson	7ceaabb18e	Delete Bootstrap and Kademlia (#2974 )	2019-10-04 16:48:41 -04:00
Natalie Villasana	4f2f8ae11b	satellite/overlay: add UpdateExitStatus and GetExitingNodes for graceful exit (#3087 )	2019-10-01 18:18:21 -04:00
Cameron	fd72de211c	satellite/satellitedb: update node version columns in UpdateCheckIn (#3129 ) * update node version columns in UpdateCheckIn * tests and fix sqlite implementation * check timestamps * edit timestamp check	2019-09-26 02:07:39 +02:00
Egon Elbre	9ceff9f9c6	satellite/overlay: move CheckIn benchmark to overlay (#3095 )	2019-09-20 16:35:52 -04:00
paul cannon	53db517154	satellite/overlay: don't use transport observers (#2989 )	2019-09-19 16:22:50 -04:00
Jess G	93788e5218	remove kademlia: create upsert query to update uptime (#2999 ) * create upsert query for check-in method * add tests * fix lint err * add benchmark test for db query * fix lint and tests * add a unit test, fix lint * add address to tests * replace print w/ b.Fatal * refactor query per CR comments * fix disqualified, only set if null * fix query * add version to updatecheckin query * fix version * fix tests * change version for tests * add version to tests * add IP, add transport, mv unit test * use node.address as arg * add last ip * fix lint	2019-09-19 11:37:31 -07:00
Jennifer Li Johnson	ce3203e910	update NodeSelectionConfig.OnlineWindow to 4hr default (#3082 )	2019-09-18 14:57:57 -04:00
Maximillian von Briesen	574c96c350	satellite/metainfo: Verify storagenode signature on satellite upload (#2985 )	2019-09-18 09:50:33 -04:00
Natalie Villasana	aa3567187e	satellite/audit: worker now verifies and reverifies (#2965 )	2019-09-11 18:37:01 -04:00
Egon Elbre	3d410add40	satellite/overlay: avoid large statement for piece counts (#3001 )	2019-09-12 00:38:58 +03:00
Egon Elbre	a801fab66a	all: add archview annotations (#2964 )	2019-09-10 16:24:16 +03:00
Bryan White	a33106df1c	satellite/satellitedb: persist piece counts to/from db (#2803 )	2019-08-27 14:37:42 +02:00
aligeti	33aff71959	satellitedb/overlay: add database for storing peer identities (#2764 )	2019-08-26 19:49:42 +03:00
Egon Elbre	00b2e1a7d7	all: enable staticcheck (#2849 ) * by having megacheck in disable it also disabled staticcheck * fix closing body * keep interfacer disabled * hide bodies * don't use deprecated func * fix dead code * fix potential overrun * keep stylecheck disabled * don't pass nil as context * fix infinite recursion * remove extraneous return * fix data race * use correct func * ignore unused var * remove unused consts	2019-08-22 13:40:15 +02:00
Bryan White	6400d63a6c	satellite/satellitedb: Add piece count column to nodes table (#2795 )	2019-08-19 12:58:13 +02:00
Isaac Hess	e34b2c553c	Reduce UpdateAddress calls with address cache (#2681 )	2019-08-06 16:56:12 -06:00
Egon Elbre	c8edeb0257	satellite/overlay: rename overlay.Cache to overlay.Service (#2717 )	2019-08-06 19:35:59 +03:00
ethanadams	c9b46f2fe2	V3-1987: Optimize audits stats persistence (#2632 ) * Added batch update stats for recordAuditSuccessStatus * Added batch update stats to recordAuditFailStatus * added configurable batch size * build individual update/delete statements so the statements can be batched into 1 call to the DB * notified #config-changes channel and ran make update-satellite-config-lock * updated tests to use batch update stats	2019-07-31 13:21:06 -04:00
Egon Elbre	ec3d5c0bdd	don't use global loggers (#2671 ) * pkg/server: don't use global logger * satellite/overlay: use correct logger * pkg/kademlia: use correct logger * linksharing: use conventional way to pass in logger * use zaptest in tests	2019-07-31 15:09:45 +03:00
Egon Elbre	5d0816430f	rename all the things (#2531 ) * rename pkg/linksharing to linksharing * rename pkg/httpserver to linksharing/httpserver * rename pkg/eestream to uplink/eestream * rename pkg/stream to uplink/stream * rename pkg/metainfo/kvmetainfo to uplink/metainfo/kvmetainfo * rename pkg/auth/signing to pkg/signing * rename pkg/storage to uplink/storage * rename pkg/accounting to satellite/accounting * rename pkg/audit to satellite/audit * rename pkg/certdb to satellite/certdb * rename pkg/discovery to satellite/discovery * rename pkg/overlay to satellite/overlay * rename pkg/datarepair to satellite/repair	2019-07-28 08:55:36 +03:00

48 Commits