storj

Author	SHA1	Message	Date
Cameron	0596651580	satellite/satellitedb: fix updating nodes.last_software_update_email The CASE expression used to determine which value to set last_software_update_email to did not have an ELSE clause. Therefore, when the node is both below the minimum version and did not receive a version update email (no condition is true), the value would be set to NULL. Additionally, replace `time.Now()` with `timestamp` in the check to determine if the email cooldown has passed. Change-Id: I2e2e93f1a865e123ed8b665be9621cebfb72236f	2023-02-01 17:25:58 -05:00
paul cannon	b6bcb32ecf	satellite/reputation: more accurate "reputation changes" list `overlay.(Service).UpdateReputation()` takes a "reputationChanges" parameter, a slice of node events indicating whether we think the node's disqualification or suspension status is changing. This is necessary so that the overlay service can notify the nodeevents DB about these changes. In several cases, however, this list of events is not constructed correctly, because of missing information about the previous state. In most cases, this is because the node was offline, and the order limit creation functions (which usually obtain and return the prior reputation status) ignored that node. This change makes it so that all callers to `overlay.(Service).UpdateReputation()` can be expected to provide a correct list of change events (as correct as feasible, given that we can't lock the node's information in the database during the entire operation). It ended up that there was only one caller we needed to worry about, and that was reputation.(*Service).ApplyAudit(). So the bulk of this change is teaching that function how to recognize when the prior reputation status was not filled in, and fill it in. Refs: https://github.com/storj/storj/issues/5464 Change-Id: I52ce385fc9c0ce3b283b998d517998e7f4ec8792	2023-01-31 18:39:40 +00:00
JT Olio	e40191afd6	storj: upgrade to use latest storj/common NodeAddress Change-Id: I5987391bcfe5f6dfd7b525698c337a4cbda9b76e	2023-01-25 01:37:26 +00:00
Fadila Khadar	5c3a148d6e	satellite/overlaycache: fix typo in UpdateCheckIn request - fix a malformed SQL query - add test to be sure we don't have this problem again. Change-Id: I3fde8c59ba01335411e51d964bec95bc26cfc961	2022-12-14 22:21:45 +01:00
paul cannon	ed0fa59f23	satellite/overlay: add SetNodeContained() method SetNodeContained() will change the contained flag in the nodes table, which will affect whether nodes are selected for new uploads. This flag _should_ correlate with whether or not a given node has any entries in the reverification queue. However, the reverification queue is intended to be 'safely partitionable' from the nodes table, so we can't enforce that characteristic transactionally. But this is ok; there are no dire consequences if they are out of sync. We will be adding a chore that updates the contained flag based on the contents of the reverification queue periodically, if something fails to set it directly when appropriate. Refs: https://github.com/storj/storj/issues/5231 Change-Id: I26460d8718dee63fd55d00a44568b2065fc8fe30	2022-12-01 12:43:40 +00:00
Cameron	87660bd9b3	satellite/overlay/offlinenodes: insert offline nodes into node events Add a new chore to periodically insert nodes who are offline and have not gotten an offline email in a certain amount of time into node events Change-Id: I658b385bb777b0240c98092946a93d65bee94abc	2022-11-18 12:10:06 -05:00
Cameron	8681a36164	satellite/overlay: add ability to get offline nodes in need of email Add LastOfflineEmail to overlay.NodeDossier. This is the last time a node got an offline email. Add two new overlay db methods, GetOfflineNodesForEmail and UpdateLastOfflineEmail. Edit db method UpdateCheckIn to nullify last_offline_email if node is up. Change-Id: I1ee60e7d98dd1b68348a57f9a4fb77c6c9895d6d	2022-11-17 19:03:04 +00:00
Cameron	2a25974261	satellite/overlay: insert node software update events into node events When a node checks in and its version is below the minimum, insert BelowMinVersion event into node events Change-Id: I0e437ac34496778369515cbc40c15676da8b27ae	2022-11-11 20:43:56 +00:00
Cameron	d856569935	satellite/overlay: node software update email cooldown config Change-Id: I792eef4a570e38e94f79a3f73c204b65f86ab541	2022-11-11 20:14:25 +00:00
Cameron	cb0c359b81	satellite/overlay: insert DQ node events for stray nodes Change-Id: I99da11e506ab7f6bcebdb08a5815078a3297c932	2022-11-04 15:48:17 +00:00
Cameron	74ddfab810	satellite/overlay: insert DQ event into node events in overlay.DisqualifyNode Also, return node email from overlaycache db DisqualifyNode to be used in node events insertion Change-Id: I41534cf01351c1690c3966a8055c5fe6fcf0d6a6	2022-11-04 15:18:31 +00:00
Cameron	68fe26ebe5	satelite/overlay: insert reputation events into node events Insert reputation event into node events if reputation change occurs. Change-Id: If1c5526092cb6834fe2faa6aa6e0306d4d88a4b7	2022-11-02 18:32:20 +00:00
Cameron	865974950d	satellite/overlay: insert node online events into node events table Instead of sending emails at the time the node is seen to be back online, we have decided to send the event to the node events table, which will initiate the email sending process at some point. Change-Id: Id756209498112579de8e78ee20ad2df54571a617	2022-11-02 16:26:19 +00:00
Cameron	f06da25c3d	satellite/overlay: add nodeevents.DB to satellite overlay service Add nodeevents.DB to satellite overlay service so we can insert node events into the nodeevents DB. Change-Id: I642c0ccc9941ecdb08cb22d5c8cf701959a55156	2022-11-02 15:56:37 +00:00
Cameron	8c8688ca6b	satellite/overlay: return email as part of NodeReputation Make email accessible for email sending after reputation updates Change-Id: I760feee0f6ca58b76a2955a04c0c366c618656bb	2022-10-26 17:33:22 +00:00
Cameron	a2ca443e29	satellite/overlay: send Node Online emails Send an email when a node goes from offline to online. Change-Id: I82d3f9001b9b0669e096d784edf097ffdcb1697d	2022-10-20 18:56:35 +00:00
Cameron	a52f766273	satellite/overlay: add email-sending functionality to overlay service We want to send emails to SNOs. Node status changes go through the overlay service, so it's a good place to add the mail service. Add the mailservice.Service, satellite address, and satellite name to overlay service. Also add feature flag --overlay.send-node-emails Change-Id: I3bd2cb3bf22f9724954ce2374f8b651b902b3a24	2022-10-13 18:01:05 +00:00
Michal Niewrzal	a22e6bdf67	satellite/gc/bloomfilter: use int64 to count pieces Pieces count in DB are stored as int64 and we would like to align bloom filter processing with this type. Change-Id: Iaec767e609a40d802077ae057520541805a7c44f	2022-09-22 09:39:53 +00:00
Egon Elbre	cf50696745	cmd/tools/segment-verify: wire up overlay logic Change-Id: I0a4c737a8b0995a1c3e3adeac728fe833d0ce684	2022-09-19 11:32:18 +03:00
paul cannon	07bbe7d340	satellite/overlay: don't insert new nodes if contact check failed Currently, the satellite tracks connectivity information about all nodes that have contacted it, even if we have never successfully contacted the node back. This behavior was leveraged during a security audit to create hundreds of thousands of "junk nodes" in the nodes table on one satellite, which affected performance of queries such as node selection. With this change, we should no longer track information about nodes that have never been successfully contacted. Note that it will still be possible to cause the creation of "junk node" entries in the db; the attacker just has to set up individual publicly-routable IP+port pairs for each node as it is created, so it can respond to a PingBack. Change-Id: Ibb6da6cc908fd4fc85aae1ba00313ba2738409ab	2022-08-26 09:05:04 +00:00
JT Olio	12ef68bdf4	satellitedb: restore-trash shouldn't bother with nodes we've never talked to Change-Id: I3c631e92a9a2670c52c9fa23b2b1baf67d3c9a9c	2022-08-25 16:14:29 +00:00
paul cannon	0dcc0a9ee0	satellite/reputation: reconfigure lambda and alpha This is in response to community feedback that our existing reputation calculation is too likely to disqualify storage nodes unfairly with extreme swings up and down. For details and analysis, please see the data_loss_vs_dq_chance_sim.py tool, the "tuning reputation further.ipynb" Jupyter notebook in the storj/datascience repository, and the discussion at https://forum.storj.io/t/tuning-audit-scoring/14084 In brief: changing the lambda and initial-alpha parameters in this way causes the swings in reputation to be smaller and less likely to put a node past the disqualification threshold unfairly. Note: this change will cause a one-time reset of all (non-disqualified) node reputations, because the new initial alpha value of 1000 is dramatically different, and the disqualification threshold is going to be much higher. Change-Id: Id6dc4ba8fde1be3db4255b72282207bab5491ca3	2022-08-17 18:52:53 +00:00
paul cannon	2f20bbf4d8	satellite/reputation: add a reputation write cache This should lower the amount of database load coming from reputation updates. Change-Id: Iaacfb81480075261da77c5cc93e08b24f69f8949	2022-07-14 21:40:16 +00:00
Egon Elbre	51d4e5c275	satellite/{orders,overlay}: use cache for downloads Use DownloadSelectionCache to avoid querying database for every download. This change only addresses downloads from users. The download selection cache is not currently used for audit and repair. Change-Id: I96a49e121dac0b4204f97592a63131edabd73fb5	2022-07-12 11:04:34 +00:00
Egon Elbre	48b0a65fbd	satellite/overlay: use ReadCache in Download/UploadSelectionCache sync2.ReadCache implements preemptive refreshing preventing stalling while it's being updated. Change-Id: Iee9ef36049b986f0e426c14a139b2bc9ac17fb53	2022-07-12 13:52:48 +03:00
Egon Elbre	65b5a0fe82	satellite/{overlay,satellitedb}: add AS OF to download selection This should reduce the load caused by download selection queries. Change-Id: Ic1a89d9c3eb5418f0792eb20ec2aece18dc63f2c	2022-06-28 05:18:01 +00:00
paul cannon	4e81a60838	satellite/overlay: fix TestNodeSelectionGracefulExit This test had an effective config.Reputation.AuditCount = 0, meaning all nodes that had _any_ positive audit results were considered vetted. Because of that, only one node in the test setup was "new". And that node was marked as being in GE, so could not be returned by node selection. The reason the tests still worked is because of the node selection rule that says "if there are no new nodes at all, just get all reputable nodes to satisfy the request". This commit makes it so half of the nodes are vetted and half new, which makes the test somewhat more interesting (and means we aren't concentrating too much on testing details of behavior when AuditCount is 0). Change-Id: I09157b7dc20ecaddd2a6e60cfe146e9186e3603b	2022-06-17 18:28:04 +00:00
Paul Willoughby	911cc1e163	satellite/contact: reject privateIPs in PingMe and CheckIn endpoints prevent network enumeration by rejecting privateIPs in PingMe and Checkin endpoints Closes storj/storj-private#32 Change-Id: I63f00483ff4128ebd5fa9b7b8da826a5706748c9	2022-06-07 08:09:14 +00:00
Yaroslav Vorobiov	3f47d19aa6	satellite/overlay: add disqualification reason Add disqualification reason to NodeDossier. Extend DB.DisqualifyNode with disqualification reason. Extend reputation Service.TestDisqualifyNode with disqualification reason. Change-Id: I8611b6340c7f42ac1bb8bd0fd7f0648ad650ab2d	2022-04-20 13:29:31 +00:00
Yaroslav Vorobiov	4223fa01f8	satellite/reputation: add disqualification reason for status update Set disqualification reason when reputations stats are updated on DB.Update. Added tests for DisqualifyNode and for disqualification cases which happens during Update. Change-Id: I00130ab5d9722422805159ad2f183c205de60f7e	2022-04-20 13:29:10 +00:00
Fadila Khadar	29fd36a20e	satellite/repairer: handle excluded countries For nodes in excluded areas, we don't necessarily want to remove them from the pointer, but we do want to increase the number of pieces in the segment in case those excluded area nodes go down. To do that, we increase the number of pieces repaired by the number of pieces in excluded areas. Change-Id: I0424f1bcd7e93f33eb3eeeec79dbada3b3ea1f3a	2022-03-14 10:59:36 -04:00
Moby von Briesen	b2d342aa9b	satellite/overlay: Add ability to exclude country codes on upload Create global config to specify a list of country codes that should be excluded from node selection during uploads. This exclusion is not implemented when the upload selection cache is disabled. Change-Id: Ic41e8b4f18857a11045668eac23107da99668a72	2022-03-03 16:58:48 +00:00
Fadila Khadar	e776c65172	satellite/checker: pieces in excluded countries are not healthy Add a RepairExcludedCountryCodes config flag for overlay for providing a list of country codes to exclude nodes from target repair selection. Mark segments with less than repairThreshold pieces in countries not in the RepairExcludedCountryCodes as not healthy. With this change, the repair process is not affected. The segment will be removed from the repair queue by the repairer. Another change will handle the logic at the repairer level. Fixes https://github.com/storj/team-metainfo/issues/95 Change-Id: I9231b32de117a116488de055a3e94efcabb46e81	2022-03-02 09:59:09 +00:00
Egon Elbre	be02aa9b17	satellite/overlay: add test for UpdateCheckIn panic The database table got invalid input and the resulting error was not checked. This adds updates that contain invalid fields to trigger different errors. Change-Id: Iacea32cbef5599aab562c88e4113073596cc9996	2022-01-24 15:01:59 +02:00
Yingrong Zhao	1f8f7ebf06	satellite/{audit, reputation}: fix potential nodes reputation status inconsistency The original design had a flaw which can potentially cause discrepancy for nodes reputation status between reputations table and nodes table. In the event of a failure(network issue, db failure, satellite failure, etc.) happens between update to reputations table and update to nodes table, data can be out of sync. This PR tries to fix above issue by passing through node's reputation from the beginning of an audit/repair(this data is from nodes table) to the next update in reputation service. If the updated reputation status from the service is different from the existing node status, the service will try to update nodes table. In the case of a failure, the service will be able to try update nodes table again since it can see the discrepancy of the data. This will allow both tables to be in-sync eventually. Change-Id: Ic22130b4503a594b7177237b18f7e68305c2f122	2022-01-06 21:05:59 +00:00
Egon Elbre	a42b9d1a48	all: fix uses of email.com email.com is not a domain that should be used for examples nor tests. Change-Id: I654d4287d02633d5ed9740e81a79150470eeaf25	2022-01-05 16:29:19 +02:00
Márton Elek	76c2228fbd	satellite/metainfo: propagate geofencing between buckets and stream id Github: https://github.com/storj/storj/issues/4245 Change-Id: I83d34367aab1f3c0d46a044f54980b2d50174b19	2021-11-24 08:05:05 +00:00
Kaloyan Raev	f773bb80c8	mod: bump common to fetch latest placement type changes Change-Id: I3d0813f05622e706c9be1a578b5e4d4159d16dfc	2021-11-16 12:42:25 +00:00
Mya	bf51c286d9	satellite/geoip: update node check-in to associate a country code Resolves https://github.com/storj/storj/issues/4247 Change-Id: Idfd71bf1795d48ca3c686066bbdb95b9c6594f00	2021-11-10 16:44:41 +01:00
Egon Elbre	d5628740fd	satellite/overlay/straynodes: prevent disqualification in tests Currently the test threshold for unneeded node checkins are 30s. The disqualification threshold was at 30s, which means, it was possible for all the nodes to get disqualified. Hopefully fixes #4267 Change-Id: I6b0a10c09b7fd90a9729794885c9e7a593781bad	2021-11-09 12:29:21 +00:00
Márton Elek	fb604be460	satellite/gracefulexit: use nodecache with gracefulexit Change-Id: I700caf6dfd06bd3b3970a8cf7c5a79da4af27b5f	2021-11-05 19:14:14 +00:00
Cameron Ayer	1de8a695e8	satellite/{overlay,satellitedb}: fix stray nodes DQ bug We had a bug in the stray nodes chore where nodes who had not been seen in several months were not being DQd. We figured out that this was happening because we were using two queries: The first to grab nodes where last_contact_success < some cutoff, the second to DQ them unless last_contact_success == '0001-01-01 00:00:00+00'. The problem is that if all of the nodes returned from the first query had last_contact_success of '0001-01-01 00:00:00+00', we would pass them to the second query which would not DQ them. This would result in the stray nodes DQ loop ending since we found a number of nodes to DQ less than the limit. The fix: add the "WHERE last_contact_success != '0001-01-01 00:00:00+00'::timestamptz" to the selection query. Change-Id: I4e60de90b68d8745d641b4467c2b23e0e56f7dff	2021-11-02 17:05:00 +00:00
Egon Elbre	f2d8e97d97	satellite/satellitedb: simplify select nodes query construction Change-Id: I07009b28762d4485929a2a999e8f4be8179bee51	2021-10-22 07:41:07 +00:00
Egon Elbre	51f488fc4b	satellite/satellitedb: fix selection query with AOST Union query for both reputable and new nodes didn't properly work. The top level query is required to have an `AS OF SYSTEM TIME` statement as well. Change-Id: I8ee6dd5b700c2b1ed2aa562962bfa72be7eec30a	2021-10-19 16:59:40 +03:00
Cameron Ayer	bb21551a9c	satellite/satellitedb: remove references to contained column in nodes table We don't use this column for anything. If you want to know if a node is contained, you can check the pending_audits table. Change-Id: I8da1d8e01a2dcaff63c5067a7927b5451424ad04	2021-10-14 19:17:46 +00:00
Yingrong Zhao	e00b573f8f	satellite/overlay: fix UpdateCheckIn comment Change-Id: I8584895249d7e5be6dbec79974fc9f77b7a1930c	2021-08-06 05:54:00 +00:00
Egon Elbre	65804801ec	all: fix mon.Task leak Change-Id: Ifd58c7ac5631b9c3c750b3f4cc50525167e90709	2021-08-05 14:07:45 +03:00
Yingrong Zhao	ae02f6deda	satellite/reputation: return default reputation stats when node is not found Change-Id: I587d0ab36ffa0efaf345a6a6e221ae5d2068e1c5	2021-08-04 19:34:54 +00:00
Yingrong Zhao	646ce5b8cc	satellite/overlay: remove reputation logic from overlay Change-Id: I3492860e4537c7a8e4e824ec4c9c8d179134a0c0	2021-07-28 15:15:28 -04:00
Yingrong Zhao	f8914ccce0	satellite/{repair, overlay}: use reputation store in repair Change-Id: I48db9e68f48239d48621ccc77d33618ecb83ce1a	2021-07-28 13:22:05 -04:00
Yingrong Zhao	e91574cee1	satellite/{reputation, gracefulexit}: use reputation store in gracefulexit With the effort to move audit related data into reputation store, this PR updates gracefulexit endpoint to use reputation service to get a node's audit score Change-Id: Iad93ea689ad67ff9c57c7be16687e21e715fab7a	2021-07-28 13:21:41 -04:00
Yingrong Zhao	6c7bf357cd	satellite/{reputation,audit,overlay}: replace overlay with reputation package in audit This PR implements reputation store and replace overlay in audit service to use such store for storing node's audit stats. In order to keep the changeset smaller, most of the changes in this PR is for copying audit logic in overlay to reputation package. In a following PR, the duplicating code will be removed from overlay. Change-Id: I16c12494a0970f44c422b26cf603c1dc489e5bc1	2021-07-28 13:10:48 -04:00
Cameron Ayer	8c124c6fa4	satellite/{reputation,overlay,satellitedb}: create reputation service, DB, add overlay method UpdateReputation Define service and DB interface for storing node reputation data and updating the overlay cache. Add overlay service and DB method UpdateReputation. See https://github.com/storj/storj/pull/4144 Change-Id: Iedd8bd3274457d26c595919303d55327c1464b8c	2021-06-24 16:19:15 +00:00
Egon Elbre	59e3b586e7	satellite/{gracefulexit,overlay}: enable as of system time queries Change-Id: I2af5eb0e8a51fca7893ce07b78b5633be71dfef8	2021-06-22 11:50:50 +00:00
Jeremy Wharton	8a070e7c25	satellite/overlay: Ignore unnecessary check-ins This prevents the database from being contacted unnecessarily, reducing load. Change-Id: Ib2420f68a20636ec35eb3dd3df8e02bd5341b419	2021-06-22 09:00:41 +00:00
JT Olio	da9ca0c650	testplanet/satellite: reduce the number of places default values need to be configured Satellites set their configuration values to default values using cfgstruct, however, it turns out our tests don't test these values at all! Instead, they have a completely separate definition system that is easy to forget about. As is to be expected, these values have drifted, and it appears in a few cases test planet is testing unreasonable values that we won't see in production, or perhaps worse, features enabled in production were missed and weren't enabled in testplanet. This change makes it so all values are configured the same, systematic way, so it's easy to see when test values are different than dev values or release values, and it's less hard to forget to enable features in testplanet. In terms of reviewing, this change should be actually fairly easy to review, considering private/testplanet/satellite.go keeps the current config system and the new one and confirms that they result in identical configurations, so you can be certain that nothing was missed and the config is all correct. You can also check the config lock to see what actual config values changed. Change-Id: I6715d0794887f577e21742afcf56fd2b9d12170e	2021-06-01 22:14:17 +00:00
Egon Elbre	10372afbe4	ci: fix lint errors Change-Id: Ib5893440807811f77175ccd347aa3f8ca9cccbdf	2021-05-17 13:37:31 +00:00
Egon Elbre	8f15f975a2	satellite/overlay: improve contended update checkin Improve UpdateCheckIn on a contended row: name old time/op new time/op delta UpdateCheckInContended-100x-32 2.29s ±55% 0.17s ±61% -92.45% (p=0.008 n=5+5) Change-Id: I053ab9f1cff136c306e5fb57f5e355cdc0269a8c	2021-05-16 20:41:12 +03:00
Egon Elbre	0858c3797a	satellite/{metabase,satellitedb}: deduplicate AS OF SYSTEM TIME code Currently we were duplicating code for AS OF SYSTEM TIME in several places. This replaces the code with using a method on dbutil.Implementation. As a consequence it's more useful to use a shorter name for implementation - 'impl' should be sufficiently clear in the context. Similarly, using AsOfSystemInterval and AsOfSystemTime to distinguish between the two modes is useful and slightly shorter without causing confusion. Change-Id: Idefe55528efa758b6176591017b6572a8d443e3d	2021-05-11 12:40:36 +03:00
Egon Elbre	d2033c2f52	satellite/nodeselection/uploadselection: rename package Currently nodeselection package only contained state for uploads, move these to a subpackage, such that we can make another "downloadselection" for downloads. Then move selection logic from overlay to nodeselection. Change-Id: I0fc42bcae3a29db2728dae9f3863b1e95bf5165b	2021-05-04 15:50:00 +00:00
Egon Elbre	961e841bd7	all: fix error naming errs.Class should not contain "error" in the name, since that causes a lot of stutter in the error logs. As an example a log line could end up looking like: ERROR node stats service error: satellitedbs error: node stats database error: no rows Whereas something like: ERROR nodestats service: satellitedbs: nodestatsdb: no rows Would contain all the necessary information without the stutter. Change-Id: I7b7cb7e592ebab4bcfadc1eef11122584d2b20e0	2021-04-29 15:38:21 +03:00
Cameron Ayer	a0c5da6643	satellite/satellitedb: in stray nodes DQ, don't DQ nodes where last_contact_success = '0001-01-01 00:00:00+00' When nodes check in for the very first time, if the satellite can't ping them back, they are inserted into the nodes table with last_contact_success of '0001-01-01 00:00:00+00'. If the stray nodes chore runs before the node can fix their problem, they are DQd. Solution: when DQing stray nodes, dont DQ where last_contact_success = '0001-01-01 00:00:00+00'::timestamptz Change-Id: I477a02d5ef85b2c930ed6b7d99a4d1995169bca8	2021-04-22 10:13:13 -04:00
Egon Elbre	267506bb20	satellite/metabase: move package one level higher metabase has become a central concept and it's more suitable for it to be directly nested under satellite rather than being part of metainfo. metainfo is going to be the "endpoint" logic for handling requests. Change-Id: I53770d6761ac1e9a1283b5aa68f471b21e784198	2021-04-21 15:54:22 +03:00
Jeff Wendling	a65aecfd98	compensation: always generate invoices for every node instead of only generating invoices for nodes that had some activity, we generate it for every node so that we can find and pay terminal nodes that did not meet thresholds before we recognized them as terminal. Change-Id: Ibb3433e1b35f1ddcfbe292c034238c9fa1b66c44	2021-03-29 14:15:45 +00:00
Egon Elbre	d57873fd45	satellite/overlay: remove Inspector Currently overlay.Inspector had two rpc methods and both of them were unimplemented. Change-Id: I1a2ecc7b7113898fa234a1c1fe451c8cc9e2ee81	2021-03-29 12:26:10 +03:00
Egon Elbre	86e698f572	pb: use *UnimplementedServer to avoid breaking API changes Change-Id: I99a34eeb37ac4453411f273511710562a519f57a	2021-03-29 12:26:10 +03:00
Cameron Ayer	05f8d2d0b1	satellite/satellitedb: filter offline suspended nodes from selection Change-Id: I5a6f413453332238d579a7bf50eb30e9156f96c2	2021-03-27 23:36:46 +00:00
Cameron Ayer	1a51049ac0	satellite/{overlay,satellitedb}: add flag to toggle suspending nodes for offline audits This change introduces a new config flag, --overlay.audit-history.offline-suspension-enabled, to toggle suspending nodes for offline audits. If the flag is set to true, nodes will be suspended if they meet the requirements. If the flag is false, nodes will not be suspended. If they are already suspended and/or under review, these will be cleared. Change-Id: Ibeba759c42d6e504f6b7598120d4fd4dab85ca74	2021-03-27 16:28:27 +00:00
Cameron Ayer	eb44dc21b4	satellite/satellitedb: select stray nodes for DQ in separate tx from update Previously we would select a limited number of nodes for DQ in a CTE and run the update on that set in a single transaction. This could lead to locking on the table, so instead we select and update in separate transactions. Change-Id: I1e802c0845e829eeadcee4fa382f58462515fdb1	2021-03-27 00:00:23 +00:00
Michał Niewrzał	237782813b	Merge remote-tracking branch 'origin/multipart-upload' Change-Id: If6c5a450b238adab55d1e0dea67d01e5f5768a9f	2021-03-23 09:44:49 +01:00
Cameron Ayer	864ad70fe2	satellite/overlay/straynodes: set --stray-nodes.enable-dq release default to true Since we will enable this on all satellites, just set default to true Change-Id: Ibc86a0afd0b0f57e86bd067abb9cdf06c295a467	2021-03-22 17:25:09 +00:00
Cameron Ayer	2607b16070	satellite/{overlay/straynodes,satellitedb}: rework DQNodesLastSeenBefore to return DQd node IDs and last contact successes We would like to log Node IDs and last contact successes of nodes DQd in this manner. We would also like to avoid returning an unbounded list of items from the db. Therefore we change the query to select a limited number of nodes that meet the DQ conditions and iterate until 0 rows are returned. Each column of the query is already indexed. Change-Id: Iaec2d9b56e7202b7c2028ba21750d40c8dd506ee	2021-03-22 13:01:30 -04:00
Michał Niewrzał	d995fb497f	Merge remote-tracking branch 'origin/main' into multipart-upload Change-Id: I367da03351ab80f7343332420490dde9282aa47a	2021-02-23 12:31:31 +01:00
Cameron Ayer	549033f2e6	satellite/satellitedb: don't include DQd and exited nodes in DQStrayNodes Don't update DQ time of already DQd nodes. Don't DQ nodes who exited. Change-Id: I4528a9ba9f8e278987165ad337a9b34dadb9788b	2021-02-19 15:12:30 -05:00
Egon Elbre	1137620baf	satellite/satellitedb: move tests to their domains Testing interfaces is slightly clearer when it's in the package needing the database rather than each individual implementation. Change-Id: I10334c214a205f7e510b939b4359a2214c4e060a	2021-02-19 17:29:15 +02:00
JT Olio	b2ed7edd30	cmd/satellite: restore-trash parallel workers Change-Id: Ic7466b21c20bda334e7ba4268a494e96b6528ac1	2021-02-18 19:11:19 +02:00
JT Olio	3ae3389ddc	cmd/satellite: restore-trash command Change-Id: I80fc932c12147692d49cde277784871ac611fcad	2021-02-18 09:19:22 -07:00
Michał Niewrzał	12402eb729	Merge remote-tracking branch 'origin/main' into multipart-upload Change-Id: I38adf8218c1415c7ea1910f8bd6bed13544b0f03	2021-02-17 08:50:38 +01:00
Egon Elbre	f7ad86521e	satellite/overlay: fix data race in TestAuditHistoryBasic Change-Id: I196f10973fe10b10b226ac3a63e62bf4fe9c256b	2021-02-16 12:32:19 +02:00
Michał Niewrzał	908a96ae30	Merge remote-tracking branch 'origin/main' into multipart-upload Change-Id: I075aaff42ca3f5dc538356cedfccd5939c75e791	2021-02-11 11:48:23 +01:00
Yaroslav Vorobiov	966535e9de	{storagenode,satellite}/nodeoperator: add wallet features Change-Id: Iac7eb40a52b8fddcc573aebaad2e3a30a10cded9	2021-02-08 22:09:45 +02:00
Kaloyan Raev	d0612199f0	Merge remote-tracking branch 'origin/main' into multipart-upload Conflicts: go.mod go.sum satellite/metainfo/config.go satellite/metainfo/metainfo_test.go Change-Id: I95cf3c1d020a7918795b5eec63f36112fdb86749	2021-02-01 14:32:12 +02:00
Egon Elbre	b7a0739219	satellite/overlay: use DownloadSelectionCache for getting node IPs Change-Id: Ib8f4eedb2bf465767050693a1e961b37a294ca06	2021-01-29 16:47:10 +02:00
Egon Elbre	54e01d37f9	satellite/overlay: add DownloadSelectionCache Change-Id: Ic0779280172325f8d03f55a2e9673722f72bdd44	2021-01-29 16:47:06 +02:00
Egon Elbre	19e3dc4ec0	satellite/overlay: rename NodeSelectionCache to UploadSelectionCache It wasn't obvious that NodeSelectionCache was only for uploads. Change-Id: Ifeeaa6fdb50a4b7916245b48d8634d70ac54459c	2021-01-28 14:56:53 +02:00
littleskunk	0b2568d712	satellite/overlay/straynodes: increase development duration without contact Stopping storj-sim for over 5 minutes caused nodes to be disqualified. Set development max duration without contact to 300 days.	2021-01-26 12:24:39 +02:00
Kaloyan Raev	c24ada7114	Merge remote-tracking branch 'origin/main' into multipart-upload Conflicts: go.mod go.sum Change-Id: Icf7c029e9d800e5f6a9fdd208c36f28e05468690	2021-01-20 17:35:57 +02:00
Cameron Ayer	d14607a5f7	satellite/{contact,nodestats,overlay,satellitedb}: remove references to total_uptime_count and uptime_success_count columns Change-Id: I1f92022909bc564e9b1e31bf937fdfe7c16554de	2021-01-19 15:43:02 -05:00
Cameron Ayer	75d828200c	private,satellite: add chore to dq stray nodes Full scope: private/testplanet,satellite/{overlay,satellitedb} Description: In most cases, downtime tracking with audits will eventually lead to DQ for nodes who are unresponsive. However, if a stray node has no pieces, it will not be audited and will thus never be disqualified. This chore will check for nodes who have not successfully been contacted in some set time and DQ them. There are some new flags for toggling DQ of stray nodes and the timeframes for running the chore and how long nodes can go without contact. Change-Id: Ic9d41fdbf214736798925e728245180fb3c55615	2021-01-19 14:21:56 -05:00
Kaloyan Raev	6dff40f5c5	Merge remote-tracking branch 'origin/main' into multipart-upload Conflicts: go.mod go.sum satellite/metainfo/metainfo.go Change-Id: Ib5c49f3c911c58319855a171f9ce73657da976d9	2021-01-14 14:33:59 +02:00
Egon Elbre	85fb964afe	satellite/{metainfo,overlay}: improvements to GetObjectIPs * Deduplicate NodeID list prior to fetching IPs. * Use NodeSelectionCache for fetching reliable IPs. * Return number of segements, reliable pieces and all pieces. Change-Id: I13e679caab275488b4037624b840a4068dad9589	2021-01-14 09:12:45 +00:00
Cameron Ayer	0403e99a5b	satellite/{overlay,satellitedb}: remove unused methods for old downtime tracking GetSuccessfulNodeNotCheckedInSince and GetOfflineNodesLimited are overlay methods which were only used by the previous downtime tracking system which has been removed. These methods should also be removed. Change-Id: Idb829d742e1f987e095604423fff656fe581183e	2021-01-11 15:21:28 +00:00
Michał Niewrzał	ec88d21a3c	Merge 'main' branch. Change-Id: I6e8162d1a6caf75e89c9f9c9f9522730aebf83ae	2021-01-11 10:26:58 +01:00
Moby von Briesen	6e2ef3b9ee	Revert "satellite/satellitedb: Do not consider nodes with offline_suspended as reputable." This reverts commit `e24262c2c9`. Change-Id: I287deb2e52d03bbd698ed055f0f216b0b5bf2798	2021-01-04 14:28:37 +00:00
Michał Niewrzał	ad3e3a38c5	Merge 'main' branch Change-Id: Ia0db1b1f9ef3e0671d3f2208881b0abc3064e200	2021-01-04 12:13:45 +01:00
Moby von Briesen	edbee53888	satellite,storagenode: Pass audit history over GetStats endpoint Full prefix: satellite/{overlay,nodestats},storagenode/{reputation,nodestats} Allow the storagenode to receive its audit history data from the satellite via the satellite's GetStats endpoint. The storagenode does not save this data for use in the API yet. Change-Id: I9488f4d7a4ccb4ccf8336b8e4aeb3e5beee54979	2020-12-30 19:13:26 +00:00
Moby von Briesen	825dc71227	satellite/{overlay, satellitedb}: Refactor audit history * Separate audit history interface into its own file in the overlay package * Add overlay.AuditHistory struct so that internalpb.AuditHistory is only used from within the database layer * Add overlay.GetAuditHistory function for features that will require access to detailed audit history information * Do not return full audit history from UpdateAuditHistory - callers to that function only need to know the online score and whether a full tracking period has been completed * Move audit history tests out of satellite/satellitedb, since they are independent of database implementation Change-Id: I35b0c4ac23bbaabd80624f8a9631c3cb1a1f33bd	2020-12-29 18:50:22 +00:00
Moby von Briesen	e24262c2c9	satellite/satellitedb: Do not consider nodes with offline_suspended as reputable. Nodes which are offline_suspended will no longer be considered for new uploads. The current threshold that enters a node into offline suspension is 0.6. Disqualification for offline suspension is still disabled. Change-Id: I0da9abf47167dd5bf6bb21e0bc2186e003e38d1a	2020-12-29 17:59:09 +00:00
Ethan Adams	6070018021	satellite/overlay: use AS OF SYSTEM TIME with Cockroach Query nodes table using AS OF SYSTEM TIME '-10s' (by default) when on CRDB to alleviate contention on the nodes table and minimize CRDB retries. Queries for standard uploads are already cached, and node lookups for graceful exit uploads has retry logic so it isn't necessary for the nodes returned to be current.	2020-12-22 21:07:07 +02:00
Kaloyan Raev	9aa61245d0	satellite/audits: migrate to metabase Change-Id: I480c941820c5b0bd3af0539d92b548189211acb2	2020-12-17 14:38:48 +02:00

1 2 3 4 5 ...

264 Commits