storj

Author	SHA1	Message	Date
Márton Elek	f7b39aaed4	satellite/nodeselection: remove stats/size from nodeselection state stats/size/count is not used by any production code, and it's not required, as we can assert the state with other checks. real motivation: next commits will make the Selector of the State configurable, therefore we won't have one single Stat, it depends on the request parameters. (we plan to support both network and id based randomization) Change-Id: I631828fc0046d2fef5b7a674fc0268a0446e9655	2023-08-01 18:29:41 +00:00
Michal Niewrzal	1d62dc63f5	satellite/repair/repairer: fix NumHealthyInExcludedCountries calculation Currently, we have issue were while counting unhealthy pieces we are counting twice piece which is in excluded country and is outside segment placement. This can cause unnecessary repair. This change is also doing another step to move RepairExcludedCountryCodes from overlay config into repair package. Change-Id: I3692f6e0ddb9982af925db42be23d644aec1963f	2023-07-10 12:01:19 +02:00
Márton Elek	97a89c3476	satellite: switch to use nodefilters instead of old placement.AllowedCountry placement.AllowedCountry is the old way to specify placement, with the new approach we can use a more generic (dynamic method), which can check full node information instead of just the country code. The 90% of this patch is just search and replace: * we need to use NodeFilters instead of placement.AllowedCountry * which means, we need an initialized PlacementRules available everywhere * which means we need to configure the placement rules The remaining 10% is the placement.go, where we introduced a new type of configuration (lightweight expression language) to define any kind of placement without code change. Change-Id: Ie644b0b1840871b0e6bbcf80c6b50a947503d7df	2023-07-07 16:55:45 +00:00
Márton Elek	70cdca5d3c	satellite: move satellite/nodeselection/uploadselection => satellite/nodeselection All the files in uploadselection are (in fact) related to generic node selection, and used not only for upload, but for download, repair, etc... Change-Id: Ie4098318a6f8f0bbf672d432761e87047d3762ab	2023-07-07 10:32:03 +02:00
Michal Niewrzal	21c1e66a85	satellite/overlay: refactor ReliabilityCache to keep more data ReliabilityCache will be now using refactored overlay Reliable method. This method will provide more info about nodes (e.g. country code) and with this we are able to add two dedicated methods to classify pieces: * OutOfPlacementPieces * PiecesNodesLastNetsInOrder With those new method we will fix issue where offline but reliable node won't be checked for clumped pieces and off placement pieces. https://github.com/storj/storj/issues/5998 Change-Id: I9ffbed9f07f4881c9db3bd0e5f0412f1a418dd82	2023-07-05 11:19:10 +02:00
Michal Niewrzal	f2cd7b0928	satellite/overlay: refactor Reliable to be used with repair checker Currently we are using Reliable to get missing pieces for repair checker. The issue is that now checker is looking at more things than just missing pieces (clumped/off, placement pieces) and using only node ID is not enough. We have issue where we are skipping offline nodes from clumped and off placement pieces check. Reliable was refactored to get data (e.g. country, lastNet) about all reliable nodes. List is split into online and offline. This data will be cached for quick use by repair checker. It will be also possible to check nodes metadata like country code or lastNet. We are also slowly moving `RepairExcludedCountryCodes` config from overlay to repair which makes more sens for it. This this first part of changes. https://github.com/storj/storj/issues/5998 Change-Id: If534342488c0e440affc2894a8fbda6507b8959d	2023-07-05 10:56:31 +02:00
Márton Elek	500b6244f8	satellite/satellitedb: create table for node tags Change-Id: I884bb740974e6b8241aa6b85faf266b85fe892d4	2023-07-05 09:38:53 +02:00
Márton Elek	d38b8fa2c4	satellite/nodeselection: use the same Node object from overlay and nodeselection We use two different Node types in `overlay` and `uploadnodeselection` and converting back and forth. Using the same object would allow us to use a unified node selection interface everywhere. Change-Id: Ie71e29d60184ee0e5b4547eb54325f09c418f73c	2023-07-03 16:59:33 +00:00
Michal Niewrzal	98f4f249b2	satellite/overlay: refactor KnownReliable to be used with repairer Currently we are using KnownUnreliableOrOffline to get missing pieces for segment repairer (GetMissingPieces). The issue is that now repairer is looking at more things than just missing pieces (clumped/off placement pieces). KnownReliable was refactored to get data (e.g. country, lastNet) about all reliable nodes from provided list. List is split into online and offline. This way we will be able to use results from this method to all checks: missing pieces, clumped pieces, out of placement pieces. This this first part of changes to handle different kind of pieces in segment repairer. https://github.com/storj/storj/issues/5998 Change-Id: I6cbaf59cff9d6c4346ace75bb814ccd985c0e43e	2023-06-27 13:27:23 +02:00
Michal Niewrzal	9e3fd4d514	satellite/overlay: delete unused method Change-Id: I87828fcac4f4a9fb08c86af188aa6ea28c5c64af	2023-06-22 12:45:59 +00:00
Michal Niewrzal	fe21fd42f7	satellite/overlay: add GetNodesOutOfPlacement method We would like to verify if nodes matches specific placement e.g. to validate segment pieces are correctly geofenced. https://github.com/storj/storj/issues/5896 Change-Id: I842767dccc121a3c60224f677ab55e5dc150c76e	2023-05-30 14:57:20 +02:00
Michal Niewrzal	36e046375c	satellite/repair/checker: remove segments loop parts We are switching completely to ranged loop. https://github.com/storj/storj/issues/5368 Change-Id: I8583549973cd36aa0e0c482c20d7a75cb7568ab3	2023-05-08 12:19:13 +00:00
Michal Niewrzal	1aa24b9f0d	satellite/audit: remove segments loop parts We are switching completely to ranged loop. https://github.com/storj/storj/issues/5368 Change-Id: I9cec0ac454f40f19d52c078a8b1870c4d192bd7a	2023-04-24 15:52:11 +00:00
Márton Elek	ffaf15a3b0	satellite/overlay: remove unused mail service from overlay It was surprising that `satellite auditor` complained about SMTP mail settings, even if it's not supposed to sending any mail. Looks like we can remove the mail service dependency, as it's not a hard requirement for overlay.Service. Change-Id: I29a52eeff3f967ddb2d74a09458dc0ee2f051bd7	2023-03-09 12:17:35 +00:00
paul cannon	2522ff09b6	satellite/overlay: configurable meaning of last_net Up to now, we have been implementing the DistinctIP preference with code in two places: 1. On check-in, the last_net is determined by taking the /24 or /64 (in ResolveIPAndNetwork()) and we store it with the node record. 2. On node selection, a preference parameter defines whether to return results that are distinct on last_net. It can be observed that we have never yet had the need to switch from DistinctIP to !DistinctIP, or from !DistinctIP to DistinctIP, on the same satellite, and we will probably never need to do so in an automated way. It can also be observed that this arrangement makes tests more complicated, because we often have to arrange for test nodes to have IP addresses in different /24 networks (a particular pain on macOS). Those two considerations, plus some pending work on the repair framework that will make repair take last_net into consideration, motivate this change. With this change, in the #2 place, we will _always_ return results that are distinct on last_net. We implement the DistinctIP preference, then, by making the #1 place (ResolveIPAndNetwork()) more flexible. When DistinctIP is enabled, last_net will be calculated as it was before. But when DistinctIP is _off_, last_net can be the same as address (IP and port). That will effectively implement !DistinctIP because every record will have a distinct last_net already. As a side effect, this flexibility will allow us to change the rules about last_net construction arbitrarily. We can do tests where last_net is set to the source IP, or to a /30 prefix, or a /16 prefix, etc., and be able to exercise the production logic without requiring a virtual network bridge. This change should be safe to make without any migration code, because all known production satellite deployments use DistinctIP, and the associated last_net values will not change for them. They will only change for satellites with !DistinctIP, which are mostly test deployments that can be recreated trivially. For those satellites which are both permanent and !DistinctIP, node selection will suddenly start acting as though DistinctIP is enabled, until the operator runs a single SQL update "UPDATE nodes SET last_net = last_ip_port". That can be done either before or after deploying software with this change. I also assert that this will not hurt performance for production deployments. It's true that adding the distinct requirement to node selection makes things a little slower, but the distinct requirement is already present for all production deployments, and they will see no change. Refs: https://github.com/storj/storj/issues/5391 Change-Id: I0e7e92498c3da768df5b4d5fb213dcd2d4862924	2023-03-09 02:20:12 +00:00
JT Olio	686faeedbd	satellite/overlay: return noise info with selected nodes we have two more fields in the database (noise_proto and noise_public_key) that now need to go into pb.NodeAddress when returning AddressedOrderLimits. the only real complication is making sure type conversions between database types and NodeURLs and so on don't lose this new pb.NodeAddress field (NoiseInfo). otherwise this is a relatively straightforward commit Change-Id: I45b59d7b2d3ae21c2e6eb95497f07cd388d454b3	2023-02-02 15:46:27 +00:00
JT Olio	2753d5a32f	satellite/overlay: keep track of noise info per node Change-Id: Icef04c3e87dbf4bb57d3837274c323bf6dd2c81f	2023-02-01 23:03:35 -05:00
Cameron	0596651580	satellite/satellitedb: fix updating nodes.last_software_update_email The CASE expression used to determine which value to set last_software_update_email to did not have an ELSE clause. Therefore, when the node is both below the minimum version and did not receive a version update email (no condition is true), the value would be set to NULL. Additionally, replace `time.Now()` with `timestamp` in the check to determine if the email cooldown has passed. Change-Id: I2e2e93f1a865e123ed8b665be9621cebfb72236f	2023-02-01 17:25:58 -05:00
paul cannon	b6bcb32ecf	satellite/reputation: more accurate "reputation changes" list `overlay.(Service).UpdateReputation()` takes a "reputationChanges" parameter, a slice of node events indicating whether we think the node's disqualification or suspension status is changing. This is necessary so that the overlay service can notify the nodeevents DB about these changes. In several cases, however, this list of events is not constructed correctly, because of missing information about the previous state. In most cases, this is because the node was offline, and the order limit creation functions (which usually obtain and return the prior reputation status) ignored that node. This change makes it so that all callers to `overlay.(Service).UpdateReputation()` can be expected to provide a correct list of change events (as correct as feasible, given that we can't lock the node's information in the database during the entire operation). It ended up that there was only one caller we needed to worry about, and that was reputation.(*Service).ApplyAudit(). So the bulk of this change is teaching that function how to recognize when the prior reputation status was not filled in, and fill it in. Refs: https://github.com/storj/storj/issues/5464 Change-Id: I52ce385fc9c0ce3b283b998d517998e7f4ec8792	2023-01-31 18:39:40 +00:00
JT Olio	e40191afd6	storj: upgrade to use latest storj/common NodeAddress Change-Id: I5987391bcfe5f6dfd7b525698c337a4cbda9b76e	2023-01-25 01:37:26 +00:00
Cameron	8681a36164	satellite/overlay: add ability to get offline nodes in need of email Add LastOfflineEmail to overlay.NodeDossier. This is the last time a node got an offline email. Add two new overlay db methods, GetOfflineNodesForEmail and UpdateLastOfflineEmail. Edit db method UpdateCheckIn to nullify last_offline_email if node is up. Change-Id: I1ee60e7d98dd1b68348a57f9a4fb77c6c9895d6d	2022-11-17 19:03:04 +00:00
Cameron	2a25974261	satellite/overlay: insert node software update events into node events When a node checks in and its version is below the minimum, insert BelowMinVersion event into node events Change-Id: I0e437ac34496778369515cbc40c15676da8b27ae	2022-11-11 20:43:56 +00:00
Cameron	74ddfab810	satellite/overlay: insert DQ event into node events in overlay.DisqualifyNode Also, return node email from overlaycache db DisqualifyNode to be used in node events insertion Change-Id: I41534cf01351c1690c3966a8055c5fe6fcf0d6a6	2022-11-04 15:18:31 +00:00
Cameron	68fe26ebe5	satelite/overlay: insert reputation events into node events Insert reputation event into node events if reputation change occurs. Change-Id: If1c5526092cb6834fe2faa6aa6e0306d4d88a4b7	2022-11-02 18:32:20 +00:00
Cameron	865974950d	satellite/overlay: insert node online events into node events table Instead of sending emails at the time the node is seen to be back online, we have decided to send the event to the node events table, which will initiate the email sending process at some point. Change-Id: Id756209498112579de8e78ee20ad2df54571a617	2022-11-02 16:26:19 +00:00
Cameron	f06da25c3d	satellite/overlay: add nodeevents.DB to satellite overlay service Add nodeevents.DB to satellite overlay service so we can insert node events into the nodeevents DB. Change-Id: I642c0ccc9941ecdb08cb22d5c8cf701959a55156	2022-11-02 15:56:37 +00:00
Cameron	a2ca443e29	satellite/overlay: send Node Online emails Send an email when a node goes from offline to online. Change-Id: I82d3f9001b9b0669e096d784edf097ffdcb1697d	2022-10-20 18:56:35 +00:00
Cameron	a52f766273	satellite/overlay: add email-sending functionality to overlay service We want to send emails to SNOs. Node status changes go through the overlay service, so it's a good place to add the mail service. Add the mailservice.Service, satellite address, and satellite name to overlay service. Also add feature flag --overlay.send-node-emails Change-Id: I3bd2cb3bf22f9724954ce2374f8b651b902b3a24	2022-10-13 18:01:05 +00:00
Egon Elbre	48b0a65fbd	satellite/overlay: use ReadCache in Download/UploadSelectionCache sync2.ReadCache implements preemptive refreshing preventing stalling while it's being updated. Change-Id: Iee9ef36049b986f0e426c14a139b2bc9ac17fb53	2022-07-12 13:52:48 +03:00
Yaroslav Vorobiov	3f47d19aa6	satellite/overlay: add disqualification reason Add disqualification reason to NodeDossier. Extend DB.DisqualifyNode with disqualification reason. Extend reputation Service.TestDisqualifyNode with disqualification reason. Change-Id: I8611b6340c7f42ac1bb8bd0fd7f0648ad650ab2d	2022-04-20 13:29:31 +00:00
Yaroslav Vorobiov	4223fa01f8	satellite/reputation: add disqualification reason for status update Set disqualification reason when reputations stats are updated on DB.Update. Added tests for DisqualifyNode and for disqualification cases which happens during Update. Change-Id: I00130ab5d9722422805159ad2f183c205de60f7e	2022-04-20 13:29:10 +00:00
Fadila Khadar	29fd36a20e	satellite/repairer: handle excluded countries For nodes in excluded areas, we don't necessarily want to remove them from the pointer, but we do want to increase the number of pieces in the segment in case those excluded area nodes go down. To do that, we increase the number of pieces repaired by the number of pieces in excluded areas. Change-Id: I0424f1bcd7e93f33eb3eeeec79dbada3b3ea1f3a	2022-03-14 10:59:36 -04:00
Fadila Khadar	e776c65172	satellite/checker: pieces in excluded countries are not healthy Add a RepairExcludedCountryCodes config flag for overlay for providing a list of country codes to exclude nodes from target repair selection. Mark segments with less than repairThreshold pieces in countries not in the RepairExcludedCountryCodes as not healthy. With this change, the repair process is not affected. The segment will be removed from the repair queue by the repairer. Another change will handle the logic at the repairer level. Fixes https://github.com/storj/team-metainfo/issues/95 Change-Id: I9231b32de117a116488de055a3e94efcabb46e81	2022-03-02 09:59:09 +00:00
Egon Elbre	be02aa9b17	satellite/overlay: add test for UpdateCheckIn panic The database table got invalid input and the resulting error was not checked. This adds updates that contain invalid fields to trigger different errors. Change-Id: Iacea32cbef5599aab562c88e4113073596cc9996	2022-01-24 15:01:59 +02:00
Yingrong Zhao	1f8f7ebf06	satellite/{audit, reputation}: fix potential nodes reputation status inconsistency The original design had a flaw which can potentially cause discrepancy for nodes reputation status between reputations table and nodes table. In the event of a failure(network issue, db failure, satellite failure, etc.) happens between update to reputations table and update to nodes table, data can be out of sync. This PR tries to fix above issue by passing through node's reputation from the beginning of an audit/repair(this data is from nodes table) to the next update in reputation service. If the updated reputation status from the service is different from the existing node status, the service will try to update nodes table. In the case of a failure, the service will be able to try update nodes table again since it can see the discrepancy of the data. This will allow both tables to be in-sync eventually. Change-Id: Ic22130b4503a594b7177237b18f7e68305c2f122	2022-01-06 21:05:59 +00:00
Egon Elbre	a42b9d1a48	all: fix uses of email.com email.com is not a domain that should be used for examples nor tests. Change-Id: I654d4287d02633d5ed9740e81a79150470eeaf25	2022-01-05 16:29:19 +02:00
Cameron Ayer	bb21551a9c	satellite/satellitedb: remove references to contained column in nodes table We don't use this column for anything. If you want to know if a node is contained, you can check the pending_audits table. Change-Id: I8da1d8e01a2dcaff63c5067a7927b5451424ad04	2021-10-14 19:17:46 +00:00
Yingrong Zhao	646ce5b8cc	satellite/overlay: remove reputation logic from overlay Change-Id: I3492860e4537c7a8e4e824ec4c9c8d179134a0c0	2021-07-28 15:15:28 -04:00
Yingrong Zhao	f8914ccce0	satellite/{repair, overlay}: use reputation store in repair Change-Id: I48db9e68f48239d48621ccc77d33618ecb83ce1a	2021-07-28 13:22:05 -04:00
Yingrong Zhao	e91574cee1	satellite/{reputation, gracefulexit}: use reputation store in gracefulexit With the effort to move audit related data into reputation store, this PR updates gracefulexit endpoint to use reputation service to get a node's audit score Change-Id: Iad93ea689ad67ff9c57c7be16687e21e715fab7a	2021-07-28 13:21:41 -04:00
Yingrong Zhao	6c7bf357cd	satellite/{reputation,audit,overlay}: replace overlay with reputation package in audit This PR implements reputation store and replace overlay in audit service to use such store for storing node's audit stats. In order to keep the changeset smaller, most of the changes in this PR is for copying audit logic in overlay to reputation package. In a following PR, the duplicating code will be removed from overlay. Change-Id: I16c12494a0970f44c422b26cf603c1dc489e5bc1	2021-07-28 13:10:48 -04:00
Cameron Ayer	8c124c6fa4	satellite/{reputation,overlay,satellitedb}: create reputation service, DB, add overlay method UpdateReputation Define service and DB interface for storing node reputation data and updating the overlay cache. Add overlay service and DB method UpdateReputation. See https://github.com/storj/storj/pull/4144 Change-Id: Iedd8bd3274457d26c595919303d55327c1464b8c	2021-06-24 16:19:15 +00:00
Cameron Ayer	05f8d2d0b1	satellite/satellitedb: filter offline suspended nodes from selection Change-Id: I5a6f413453332238d579a7bf50eb30e9156f96c2	2021-03-27 23:36:46 +00:00
Yaroslav Vorobiov	966535e9de	{storagenode,satellite}/nodeoperator: add wallet features Change-Id: Iac7eb40a52b8fddcc573aebaad2e3a30a10cded9	2021-02-08 22:09:45 +02:00
Egon Elbre	19e3dc4ec0	satellite/overlay: rename NodeSelectionCache to UploadSelectionCache It wasn't obvious that NodeSelectionCache was only for uploads. Change-Id: Ifeeaa6fdb50a4b7916245b48d8634d70ac54459c	2021-01-28 14:56:53 +02:00
Cameron Ayer	d14607a5f7	satellite/{contact,nodestats,overlay,satellitedb}: remove references to total_uptime_count and uptime_success_count columns Change-Id: I1f92022909bc564e9b1e31bf937fdfe7c16554de	2021-01-19 15:43:02 -05:00
Cameron Ayer	0403e99a5b	satellite/{overlay,satellitedb}: remove unused methods for old downtime tracking GetSuccessfulNodeNotCheckedInSince and GetOfflineNodesLimited are overlay methods which were only used by the previous downtime tracking system which has been removed. These methods should also be removed. Change-Id: Idb829d742e1f987e095604423fff656fe581183e	2021-01-11 15:21:28 +00:00
Moby von Briesen	6e2ef3b9ee	Revert "satellite/satellitedb: Do not consider nodes with offline_suspended as reputable." This reverts commit `e24262c2c9`. Change-Id: I287deb2e52d03bbd698ed055f0f216b0b5bf2798	2021-01-04 14:28:37 +00:00
Moby von Briesen	e24262c2c9	satellite/satellitedb: Do not consider nodes with offline_suspended as reputable. Nodes which are offline_suspended will no longer be considered for new uploads. The current threshold that enters a node into offline suspension is 0.6. Disqualification for offline suspension is still disabled. Change-Id: I0da9abf47167dd5bf6bb21e0bc2186e003e38d1a	2020-12-29 17:59:09 +00:00
Ethan Adams	6070018021	satellite/overlay: use AS OF SYSTEM TIME with Cockroach Query nodes table using AS OF SYSTEM TIME '-10s' (by default) when on CRDB to alleviate contention on the nodes table and minimize CRDB retries. Queries for standard uploads are already cached, and node lookups for graceful exit uploads has retry logic so it isn't necessary for the nodes returned to be current.	2020-12-22 21:07:07 +02:00

1 2

96 Commits