storj

Author	SHA1	Message	Date
Egon Elbre	24833465e6	satellite/metainfo/metabase: avoid magic constant Change-Id: I4f01e38f67e18ae9cb9845a8e75a987acba66427	2021-01-11 10:22:21 +00:00
Jessica Grebenschikov	1709117b0d	satellite/console/wasm: add more unit tests Change-Id: Ie134f8a08d690ce013039ed1a4e484f8b6a1a6d5	2021-01-08 18:50:29 +00:00
Jeff Wendling	2d2359667d	satellite/orders: remove unused satelliteAddress field Change-Id: I58091769472688433c48becc8dfc9029bddd87aa	2021-01-08 12:25:39 -05:00
Egon Elbre	ba5461562d	satellite/orders: remove satellite address SatelliteAddress in OrderLimit is not being used anymore and some satellite addresses may consume too much bytes. Change-Id: Ic7a0efe5b6211c2f3b91af67b293cde98b29d074	2021-01-08 16:57:36 +00:00
Egon Elbre	51731db121	satellite/orders: use smaller encrypted metadata Avoid using project uuid string representation, because it uses more bandwidth. This reduces the encrypted metadata size from 118 -> 97 bytes. Change-Id: Ic53a81b83acc065f24f28cd404f9c0b1fe592594	2021-01-08 16:40:31 +00:00
JT Olio	8907180e81	satellite: pass contact.external-address config to web ui Change-Id: I54978aa34aa9eb98876fab6460a5737d718d6135	2021-01-06 10:11:20 -07:00
Egon Elbre	9cb4466eb0	cmd/storj-sim: use dev setup by default for consistency Fixes bug when using release binaries together with storj-sim. Change-Id: I077bedc1486ac85aa1f04fcc0ed4098cd313f2fc	2021-01-05 13:47:30 +02:00
Moby von Briesen	a90d6fcad8	satellite/repair/checker: Use segment health on checker insert Do not insert the number of healthy pieces for segment health anymore. Rather, insert the segment health calculated by our new priority function. Change-Id: Ieee7fb2deee89f4d79ae85bac7f577befa2a0c7f	2021-01-04 11:48:17 -05:00
Moby von Briesen	6e2ef3b9ee	Revert "satellite/satellitedb: Do not consider nodes with offline_suspended as reputable." This reverts commit `e24262c2c9`. Change-Id: I287deb2e52d03bbd698ed055f0f216b0b5bf2798	2021-01-04 14:28:37 +00:00
Michał Niewrzał	d4ebdba48c	satellite/payments/stripecoinpayments: fix tests failing in 2021 We had some tests with hardcoded year 2020. Change-Id: I0184c3ece819cb764eb305751a1d8d4056b6af17	2021-01-04 10:47:31 +01:00
Moby von Briesen	edbee53888	satellite,storagenode: Pass audit history over GetStats endpoint Full prefix: satellite/{overlay,nodestats},storagenode/{reputation,nodestats} Allow the storagenode to receive its audit history data from the satellite via the satellite's GetStats endpoint. The storagenode does not save this data for use in the API yet. Change-Id: I9488f4d7a4ccb4ccf8336b8e4aeb3e5beee54979	2020-12-30 19:13:26 +00:00
Rafael Gomes	8b2e4bfa7e	satellite/metainfo/piecedeletion Remove spaces from metrics. Change-Id: Iaf1d8a96a43087f2fcc579347f581e8a78a0fb58	2020-12-30 14:27:39 -03:00
paul cannon	7246368ca1	satellite/repair: clamp totalNodes to 100 or higher Change-Id: I239418ed3671b1cee30b0b1797dc434244e72448	2020-12-30 10:39:14 -06:00
Moby von Briesen	825dc71227	satellite/{overlay, satellitedb}: Refactor audit history * Separate audit history interface into its own file in the overlay package * Add overlay.AuditHistory struct so that internalpb.AuditHistory is only used from within the database layer * Add overlay.GetAuditHistory function for features that will require access to detailed audit history information * Do not return full audit history from UpdateAuditHistory - callers to that function only need to know the online score and whether a full tracking period has been completed * Move audit history tests out of satellite/satellitedb, since they are independent of database implementation Change-Id: I35b0c4ac23bbaabd80624f8a9631c3cb1a1f33bd	2020-12-29 18:50:22 +00:00
Moby von Briesen	85ae13f11d	satellite/satellitedb: Drop nodes_offline_times table. Now that the deprecated downtime tracking service is removed (`3fc76f4ffe`), we can safely remove the nodes_offline_times table. Change-Id: Ia7c6efe32ba104dff5a830af5f2beee3337eefe5	2020-12-29 18:17:50 +00:00
Moby von Briesen	e24262c2c9	satellite/satellitedb: Do not consider nodes with offline_suspended as reputable. Nodes which are offline_suspended will no longer be considered for new uploads. The current threshold that enters a node into offline suspension is 0.6. Disqualification for offline suspension is still disabled. Change-Id: I0da9abf47167dd5bf6bb21e0bc2186e003e38d1a	2020-12-29 17:59:09 +00:00
Stefan Benten	ad58459198	satellite/admin: allow more than just "paid" invoice status during user deletion Currently we do not allow anything other than the "paid" status for invoices when trying to delete a user. However there can be a couple of other states that are still fine to accept during deletion of a user. This change reverses the order to check for the status that we do not want to allow. Change-Id: I78d85af6438015c55100fa201ccffc731c91de1c	2020-12-23 16:40:44 +01:00
JT Olio	7faaeed2bf	satellite/access grant wizard: don't hardcode the satellites Change-Id: Id9fbf68882cdb2fce846b7a2604cf965cc53ab1a	2020-12-22 21:24:45 -07:00
JT Olio	efde103dba	accounting: rollup test is broken for the hour before midnight UTC this change isn't the real fix. it's just ignoring the problem. i don't know what the real fix is. is the problem with the test, or is there actually a problem with the rollup code? Change-Id: I552bdd947deadc212cc56efc5f818942b9827126	2020-12-22 14:14:52 -07:00
Ethan Adams	6070018021	satellite/overlay: use AS OF SYSTEM TIME with Cockroach Query nodes table using AS OF SYSTEM TIME '-10s' (by default) when on CRDB to alleviate contention on the nodes table and minimize CRDB retries. Queries for standard uploads are already cached, and node lookups for graceful exit uploads has retry logic so it isn't necessary for the nodes returned to be current.	2020-12-22 21:07:07 +02:00
Ethan Adams	563197c628	satellite/overlay: Add index on nodes table (#4012 ) satellite/accounting: Add index for project_id on bucket_storage_tallies	2020-12-21 12:48:48 -05:00
Ethan Adams	9b52283570	satellite/accounting: Add index for project_id on bucket_storage_tallies (#4010 ) Change-Id: I47ab2d1e24f94307c3383c497cffe2a150fa8ab7	2020-12-21 11:42:00 -05:00
Ethan Adams	6e501898c3	satellite/accounting: Performance improvements to getNodeIds used by GetBandwidthSince (#4009 )	2020-12-21 16:37:01 +01:00
Jessica Grebenschikov	d961437889	satellite/orders: remove the config IncludeEncryptedMetadata Since the Satellite now requires the order encryption functionality (since serial_number table is deprecated) to properly function, we can remove the config flag to turn on/off the feature. Change-Id: Ie973f72a9a05a81cef9e53dc9c99d22c940c2488	2020-12-18 10:39:29 -08:00
Jessica Grebenschikov	da0327c9b7	satellite/dbcleanup: remove expired serial chore Change-Id: Ib71d41eb6679d6435e5bc10b6244dac66380a74e	2020-12-18 09:36:28 -08:00
Jessica Grebenschikov	97a5e6c814	satellite/orders: stop inserting/reading from serial_numbers table This PR contains the minimum changes needed to stop inserting into the serial_numbers table. This is the first step in completely deprecating that table. The next step is to create another PR to remove the expiredSerial chore, fix more tests, and remove any other methods on the serial_number table. Change-Id: I5f12a56ebf3fa4d1a1976141d2911f25a98d2cc3	2020-12-18 08:35:13 -08:00
littleskunk	2437d5b171	satellite/access-grants: default auth service url (#4002 ) * satellite/access-grants: default auth service url	2020-12-17 23:38:16 +01:00
paul cannon	d3604a5e90	satellite/repair: use survivability model for segment health The chief segment health models we've come up with are the "immediate danger" model and the "survivability" model. The former calculates the chance of losing a segment becoming lost in the next time period (using the CDF of the binomial distribution to estimate the chance of x nodes failing in that period), while the latter estimates the number of iterations for which a segment can be expected to survive (using the mean of the negative binomial distribution). The immediate danger model was a promising one for comparing segment health across segments with different RS parameters, as it is more precisely what we want to prevent, but it turns out that practically all segments in production have infinite health, as the chance of losing segments with any reasonable estimate of node failure rate is smaller than DBL_EPSILON, the smallest possible difference from 1.0 representable in a float64 (about 1e-16). Leaving aside the wisdom of worrying about the repair of segments that have less than a 1e-16 chance of being lost, we want to be extremely conservative and proactive in our repair efforts, and the health of the segments we have been repairing thus far also evaluates to infinity under the immediate danger model. Thus, we find ourselves reaching for an alternative. Dr. Ben saves the day: the survivability model is a reasonably close approximation of the immediate danger model, and even better, it is far simpler to calculate and yields manageable values for real-world segments. The downside to it is that it requires as input an estimate of the total number of active nodes. This change replaces the segment health calculation to use the survivability model, and reinstates the call to SegmentHealth() where it was reverted. It gets estimates for the total number of active nodes by leveraging the reliability cache. Change-Id: Ia5d9b9031b9f6cf0fa7b9005a7011609415527dc	2020-12-17 21:30:17 +00:00
littleskunk	3feee9f4f8	satellite/accounting: default project limits (#4001 )	2020-12-17 22:27:05 +01:00
Cameron Ayer	28eaae66af	satellite/satellitedb: drop num_healthy_pieces column from injuredsegments This column is no longer used as it has been replaced by the segment_health column. Change-Id: I6b4df89cd4f994d8418976f88e8c5f57615f8115	2020-12-17 20:17:08 +00:00
VitaliiShpital	f4bbd0f5df	web/satellite: use brotli instead of gzip WHAT: we'll use brotli instead of gzip from now on WHY: better compression Change-Id: Ibeadd6bfc783e9c15cf3f62f719af692071a7721	2020-12-17 19:23:44 +00:00
Egon Elbre	12055e7864	all: minor cleanups Change-Id: I4248dbe36a62a223b06135254b32851485a2eec1	2020-12-16 10:47:46 +00:00
Cameron Ayer	8c52bb3a18	satellite/checker: use numHealthy as segment health in repair queue A few weeks ago it was discovered that the segment health function was not working as expected with production values. As a bandaid, we decided to insert the number of healthy pieces into the segment health column. This should have effectively reverted our means of prioritizing repair to the previous implementation. However, it turns out that the bandaid was placed into the code which removes items from the irreparable db and inserts them into the repair queue. This change: insert number of healthy pieces into the repair queue in the method, RemoteSegment Change-Id: Iabfc7984df0a928066b69e9aecb6f615253f1ad2	2020-12-15 17:16:59 -05:00
Cameron Ayer	2ac72eaf16	satellite/repair/checker: add new monkit stats tagged with rs scheme There is a new checker field called statsCollector. This contains a map of stats pointers where the key is a stringified redundancy scheme. stats contains all tagged monkit metrics. These metrics exist under the key name, "tagged_repair_stats", which is tagged with the name of each metric and a corresponding rs scheme. As the metainfo observer works on a segment, it checks statsCollector for a stats corresponding to the segment's redundancy scheme. If one doesn't exist, it is created and chained to the monkit scope. Now we can call Observe, Inc, etc on the fields just like before, and they have tags! durabilityStats has also been renamed to aggregateStats. At the end of the metainfo loop, we insert the aggregateStats totals into the corresponding stats fields for metric reporting. Change-Id: I8aa1918351d246a8ef818b9712ed4cb39d1ea9c6	2020-12-15 14:08:01 +00:00
Stefan Benten	9fe477899b	satellite/satellitedb: add lint ignore rule to support staticcheck 2020.2 staticcheck 2020.2 is not liking our dbx files, so we need to ignore them. Change-Id: I6becc3619bb088473f9776d0878ce240d4935936	2020-12-14 21:16:31 +00:00
Jessica Grebenschikov	3cc98de3ee	satellite/console/wasm: reduce size to <9MB Make changes so that we only import the necessary files from the console package so that the generated wasm code is as small as possible. This change gets the compiled wasm code down to 8.6MB uncompressed and 2MB when compressed with `gzip --best`. https://review.dev.storj.io/c/storj/storj/+/3396 Change-Id: Ifdd4be285810757b46bbbe43327c0d0139e5f8f7	2020-12-14 16:41:39 +00:00
Ivan Fraixedes	2dddcffe43	satellite/accounting/rollout: Remove unused variable Remove a declared variable that's set by never read nor passed to any function so it's unused code. Change-Id: I8daf9d1f71d29ab39d7a80011d1b4813ada1c67d	2020-12-14 14:11:41 +00:00
Brandon Iglesias	ca1e6b9756	Adding Fastly (#3994 )	2020-12-11 15:53:05 +02:00
Stefan Benten	8fe829d5fd	build: add wasm bits to Dockerfile and bump to go v1.15.6 (#3992 )	2020-12-11 02:23:39 +01:00
Jessica Grebenschikov	0649d2b930	satellite/repair: improve contention for injuredsegments table on CRDB We migrated satelliteDB off of Postgres and over to CockroachDB (crdb), but there was way too high contention for the injuredsegments table so we had to rollback to Postgres for the repair queue. A couple things contributed to this problem: 1) crdb doesn't support `FOR UPDATE SKIP LOCKED` 2) the original crdb Select query was doing 2 full table scans and not using any indexes 3) the SLC Satellite (where we were doing the migration) was running 48 repair worker processes, each of which run up to 5 goroutines which all are trying to select out of the repair queue and this was causing a ton of contention. The changes in this PR should help to reduce that contention and improve performance on CRDB. The changes include: 1) Use an update/set query instead of select/update to capitalize on the new `UPDATE` implicit row locking ability in CRDB. - Details: As of CRDB v20.2.2, there is implicit row locking with update/set queries (contention reduction and performance gains are described in this blog post: https://www.cockroachlabs.com/blog/when-and-why-to-use-select-for-update-in-cockroachdb/). 2) Remove the `ORDER BY` clause since this was causing a full table scan and also prevented the use of the row locking capability. - While long term it is very important to `ORDER BY segment_health`, the change here is only suppose to be a temporary bandaid to get us migrated over to CRDB quickly. Since segment_health has been set to infinity for some time now (re: https://review.dev.storj.io/c/storj/storj/+/3224), it seems like it might be ok to continue not making use of this for the short term. However, long term this needs to be fixed with a redesign of the repair workers, possible in the trusted delegated repair design (https://review.dev.storj.io/c/storj/storj/+/2602) or something similar to what is recommended here on how to implement a queue on CRDB https://dev.to/ajwerner/quick-and-easy-exactly-once-distributed-work-queues-using-serializable-transactions-jdp, or migrate to rabbit MQ priority queue or something similar.. This PRs improved query uses the index to avoid full scans and also locks the row its going to update and CRDB retries for us if there are any lock errors. Change-Id: Id29faad2186627872fbeb0f31536c4f55f860f23	2020-12-10 09:51:26 -08:00
Michal Niewrzal	c2a97aeb14	satellite/satellitedb: add ListAllBuckets method We need to be able to list all buckets in DB without knowing project ID. This method will be used to list buckets for metainfo loop implementation based on metabase. Change-Id: Iac75af0eee4f31e80a15577575a8249cbca787b2	2020-12-10 14:19:27 +00:00
Stefan Benten	494bd5db81	all: golangci-lint v1.33.0 fixes (#3985 )	2020-12-05 17:01:42 +01:00
Ethan Adams	f90ea10a4a	Allow for DB application names per process. (#3983 )	2020-12-04 11:24:39 +01:00
Moby von Briesen	d75e4be11f	satellite/{accounting, contact}: Remove periods and spaces from metrics. Change-Id: I84179c2931293e3a1eb0ff8050416d25e481ce07	2020-12-03 15:33:01 +00:00
Moby von Briesen	3fc76f4ffe	satellite/downtime: Remove deprecated downtime tracking service. We are no longer planning on implementing downtime penalization using the method described in docs/blueprints/archive/storage-node-downtime-tracking-deprecated.md. Now, we are implementing the design described in docs/blueprints/storage-node-downtime-tracking-with-audits.md. This change removes the downtime estimation chores from the satellite core as well as the package satellite/downtime. A future change will remove the database table. Change-Id: I1a1d3cf9dceeba36255d25243294865b89925518	2020-12-02 15:16:13 -05:00
JT Olio	1728c3a992	satellite/dbx: standardize on assignment Change-Id: I8f87bc8391e765e4480b0590d92d3601248e1f93	2020-12-01 16:10:18 +00:00
Jessica Grebenschikov	b261110352	satellite/orders: get bucketID from encrypted metadata in order instead of serial_numbers table We want to stop using the serial_numbers table in satelliteDB. One of the last places using the serial_numbers table is when storagenodes settle orders, we look up the bucket name and project ID from the serial number from the serial_numbers table. Now that we have support to add encrypted metadata into the OrderLimit, this PR makes use of that and now attempts to read the project ID and bucket name from the encrypted orderLimit metadata instead of from the serial_numbers table. For backwards compatibility and to ensure no errors, we will still fallback to the old way of getting that info from the serial_numbers table, but this will be removed in the next release as long as there are no errors. All processes that create orderLimits must have an orders.encryption-keys set. The services that create orderLimits (and thus need to encrypt the order metadata) are the satellite apiProcess, the repair process, audit service (core process), and graceful exit (core process). Only the satellite api process decrypts the order metadata when storagenodes settle orders. This means that the same encryption key needs to be provided in the config for the satellite api process, repair process, and the core process like so: orders.include-encrypted-metadata=true orders.encryption-keys="<"encryptionKeyID>=<encryptionKey>" Change-Id: Ie2c037971713d6fbf69d697bfad7f8b672eedd66	2020-12-01 15:29:32 +00:00
JT Olio	70b91aac54	satellitedb: remove cruft caused by https://review.dev.storj.io/c/storj/storj/+/3223 Change-Id: I198bb2f869cc7177b9ecafdd8932bbf2b58be5b8	2020-12-01 00:16:26 +00:00
Yingrong Zhao	d8ba7b3057	satellite/console: only allow project member to get all bucket names Change-Id: I8ceb0b7eb19e221072b4ff3411a4ec1a7817d16f	2020-11-30 15:41:35 -05:00
Egon Elbre	f456d7ce03	satellite: remove implementation detail from DB interface Which database access and how it internally does migrations is an implementation detail and does not belong in the requirements interface. Change-Id: Ia4a6994f39470063a96a8e5f3a1bd27aa79fe5cd	2020-11-30 13:29:20 +02:00
Egon Elbre	28ea63be92	satellite/repair: avoid TestDBAccess Change-Id: I34adb58cd67fba5917032f2f328d75b1c4afdbbf	2020-11-30 13:29:08 +02:00
VitaliiShpital	0771cdb0b1	web/satellite: create access grant: generate gateway credentials step WHAT: generate gateway credentials step for create access grant flow WHY: part of the flow Change-Id: I6496712b43f78a818ba0582b586cfae3a44683e6	2020-11-30 10:36:29 +00:00
VitaliiShpital	bb7677a85f	web/satellite: get gateway credentials request using url from config WHAT: POST request to get gateway credentials using access grant. Put request url to config and use it for request. WHY: to show gateway credentials on UI Change-Id: I15ef43ecdeed69b0961d5796aacb47f36d560b1b	2020-11-30 10:36:23 +00:00
Michal Niewrzal	21602e0494	satellite/metainfo: enable commented test Test was commented to make uplink refactoring possible. Now we can bring back this test. Change-Id: I0511b76073efaafed8aac97f8e845dcec93dd059	2020-11-30 10:49:23 +01:00
JT Olio	71e11b27f3	satellite/dbx: only retry with cockroach Change-Id: Id3630c26dbfda36dcbece2849e2353d5ab2882af	2020-11-29 18:10:07 -07:00
JT Olio	bd23d12bb9	satellite/dbx: add cockroach retries for other QueryContext operations Change-Id: Ia30fbba55c926892702fa96fb9dd01b75347d351	2020-11-29 18:09:56 -07:00
JT Olio	ea2f39ca7f	satellite/dbx: add retries for QueryRowContext-based operations Change-Id: Ie2527b673dd4ce5250cf5c0cbf8f14921262f665	2020-11-29 18:09:46 -07:00
JT Olio	d3b0691bbd	satellite/dbx: import dbx templates these are unchanged from storj.io/dbx. we're importing them because in a later commit we will change them, and it'd be nice to see that diff as a separate commit. Change-Id: I8315130ed6bab397bd65b9a1a90c29d130b8c02d	2020-11-29 18:09:33 -07:00
JT Olio	5d8a67a4f7	satellitedb: retry GetBandwidthSince on cockroach Change-Id: I2bf20f3a19e7f3af97630d8a679410feba70661e	2020-11-29 16:36:15 -07:00
Ethan	5dc013d3bd	satellite/overlay: Add retry to all selects in overlaycache Change-Id: I0356d71a35701f8e0ca04a34b2bb2aea666c1394	2020-11-29 16:46:57 -05:00
JT Olio	6bce907cb0	satellite: try to stream rollups to aggregation function to use less memory this change tries really hard to never have all of the storage node rollups in memory at the same time, up until the rollups are actually getting summed together. Change-Id: If67f49e7d71106798d996a6850b3e48671bd9e18	2020-11-29 10:26:32 -07:00
JT Olio	6aae21541f	satellitedb: do saverollup in batches Change-Id: I78278a192cba60541eee2986f54a88d5a479bd3e	2020-11-28 19:26:46 -07:00
JT Olio	0ba516d405	satellite: support pointing db components at different databases the immediate need is to be able to move the repair queue back out of cockroach if we can't save it. Change-Id: If26001a4e6804f6bb8713b4aee7e4fd6254dc326	2020-11-28 18:39:16 +00:00
Moby von Briesen	75f0f713a3	satellite/repair/checker/checker.go: Use number of healthy pieces instead of SegmentHealth for injured segments queue. We did not test the SegmentHealth function with actual production values, and it turns out that values such as 52 healthy, 35 minimum result in +Inf segment health - so pretty much all segments put into the repair queue have the same health, which means we effectively aren't sorting by health. This change inserts numHealthy as segment health into the database so the segments are ordered as they were before. We need to refine the SegmentHealth function before we can support multi RS. Change-Id: Ief19bbfee3594c5dfe94ca606bc930f05f85ff74	2020-11-28 12:16:32 -05:00
Ivan Fraixedes	7eb3b2d6d0	satellite/gc: Init map with an aprox size Because the PieceTracker receives a piece count per nodes which is an approximation of the number of nodes that they are going to be reported by the metainfo loop so we can use as a good guess of the map's size and initialized with it. Change-Id: I644db40926c03e4c457457fb41d2ec1da059cea6	2020-11-27 10:44:19 +01:00
Ivan Fraixedes	319d2cad11	satellite: Fix typo in a comment Change-Id: I151b824e868db1cc1e8b8e8af9f35b027db1e6ff	2020-11-26 15:44:49 +01:00
Michal Niewrzal	8ceef9f357	satellite/metainfo: temporary disable one assertion in test This is need to merge https://review.dev.storj.io/c/storj/uplink/+/3208 , after that this code will be back. Change-Id: If9f2f1db95c7a1bba64a41c45a39bd3096a519e7	2020-11-25 13:21:41 +00:00
Egon Elbre	3792e2921c	satellite/accounting/tally: make test less fragile MetadataSize can slightly vary and checking for exact value makes difficult to change what's being encoded in metadata. Change-Id: I5f1ade41bc26d115e6743367ee35cf1ba74795c9	2020-11-25 13:33:24 +02:00
Kaloyan Raev	53b7fd7b00	satellite/{audit,gracefulexit}: remove logic for PieceHashesVerified We now have the piece hashes verified for all segments on all production satellites. We can remove the code that handles the case where piece hashes are not verified. This would make easier the migration of services from PointerDB to the new metabase. For consistency, PieceHashesVerified is still set to true in PointerDB for new segments. Change-Id: Idf0ccce4c8d01ae812f11e8384a7221d90d4c183	2020-11-24 11:09:48 +02:00
Egon Elbre	9de1617db0	satellite/orders: ensure encryption keys handles set twice Currently flag parsing seems to call Set twice, which causes problems with encryption keys. We can clear for every set for now. Change-Id: Id5c695b4020194ac1c50a2da9c7d2a896cb9216f	2020-11-23 19:47:22 +00:00
Moby von Briesen	575f50df84	satellite/repair: Update repair override config to support multiple RS schemes. Rather than having a single repair override value, we will now support repair override values based on a particular segment's RS scheme. The new format for RS override values is "k/o/n-override,k/o/n-override..." Change-Id: Ieb422638446ef3a9357d59b2d279ee941367604d	2020-11-23 18:01:15 +00:00
Egon Elbre	55d5e1fd7d	satellite/orders: ensure that expired deletion doesn't stall Add checks to ensure that when somebody uses empty options, the deletion doesn't loop infinitely. Change-Id: I1738fb1e7e1f8efbbb954c491cb6489f7bcdc2db	2020-11-23 14:52:40 +02:00
Jessica Grebenschikov	5beb2f5737	satellite/orders: add factory function to encryption key Change-Id: I9a1020c63e4ebc6d73683cf1749366e9b9f20f07	2020-11-20 11:40:15 -08:00
Ethan	2b92bba563	satellite/satellitedb/orders: Handle serial_numbers deletes in smaller increments on CRDB CRDB doesn't like large deletes. While testing in the POC environment we found that deletes on the serial_numbers table could take hours. This change limits deletes to 1000 at a time (configurable) to avoid blocking other queries. Change-Id: I08455e25db1574579dd4d7b7125a08e9c913dff1	2020-11-20 13:44:52 +00:00
Moby von Briesen	a8b66dce17	satellite/accounting: account for old orders that can be submitted in satellite rollup With the new phase 3 order submission, orders can be added to the storage and bandwidth rollup tables at timestamps before the most recent rollup was run. This change shifts the start time of each new rollup window to account for any unexpired orders that might have been added since the previous rollup. A satellitedb migration is necessary to allow upserts in the accounting_rollups table when entries with identical node_ids and start_times are inserted. Change-Id: Ib3022081f4d6be60cfec8430b45867ad3c01da63	2020-11-18 14:46:00 -05:00
Egon Elbre	aeb801604e	{satellite,storagenode}/orders: fix flaky tests Before manipulating order information on storagenodes we need to wait for the orders to propagate to the database. Some of that happens async with uplink. Change-Id: Iaacfd7db0909ab5d2831d06388e5fb27b6d4778f	2020-11-18 13:44:02 +00:00
paul cannon	2b59640f18	cmd/satellite: ignore Canceled in exit from repair worker Firstly, this changes the repair functionality to return Canceled errors when a repair is canceled during the Get phase. Previously, because we do not track individual errors per piece, this would just show up as a failure to download enough pieces to repair the segment, which would cause the segment to be added to the IrreparableDB, which is entirely unhelpful. Then, ignore Canceled errors in the return value of the repair worker. Apparently, when the worker returns an error, that makes Cobra exit the program with a nonzero exit code, which causes some piece of our deployment automation to freak out and page people. And when we ask the repair worker to shut down, "canceled" errors are what we _expect_, not an error case. Change-Id: Ia3eb1c60a8d6ec5d09e7cef55dea523be28e8435	2020-11-17 21:37:59 +00:00
Moby von Briesen	0ec685b173	satellite/{satellitedb, repair/{queue, checker}}: Use new column "segmentHealth" instead of "numHealthy" in injured segments queue We plan to add support for a new Reed-Solomon scheme soon, but our repair queue orders segments by least number of healthy pieces first. With a second RS scheme, fewer healthy pieces will not necessarily correlate to lower health. This change just adds the new column in a migration. A separate change will add the new health function. Right now, since we only support one RS scheme, behavior will not change. Number of healthy pieces is being inserted as "segment health" until the new health function is merged. Segment health is calculated with a new priority function created in commit `3e5640359`. In order to use the function, a new config value is added, called NodeFailureRate, representing the approximate probability of any individual node going down in the duration of one checker run. Change-Id: I51c4202203faf52528d923befbe886dbf86d02f2	2020-11-16 21:18:09 +00:00
VitaliiShpital	51a712f9e8	satellite/console: get all bucket names endpoint and service method WHAT: new endpoint for fetching all bucket names WHY: used by new access grant flow Change-Id: I356a3381359665fd2726120139b34b1e611fe3c4	2020-11-16 17:51:40 +02:00
Jessica Grebenschikov	f558cc825e	satellite/orders: add storagenode_bw_phase2 table and dont delete tallies for longer It turns out we need to make 2 more changes in order for the new order submission phase 3 to get deployed. This PR makes 2 changes: 1) when the rollup service deletes tallies, we now keep tallies around until orders expire (vs 1 day like before). 2) the reported rollup chore will now write the storagenode_bandwidth_rollups to a new table _phase2 as an intermediary step so it doesn't conflict with phase 3 order settlement. These changes need to be deployed for 2 days before we can turn on phase 3 of the new orders settlement workflow. Change-Id: Iafbff577ba7d55f8f17b7db857311b2ce799de60	2020-11-13 17:15:24 +00:00
Yaroslav Vorobiov	1b4bfbb9d2	multinode/console: nodes addition and removal Change-Id: I60c685953a8d0e24f78b1414c34a28d4b87863b0	2020-11-12 20:26:08 +02:00
Jessica Grebenschikov	226e13e616	satellite/cosole: add tests for wasm access code Change-Id: I78f71b2f0bef03b6e87cd7d79ccaef5f45393b55	2020-11-12 08:03:36 -08:00
paul cannon	3e56403599	satellite/repair: add a repair health function This will be used to rank segments in need of repair for attention by the repair workers. Change-Id: I5b70650cec933696b4c6d73bb7efb97e3efdf24a	2020-11-11 18:48:51 +00:00
Jeff Wendling	31533ed1a1	satellite/console/wasm: remove storj.io/uplink deependency Change-Id: Iee95389e4ba24618e31aff7be44d05377b2e2419	2020-11-11 16:51:14 +00:00
Cameron Ayer	da9f1f0611	satellite/repair: add monkit counter for segments below minimum required The current monkit reporting for "remote_segments_lost" is not usable for triggering alerts, as it has reported no data. To allow alerting, two new metrics "checker_segments_below_min_req" and "repairer_segments_below_min_req" will increment by zero on each segment unless it is below the minimum required piece count. The two metrics report what is found by the checker and the repairer respectively. Change-Id: I98a68bb189eaf68a833d25cf5db9e68df535b9d7	2020-11-11 12:48:23 +00:00
Yingrong Zhao	2ce3170bb4	satellite/console/wasm: expose method to add caveats in the browser This PR does the following three things: 1. Defines a high-level interface for this wasm package - All return value from this package will be wrapped with an result object that contains a value field and an error field 2. Exposes two new functions to allow users to add permissions for a given API key - newPermission() - setAPIKeyPermission() 3. Adds API documentation for the newly added API functions Change-Id: Id995189702b369bba18fa344bef4ddfb0f3f1f44	2020-11-10 20:10:53 +00:00
Brandon Iglesias	3ba52b25a9	satellite/rewards: update partners to include MAXN	2020-11-10 14:08:32 +02:00
Moby von Briesen	db6bc6503d	satellite/metainfo: Update metainfo RS config to more easily support multiple RS schemes. Make metainfo.RSConfig a valid pflag config value. This allows us to configure the RSConfig as a string like k/m/o/n-shareSize, which makes having multiple supported RS schemes easier in the future. RS-related config values that are no longer needed have been removed (MinTotalThreshold, MaxTotalThreshold, MaxBufferMem, Verify). Change-Id: I0178ae467dcf4375c504e7202f31443d627c15e1	2020-11-09 22:16:13 +00:00
Cameron Ayer	d63b7658e8	satellite/repair: fix lastSeenSegmentKey bug in IrreparableProcess A change was made to use a metabase.SegmentKey (a byte slice alias) as the last seen item to iterate through the irreparable DB in a for loop. However, this SegmentKey was not initialized, thus it was nil. This caused the DB query to return nothing, and healthy segments could not be cleaned out of the irreparable DB. Change-Id: Idb30d6fef6113a30a27158d548f62c7443e65a81	2020-11-09 14:48:15 +00:00
VitaliiShpital	f8c3848c78	satellite/console: change user's email endpoint/feature WHAT: change user's email endpoint and appropriate service method was implemented WHY: make it possible to change user's email for temporary filezilla account Change-Id: Ieea41bf49819a42b5f433e8dfaeec24c6d5ddc9f	2020-11-06 11:54:07 +00:00
jessicagreben	c4c29e370a	wasm: add webassembly code for creating access grant in console web UI Change-Id: I3c6d9afc660f3d959d6138db84341e9460b877a1	2020-11-04 12:08:30 -08:00
Ivan Fraixedes	2dffaebc6f	satellite/accounting: Fix and enhance code doc comments Fix and enhance the source code documentation comments for the satellite/accounting packaged. Change-Id: I965742cf378e8b6b80d18bc84a4ff76e9af1e8b7	2020-11-04 09:50:48 +00:00
paul cannon	8616fc146d	satellite/orders: send IPs for graceful exit Storage nodes undergoing Graceful Exit have up to now been receiving hostnames for all other storage nodes they need to contact when transferring pieces. This adds up to a lot of DNS lookups, which apparently overwhelm some home routers. There does not seem to be any need for us to send hostnames for graceful exit as opposed to IP addresses; we already use IP addresses (as given by the last_ip_port column in the nodes table) for all the GET and PUT orders we send out. This change causes IP addresses to be used instead. I started trying to construct a test to ensure that the behavior changed, but it was rabbit-holing, so I've begun to feel that maybe this change doesn't require one; it is a very simple change, and very much of the same nature as what we already do for IPs in CreateGetOrderLimits and CreatePutOrderLimits (and others). Change-Id: Ib2b5ffe7a9310e9cdbe7464450cc7c934fa229a1	2020-11-04 00:17:20 +00:00
Cameron Ayer	dc67ce74c9	satellite: remove IsUp field from overlay.UpdateRequest With the new overlay.AuditOutcome type for offline audits, the IsUp field is redundant. If AuditOutcome != AuditOffline, then the node is online. In addition to removing the field itself, other changes needed to be made regarding the relationship between 'uptime' and 'audits'. Previously, uptime and audit outcome were completely separated. For example, it was possible to update a node's stats to give it a successful/failed/unknown audit while simultaneously indicating that the node was offline by setting IsUp to false. This is no longer possible under this changeset. Some test which did this have been changed slightly in order to pass. Also add new benchmarks for UpdateStats and BatchUpdateStats with different audit outcomes. Change-Id: I998892d615850b1f138dc62f9b050f720ea0926b	2020-11-02 15:34:17 -05:00
Egon Elbre	7183dca6cb	all: fix defers in loop defer should not be called in a loop. Change-Id: Ifa5a25a56402814b974bcdfb0c2fce56df8e7e59	2020-11-02 15:06:38 +02:00
Egon Elbre	fd8e697ab2	{satellite,storagenode}/internalpb: use specific package name Ensure we don't register types with the same name into protobuf. Change-Id: I53d025863fff8c91a067ca5819befa87eb5e35bb	2020-10-30 17:31:08 +02:00
Michal Niewrzal	0205f0d807	satellite/metainfo: fix usage of types from internalpb After moving SatStreamID and SatSegmentID from common I missed changing some methods in metainfo endpoint. This change is a fix for that. Change-Id: I34e121fce47371ee4cfd92cce03809520b68859f	2020-10-30 16:03:45 +02:00
Egon Elbre	77c4f99fa0	satellite/internalpb: move delegated_repair.proto Change-Id: If4f37c52b151e09cf35d2145b463ef1e9ab529ae	2020-10-30 15:31:32 +02:00
Egon Elbre	11338e9beb	satellite/internalpb: move audithistory.pb Change-Id: I8eee84d49ed90459168ddaf04ae57f790c2a22c4	2020-10-30 15:30:11 +02:00
Egon Elbre	7ce372c686	satellite/internalpb: add inspectors Change-Id: Ib688e43d05135c0c31ae95df533f1e4535ea396a	2020-10-30 13:28:17 +02:00

1 2 3 4 5 ...

1626 Commits