storj

Author	SHA1	Message	Date
Egon Elbre	85fb964afe	satellite/{metainfo,overlay}: improvements to GetObjectIPs * Deduplicate NodeID list prior to fetching IPs. * Use NodeSelectionCache for fetching reliable IPs. * Return number of segements, reliable pieces and all pieces. Change-Id: I13e679caab275488b4037624b840a4068dad9589	2021-01-14 09:12:45 +00:00
Egon Elbre	d11c2b709e	go.mod: bump storj.io/common * Add missing endpoints. * Fix deprecated packages and funcs. Change-Id: I756090c46a4d15eabf6d413a593cdc64c5809bc7	2021-01-13 14:51:08 +00:00
Ivan Fraixedes	a4d06b9b1e	satellite/metainfo: Don't response errors when Redis down For being able to have resilient multi-region satellites we cannot stop processing uploads/download client request when Redis isn't responding properly. These changes avoid to stop the processing of the client requests when we cannot check if the client exceeds its storage or bandwidth limits and we cannot update its used storage/bandwidth limits because Redis is not responding successfully or the satellite database returns an error. Change-Id: Ia7f12c07fc9ffdfad0e7ff052ff3fd81eca0f0e3	2021-01-13 14:30:44 +00:00
Ivan Fraixedes	a73c59bbdd	satellite/console/consoleweb: Change status codes usage limits Respond to the HTTP clients which request the project usage limits with different status codes depending of the error class returned by the satellite/accounting Service. Change-Id: I6f486ea55517f616c7cec81dbbe77e997484180f	2021-01-13 15:00:12 +01:00
Cameron Ayer	0184d33e96	satellite/satellitedb: set default 0 on uptime columns This is the first step in the removal of uptime columns on the nodes table. These columns are no longer used: uptime_success_count total_uptime_count uptime_reputation_alpha uptime_reputation_beta In order to avoid breaking backwards compatibility, we need to remove all references to these columns before removing the columns themselves from the database. However, since uptime_success_count and total_uptime_count are NOT NULLABLE, we can't remove them from the insert statements in the overlay. So we can't remove the columns because of the references, and we can't remove the references because the columns can't be null. What a pickle. To remedy this, we will set a default on the columns. Then we should be able to remove them from the insert statements Change-Id: I75f6c56fb7897835bbf29869f86f39de1d9dd345	2021-01-12 17:44:37 +00:00
Ivan Fraixedes	ce26616647	satellite/accounting/live: Use Redis client directly We have to adapt the live accounting to allow the packages that use it to differentiate about errors for being able to ignore them and make our satellite resilient to Redis downtime. For differentiating errors we should make changes in the live accounting but also in the storage/redis.Client, however, we may need to do some dirty workarounds or break other parts of the implementation that depends on it. On the other hand we want to get rid of the storage/redis.Client because it has more functionality that the one that we are using and some process has been started to remove it. Hence, we have refactored the live accounting to directly use the Redis client library for later on (in a future commit) adapt the satellite for being resilient to Redis downtime. Last but not least, a test for expired bandwidth keys have been added and with it a bug was spotted and fix it. Change-Id: Ibd191522cd20f6a9a15e5ccb7beb83a678e530ff	2021-01-12 15:33:29 +01:00
JT Olio	1ad69b9f96	satellite: make external address calc more robust Change-Id: Icbdc6fb4e3fc85076dcb7bcc4b3ec36baad308d2	2021-01-11 16:47:25 +00:00
Cameron Ayer	0403e99a5b	satellite/{overlay,satellitedb}: remove unused methods for old downtime tracking GetSuccessfulNodeNotCheckedInSince and GetOfflineNodesLimited are overlay methods which were only used by the previous downtime tracking system which has been removed. These methods should also be removed. Change-Id: Idb829d742e1f987e095604423fff656fe581183e	2021-01-11 15:21:28 +00:00
Egon Elbre	24833465e6	satellite/metainfo/metabase: avoid magic constant Change-Id: I4f01e38f67e18ae9cb9845a8e75a987acba66427	2021-01-11 10:22:21 +00:00
Jessica Grebenschikov	1709117b0d	satellite/console/wasm: add more unit tests Change-Id: Ie134f8a08d690ce013039ed1a4e484f8b6a1a6d5	2021-01-08 18:50:29 +00:00
Jeff Wendling	2d2359667d	satellite/orders: remove unused satelliteAddress field Change-Id: I58091769472688433c48becc8dfc9029bddd87aa	2021-01-08 12:25:39 -05:00
Egon Elbre	ba5461562d	satellite/orders: remove satellite address SatelliteAddress in OrderLimit is not being used anymore and some satellite addresses may consume too much bytes. Change-Id: Ic7a0efe5b6211c2f3b91af67b293cde98b29d074	2021-01-08 16:57:36 +00:00
Egon Elbre	51731db121	satellite/orders: use smaller encrypted metadata Avoid using project uuid string representation, because it uses more bandwidth. This reduces the encrypted metadata size from 118 -> 97 bytes. Change-Id: Ic53a81b83acc065f24f28cd404f9c0b1fe592594	2021-01-08 16:40:31 +00:00
JT Olio	8907180e81	satellite: pass contact.external-address config to web ui Change-Id: I54978aa34aa9eb98876fab6460a5737d718d6135	2021-01-06 10:11:20 -07:00
Egon Elbre	9cb4466eb0	cmd/storj-sim: use dev setup by default for consistency Fixes bug when using release binaries together with storj-sim. Change-Id: I077bedc1486ac85aa1f04fcc0ed4098cd313f2fc	2021-01-05 13:47:30 +02:00
Moby von Briesen	a90d6fcad8	satellite/repair/checker: Use segment health on checker insert Do not insert the number of healthy pieces for segment health anymore. Rather, insert the segment health calculated by our new priority function. Change-Id: Ieee7fb2deee89f4d79ae85bac7f577befa2a0c7f	2021-01-04 11:48:17 -05:00
Moby von Briesen	6e2ef3b9ee	Revert "satellite/satellitedb: Do not consider nodes with offline_suspended as reputable." This reverts commit `e24262c2c9`. Change-Id: I287deb2e52d03bbd698ed055f0f216b0b5bf2798	2021-01-04 14:28:37 +00:00
Michał Niewrzał	d4ebdba48c	satellite/payments/stripecoinpayments: fix tests failing in 2021 We had some tests with hardcoded year 2020. Change-Id: I0184c3ece819cb764eb305751a1d8d4056b6af17	2021-01-04 10:47:31 +01:00
Moby von Briesen	edbee53888	satellite,storagenode: Pass audit history over GetStats endpoint Full prefix: satellite/{overlay,nodestats},storagenode/{reputation,nodestats} Allow the storagenode to receive its audit history data from the satellite via the satellite's GetStats endpoint. The storagenode does not save this data for use in the API yet. Change-Id: I9488f4d7a4ccb4ccf8336b8e4aeb3e5beee54979	2020-12-30 19:13:26 +00:00
Rafael Gomes	8b2e4bfa7e	satellite/metainfo/piecedeletion Remove spaces from metrics. Change-Id: Iaf1d8a96a43087f2fcc579347f581e8a78a0fb58	2020-12-30 14:27:39 -03:00
paul cannon	7246368ca1	satellite/repair: clamp totalNodes to 100 or higher Change-Id: I239418ed3671b1cee30b0b1797dc434244e72448	2020-12-30 10:39:14 -06:00
Moby von Briesen	825dc71227	satellite/{overlay, satellitedb}: Refactor audit history * Separate audit history interface into its own file in the overlay package * Add overlay.AuditHistory struct so that internalpb.AuditHistory is only used from within the database layer * Add overlay.GetAuditHistory function for features that will require access to detailed audit history information * Do not return full audit history from UpdateAuditHistory - callers to that function only need to know the online score and whether a full tracking period has been completed * Move audit history tests out of satellite/satellitedb, since they are independent of database implementation Change-Id: I35b0c4ac23bbaabd80624f8a9631c3cb1a1f33bd	2020-12-29 18:50:22 +00:00
Moby von Briesen	85ae13f11d	satellite/satellitedb: Drop nodes_offline_times table. Now that the deprecated downtime tracking service is removed (`3fc76f4ffe`), we can safely remove the nodes_offline_times table. Change-Id: Ia7c6efe32ba104dff5a830af5f2beee3337eefe5	2020-12-29 18:17:50 +00:00
Moby von Briesen	e24262c2c9	satellite/satellitedb: Do not consider nodes with offline_suspended as reputable. Nodes which are offline_suspended will no longer be considered for new uploads. The current threshold that enters a node into offline suspension is 0.6. Disqualification for offline suspension is still disabled. Change-Id: I0da9abf47167dd5bf6bb21e0bc2186e003e38d1a	2020-12-29 17:59:09 +00:00
Stefan Benten	ad58459198	satellite/admin: allow more than just "paid" invoice status during user deletion Currently we do not allow anything other than the "paid" status for invoices when trying to delete a user. However there can be a couple of other states that are still fine to accept during deletion of a user. This change reverses the order to check for the status that we do not want to allow. Change-Id: I78d85af6438015c55100fa201ccffc731c91de1c	2020-12-23 16:40:44 +01:00
JT Olio	7faaeed2bf	satellite/access grant wizard: don't hardcode the satellites Change-Id: Id9fbf68882cdb2fce846b7a2604cf965cc53ab1a	2020-12-22 21:24:45 -07:00
JT Olio	efde103dba	accounting: rollup test is broken for the hour before midnight UTC this change isn't the real fix. it's just ignoring the problem. i don't know what the real fix is. is the problem with the test, or is there actually a problem with the rollup code? Change-Id: I552bdd947deadc212cc56efc5f818942b9827126	2020-12-22 14:14:52 -07:00
Ethan Adams	6070018021	satellite/overlay: use AS OF SYSTEM TIME with Cockroach Query nodes table using AS OF SYSTEM TIME '-10s' (by default) when on CRDB to alleviate contention on the nodes table and minimize CRDB retries. Queries for standard uploads are already cached, and node lookups for graceful exit uploads has retry logic so it isn't necessary for the nodes returned to be current.	2020-12-22 21:07:07 +02:00
Ethan Adams	563197c628	satellite/overlay: Add index on nodes table (#4012 ) satellite/accounting: Add index for project_id on bucket_storage_tallies	2020-12-21 12:48:48 -05:00
Ethan Adams	9b52283570	satellite/accounting: Add index for project_id on bucket_storage_tallies (#4010 ) Change-Id: I47ab2d1e24f94307c3383c497cffe2a150fa8ab7	2020-12-21 11:42:00 -05:00
Ethan Adams	6e501898c3	satellite/accounting: Performance improvements to getNodeIds used by GetBandwidthSince (#4009 )	2020-12-21 16:37:01 +01:00
Jessica Grebenschikov	d961437889	satellite/orders: remove the config IncludeEncryptedMetadata Since the Satellite now requires the order encryption functionality (since serial_number table is deprecated) to properly function, we can remove the config flag to turn on/off the feature. Change-Id: Ie973f72a9a05a81cef9e53dc9c99d22c940c2488	2020-12-18 10:39:29 -08:00
Jessica Grebenschikov	da0327c9b7	satellite/dbcleanup: remove expired serial chore Change-Id: Ib71d41eb6679d6435e5bc10b6244dac66380a74e	2020-12-18 09:36:28 -08:00
Jessica Grebenschikov	97a5e6c814	satellite/orders: stop inserting/reading from serial_numbers table This PR contains the minimum changes needed to stop inserting into the serial_numbers table. This is the first step in completely deprecating that table. The next step is to create another PR to remove the expiredSerial chore, fix more tests, and remove any other methods on the serial_number table. Change-Id: I5f12a56ebf3fa4d1a1976141d2911f25a98d2cc3	2020-12-18 08:35:13 -08:00
littleskunk	2437d5b171	satellite/access-grants: default auth service url (#4002 ) * satellite/access-grants: default auth service url	2020-12-17 23:38:16 +01:00
paul cannon	d3604a5e90	satellite/repair: use survivability model for segment health The chief segment health models we've come up with are the "immediate danger" model and the "survivability" model. The former calculates the chance of losing a segment becoming lost in the next time period (using the CDF of the binomial distribution to estimate the chance of x nodes failing in that period), while the latter estimates the number of iterations for which a segment can be expected to survive (using the mean of the negative binomial distribution). The immediate danger model was a promising one for comparing segment health across segments with different RS parameters, as it is more precisely what we want to prevent, but it turns out that practically all segments in production have infinite health, as the chance of losing segments with any reasonable estimate of node failure rate is smaller than DBL_EPSILON, the smallest possible difference from 1.0 representable in a float64 (about 1e-16). Leaving aside the wisdom of worrying about the repair of segments that have less than a 1e-16 chance of being lost, we want to be extremely conservative and proactive in our repair efforts, and the health of the segments we have been repairing thus far also evaluates to infinity under the immediate danger model. Thus, we find ourselves reaching for an alternative. Dr. Ben saves the day: the survivability model is a reasonably close approximation of the immediate danger model, and even better, it is far simpler to calculate and yields manageable values for real-world segments. The downside to it is that it requires as input an estimate of the total number of active nodes. This change replaces the segment health calculation to use the survivability model, and reinstates the call to SegmentHealth() where it was reverted. It gets estimates for the total number of active nodes by leveraging the reliability cache. Change-Id: Ia5d9b9031b9f6cf0fa7b9005a7011609415527dc	2020-12-17 21:30:17 +00:00
littleskunk	3feee9f4f8	satellite/accounting: default project limits (#4001 )	2020-12-17 22:27:05 +01:00
Cameron Ayer	28eaae66af	satellite/satellitedb: drop num_healthy_pieces column from injuredsegments This column is no longer used as it has been replaced by the segment_health column. Change-Id: I6b4df89cd4f994d8418976f88e8c5f57615f8115	2020-12-17 20:17:08 +00:00
VitaliiShpital	f4bbd0f5df	web/satellite: use brotli instead of gzip WHAT: we'll use brotli instead of gzip from now on WHY: better compression Change-Id: Ibeadd6bfc783e9c15cf3f62f719af692071a7721	2020-12-17 19:23:44 +00:00
Egon Elbre	12055e7864	all: minor cleanups Change-Id: I4248dbe36a62a223b06135254b32851485a2eec1	2020-12-16 10:47:46 +00:00
Cameron Ayer	8c52bb3a18	satellite/checker: use numHealthy as segment health in repair queue A few weeks ago it was discovered that the segment health function was not working as expected with production values. As a bandaid, we decided to insert the number of healthy pieces into the segment health column. This should have effectively reverted our means of prioritizing repair to the previous implementation. However, it turns out that the bandaid was placed into the code which removes items from the irreparable db and inserts them into the repair queue. This change: insert number of healthy pieces into the repair queue in the method, RemoteSegment Change-Id: Iabfc7984df0a928066b69e9aecb6f615253f1ad2	2020-12-15 17:16:59 -05:00
Cameron Ayer	2ac72eaf16	satellite/repair/checker: add new monkit stats tagged with rs scheme There is a new checker field called statsCollector. This contains a map of stats pointers where the key is a stringified redundancy scheme. stats contains all tagged monkit metrics. These metrics exist under the key name, "tagged_repair_stats", which is tagged with the name of each metric and a corresponding rs scheme. As the metainfo observer works on a segment, it checks statsCollector for a stats corresponding to the segment's redundancy scheme. If one doesn't exist, it is created and chained to the monkit scope. Now we can call Observe, Inc, etc on the fields just like before, and they have tags! durabilityStats has also been renamed to aggregateStats. At the end of the metainfo loop, we insert the aggregateStats totals into the corresponding stats fields for metric reporting. Change-Id: I8aa1918351d246a8ef818b9712ed4cb39d1ea9c6	2020-12-15 14:08:01 +00:00
Stefan Benten	9fe477899b	satellite/satellitedb: add lint ignore rule to support staticcheck 2020.2 staticcheck 2020.2 is not liking our dbx files, so we need to ignore them. Change-Id: I6becc3619bb088473f9776d0878ce240d4935936	2020-12-14 21:16:31 +00:00
Jessica Grebenschikov	3cc98de3ee	satellite/console/wasm: reduce size to <9MB Make changes so that we only import the necessary files from the console package so that the generated wasm code is as small as possible. This change gets the compiled wasm code down to 8.6MB uncompressed and 2MB when compressed with `gzip --best`. https://review.dev.storj.io/c/storj/storj/+/3396 Change-Id: Ifdd4be285810757b46bbbe43327c0d0139e5f8f7	2020-12-14 16:41:39 +00:00
Ivan Fraixedes	2dddcffe43	satellite/accounting/rollout: Remove unused variable Remove a declared variable that's set by never read nor passed to any function so it's unused code. Change-Id: I8daf9d1f71d29ab39d7a80011d1b4813ada1c67d	2020-12-14 14:11:41 +00:00
Brandon Iglesias	ca1e6b9756	Adding Fastly (#3994 )	2020-12-11 15:53:05 +02:00
Stefan Benten	8fe829d5fd	build: add wasm bits to Dockerfile and bump to go v1.15.6 (#3992 )	2020-12-11 02:23:39 +01:00
Jessica Grebenschikov	0649d2b930	satellite/repair: improve contention for injuredsegments table on CRDB We migrated satelliteDB off of Postgres and over to CockroachDB (crdb), but there was way too high contention for the injuredsegments table so we had to rollback to Postgres for the repair queue. A couple things contributed to this problem: 1) crdb doesn't support `FOR UPDATE SKIP LOCKED` 2) the original crdb Select query was doing 2 full table scans and not using any indexes 3) the SLC Satellite (where we were doing the migration) was running 48 repair worker processes, each of which run up to 5 goroutines which all are trying to select out of the repair queue and this was causing a ton of contention. The changes in this PR should help to reduce that contention and improve performance on CRDB. The changes include: 1) Use an update/set query instead of select/update to capitalize on the new `UPDATE` implicit row locking ability in CRDB. - Details: As of CRDB v20.2.2, there is implicit row locking with update/set queries (contention reduction and performance gains are described in this blog post: https://www.cockroachlabs.com/blog/when-and-why-to-use-select-for-update-in-cockroachdb/). 2) Remove the `ORDER BY` clause since this was causing a full table scan and also prevented the use of the row locking capability. - While long term it is very important to `ORDER BY segment_health`, the change here is only suppose to be a temporary bandaid to get us migrated over to CRDB quickly. Since segment_health has been set to infinity for some time now (re: https://review.dev.storj.io/c/storj/storj/+/3224), it seems like it might be ok to continue not making use of this for the short term. However, long term this needs to be fixed with a redesign of the repair workers, possible in the trusted delegated repair design (https://review.dev.storj.io/c/storj/storj/+/2602) or something similar to what is recommended here on how to implement a queue on CRDB https://dev.to/ajwerner/quick-and-easy-exactly-once-distributed-work-queues-using-serializable-transactions-jdp, or migrate to rabbit MQ priority queue or something similar.. This PRs improved query uses the index to avoid full scans and also locks the row its going to update and CRDB retries for us if there are any lock errors. Change-Id: Id29faad2186627872fbeb0f31536c4f55f860f23	2020-12-10 09:51:26 -08:00
Michal Niewrzal	c2a97aeb14	satellite/satellitedb: add ListAllBuckets method We need to be able to list all buckets in DB without knowing project ID. This method will be used to list buckets for metainfo loop implementation based on metabase. Change-Id: Iac75af0eee4f31e80a15577575a8249cbca787b2	2020-12-10 14:19:27 +00:00
Stefan Benten	494bd5db81	all: golangci-lint v1.33.0 fixes (#3985 )	2020-12-05 17:01:42 +01:00

1 2 3 4 5 ...

1584 Commits