storj

Author	SHA1	Message	Date
Cameron Ayer	bb7be23115	satellite/{audit,overlay,satellitedb}: enable reporting offline audits - Remove flag for switching off offline audit reporting. - Change the overlay method used from UpdateUptime to BatchUpdateStats, as this is where the new online scoring is done. - Add a new overlay.AuditOutcome type: AuditOffline. Since we now use the same method to record offline audits as success, failure, and unknown, we need to distinguish offline audits from the rest. Change-Id: Iadcfe10cf13466fa1a1c2dc542db8994a6423355	2020-10-27 10:44:46 +00:00
Ethan	9a29ec5b3e	Add index to graceful_exit_transfer_queue table This fixes a slow query that was taking up to 4 seconds in production SELECT node_id, path, piece_num, root_piece_id, durability_ratio, queued_at, requested_at, last_failed_at, last_failed_code, failed_count, finished_at, order_limit_send_count FROM graceful_exit_transfer_queue WHERE node_id = '[redacted]' AND finished_at is NULL AND last_failed_at is NULL ORDER BY durability_ratio asc, queued_at asc LIMIT 300 OFFSET 0; Change-Id: Ib89743ca35f1d8d0a1456b20fa08c683ebdc1549	2020-10-26 14:47:48 +00:00
Moby von Briesen	7c3afe164b	satellite/overlay: uncomment dq for offline and disable with feature flag Change-Id: Ib39e2be32e880b822a94eddfb81af99a38843a27	2020-10-16 12:55:16 +00:00
Yaroslav Vorobiov	139a7ee959	private/migrate: add ablity to create dbs during migration Use tagsql.DB pointer as step database, to propagate changes back and forth between actual database and migration. Adds CreateDB operation to the migration step to be able to create new dbs before executing migration action. Adjusts storagenode database migration to use inner tagsql.DB pointer of each database as step.DB. Adjusts satellite dabase migration, adds proxy migrationDB field to satellite db that wraps itself as tagsql.DB, pointer of which is used as step.DB. Change-Id: Ifed4de5b01a356cf7b37db64d2eaeb7b61982c5c	2020-10-15 15:28:04 +03:00
Stefan Benten	0b43b93259	satellite/satellitedb: make limits per default NULL This change completes the column migration of `5f6fccc6e8` and `2f648fd981`. It resets every users project limits who are below or equal to our current production defaults. Change-Id: Ie041d08bb67b62844f6023190fc00bc2dad5b1cb	2020-10-14 20:28:16 +00:00
Egon Elbre	2268cc1df3	all: fix linter complaints Change-Id: Ia01404dbb6bdd19a146fa10ff7302e08f87a8c95	2020-10-13 15:59:01 +03:00
Egon Elbre	0bdb952269	all: use keyed special comment Change-Id: I57f6af053382c638026b64c5ff77b169bd3c6c8b	2020-10-13 15:13:41 +03:00
Jeff Wendling	0f0faf0a9f	satellite/orders: do a better job limiting concurrent requests Doing it at the ProcessOrders level was insufficient: the endpoints make multiple database calls. It was a misguided attempt to only have one spot enter the semaphore. By putting it in the endpoint we can not only be sure that the concurrency is correctly limited but it can be configurable easily. Change-Id: I937149dd077adf9eb87fce52a1a17dc0afe96f64	2020-10-09 16:27:15 -04:00
Jeff Wendling	7c303208ff	satellite/satellitedb: emergency temporary order processing semaphore we have thundering herds of order submissions that take all of the database connections causing temporary periodic outages. limit the amount of concurrent order processing to 2. Change-Id: If3f86cdbd21085a4414c2ff17d9ef6d8839a6c2b	2020-10-08 19:16:47 +00:00
Cameron Ayer	b39a99bae6	satellite/{overlay,satellitedb}: always show node's real online score Previously if a node did not have audit history data for each of the windows over the tracking period, we would give them the benefit of the doubt and set their score to 1. This was to prevent nodes from being suspended right out the gate. We need a minimum amount of data to evaluate them. However, a node who is actually failing at being online will have no idea until they have received enough audits and we suspend them. Instead, we will always use their real score, but use a flag to determine whether they are eligible for suspension/dq. Change-Id: I382218f12e8770f95d4bcddcf101ef348940cadf	2020-10-02 12:28:11 -04:00
Cameron Ayer	c2525ba2b5	satellite/{repair,satellitedb}: clean up healthy segments from repair queue at end of checker iteration Repair workers prioritize the most unhealthy segments. This has the consequence that when we finally begin to reach the end of the queue, a good portion of the remaining segments are healthy again as their nodes have come back online. This makes it appear that there are more injured segments than there actually are. solution: Any time the checker observes an injured segment it inserts it into the repair queue or updates it if it already exists. Therefore, we can determine which segments are no longer injured if they were not inserted or updated by the last checker iteration. To do this we add a new column to the injured segments table, updated_at, which is set to the current time when a segment is inserted or updated. At the end of the checker iteration, we can delete any items where updated_at < checker start. Change-Id: I76a98487a4a845fab2fbc677638a732a95057a94	2020-09-29 20:38:22 +00:00
Egon Elbre	c23a8e3b81	go.mod: update pgx to v4.9.0 Fix query to use TextArray instead of VarcharArray. Fix queries to use the correct type. Change-Id: Ibb7e55adba277d05778118d81ca697470e72c374	2020-09-29 19:03:08 +00:00
Egon Elbre	2d27bc8787	satellite/satellitedb: separate cockroach for migration tests Currently Cockroach migration test is the most heavy with regards to schema changes. This causes other tests to time out. This adds an alternate cockroach instance that is used for migration tests. Change-Id: I01fe9313527ff002f0bb0914dd52c3645b8eaf6d	2020-09-29 09:31:33 +00:00
Jessica Grebenschikov	4a2c66fa06	satellite/accounting: add cache for getting project storage and bw limits This PR adds the following items: 1) an in-memory read-only cache thats stores project limit info for projectIDs This cache is stored in-memory since this is expected to be a small amount of data. In this implementation we are only storing in the cache projects that have been accessed. Currently for the largest Satellite (eu-west) there is about 4500 total projects. So storing the storage limit (int64) and the bandwidth limit (int64), this would end up being about 200kb (including the 32 byte project ID) if all 4500 projectIDs were in the cache. So this all fits in memory for the time being. At some point it may not as usage grows, but that seems years out. The cache is a read only cache. When requests come in to upload/download a file, we will read from the cache what the current limits are for that project. If the cache does not contain the projectID, it will get the info from the database (satellitedb project table), then add it to the cache. The only time the values in the cache are modified is when either a) the project ID is not in the cache, or b) the item in the cache has expired (default 10mins), then the data gets refreshed out of the database. This occurs by default every 10 mins. This means that if we update the usage limits in the database, that change might not show up in the cache for 10 mins which mean it will not be reflected to limit end users uploading/downloading files for that time period.. Change-Id: I3fd7056cf963676009834fcbcf9c4a0922ca4a8f	2020-09-25 16:28:49 +00:00
Stefan Benten	38108828ac	satellite/satellitedb: enable multiple projects existing users Change-Id: I2ef77182d5464d72574698c8abfbbfdbda3f5a9e	2020-09-23 18:17:38 +02:00
Stefan Benten	5f6fccc6e8	satellite/satellitedb: makes limits nullable change backwards compatible Our current endpoints bail on us, if the column data is null. Thus we need to take the intermediate step and set the default to a fixed value and reset those with the following release. It sets the default column value to our current config values of 50GB for storage and bandwidth and 100 buckets, while still enabling the field to be nullable. All 0 values are migrated to be the default as well to ensure they can keep using their projects, as with the original change, 0 actually means 0. Change-Id: I797be80ce2d2105091599dc1b3fc76f74336b66b	2020-09-23 17:54:42 +02:00
Stefan Benten	2f648fd981	satellite: make limits be nullable Currently we have no way to actually set one of the following limits to 0 (meaning not usable): - maxBuckets - usageLimit - bandwidthLimit With having the field nullable, NULL corresponds to the global default, 0 now actually 0 and a set value determines a custom limit. Change-Id: I92bb77529dcbd0881ae8368921be9d246eb0919e	2020-09-21 19:34:19 +00:00
Qweder93	8182fdad0b	storagenode: heldamount renamed to payouts, renamed some methods and structs to more meaningful names. grouped estimated payout with pathouts satellite: heldamount renamed to SNOpayouts. Change-Id: I244b4d2454e0621f4b8e22d3c0d3e602c0bbcb02	2020-09-16 14:57:35 +00:00
Cameron Ayer	e7c34a053d	satellite/satellitedb: add column and index "updated_at" to injuredsegments Change-Id: I59e9bb2077885f09e17795375fe98ed31bd83d54	2020-09-14 12:53:04 -04:00
Michal Niewrzal	27a9d14e2a	satellite/repair: use metabase.SegmentKey type in repair package Another change which is a part of refactoring to replace path parameter (string/[]byte) with key paramter (metabase.SegmentKey) Change-Id: I617878442442e5d59bbe5c995f913c3c93c16928	2020-09-08 19:35:20 +00:00
Jennifer Johnson	4e2413a99d	satellite/satellitedb: uses vetted_at field to select for reputable nodes Additionally, this PR changes NewNodeFraction devDefault and testplanet config from 0.05 to 1. This is because many tests relied on selecting nodes that were reputable based on audit and uptime counts of 0, in effect, selecting new nodes as reputable ones. However, since reputation is now indicated by a vetted_at db field that is explicitly set rather than implied by audit and uptime counts, it would be more complicated to try to update all of the nodes' reputations before selecting nodes for tests. Now we just allow all test nodes to be new if needed. Change-Id: Ib9531be77408662315b948fd029cee925ed2ca1d	2020-09-04 16:45:32 +00:00
Michal Niewrzal	8649a00557	satellite/gracefulexit: replace `Path []byte` to `Key metabaseSegmentKey` TransferQueueItem We are unifying which name (and type) we are using for value we are using to point to segment. We want to use `key` instead of `path`. Dedicated type `metabase.SegmentKey` was created for this purposes also. This change is doing refactoring around gracefulexit. Change-Id: I90d51ff087b206179e61d5f1bc95f4709d76f917	2020-09-04 11:09:48 +00:00
Egon Elbre	dc48197bd8	satellite/orders: add bucket id to order limit Change-Id: I9019ec77d692e62ac17b67a1da71dc3535cde50c	2020-09-03 10:50:11 +03:00
Michal Niewrzal	0604a672c1	satellite/metainfo: use metabase in loop Change-Id: I1bb0c6fe0a762895fde950690b06f7dd9d77e178	2020-09-01 10:06:16 +00:00
Moby von Briesen	2d01dd9732	satellite/satellitedb: Add online_score column to nodes table Add online score used for the new audit history offline tracking system to the nodes table. This allows us easy access to the node's online score for the storagenode dashboard as well as for data analysis. Change-Id: Ie99be1192e5236862a5b3dbed2e5ef03b9169410	2020-08-31 15:07:07 +00:00
Moby von Briesen	60a95d0dc9	satellite/{satellitedb,overlay}: Enable offline suspension and review period When a node's audit history "online score" passes below a configured threshold, the node goes into "offline suspension" mode and begins a review period, where the operator is given an opportunity to bring their node back online. After the review period passes, offline suspension is turned off for the node. In the future, if a node still has a bad online score at the end of the review period, it will be disqualified. This is disabled right now. In the future, if a node is in offline suspension, it will be treated as "unhealthy". Right now, there are no consequences for being in offline suspension. Minor changes: * Moves AuditHistoryConfig out of UpdateStats/BatchUpdateStats args and into UpdateRequest. * Adds "now" argument to UpdateStats/BatchUpdateStats args for easy testing. * Changes formatting strings inside buildUpdateStatement to use specific types. Change-Id: I032b60298840fc16e6ef831da750f2d57619a397	2020-08-28 16:35:48 +00:00
Bill Thorp	729079965f	satellite/satellitedb : remove migation steps 69-102 Jenkins has been failing a lot lately due to test timeouts with CockroachDB. TestMigrateCockroach previously took around 5 minutes, now it takes 2. Why 103? I couldn't get 100 to work due to an error w/ NOT NULL and PKs. Change-Id: Iec95d4e25f9d6cd36920e7f43272c486a17fa879	2020-08-27 07:36:05 +00:00
Moby von Briesen	959cd5cd83	satellite/satellitedb: Update audit history from overlay.UpdateStats and overlay.BatchUpdateStats Change-Id: Ib530b61895ca4a8b12ba022c408a416b237b56d7	2020-08-20 22:46:28 +00:00
Moby von Briesen	5f0477ebe9	satellite/{overlay,satellitedb}: Create database functionality for updating audit history Add a function to the overlay cache called UpdateAuditHistory, which allows us to add online or offline audits to a particular node's audit history, and get that node's "online score" for the configured tracking period. The next step will be to use UpdateAuditHistory from inside BatchUpdateStats/UpdateStats, so that audit history is actually updated when nodes get audited, and we can suspend nodes based on their online score. Change-Id: I2289105e6961e68e829a987ff756b0e576fab120	2020-08-20 17:34:27 +00:00
Egon Elbre	94a09ce20b	all: add missing dots Change-Id: I93b86c9fb3398c5d3c9121b8859dad1c615fa23a	2020-08-11 17:50:01 +03:00
Ethan	ab1d0f097d	satellite/storageusage: Group accounting rollups at_rest_total by day When investigating a gap in storage usage data in the SN dashboard, I noticed that there were 2 entries in the accounting_rollups table on the date of the gap. This change accounts for multiple entries in the accounting_rollups table for a given day. Change-Id: Ibf2b5d0455117cb0417163e8fcfb7e509d594171	2020-08-10 15:03:15 +00:00
Kaloyan Raev	7552ff26ec	satellite/db: drop project_invoice_stamps table It's an obsolete table from earlier state of Stripe invoices implementation. No code is currently using it. It is confirmed that this table is currently empty across all satellites. Change-Id: I12d2756578faf8418ea8f3b09088e885694b8925	2020-08-10 13:22:10 +00:00
Kaloyan Raev	edfd3d7661	satellite/payments: delete `credits` and `credits_spendings` db tables Jira: https://storjlabs.atlassian.net/browse/USR-822 This the last step of dropping these 2 db tables. It also deletes all code associate with them. Change-Id: I8be840dc2a7be255cf6308c9434b729fe4d9391e	2020-07-30 12:19:57 +03:00
Egon Elbre	36ed939b89	satellite/orders: add buckets db to service We need to add bucket UUID into the order limit, hence we need access to the buckets table. Change-Id: I348ce1f709c9fcdec5c4034acaab59805b33da9f	2020-07-24 17:36:49 +03:00
Ethan	cfca021839	satellite/accounting: Add chore to cleanup old project bandwidth rollups data Removes old project_bandwidth_rollups records that are no longer used. Uses a retain months configuration to determine how many months to save. Current month cannot be removed. Tests retainMonths=-1, 0, 2 Change-Id: Ia4be2546cdb28802427acf41ecd85ad66df3e62c	2020-07-22 18:56:49 +00:00
Bill Thorp	65408db6e0	satellite/satellitedb: Coinpayments repeat insert bug fix I introduced a bug with https://review.dev.storj.io/c/storj/storj/+/2216 Because the log change allowed insert to be called multiple times. This changes the insert logic to do nothing if the PK already exists. Change-Id: I90d192a0f6619bfbb360ea104066f00a3348f6dd	2020-07-20 20:21:35 +00:00
Isaac Hess	67a292d135	satellite/satellitedb: Monitor node tallies We are adding a monkit evaluation for the total sum of data stored on the nodes before it is inserted into the database. This will give us a time-series history of total data stored so we can see it change over time. Change-Id: I41145a2d7a09c8e63b42ae578bd081035b60e529	2020-07-17 10:21:42 -06:00
Egon Elbre	d8dcae3075	all: fix error checking Change-Id: Ia0da1bbd6ce695139922f94096c2419281905e32	2020-07-16 19:13:14 +03:00
Egon Elbre	e70da5cd4e	all: fix comments Change-Id: I2d2307e3fab87de47a72b3595d051e2c95ff4f8a	2020-07-16 19:13:14 +03:00
Egon Elbre	080ba47a06	all: fix dots Change-Id: I6a419c62700c568254ff67ae5b73efed2fc98aa2	2020-07-16 14:58:28 +00:00
stefanbenten	9ace375ee0	satellite/{console,satellitedb}: change project limiting based on new users field This change switches the backend logic to use the new DB column on the users table to restrict project creation. Furthermore it back fills the existing limits from registration tokens to the new column to ensure no users are reset to the new default. UI is updated to reflect ability to create several projects Change-Id: Ie29157430ae6b065411ca4c4557c9f1be69cdc4f	2020-07-16 10:57:47 +00:00
stefanbenten	0209a2095f	satellite/{console,satellitedb}: add project_limit column to users table Change-Id: I603f085f17ca5b413dd1c6837c2081f9e7e791a1	2020-07-15 17:27:31 +00:00
stefanbenten	2c2d284f3d	satellite/admin: add bucket limit handling endpoint Change-Id: I4b199277cff30f11f4a9fff3b0ac4017b694f2e8	2020-07-15 17:27:23 +00:00
Jennifer Johnson	784a156eea	satellite: prevents uplink from creating a bucket once it exceeds the max bucket allocation. Change-Id: I4b3822ed723c03dbbc0df136b2201027e19ba0cd	2020-07-15 17:27:05 +00:00
stefanbenten	257855b5de	all: replace == comparison with errors.Is Change-Id: I05d9a369c7c6f144b94a4c524e8aea18eb9cb714	2020-07-14 15:50:25 +00:00
stefanbenten	0a32ba0e6b	satellite/admin: add project rename functionality Change-Id: I4c0f42d4c2c26859279f247f94cef97a8ff630a9	2020-07-14 11:36:49 +00:00
stefanbenten	f768302c91	satellite/admin: harden project deletion requirements Change-Id: Ia7ea469f87469b16e464dc22af24b98a6ef1873d	2020-07-14 11:36:29 +00:00
Jessica Grebenschikov	8abb907010	satellite/orders: add settle orders with window Why: We need a way to cut down on database traffic due to bandwidth measurement and tracking. What: This changeset is the Satellite side of settling orders in 1 hr windows. See design doc for more details: https://review.dev.storj.io/c/storj/storj/+/1732 Change-Id: I2e1c151e2e65516ebe1b7f47b7c5f83a3a220b31	2020-07-13 15:41:29 -07:00
paul cannon	bbdb351e5e	all: use jackc/pgx in place of lib/pq What: Use the github.com/jackc/pgx postgresql driver in place of github.com/lib/pq. Why: github.com/lib/pq has some problems with error handling and context cancellations (i.e. it might even issue queries or DML statements more than once! see https://github.com/lib/pq/issues/939). The github.com/jackx/pgx library appears not to have these problems, and also appears to be better engineered and implemented (in particular, it doesn't use "exceptions by panic"). It should also give us some performance improvements in some cases, and even more so if we can use it directly instead of going through the database/sql layer. Change-Id: Ia696d220f340a097dee9550a312d37de14ed2044	2020-07-13 15:54:41 +00:00
Egon Elbre	9dc9cd8a17	tests: allow STORJ_TEST_POSTGRES STORJ_POSTGRES_TEST naming was not consistent with STORJ_SIM_POSTGRES. This allows to use STORJ_TEST_POSTGRES for clarity, it still has a fallback to STORJ_POSTGRES_TEST. Change-Id: I6f294c66c80fcfd6750fea2a89795f3b7f5dd691	2020-07-10 16:43:49 +03:00

1 2 3 4 5 ...

584 Commits