storj

Author	SHA1	Message	Date
Jennifer Johnson	699b635e5d	satellite/overlay: rename newNodePercentage to newNodeFraction Change-Id: Ie66de91f88183b44de0773589e83e4ade9aa997a	2020-03-19 20:09:32 +00:00
Qweder93	0df586c3a8	satellitedb/heldamount updated, tests added + storagenode console updated Change-Id: I10f568a426d0fc42069d025de2accbef5b26dc0c	2020-03-19 15:37:45 +02:00
Kaloyan Raev	10b032e484	libuplink: return deleted bucket/object (step 4) Switch back to the original DeleteBucket and DeleteObject methods. Next step: remove the DeleteBucketReturnDeleted and DeleteObjectReturnDeleted from storj.io/uplink. Change-Id: I273a305326d411e51ce354ce72fcc6ecadf4dd5f	2020-03-19 13:32:07 +02:00
Jessica Grebenschikov	5142874144	satellite/gc: move garbage collection to its own process Change-Id: I7235aa83f7c641e31c62ba9d42192b2232dca4a5	2020-03-18 16:44:01 +00:00
Egon Elbre	09e0f3de63	satellite/metainfo/piecedeletion: add Service Change-Id: Id7e32ed569701fa0be66f9527c43a67052994570	2020-03-18 14:50:08 +00:00
Bill Thorp	94c11c5212	satellite: remove some unnecessary UTC() calls Fixes some easy cases of extraneous UTC() calls Change-Id: I3f4c287ae622a455b9a492a8892a699e0710ca9a	2020-03-13 13:49:44 +00:00
Jeff Wendling	41887883f3	satellite/satellitedb: check indexes on migration Change-Id: I5ba7ae2b512d77c70405ce332158f12128e27eed	2020-03-13 10:45:22 +00:00
JT Olio	051569c69f	satellite: enable open registration (and add flag that disables it) SM-441 Change-Id: I47bfedb312089f6d2bfbab013bd74ad4b8aa5f5e	2020-03-11 03:53:34 +01:00
paul cannon	79553059cb	satellite/repair: put irreparable segments in irreparableDB Previously, we were simply discarding rows from the repair queue when they couldn't be repaired (either because the overlay said too many nodes were down, or because we failed to download enough pieces). Now, such segments will be put into the irreparableDB for further and (hopefully) more focused attention. This change also better differentiates some error cases from Repair() for monitoring purposes. Change-Id: I82a52a6da50c948ddd651048e2a39cb4b1e6df5c	2020-03-09 21:45:16 +00:00
Michal Niewrzal	d7b5df70d3	cmd/uplink: remove unused flag New API has limited number of options to configure at the moment. We should remove unused flags from Uplink CLI and add if needed in the future. Change-Id: Icf3f3dadd43cb61a3b408b02d0762aef34425dbf	2020-03-09 13:44:46 +00:00
Michal Niewrzal	c20cf25f35	cmd: migrate uplink CLI to new API Change-Id: I8f8fcc8dd9a68aac18fd79c4071696fb54853a60	2020-03-09 13:26:29 +00:00
Egon Elbre	f4d5d89b68	private/testplanet: add WaitForStorageNodeEndpoints After calling uplink.Upload it is not guaranteed that the storage node has yet saved all the orders since it happens asynchronously. Hence we need a separate func to wait for them to complete. Change-Id: I0c34b3ea6c98dbcf37f80493c0e10a8bdbbb2aaf	2020-03-05 10:33:56 +00:00
Jennifer Johnson	1c1750e6be	removes bandwidth limiting On satellite, remove all references to free_bandwidth column in nodes table. On storage node, remove references to AllocatedBandwidth and MinimumBandwidth and mark as deprecated. Protobuf message, NodeCapacity, is left intact for backwards compatibility. Once this is released to all satellites, we can drop the column from the DB. Change-Id: I2ff6c6537fc9008a0c5588e951afea58ede85838	2020-03-04 14:04:00 +00:00
Cameron Ayer	7244a6a84e	storagenode/{contact, piecestore}: implement low disk notification with cooldown When a storagenode begins to run low on capacity, we want to notify the satellite before completely running out of space. To achieve this, at the end of an upload request, the SN checks if its available space has fallen below a certain threshold. If so, trigger a notification to the satellites. The new NotifyLowDisk method on the monitor chore is implemented using the common/syn2.Cooldown type, which allows us to execute contact only once within a given timeframe; avoiding hammering the satellites with requests. This PR contains changes to the storagenode/contact package, namely moving methods involving the actual satellite communication out of Chore and into Service. This allows us to ping satellites from the monitor chore Change-Id: I668455748cdc6741291b61130d8ef9feece86458	2020-03-03 10:45:37 -05:00
Michal Niewrzal	d384e48ad7	private/testplanet: set rollout seed to avoid warnings in logs Each test log is starting with warnings like this: "rollout config error: empty seed {"binary": "Identity"}". Make no sense to print them and pollute output. Change-Id: Ib50e28d09d8b259106d3b79d8f1262954a7aed63	2020-03-03 12:58:54 +00:00
Egon Elbre	decb2ec69a	private/processgroup: moved to storj.io/common/processgroup Change-Id: I1ec0bb440dda757d8f9a6f564a0084dde2f9cc84	2020-03-03 10:50:33 +00:00
Jeff Wendling	443aa08a06	private/dbutil/txutil: remove the individual retry events Change-Id: I63d06e57d7e6723b4d00d51f77c46345a11c4671	2020-03-03 08:38:19 +00:00
Qweder93	484ec7463a	storagenode: notifications on outdated software version Change-Id: If19b075c78a7b2c441e11b783c3c09fed55060c7	2020-03-02 16:48:02 +00:00
Egon Elbre	1f7c3be8f9	private/testplanet: add option to run testplanet databases non-parallel NonParallel running is needed for gateway tests, because minio unfortunately relies on global state. Change-Id: If730db2ab86d10f4d02e1ac3128f758e9c18cdff	2020-02-27 15:49:22 +02:00
Egon Elbre	f85606b5a7	private/grpctlsopts: grpc related tlsopts This moves grpc related tlsopts methods to private/grpctlsopts. This allows to remove grpc dependency from tlsopts. Change-Id: I25090b82b1e7a0633417ad600f8587b0c30ace73	2020-02-26 22:46:06 +02:00
Egon Elbre	64330c55b3	all: use pbgrpc common/pb moved grpc to a separate package common/pb/pbgrpc. This updates this repository to use it. Change-Id: I2de2a190688871cf9cb61f7ea511f8a01e264e4e	2020-02-26 21:27:47 +02:00
Egon Elbre	9752d01884	private/prompt: remove dependency to go-prompt Change-Id: Ida8ef731ce806cec076343dc77d72a3b0d7736b4	2020-02-25 13:09:41 +02:00
paul cannon	92d86fa044	satellite/repair: fix repair concurrency This new repair timeout (configured as TotalTimeout) will include both the time to download pieces and the time to upload pieces, as well as the time to pop the segment from the repair queue. This is a move from Github PR #3645. Change-Id: I47d618f57285845d8473fcd285f7d9be9b4318c8	2020-02-24 19:57:09 +00:00
Cameron Ayer	f22bddf122	{storagenode/contact, private/testplanet}: remove ErrFailureToStart and panic in testplanet.Start Change-Id: I252e8c9407400af7bda95a7657c8154660c3c801	2020-02-24 18:24:23 +00:00
Egon Elbre	e30f7b35b6	cmd/gateway: use a separate repository Change-Id: Idbb0b2b6cf0e60c6d5d91218c24524d72285cf26	2020-02-24 10:03:03 +02:00
Yingrong Zhao	5011e78311	storagenode/piecestore: remove unused DeletePiece endpoint With commit: `3331b443e7`, satellite will start calling `DeletePieces`. Therefore, we can remove the old endpoint once the above commit is deployed with all satellites Change-Id: I0124bc00a7cb808d119eb59f8fcd7fadf68158bb	2020-02-21 21:03:49 +00:00
Egon Elbre	5342dd9fe6	go.mod: update uplink Change-Id: I867a6a1eef8aa5d60bb676e5112b98c4192ce811	2020-02-21 16:08:12 +02:00
Egon Elbre	fd5611fb5e	private/testplanet: ensure server is closed in test Change-Id: I12eafadfb1794cd84a288e39740f703919a9ddc6	2020-02-21 10:10:51 +02:00
Yingrong Zhao	77f67a8086	satellite/metainfo: add timeout for delete request Change-Id: I9cad6d7ea185fc2c0ed4e58b42e4e3a78178a79f	2020-02-20 09:10:16 +00:00
Cameron Ayer	3e70a893dd	storagenode/{piecestore, contact}: report capacity to satellites if below specific threshold Curently, storage nodes only report their capacity to satellites once per hour. If a node fills up, it will fail all uploads until the next contact cycle begins. With these changes, at the end of an upload we check whether the MinimumDiskSpace threshold has been passed. If so, trigger the monitor chore to update the node's capacity, then trigger the contact chore to report the new capacity to the satellites Change-Id: Ie6aadaade1e2c12c87e03f8ff9059a50121380a0	2020-02-18 15:42:48 -05:00
Jeff Wendling	948589d38b	private/dbutil/txutil: include details about retry attempts in error Change-Id: I978ae44c4890df31185ec6077c9fb3a2b2fce8f1	2020-02-17 14:18:13 +00:00
Egon Elbre	892b190db6	satellite/admin: add project limit modification and authorization token Change-Id: If9a7214a940b8544f8023c2cd82da21f19d3f521	2020-02-17 07:56:16 +00:00
Michal Niewrzal	cea4c25f53	mod: bump common and uplink version Change-Id: Ia063d33c087dd91a46c008e154b078f11fa21527	2020-02-12 14:33:54 +00:00
Egon Elbre	dbf46c4aa7	satellite/admin: administrative endpoint Admin server allows creating basic REST and html API-s for different administrative tasks. Change-Id: I3dc1786abe1c87350eed60ec90e48130f44e63cf	2020-02-12 12:12:50 +02:00
Cameron Ayer	33d696b096	storage/redis/redisserver: simplify redisserver creation Change-Id: I881576a7881db671b5abeeca7120a022987cc47f	2020-02-11 19:11:57 +00:00
Cameron Ayer	b22bf16b35	satellite/overlay: add config flag for node selection free disk requirement Currently SNs report their free disk space once per hour. If a node becomes full, it has to wait until the next contact cycle begins to report; all the while receiving and failing upload requests. By increasing the minimum required disk space, we can give the storage nodes more time to report their space before the completely fill up. This change goes hand-in-hand with another change we want to implement: trigger capacity report on SN immediately upon falling below threshold. Change-Id: I12f778286c6c3f582438b0e2949765ac43325e27	2020-02-11 18:08:25 +00:00
Egon Elbre	429f08b4f0	satellite: add Admin peer This peer will contain our administrative panels. It's completely separated from our other satellite processes because it allows better control for restricting access to it. Change-Id: Ifca473bee82ff6c680b346918ba32b835a7a6847	2020-02-11 16:15:33 +00:00
Michal Niewrzal	426c8eb31a	private/testplanet: add DeleteBucket method for uplink New method added to be able to delete easily bucket during tests. Change-Id: Iaae89618cc676ddbbbd4b0df2eeacd143ea6f3c2	2020-02-11 15:58:13 +00:00
Jeff Wendling	99c3ba5bbf	testplanet: log stack trace for error during creation Change-Id: Ifcd2cba4195413a7213ba4d113c43f9fb3cbc3e5	2020-02-10 21:59:20 +00:00
Jeff Wendling	7999d24f81	all: use monkit v3 this commit updates our monkit dependency to the v3 version where it outputs in an influx style. this makes discovery much easier as many tools are built to look at it this way. graphite and rothko will suffer some due to no longer being a tree based on dots. hopefully time will exist to update rothko to index based on the new metric format. it adds an influx output for the statreceiver so that we can write to influxdb v1 or v2 directly. Change-Id: Iae9f9494a6d29cfbd1f932a5e71a891b490415ff	2020-02-05 23:53:17 +00:00
Jeff Wendling	d20db90cff	private/dbutil/txutil: create new transactions for retries it was noticed that if you had a long lived transaction A that was blocking some other transaction B and A was being aborted due to retriable errors, then transaction B was never given priority. this was due to using savepoints to do lightweight retries. this behavior was problematic becaue we had some queries blocked for over 16 hours, so this commit addresses the issue with two prongs: 1. bound the amount of time we will retry a transaction 2. create new transactions when a retry is needed the first ensures that we never wait for 16 hours, and the value chosen is 10 minutes. that should be long enough for an ample amount of retries for small queries, and huge queries probably shouldn't be retried, even if possible: it's more preferrable to find a way to make them smaller. the second ensures that even in the case of retries, queries that are blocked on the aborted transaction gain priority to run. between those two changes, the maximum stall time due to retries should be bounded to around 10 minutes. Change-Id: Icf898501ef505a89738820a3fae2580988f9f5f4	2020-02-01 18:34:28 +00:00
Michal Niewrzal	a181e0b627	libuplink: adjust tests to changes in encryption store We move PathCipher to encryption.Store and we need to adjust storj/uplink for those changes. Uplink repo is also using libuplink to run tests so we need first adjust storj/storj libuplink and later storj/uplink. Change-Id: I84f23e6bad18ac139f72c19939dc526f9f46d88b	2020-01-30 22:00:24 +00:00
Egon Elbre	f237d70098	storagenode,satellite: use pkg/debug Use debug.Server in storage node and satellite for customizing debug server. Change-Id: I7979412376d028cadf29656d838ab94f18e2aa99	2020-01-29 16:30:31 -05:00
Ethan	149273c63f	satellite/metainfo: add cache expiration for project level rate limiting Allow rate limit project cache to expire so we can make project level rate limit changes without restarting the satellite process. Change-Id: I159ea22edff5de7cbfcd13bfe70898dcef770e42	2020-01-29 16:14:10 +00:00
Egon Elbre	e319660f7a	private/lifecycle: implement Group lifecycle.Group implements controlling multiple items such that their startup and close works. Change-Id: Idb4f4a6c3a1f07cdcf44d3147a6c959686df0007	2020-01-29 00:37:33 +00:00
paul cannon	5a1838bc28	private/dbutil: retry single statements on cockroachdb This ought to make it so that all single statements (Exec- or Query-) on a CockroachDB backend will get retried as necessary. As there is no need for savepoints to be allocated or released in this case, there is no round-trip overhead except when statements actually do need to be retried. Change-Id: Ibd7f1725ff727477c456cb309120d080f3cd7099	2020-01-24 09:01:47 +00:00
Isaac Hess	2f77ce48f0	private/testplanet: Add databases to testplanet.databases near creation We now close databases in testplanet in reverse order, knowing that some caches and other objects need to close prior to the underlying db. Some dbs were not being added near the list of closeable databases near their creation, causing an issue with shutdown order. Change-Id: I23391f4d77649030493e47bd7169002a72b3bf7a	2020-01-23 15:30:52 -07:00
Jeff Wendling	16bb374deb	storagenode/piecestore: add large timeouts to read/write operations this is to help protect against intentional or unintentional slowloris style problems where a client keeps a tcp connection alive but never sends any data. because grpc is great, we have to spawn a separate goroutine for every read/write to the stream so that we can return from the server handler to cancel it if necessary. yep. really. additionally, we update the rpcstatus package to do some stack trace capture and add a Wrap method for the times where we want to just use the existing error. also fixes a number of TODOs where we attach status codes to the returned errors in the endpoints. Change-Id: Id8bb8ff84aa34e0f711b0cf9bce3908b36a1d3c1	2020-01-23 19:20:49 +00:00
Egon Elbre	89a148047d	private/testplanet: shutdown databases in reverse order Since we have caches on top of databases and they are included in the databases list, we need to shut them down in-reverse order to avoid issues with flushing to a closed database. Change-Id: I3f23a527a2a5425638b1a7e2cab84741f019d493	2020-01-23 18:55:57 +00:00
paul cannon	fd84fa6316	private/dbutil: rollback pending transactions on panic We don't do a lot of panicking in our main code, so hopefully this won't matter much, but we /do/ call panic a lot in our tests (t.Fatal, require.NoError, etc). And when that happens, we need pending transactions to be aborted or we can get into a deadlock situation when something else tries to /Close/ that connection. Change-Id: Idaf0d543ac95afea34f9b2393d1187f5322e9f0f	2020-01-23 16:30:19 +00:00
Isaac Hess	40a890639d	satellite/orders: Flush all pending bandwidth rollup writes on shutdown Currently we risk losing pending bandwidth rollup writes even on a clean shutdown. This change ensures that all pending writes are actually written to the db when shutting down the satellite. Change-Id: Ideab62fa9808937d3dce9585c52405d8c8a0e703	2020-01-23 08:12:41 -07:00
Egon Elbre	c6f94ce9e4	satellite/metainfo: remove support for boltdb based pointerDB By previous changes we can now remove testplanet.New and also remove metainfo boltdb support. Change-Id: I5bdfbbbb45967492728e705b34b2fedb4f28c381	2020-01-23 13:54:00 +02:00
Egon Elbre	5a4745eddb	all: remove usages of testplanet.New Ensure that tests use testplanet.Run, so we always require running against all database backends. Change-Id: I6b0209e6a4912cf3328bd35b2c31bb8598930acb	2020-01-22 22:42:57 +02:00
Jeff Wendling	3b86917cc9	private/dbutil/pgutil: faster cockroach constraint finding Change-Id: Ia100b9ef7d2d59dfad0389feb8f2e7c47c2c4c9b	2020-01-22 15:47:04 +00:00
Egon Elbre	fc2766eefc	private/testplanet: flatten migration for running tests Currently Cockroach DB setup takes a significant amount of time. This flattens the database setup into a single query, which improves the test time significantly. The migration tests still test each migration separately. Change-Id: Iaca16f34a6af3926fa2b5ebf618f939fd59460b3	2020-01-22 15:09:11 +00:00
Egon Elbre	8b3db70329	private/testplanet: increase metainfo rate limit Rate limit was causing tests to fail due to making too many request. Change-Id: Iafbc97b4880b6d98c86045b28ca7583d27f51720	2020-01-22 13:57:38 +00:00
Michal Niewrzal	6502454947	satellite/metainfo: move RS configuration to satellite With this change RS configuration will be set on satellite. Uplink with get RS values with BeginObject request and will use it. For backward compatibility and to avoid super large change redundancy scheme stored with bucket is not touched. This can be done in future. Change-Id: Ia5f76fc10c37e2c44e4f7b8754f28eafe1f97eff	2020-01-22 09:33:53 +00:00
Ethan	21a5d70a83	satellite/metainfo: Rate limiting - API requests Limits how many times metainfo APIs can be called per second by project ID. If limit is exceeded, the API will return Unauthorized/Too Many requests. Limit per second and the size of the limiter cache per project are configurable, as well as whether the limiter is enabled. Tests added/updated for the new rate_limit field in projects table. Tests added for exceeding limits and disableing limiter. Change-Id: Ic8ad102de3b690a475809d4f684156d5715f20fa	2020-01-21 14:25:04 +00:00
Michal Niewrzal	86f194769f	uplink: adjust to changes in storj/uplink This change is adjusting code base to changes in storj/uplink. https://review.dev.storj.io/c/storj/uplink/+/643 Change-Id: Ieca87f9f5983e391bf4b4fec8b9d5491fd32bfa1	2020-01-20 22:06:19 +00:00
Egon Elbre	c1c878efcf	all: fix import groupings check-imports was broken and didn't complain about things. Change-Id: I38adafd16b4aba86f0eb4f53427b4393f9a6c710	2020-01-20 17:47:44 +00:00
Egon Elbre	1279eeae39	private/tagsql,storage: fixes to context cancellation Replace all the remaining uses of sql.DB with tagsql.DB to fix issues with context cancellation. Introduce tagsql.Open which helps to get rid of all tagsql.Wrap-s. Use tagsql in cockroachkv and postgreskv. Change-Id: I8946d203341cb85a25976896fc7881e1f704e779	2020-01-20 15:44:39 +02:00
Egon Elbre	10d932fd65	lib/uplinkc: fix test flakiness by setting MaxTimeSkew Not having a skew caused an issue where: 1. Uplink calls "begin segment", where segment isn't committed to the database. 2. Uplink stores piece X to the storage node A with timestamp 1. 3. Satellite runs garbage collection with timestamp 2. 4. Satellite sends retain request to storage node A with timestamp 2. 5. Storage node A deletes piece X, because 1 < 2. 6. Uplink calls "commit segment" with storage node A in it. 7. Download of segment fails, because A doesn't have piece X. In production this is not an issue since the MaxTimeSkew is 72h by default. Change-Id: Id87ca3ddc44103dcd85d031b1367168c014b8e7b	2020-01-20 12:44:42 +00:00
Egon Elbre	ee0293c212	private/dbutil/sqliteutil: add missing err check Change-Id: Ie18c76d0e6d02a5c55e2d6503437b8a07b47a64e	2020-01-19 19:24:58 +00:00
Egon Elbre	1abfe42142	satellite: use tagsql Change-Id: I2170dee409fb0c2fe85913ddd36e7811a3b853ed	2020-01-19 14:39:16 +02:00
Egon Elbre	25b76fe63f	storagenode/storagenodedb: use tagsql Change-Id: Iba3b34a97b982deb4f72ce55517a294f249b6b55	2020-01-19 14:39:16 +02:00
Egon Elbre	59d06644b9	private/migrate: switch to tagsql Also added temporary types withRebind and withTagTx, which will be later removed. Currently they help to avoid changing the whole codebase at the same time. Change-Id: I7f07ba8f4709a23a463bfa67464628665a05808f	2020-01-19 14:39:16 +02:00
Egon Elbre	5fd833b108	private/dbutil: remove basic Query dbschema.Query is used only for testing and sqlite, so this won't cause us problems in production. Change-Id: Ib296a7daf161a9d3de23a7dfdc4f505d47ac4a37	2020-01-19 14:39:16 +02:00
stefanbenten	f4097d518c	satellite: reduce logging of node status Change-Id: I6618cf4bf31b856acd7a28b54011a943c03ab22a	2020-01-18 17:47:59 +00:00
Moby von Briesen	273eb66fae	cmd/storagenode,storagenode/preflight: add config flag to disable storagenode database preflight check. Disable preflight database check by default, and have the option to enable it. This will allow us to enable it once it is definitely working. Also change the name of the config flag for preflight time sync. Change-Id: Ie2e20f9e25dcb38794eafa7e1505e7c6ff287c99	2020-01-17 17:53:17 +00:00
Egon Elbre	5d80e22af9	private/tagsql: implement wrapper for sql.DB Wrapper adds tracing and fixes context usage issues. Change-Id: Ie6f7650eac87e2a2b64b760198498ba5857ad535	2020-01-17 13:52:12 +00:00
Cameron Ayer	4424697d7f	satellite/accounting: refactor live accounting to hold current estimated totals live accounting used to be a cache to store writes before they are picked up during the tally iteration, after which the cache is cleared. This created a window in which users could potentially exceed the storage limit. This PR refactors live accounting to hold current estimations of space used per project. This should also reduce DB load since we no longer need to query the satellite DB when checking space used for limiting. The mechanism by which the new live accounting system works is as follows: During the upload of any segment, the size of that segment is added to its respective project total in live accounting. At the beginning of the tally iteration we record the current values in live accounting as `initialLiveTotals`. At the end of the tally iteration we again record the current totals in live accounting as `latestLiveTotals`. The metainfo loop observer in tally allows us to get the project totals from what it observed in metainfo DB which are stored in `tallyProjectTotals`. However, for any particular segment uploaded during the metainfo loop, the observer may or may not have seen it. Thus, we take half of the difference between `latestLiveTotals` and `initialLiveTotals`, and add that to the total that was found during tally and set that as the new live accounting total. Initially, live accounting was storing the total stored amount across all nodes rather than the segment size, which is inconsistent with how we record amounts stored in the project accounting DB, so we have refactored live accounting to record segment size Change-Id: Ie48bfdef453428fcdc180b2d781a69d58fd927fb	2020-01-16 10:26:49 -05:00
Jeff Wendling	78c6d5bb32	satellite/satellitedb: reported_serials table for processing orders this commit introduces the reported_serials table. its purpose is to allow for blind writes into it as nodes report in so that we have minimal contention. in order to continue to accurately account for used bandwidth, though, we cannot immediately add the settled amount. if we did, we would have to give up on blind writes. the table's primary key is structured precisely so that we can quickly find expired orders and so that we maximally benefit from rocksdb path prefix compression. we do this by rounding the expires at time forward to the next day, effectively giving us storagenode petnames for free. and since there's no secondary index or foreign key constraints, this design should use significantly less space than the current used_serials table while also reducing contention. after inserting the orders into the table, we have a chore that periodically consumes all of the expired orders in it and inserts them into the existing rollups tables. this is as if we changed the nodes to report as the order expired rather than as soon as possible, so the belief in correctness of the refactor is higher. since we are able to process large batches of orders (typically a day's worth), we can use the code to maximally batch inserts into the rollup tables to make inserts as friendly as possible to cockroach. Change-Id: I25d609ca2679b8331979184f16c6d46d4f74c1a6	2020-01-15 19:21:21 -07:00
Yingrong Zhao	db8aee0806	satellite/contact; storagenode/preflight: add clock check on startup for storagenode add config preflight.enabled-local-time Change-Id: I7b942c9bee063aae409ee6721ae9d079dff0144f	2020-01-15 15:35:26 +00:00
Egon Elbre	08f63614be	private/context2: add WithoutCancellation Change-Id: I38557c16f41b8983886f256353cc6afb7634d9e6	2020-01-15 14:23:46 +02:00
Egon Elbre	64fb2d3d2f	Revert "dbutil: statically require all databases accesses to use contexts" This reverts commit `8e242cd012`. Revert because lib/pq has known issues with context cancellation. These issues need to be resolved before these changes can be merged. Change-Id: I160af51dbc2d67c5449aafa406a403e5367bb555	2020-01-15 07:28:00 +00:00
JT Olio	c01cbe0130	satellitedb: save out all db-touching traces Change-Id: Ib1e192221f9da813fd9cbb55f620a047b82c9523	2020-01-14 18:47:45 -05:00
JT Olio	8e242cd012	dbutil: statically require all databases accesses to use contexts this will allow for some nice runtime analysis down the road. also, this allows for wrapping database handles in a way that can interact with these contexts requires https://review.dev.storj.io/c/storj/dbx/+/514 Change-Id: Ib087b7cd73296dd2c1e0331314da34d861f61d2b	2020-01-14 18:20:47 -05:00
Egon Elbre	64f056bee4	private/dbutil/sqlutil: use context in queries Change-Id: Icb92daa483d13e6d57013f3917571d476126bfd2	2020-01-14 20:27:09 +00:00
Egon Elbre	df9e53ea0b	private: ensure we don't eat the underlying error When error is formatted using %v it's not possible to check whether the error was caused by a context cancellation. Change-Id: I164d1c83cdf5e7e6eacf082145b5c6a47078d041	2020-01-14 20:26:51 +00:00
Egon Elbre	cd4ff0722e	private/testplanet: use defaultInterval Change-Id: Ife2810be46faaaf8cd51b193a859a88fff894a0e	2020-01-14 16:07:36 +00:00
Isaac Hess	4950d7106a	satellite/orders: Add write cache for bw rollups Change-Id: I8ba454cb2ab4742cafd6ed09120e4240874831fc	2020-01-13 22:40:51 +00:00
Egon Elbre	b9740f0c0a	storage/cockroachkv: add ctx argument Change-Id: Ib6c29f44722b0354afcd499a0e567f04aef7eb28	2020-01-13 15:57:47 +02:00
Egon Elbre	ff267168c5	private/migrate: add ctx argument Change-Id: I3d65912d89261386413c494c7ed1576fed4dcaf4	2020-01-13 15:52:26 +02:00
Egon Elbre	24958bd7d3	satellite: add ctx to DB.CreateTables Change-Id: I9ecad624cf5a7fc9c86bb91c68f96a3a4efd2e92	2020-01-13 15:31:09 +02:00
Egon Elbre	0835b9024c	private/dbutil/pgutil: add ctx argument Change-Id: Icfd56ca8c1f831ad56c0195a0b883e8f0618daaf	2020-01-13 15:27:06 +02:00
Egon Elbre	c7b846589e	private/dbutil/sqliteutil: add ctx argument Change-Id: If1caa9cde746817e62cae32a152eeec81959129c	2020-01-13 15:03:30 +02:00
Michal Niewrzal	b579c260ab	cmd: rename "scope" flag to "access" We decided that better name for "scope" will be "access". This change refactors cmd part of code but don't touch libuplink. For backward compatibility old configs with "scope" field will be loaded without any issue. Old flag "scope" won't be supported directly from command line. https://storjlabs.atlassian.net/browse/V3-3488 Change-Id: I349d6971c798380d147937c91e887edb5e9ae4aa	2020-01-10 15:27:53 +00:00
Natalie Ventura Villasana	6b1829f3c3	satellite/downtime: new chore estimates downtime Adds EstimationChore to the downtime package, which is an independent chore that finds offline nodes given a configurable limit, then uptime checks those nodes, and sets a last contact success or failure given a response. For failed nodes, the chore updates the amount of downtime the node has been offline in the DowntimeTracking table. Design doc section: https://github.com/storj/storj/blob/master/docs/blueprints/storage-node-downtime-tracking.md#estimating-offline-time Jira: https://storjlabs.atlassian.net/browse/V3-2545 Change-Id: I60af95803930bf9b33232b248bb20cca6f0e0b5f	2020-01-09 15:05:13 -05:00
Egon Elbre	8d8d57c3b5	mod: update sqlite module to v2.0.2 This updates SQLite amalgamation from 3.29.0 to 3.30.1. The module contains fixes for races. Change-Id: Ic6a06a43ba404de0091d8a2f7444a8f4b1d5d54c	2020-01-08 21:21:15 +02:00
Yingrong Zhao	76ee8a1b4c	satellite: remove UptimeReputation configs from codebase With the new storage node downtime tracking feature, we need remove current uptime reputation configs: UptimeReputationAlpha, UptimeReputationBeta, and UptimeReputationDQ. This is the first step of removing the uptime reputation columns from satellitedb Change-Id: Ie8fab13295dbf545e33aeda0c4306cda4ba54e36	2020-01-08 18:54:15 +00:00
Egon Elbre	082ec81714	uplink: move to storj.io/uplink (#3746 )	2020-01-08 15:40:19 +02:00
Egon Elbre	cf2128d3b9	uplink: avoid cyclic dependency to storj.io This helps to simplify splitting and running tests. Change-Id: I4aaf077df7fd6bd6f14f10cb902850883349eaf5	2020-01-08 14:51:33 +02:00
paul cannon	0c88a7b475	private/migrate: use transactional helpers and not Begin() This code needs to work against cockroachDB, so transactions must be retried when a retryable error is returned. This change puts migrate transactions into the dbutil.WithTx transactional helpers to achieve this in the easiest way. Change-Id: Ib930e82d55cb0257357a222ce9131e6e53372c03	2020-01-07 18:25:38 +00:00
paul cannon	6231842422	private/dbutil: add WithTx transaction helpers These helpers will work similar to the WithTx method we have added to our dbx.DB instances, but it will use crdb.ExecuteTx or crdb.ExecuteInTx when the backend is CockroachDB, so that transactions are retried correctly. Anything that uses transactions and might need to work against CockroachDB needs to handle "RetriableError" from cockroachdb by restarting the transaction. This will probably be a large pain if not using these helpers or something very like them. Subsequent changes will undertake transforming all db-transaction uses in satellite code so that they are cockroach-safe. Change-Id: I648b8de2168612c67b9d6eb8402bccf8286249a9	2020-01-06 20:06:45 +00:00
Egon Elbre	f41d440944	all: reduce number of log messages Remove starting up messages from peers. We expect all of them to start, if they don't, then they should return an error why they don't start. The only informative message is when a service is disabled. When doing initial database setup then each migration step isn't informative, hence print only a single line with the final version. Also use shorter log scopes. Change-Id: Ic8b61411df2eeae2a36d600a0c2fbc97a84a5b93	2020-01-06 19:03:46 +00:00
paul cannon	a33734bee7	satellite/satellitedb/dbx: add cockroach driver type Change-Id: I7a0da6e066c67a521fc1b23b085ab8554eee0d4c	2020-01-06 18:01:03 +00:00
Cameron Ayer	0038abb51b	private/testplanet: use redis for live accounting storing live accounting in memory will not work, as the core and api each create their own instance. Using redis will allow each to access the same store Change-Id: I4c8250b579d7b6b6d8991bc890894573626effe6	2020-01-03 21:04:50 +00:00
Ethan	05b406e992	satellite:{downtime,overlay}: Implement offline node detection chore https://storjlabs.atlassian.net/browse/V3-3398 Change-Id: I598c3bad819026377d1d113c099dc9bba8b02742	2020-01-03 17:10:03 +00:00
Ethan	8859c36234	satellite/{downtime,contact}: Add CheckNodeAvailability for use within the downtime tracking chores. https://storjlabs.atlassian.net/browse/V3-2545 Change-Id: I1dd54a0c77cb4905bb1f350beeb82c6f7700ee70	2020-01-02 18:24:11 +00:00
Ivan Fraixedes	c3b58f1656	satellte/metainfo: Make BeginDeleteObject to delete pieces For improving the deletion performance we are shifting the responsibility to delete the pieces of the object from Uplink to the Satellite. BeginDeleteObject was the first call to return the stream ID which was used for after retrieving the list of segments and then get addressed order limits for deleting the pieces (of each segment) from the storage nodes. Now we want the Satellite deletes the pieces of all the object segments from the storage nodes hence we don't need anymore to have several network round trips between the Uplink and the Satellite because the Satellite can delete all of them in the initial BegingDeleteObject request. satellite/metainfo.ListSegments has been changed to return 0 items if the pointer of the last segment of an object is not found because we need to preserve the backward compatibility with Uplinks that won't be updated to the last release and they rely on listing the segments after calling BeginDeleteObject for retrieving the addressed order limits to contact the storage nodes to delete the pieces. Change-Id: I5f99ecf27d62d65b0a062936b9b17581ef692af0	2020-01-02 15:53:59 +00:00
Egon Elbre	e03d3fb577	uplink: move configs to cmd/uplink/cmd Change-Id: Ifc1d3440dcef429c2a6142c16f3e991abf49f1d2	2020-01-02 09:40:57 +00:00
Egon Elbre	2680bae88c	private/testplanet: remove dependency to uplink Remove direct dependency on uplink.RSConfig, this simplifies moving the config file without introducing weird dependencies. Change-Id: I7fd2a145401e0205d7047631df9d2810241efeec	2020-01-02 09:40:46 +00:00
Natalie Ventura Villasana	aa3e183c2e	satellite/gracefulexit: add ge eligibility check Adds check to see if storage nodes are eligible to initiate graceful exit, by checking their CreatedAt date and seeing if their "age" is greater than the new config value: NodeMinAgeInMonths The default for this value is 6 months for now. https://storjlabs.atlassian.net/browse/V3-3357 Change-Id: Ib807ab8987ddb5a38a27a83886490f73fe8c5816	2019-12-31 09:31:58 -05:00
Stefan Benten	758fe35aba	storagenode/orders: adding jitter to sending (#3725 )	2019-12-30 21:35:26 +01:00
Egon Elbre	6615ecc9b6	common: separate repository Change-Id: Ibb89c42060450e3839481a7e495bbe3ad940610a	2019-12-27 14:11:15 +02:00
Fadila	115b8b0fc8	storagenode/piecestore: delete several pieces in a single request This is part of the deletion performance improvement. See https://storjlabs.atlassian.net/browse/V3-3349 Change-Id: Idcd83a302f2bd5cc3299e1a4195a7e177f452599	2019-12-27 10:58:04 +00:00
Isaac Hess	7d1e28ea30	storagenode: Include trash space when calculating space used This commit adds functionality to include the space used in the trash directory when calculating available space on the node. It also includes this trash value in the space used cache, with methods to keep the cache up-to-date as files are trashed, restored, and emptied. As part of the commit, the RestoreTrash and EmptyTrash methods have slightly changed signatures. RestoreTrash now also returns the keys that were restored, while EmptyTrash also returns the total disk space recovered. Each of these changes makes it possible to keep the cache up-to-date and know how much space is being used/recovered. Also changed is the signature of PieceStoreAccess.ContentSize method. Previously this method returns only the content size of the blob, removing the size of any header data. This method has been renamed `Size` and returns both the full disk size and content size of the blob. This allows us to only stat the file once, and in some instances (i.e. cache) knowing the full file size is useful. Note: This commit simply adds the trash size data to the piece size data we were already collecting. The piece size data is not accurate for all use-cases (e.g. because it does not contain piece header data); however, this commit does not fix that problem. Now that the ContentSize (Size) method returns the full size of the file, it should be easier to fix this problem in a future commit. Change-Id: I4a6cae09e262c8452a618116d1dc66b687f59f85	2019-12-23 19:07:03 -07:00
Egon Elbre	d55288cf68	pkg/rpc: replace methods with direct calls to pb Change-Id: I8bd015d8d316a2c12c1daceca1d9fd257f6f57bc	2019-12-22 17:12:43 +02:00
Egon Elbre	acb4435a67	satellite/satellitedb: improve Cockroach migrate test Load schemas in parallel instead of one-by-one. Optimizes from 2m30s to 1m15s. Change-Id: I0bf6381a0ae99b44271fe55d4ee658683064c097	2019-12-21 10:58:43 +00:00
Egon Elbre	ea455b6df0	all: remove code to default to grpc We have moved to drpc so we don't need to have code for building with grpc only. Change-Id: I55732314dca0d5b4ce1132b68de4186a15d91b21	2019-12-20 20:12:04 +02:00
Egon Elbre	9e4d833170	private/testplanet: use default interval The default interval tries to balance: 1. ensure that most things run at least once during tests 2. ensure that they won't run over 10 times Change-Id: I911b57b595ffbef1963654bf4a42efad1534b058	2019-12-20 17:01:30 +00:00
Ivan Fraixedes	46c8d2e9c7	private/testplanet: Wait until peer ends when closing it Close a peer didn't guarantee that the peer ended its services and we want that when a StopPeer method returns the peer service is actually finished. Change-Id: If97f41b7e404990555640c71e097ebc719678ae7	2019-12-20 14:23:25 +00:00
Egon Elbre	2daf24a1ea	private/testcontext: remove version dependency Change-Id: Ibabf5ec774dcdb1e4fc2f200368281c69b62e6c2	2019-12-18 15:24:44 +00:00
Bryan White	67892b4add	private/testidentity: clone identities for each version test Change-Id: Ic5e5c8e0b19d3b4f86d91e1ae22a26035fd63224	2019-12-17 17:21:16 +01:00
Egon Elbre	7455ab771b	pkg/peertls/tlsopts: move test that requires testplanet For splitting core repository we need it not to pull in testplanet even in tests. Change-Id: I04d46b418e6e908185a4da694cf47dc3c5cc65f0	2019-12-17 13:45:51 +00:00
Egon Elbre	b04f9996c5	pkg/rpc: move test that needs testplanet Move rpc test that uses testplanet into private/testplanet. This ensures that rpc doesn't have the whole system as a dependency making it easier to separate. This unfortunately leaves pkg/rpc without specific tests, but we would need to write new tests that only use the core packages. Change-Id: I402ab3c2d50282af159c2ef3371d23b0997fef0a	2019-12-17 13:31:12 +00:00
Cameron Ayer	a4f9865b47	satellite: adds and enables cockroachdb compatibility for tests Change-Id: I85a3ad8c3b9d7e15ea8675b6c55af0002933db57	2019-12-16 22:29:25 +00:00
paul cannon	2f7465c294	private/dbutil: register "cockroach" as sql.DB driver this will allow us to inspect the type of `db.Driver()` on *sql.DB connections to correctly differentiate between pg and crdb conns. as a bonus, this moves all concerns about when to replace "cockroach://" with "postgres://" out of view, letting the thin shim "driver" take care of that. Change-Id: Ib24103ab7c508231e681f89a7321b623e4e125e9	2019-12-16 19:10:00 +00:00
Vitalii Shpital	53d9bc4530	storagenode/notifications: db created (#3707 )	2019-12-16 19:59:01 +02:00
Andrew Harding	cb89496569	storagenode/trust: wire up list into pool - also updated ping chore to pick up trust changes - fixed small typo in blueprint - fixed flags for storj-sim - wired up changes to testplanet Change-Id: I02982f3a63a1b4150b82a009ee126b25ed51917d	2019-12-13 20:32:50 +00:00
Jeff Wendling	fb8e78132d	storagenodedb: reenable utccheck in tests Change-Id: If7d64dd4ae58e4b656ff9122ae3195b2a5173cb3	2019-12-10 23:17:14 +00:00
Jessica Grebenschikov	d8a8f92e30	private/dbutil/cockroachutil: keep crdb connstr for tests Change-Id: Icad19d6b0093e7bf0fff709330164bfcbd733911	2019-12-10 17:24:35 +00:00
Cameron Ayer	6fae361c31	replace planet.Start in tests with planet.Run planet.Start starts a testplanet system, whereas planet.Run starts a testplanet and runs a test against it with each DB backend (cockroach compat). Change-Id: I39c9da26d9619ee69a2b718d24ab00271f9e9bc2	2019-12-10 16:55:54 +00:00
Jeff Wendling	48da8baab5	storj-sim: work with cockroach:// urls for satellite databases for storj-sim to work, we need to avoid schemas in cockroach urls so we have storj-sim create namespaced databases instead of schemas and we have the migrate command create the database in the same way that it would create a schema for postgres. then it works! a follow up commit will move the creation of the database/schemas into storj-sim's setup step so that we can avoid doing these icky creations during normal migration calls. it will also make the pointerdb have an explicit call to migrate instead of just doing it every time it's opened. Change-Id: If69ef5cb96b6866b0438c761bd445afb3597ae5f	2019-12-09 23:44:00 +00:00
Jeff Wendling	1df7b360d7	satellite/metainfo: Use cockroachdb client for metainfo db Change-Id: I3cf7a00de4f654eacaffbb494f4841c64a2d9ce6	2019-12-05 10:33:54 -07:00
paul cannon	378b863b2b	private,satellite: unite all the "temp db schema" things first, so that they all work the same way, because it's getting complicated, and second, so that we can do the appropriate thing instead of CREATE SCHEMA for cockroachdb. Change-Id: I27fbaeeb6223a3e06d97bcf692a2d014b31465f7	2019-12-05 15:36:59 +00:00
paul cannon	850c358087	private/dbutil/pgutil: make QuerySchema work on crdb Adjust the pg_constraint query so that it works without a LATERAL JOIN, since CockroachDB doesn't like that. This isn't hooked up to a cockroach test yet, but that's coming. Change-Id: I0df6b477d958996b673fc121eaa1f7c35e5cc504	2019-12-04 18:55:26 +00:00
Jennifer Johnson	ecb960f506	private/dbutil: distinguishes between db drivers and implementations to allow for different implementations of SQL queries. Change-Id: I2dc8d1d371139aa8bc805e92a2b80b71f580fd64	2019-12-04 18:31:26 +00:00
Andrew Harding	2461ccd469	pkg/private/fpath: subsume AtomicWriteFile AtomicWriteFile is useful primitive to use throughout the codebase Change-Id: I338fc4505ba20d5aece09ddc257286f46298e083	2019-12-03 18:14:08 +00:00
Ivan Fraixedes	bf97ef06fc	storagenode: Add new endpoint to receive satellite requests for… (#3590 ) * pkg/pg: Add new service function storage node Add a new service function to the storage node piece store for deleting pieces when satellites request them. * storagenode/piecestore: Add endpoint to delete piece Add a new endpoint to receive from trusted satellites to delete a piece. * private/testplanet: Fix storagenode mock Add to the storagenode mock the new endpoint method. * proto.lock: Update it with the last protbuff changes * storagenode/piecestore: Reuse test piece upload Extract the repeated logic from several tests functions for uploading a test piece to a test helper function. * uplink/piecestore: Implement client side method Implement the client side method of the new piecestore RPC function. * storagenode/piecestore: Add test DeletePiece endpoint Implement a test for the DeletePiece new endpoint method.	2019-11-26 18:47:19 +01:00
Jess G	854e5507ab	crdb uses namespaced db for each test (#3646 ) * crdb uses namespaced db for each test * add test for me test * fix lint and tests * updates per cr comments * rm all replaceall	2019-11-26 08:39:57 -08:00
Isaac Hess	56f8fd2dd7	storagenode/pieces: Add EmptyTrash functionality (#3640 ) * storagenode/pieces: Add EmptyTrash functionality * storagenode/pieces: Fix err * storagenode/pieces: Fix lint	2019-11-26 09:25:21 -07:00
Vitalii Shpital	038ac58600	web/storagenode: minimal allowed version view implemented (#3583 )	2019-11-26 18:08:24 +02:00
Egon Elbre	36fead0093	satellite/metainfo: add UserAgent support to endpoints (#3548 )	2019-11-26 03:12:37 -08:00
Yingrong Zhao	79a4fff6c7	satellite/referrals: set up referrals service and http endpoints (#3566 )	2019-11-25 16:36:36 -05:00
Jess G	388f33b84d	satellitedb: add support to testplanet for cockroachdb (#3634 ) * update migration steps, add crdb support to testplanet * add crdb support * have jenkins run a bares bones crdb compat test * skip crdb tests * skip crdb tests * fix root_piece_id column * write crdb store to tmp dir * escape	2019-11-22 11:59:46 -08:00
Yingrong Zhao	63e51df9a6	private/testplanet: add a mock referral manager server into testplanet (#3631 )	2019-11-21 17:34:49 -05:00
Isaac Hess	6aeddf2f53	storagenode/pieces: Add Trash and RestoreTrash to piecestore (#3575 ) * storagenode/pieces: Add Trash and RestoreTrash to piecestore * Add index for expiration trash	2019-11-20 09:28:49 -07:00
Jess G	e9c3194c82	satellitedb: merge migration into one step (#3551 ) * merge migration * rm migration versions * rm unneeded migration test data * create index w/postgres + crdb compatible syntax * add default to offers.invitee_credit_duration_days * changes so that schema matches from master to branch * change to be crdb compatible * add check to confirm db version * mv version check to migration * update tests * add minversion to sadb migration, update tests * confirm min version for all dbs in a migration * add validate migration to sadb * fix lint err * rm min version check from migrate * change sadb check * hard code min db version * fix comment	2019-11-19 12:52:57 -08:00
Maximillian von Briesen	8653dda2b1	satellite/audit: do not contain nodes for unknown errors (#3592 ) * skip unknown errors (wip) * add tests to make sure nodes that time out are added to containment * add bad blobs store * call "Skipped" "Unknown" * add tests to ensure unknown errors do not trigger containment * add monkit stats to lockfile * typo * add periods to end of bad blobs comments	2019-11-19 17:30:28 +01:00
littleskunk	8b3444e088	satellite/nodeselection: don't select nodes that haven't checked in for a while (#3567 ) * satellite/nodeselection: dont select nodes that havent checked in for a while * change testplanet online window to one minute * remove satellite reconfigure online window = 0 in repair tests * pass timestamp into UpdateCheckIn * change timestamp to timestamptz * edit tests to set last_contact_success to 4 hours ago * fix syntax error * remove check for last_contact_success > last_contact_failure in IsOnline	2019-11-15 23:43:06 +01:00
Yehor Butko	a8e4e9cb03	satellite/payments: project usage charges (#3512 )	2019-11-15 16:27:44 +02:00
Isaac Hess	2166c2a21b	storage/filestore: Add Trash and RestoreTrash to Blobs (#3529 ) * storage/filestore: Add Trash and RestoreTrash to Blobs * Change restore to be satellite-specific * Fix comment * Fix merge rename conflict	2019-11-14 15:19:15 -07:00
Egon Elbre	ee6c1cac8a	private: rename internal to private (#3573 )	2019-11-14 21:46:15 +02:00

... 4 5 6 7 8

394 Commits