storj

Author	SHA1	Message	Date
Cameron Ayer	f22bddf122	{storagenode/contact, private/testplanet}: remove ErrFailureToStart and panic in testplanet.Start Change-Id: I252e8c9407400af7bda95a7657c8154660c3c801	2020-02-24 18:24:23 +00:00
Yingrong Zhao	5011e78311	storagenode/piecestore: remove unused DeletePiece endpoint With commit: `3331b443e7`, satellite will start calling `DeletePieces`. Therefore, we can remove the old endpoint once the above commit is deployed with all satellites Change-Id: I0124bc00a7cb808d119eb59f8fcd7fadf68158bb	2020-02-21 21:03:49 +00:00
Egon Elbre	5342dd9fe6	go.mod: update uplink Change-Id: I867a6a1eef8aa5d60bb676e5112b98c4192ce811	2020-02-21 16:08:12 +02:00
Egon Elbre	fd5611fb5e	private/testplanet: ensure server is closed in test Change-Id: I12eafadfb1794cd84a288e39740f703919a9ddc6	2020-02-21 10:10:51 +02:00
Yingrong Zhao	77f67a8086	satellite/metainfo: add timeout for delete request Change-Id: I9cad6d7ea185fc2c0ed4e58b42e4e3a78178a79f	2020-02-20 09:10:16 +00:00
Cameron Ayer	3e70a893dd	storagenode/{piecestore, contact}: report capacity to satellites if below specific threshold Curently, storage nodes only report their capacity to satellites once per hour. If a node fills up, it will fail all uploads until the next contact cycle begins. With these changes, at the end of an upload we check whether the MinimumDiskSpace threshold has been passed. If so, trigger the monitor chore to update the node's capacity, then trigger the contact chore to report the new capacity to the satellites Change-Id: Ie6aadaade1e2c12c87e03f8ff9059a50121380a0	2020-02-18 15:42:48 -05:00
Egon Elbre	892b190db6	satellite/admin: add project limit modification and authorization token Change-Id: If9a7214a940b8544f8023c2cd82da21f19d3f521	2020-02-17 07:56:16 +00:00
Michal Niewrzal	cea4c25f53	mod: bump common and uplink version Change-Id: Ia063d33c087dd91a46c008e154b078f11fa21527	2020-02-12 14:33:54 +00:00
Egon Elbre	dbf46c4aa7	satellite/admin: administrative endpoint Admin server allows creating basic REST and html API-s for different administrative tasks. Change-Id: I3dc1786abe1c87350eed60ec90e48130f44e63cf	2020-02-12 12:12:50 +02:00
Cameron Ayer	33d696b096	storage/redis/redisserver: simplify redisserver creation Change-Id: I881576a7881db671b5abeeca7120a022987cc47f	2020-02-11 19:11:57 +00:00
Cameron Ayer	b22bf16b35	satellite/overlay: add config flag for node selection free disk requirement Currently SNs report their free disk space once per hour. If a node becomes full, it has to wait until the next contact cycle begins to report; all the while receiving and failing upload requests. By increasing the minimum required disk space, we can give the storage nodes more time to report their space before the completely fill up. This change goes hand-in-hand with another change we want to implement: trigger capacity report on SN immediately upon falling below threshold. Change-Id: I12f778286c6c3f582438b0e2949765ac43325e27	2020-02-11 18:08:25 +00:00
Egon Elbre	429f08b4f0	satellite: add Admin peer This peer will contain our administrative panels. It's completely separated from our other satellite processes because it allows better control for restricting access to it. Change-Id: Ifca473bee82ff6c680b346918ba32b835a7a6847	2020-02-11 16:15:33 +00:00
Michal Niewrzal	426c8eb31a	private/testplanet: add DeleteBucket method for uplink New method added to be able to delete easily bucket during tests. Change-Id: Iaae89618cc676ddbbbd4b0df2eeacd143ea6f3c2	2020-02-11 15:58:13 +00:00
Jeff Wendling	99c3ba5bbf	testplanet: log stack trace for error during creation Change-Id: Ifcd2cba4195413a7213ba4d113c43f9fb3cbc3e5	2020-02-10 21:59:20 +00:00
Egon Elbre	f237d70098	storagenode,satellite: use pkg/debug Use debug.Server in storage node and satellite for customizing debug server. Change-Id: I7979412376d028cadf29656d838ab94f18e2aa99	2020-01-29 16:30:31 -05:00
Ethan	149273c63f	satellite/metainfo: add cache expiration for project level rate limiting Allow rate limit project cache to expire so we can make project level rate limit changes without restarting the satellite process. Change-Id: I159ea22edff5de7cbfcd13bfe70898dcef770e42	2020-01-29 16:14:10 +00:00
Isaac Hess	2f77ce48f0	private/testplanet: Add databases to testplanet.databases near creation We now close databases in testplanet in reverse order, knowing that some caches and other objects need to close prior to the underlying db. Some dbs were not being added near the list of closeable databases near their creation, causing an issue with shutdown order. Change-Id: I23391f4d77649030493e47bd7169002a72b3bf7a	2020-01-23 15:30:52 -07:00
Jeff Wendling	16bb374deb	storagenode/piecestore: add large timeouts to read/write operations this is to help protect against intentional or unintentional slowloris style problems where a client keeps a tcp connection alive but never sends any data. because grpc is great, we have to spawn a separate goroutine for every read/write to the stream so that we can return from the server handler to cancel it if necessary. yep. really. additionally, we update the rpcstatus package to do some stack trace capture and add a Wrap method for the times where we want to just use the existing error. also fixes a number of TODOs where we attach status codes to the returned errors in the endpoints. Change-Id: Id8bb8ff84aa34e0f711b0cf9bce3908b36a1d3c1	2020-01-23 19:20:49 +00:00
Egon Elbre	89a148047d	private/testplanet: shutdown databases in reverse order Since we have caches on top of databases and they are included in the databases list, we need to shut them down in-reverse order to avoid issues with flushing to a closed database. Change-Id: I3f23a527a2a5425638b1a7e2cab84741f019d493	2020-01-23 18:55:57 +00:00
Isaac Hess	40a890639d	satellite/orders: Flush all pending bandwidth rollup writes on shutdown Currently we risk losing pending bandwidth rollup writes even on a clean shutdown. This change ensures that all pending writes are actually written to the db when shutting down the satellite. Change-Id: Ideab62fa9808937d3dce9585c52405d8c8a0e703	2020-01-23 08:12:41 -07:00
Egon Elbre	c6f94ce9e4	satellite/metainfo: remove support for boltdb based pointerDB By previous changes we can now remove testplanet.New and also remove metainfo boltdb support. Change-Id: I5bdfbbbb45967492728e705b34b2fedb4f28c381	2020-01-23 13:54:00 +02:00
Egon Elbre	5a4745eddb	all: remove usages of testplanet.New Ensure that tests use testplanet.Run, so we always require running against all database backends. Change-Id: I6b0209e6a4912cf3328bd35b2c31bb8598930acb	2020-01-22 22:42:57 +02:00
Egon Elbre	fc2766eefc	private/testplanet: flatten migration for running tests Currently Cockroach DB setup takes a significant amount of time. This flattens the database setup into a single query, which improves the test time significantly. The migration tests still test each migration separately. Change-Id: Iaca16f34a6af3926fa2b5ebf618f939fd59460b3	2020-01-22 15:09:11 +00:00
Egon Elbre	8b3db70329	private/testplanet: increase metainfo rate limit Rate limit was causing tests to fail due to making too many request. Change-Id: Iafbc97b4880b6d98c86045b28ca7583d27f51720	2020-01-22 13:57:38 +00:00
Michal Niewrzal	6502454947	satellite/metainfo: move RS configuration to satellite With this change RS configuration will be set on satellite. Uplink with get RS values with BeginObject request and will use it. For backward compatibility and to avoid super large change redundancy scheme stored with bucket is not touched. This can be done in future. Change-Id: Ia5f76fc10c37e2c44e4f7b8754f28eafe1f97eff	2020-01-22 09:33:53 +00:00
Ethan	21a5d70a83	satellite/metainfo: Rate limiting - API requests Limits how many times metainfo APIs can be called per second by project ID. If limit is exceeded, the API will return Unauthorized/Too Many requests. Limit per second and the size of the limiter cache per project are configurable, as well as whether the limiter is enabled. Tests added/updated for the new rate_limit field in projects table. Tests added for exceeding limits and disableing limiter. Change-Id: Ic8ad102de3b690a475809d4f684156d5715f20fa	2020-01-21 14:25:04 +00:00
Egon Elbre	10d932fd65	lib/uplinkc: fix test flakiness by setting MaxTimeSkew Not having a skew caused an issue where: 1. Uplink calls "begin segment", where segment isn't committed to the database. 2. Uplink stores piece X to the storage node A with timestamp 1. 3. Satellite runs garbage collection with timestamp 2. 4. Satellite sends retain request to storage node A with timestamp 2. 5. Storage node A deletes piece X, because 1 < 2. 6. Uplink calls "commit segment" with storage node A in it. 7. Download of segment fails, because A doesn't have piece X. In production this is not an issue since the MaxTimeSkew is 72h by default. Change-Id: Id87ca3ddc44103dcd85d031b1367168c014b8e7b	2020-01-20 12:44:42 +00:00
stefanbenten	f4097d518c	satellite: reduce logging of node status Change-Id: I6618cf4bf31b856acd7a28b54011a943c03ab22a	2020-01-18 17:47:59 +00:00
Moby von Briesen	273eb66fae	cmd/storagenode,storagenode/preflight: add config flag to disable storagenode database preflight check. Disable preflight database check by default, and have the option to enable it. This will allow us to enable it once it is definitely working. Also change the name of the config flag for preflight time sync. Change-Id: Ie2e20f9e25dcb38794eafa7e1505e7c6ff287c99	2020-01-17 17:53:17 +00:00
Cameron Ayer	4424697d7f	satellite/accounting: refactor live accounting to hold current estimated totals live accounting used to be a cache to store writes before they are picked up during the tally iteration, after which the cache is cleared. This created a window in which users could potentially exceed the storage limit. This PR refactors live accounting to hold current estimations of space used per project. This should also reduce DB load since we no longer need to query the satellite DB when checking space used for limiting. The mechanism by which the new live accounting system works is as follows: During the upload of any segment, the size of that segment is added to its respective project total in live accounting. At the beginning of the tally iteration we record the current values in live accounting as `initialLiveTotals`. At the end of the tally iteration we again record the current totals in live accounting as `latestLiveTotals`. The metainfo loop observer in tally allows us to get the project totals from what it observed in metainfo DB which are stored in `tallyProjectTotals`. However, for any particular segment uploaded during the metainfo loop, the observer may or may not have seen it. Thus, we take half of the difference between `latestLiveTotals` and `initialLiveTotals`, and add that to the total that was found during tally and set that as the new live accounting total. Initially, live accounting was storing the total stored amount across all nodes rather than the segment size, which is inconsistent with how we record amounts stored in the project accounting DB, so we have refactored live accounting to record segment size Change-Id: Ie48bfdef453428fcdc180b2d781a69d58fd927fb	2020-01-16 10:26:49 -05:00
Jeff Wendling	78c6d5bb32	satellite/satellitedb: reported_serials table for processing orders this commit introduces the reported_serials table. its purpose is to allow for blind writes into it as nodes report in so that we have minimal contention. in order to continue to accurately account for used bandwidth, though, we cannot immediately add the settled amount. if we did, we would have to give up on blind writes. the table's primary key is structured precisely so that we can quickly find expired orders and so that we maximally benefit from rocksdb path prefix compression. we do this by rounding the expires at time forward to the next day, effectively giving us storagenode petnames for free. and since there's no secondary index or foreign key constraints, this design should use significantly less space than the current used_serials table while also reducing contention. after inserting the orders into the table, we have a chore that periodically consumes all of the expired orders in it and inserts them into the existing rollups tables. this is as if we changed the nodes to report as the order expired rather than as soon as possible, so the belief in correctness of the refactor is higher. since we are able to process large batches of orders (typically a day's worth), we can use the code to maximally batch inserts into the rollup tables to make inserts as friendly as possible to cockroach. Change-Id: I25d609ca2679b8331979184f16c6d46d4f74c1a6	2020-01-15 19:21:21 -07:00
Yingrong Zhao	db8aee0806	satellite/contact; storagenode/preflight: add clock check on startup for storagenode add config preflight.enabled-local-time Change-Id: I7b942c9bee063aae409ee6721ae9d079dff0144f	2020-01-15 15:35:26 +00:00
Egon Elbre	cd4ff0722e	private/testplanet: use defaultInterval Change-Id: Ife2810be46faaaf8cd51b193a859a88fff894a0e	2020-01-14 16:07:36 +00:00
Isaac Hess	4950d7106a	satellite/orders: Add write cache for bw rollups Change-Id: I8ba454cb2ab4742cafd6ed09120e4240874831fc	2020-01-13 22:40:51 +00:00
Egon Elbre	24958bd7d3	satellite: add ctx to DB.CreateTables Change-Id: I9ecad624cf5a7fc9c86bb91c68f96a3a4efd2e92	2020-01-13 15:31:09 +02:00
Egon Elbre	0835b9024c	private/dbutil/pgutil: add ctx argument Change-Id: Icfd56ca8c1f831ad56c0195a0b883e8f0618daaf	2020-01-13 15:27:06 +02:00
Michal Niewrzal	b579c260ab	cmd: rename "scope" flag to "access" We decided that better name for "scope" will be "access". This change refactors cmd part of code but don't touch libuplink. For backward compatibility old configs with "scope" field will be loaded without any issue. Old flag "scope" won't be supported directly from command line. https://storjlabs.atlassian.net/browse/V3-3488 Change-Id: I349d6971c798380d147937c91e887edb5e9ae4aa	2020-01-10 15:27:53 +00:00
Natalie Ventura Villasana	6b1829f3c3	satellite/downtime: new chore estimates downtime Adds EstimationChore to the downtime package, which is an independent chore that finds offline nodes given a configurable limit, then uptime checks those nodes, and sets a last contact success or failure given a response. For failed nodes, the chore updates the amount of downtime the node has been offline in the DowntimeTracking table. Design doc section: https://github.com/storj/storj/blob/master/docs/blueprints/storage-node-downtime-tracking.md#estimating-offline-time Jira: https://storjlabs.atlassian.net/browse/V3-2545 Change-Id: I60af95803930bf9b33232b248bb20cca6f0e0b5f	2020-01-09 15:05:13 -05:00
Yingrong Zhao	76ee8a1b4c	satellite: remove UptimeReputation configs from codebase With the new storage node downtime tracking feature, we need remove current uptime reputation configs: UptimeReputationAlpha, UptimeReputationBeta, and UptimeReputationDQ. This is the first step of removing the uptime reputation columns from satellitedb Change-Id: Ie8fab13295dbf545e33aeda0c4306cda4ba54e36	2020-01-08 18:54:15 +00:00
Egon Elbre	082ec81714	uplink: move to storj.io/uplink (#3746 )	2020-01-08 15:40:19 +02:00
Cameron Ayer	0038abb51b	private/testplanet: use redis for live accounting storing live accounting in memory will not work, as the core and api each create their own instance. Using redis will allow each to access the same store Change-Id: I4c8250b579d7b6b6d8991bc890894573626effe6	2020-01-03 21:04:50 +00:00
Ethan	05b406e992	satellite:{downtime,overlay}: Implement offline node detection chore https://storjlabs.atlassian.net/browse/V3-3398 Change-Id: I598c3bad819026377d1d113c099dc9bba8b02742	2020-01-03 17:10:03 +00:00
Ethan	8859c36234	satellite/{downtime,contact}: Add CheckNodeAvailability for use within the downtime tracking chores. https://storjlabs.atlassian.net/browse/V3-2545 Change-Id: I1dd54a0c77cb4905bb1f350beeb82c6f7700ee70	2020-01-02 18:24:11 +00:00
Ivan Fraixedes	c3b58f1656	satellte/metainfo: Make BeginDeleteObject to delete pieces For improving the deletion performance we are shifting the responsibility to delete the pieces of the object from Uplink to the Satellite. BeginDeleteObject was the first call to return the stream ID which was used for after retrieving the list of segments and then get addressed order limits for deleting the pieces (of each segment) from the storage nodes. Now we want the Satellite deletes the pieces of all the object segments from the storage nodes hence we don't need anymore to have several network round trips between the Uplink and the Satellite because the Satellite can delete all of them in the initial BegingDeleteObject request. satellite/metainfo.ListSegments has been changed to return 0 items if the pointer of the last segment of an object is not found because we need to preserve the backward compatibility with Uplinks that won't be updated to the last release and they rely on listing the segments after calling BeginDeleteObject for retrieving the addressed order limits to contact the storage nodes to delete the pieces. Change-Id: I5f99ecf27d62d65b0a062936b9b17581ef692af0	2020-01-02 15:53:59 +00:00
Egon Elbre	e03d3fb577	uplink: move configs to cmd/uplink/cmd Change-Id: Ifc1d3440dcef429c2a6142c16f3e991abf49f1d2	2020-01-02 09:40:57 +00:00
Egon Elbre	2680bae88c	private/testplanet: remove dependency to uplink Remove direct dependency on uplink.RSConfig, this simplifies moving the config file without introducing weird dependencies. Change-Id: I7fd2a145401e0205d7047631df9d2810241efeec	2020-01-02 09:40:46 +00:00
Natalie Ventura Villasana	aa3e183c2e	satellite/gracefulexit: add ge eligibility check Adds check to see if storage nodes are eligible to initiate graceful exit, by checking their CreatedAt date and seeing if their "age" is greater than the new config value: NodeMinAgeInMonths The default for this value is 6 months for now. https://storjlabs.atlassian.net/browse/V3-3357 Change-Id: Ib807ab8987ddb5a38a27a83886490f73fe8c5816	2019-12-31 09:31:58 -05:00
Stefan Benten	758fe35aba	storagenode/orders: adding jitter to sending (#3725 )	2019-12-30 21:35:26 +01:00
Egon Elbre	6615ecc9b6	common: separate repository Change-Id: Ibb89c42060450e3839481a7e495bbe3ad940610a	2019-12-27 14:11:15 +02:00
Fadila	115b8b0fc8	storagenode/piecestore: delete several pieces in a single request This is part of the deletion performance improvement. See https://storjlabs.atlassian.net/browse/V3-3349 Change-Id: Idcd83a302f2bd5cc3299e1a4195a7e177f452599	2019-12-27 10:58:04 +00:00

1 2

69 Commits