storj

Author	SHA1	Message	Date
TungHoang	e1379bea0f	storagenode/piecestore: allow rejecting slow clients Estimate speed of the uploads by calculating the time a client has been in a non-congested state. Storage node is uncongested when it has used up over 80% of its maximum concurrent requests capacity. Currently it's disabled by default.	2021-07-01 09:59:04 +03:00
Yaroslav Vorobiov	c9cfb5ed0c	storagenode/retain: add more verbose monkit monitoring Change-Id: Ibb9804268751b4b1842eb729bc510dba83e9b28b	2021-06-04 20:20:11 +00:00
Egon Elbre	10372afbe4	ci: fix lint errors Change-Id: Ib5893440807811f77175ccd347aa3f8ca9cccbdf	2021-05-17 13:37:31 +00:00
Egon Elbre	86e698f572	pb: use *UnimplementedServer to avoid breaking API changes Change-Id: I99a34eeb37ac4453411f273511710562a519f57a	2021-03-29 12:26:10 +03:00
Ivan Fraixedes	2223877439	storgenode/piecestore: Log upload size on Upload Add the upload size to the log lines of the storagenode Upload endpoint to provides the information to Storage node operators. Change-Id: Ife661d28be72c2bf02579093e21fa811566ac8dd	2020-12-28 10:28:09 +00:00
Egon Elbre	2268cc1df3	all: fix linter complaints Change-Id: Ia01404dbb6bdd19a146fa10ff7302e08f87a8c95	2020-10-13 15:59:01 +03:00
Moby von Briesen	fbf2c0b242	storagenode/orders: Refactor orders store Abstract details of writing and reading data to/from orders files so that adding V1 and future maintenance are easier. Change-Id: I85f4a91761293de1a782e197bc9e09db228933c9	2020-10-06 15:28:07 -04:00
Cameron Ayer	ca0c1a5f0c	storagenode/{monitor,pieces}, storage/filestore: add loop to check storage directory writability periodically create and delete a temp file in the storage directory to verify writability. If this check fails, shut the node down. Change-Id: I433e3a8d1d775fc779ae78e7cf3144a05ffd0574	2020-08-31 21:20:49 +00:00
Moby von Briesen	68b67c83a7	storagenode/{orders,piecestore}: Always unlock unsent orders file, even with an empty order. When we call ordersStore.BeginEnqueue, the unsent orders file for that satellite and hour is prevented from being sent. It is freed when the commit callback returned by BeginEnqueue is used. This change ensures that we always call the commit callback, even when we have an empty order or an order with Amount <= 0. Change-Id: Ic4678f7eaa1e6957dd77d4bb5a23bb35d25b1e93	2020-08-21 11:35:31 -04:00
Jeff Wendling	91698207cf	storagenode: live tracking of order window usage This change accomplishes multiple things: 1. Instead of having a max in flight time, which means we effectively have a minimum bandwidth for uploads and downloads, we keep track of what windows have active requests happening in them. 2. We don't double check when we save the order to see if it is too old: by then, it's too late. A malicious uplink could just submit orders outside of the grace window and receive all the data, but the node would just not commit it, so the uplink gets free traffic. Because the endpoints also check for the order being too old, this would be a very tight race that depends on knowledge of the node system clock, but best to not have the race exist. Instead, we piggy back off of the in flight tracking and do the check when we start to handle the order, and commit at the end. 3. Change the functions that send orders and list unsent orders to accept a time at which that operation is happening. This way, in tests, we can pretend we're listing or sending far into the future after the windows are available to send, rather than exposing test functions to modify internal state about the grace period to get the desired effect. This brings tests closer to actual usage in production. 4. Change the calculation for if an order is allowed to be enqueued due to the grace period to just look at the order creation time, rather than some computation involving the window it will be in. In this way, you can easily answer the question of "will this order be accepted?" by asking "is it older than X?" where X is the grace period. 5. Increases the frequency we check to send up orders to once every 5 minutes instead of once every hour because we already have hour-long buffering due to the windows. This decreases the maximum latency that an order will be reported back to the satellite by 55 minutes. Change-Id: Ie08b90d139d45ee89b82347e191a2f8db1b88036	2020-08-19 19:42:33 +00:00
Cameron Ayer	0155c21b44	private/testplanet, storagenode/{monitor,pieces}: write storage dir verification file on run and verify on loop On run, write the storage directory verification file. Every time the node runs it will write the file even if it already exists. The reason we do this is because if the verification file is missing, the SN doesn't know whether it is an incorrect directory, or it simply hasn't written the file yet, and we want to keep nodes running without needing operator intervention. Once this change has been a part of the minimum version for several releases, we will move the file creation from the run command to the setup command. Run will only verify its existence. Change-Id: Ib7d20e78e711c63817db0ab3036a50af0e8f49cb	2020-08-19 19:12:21 +00:00
Moby von Briesen	708cb48aa6	storagenode/orders: implement orders filestore on storagenode * Add all new orders to the orders filestore instead of the database. * Submit orders from the filestore to the new satellite SettleWindow endpoint. The orders filestore will eventually replace the orders DB completely. For now, we will still be checking the orders DB and submitting those orders if they exist. In a later release, we will completely remove the orders DB, but we need both the DB and filestore for the transitionary period. Change-Id: Iac8780fd5ab770296181bbd313e1d335f072d4dc	2020-08-19 15:00:35 +00:00
Cameron Ayer	1f5d5235a6	storagenode/{monitor,piecestore}: if free disk < expected available space, return free disk We only sync free disk and available space, if necessary, on startup. If the SNs disk fills up with non-storj data, we will not know about it when reporting available space to the satellite. Solution: whenever we check the node's capacity, double check free disk. If free disk < than expected available space, return free disk. Change-Id: I66265c16e03be45b6e1f5817c70df7eac0a76455	2020-07-22 15:08:37 +00:00
Egon Elbre	080ba47a06	all: fix dots Change-Id: I6a419c62700c568254ff67ae5b73efed2fc98aa2	2020-07-16 14:58:28 +00:00
Moby von Briesen	e9dd5b2845	storagenode/piecestore: Properly log/send metrics for all successful pieces When an uplink or repair work finishes uploading a piece to a storagenode, it has no reason to wait another round trip after the piece is committed to gracefully close the connection - in many (most?) cases, the connection is simply canceled once the upload is complete. This has the unintended side effect of producing a lot of "piece canceled" logs and metrics on the storagenode side, when the reality is that the piece uploads were successful, and not really canceled. This commit fixes that. Change-Id: Icbc1f7857d380134560219c1c19c186df2783cd0	2020-07-07 15:19:17 +00:00
Egon Elbre	4d0cb1af7e	storagenode/piecestore: verify before checking free disk Change-Id: I3fe0383f9b13b99ef9d63ff235616ff204cf6d76	2020-06-02 17:49:14 +03:00
Qweder93	89c9672ce0	storagenode/piecestore: available storage check added in Upload Change-Id: I71e9e5f335d4320d5de8b374fe747fec43179f78	2020-06-01 16:55:22 +00:00
Moby von Briesen	dc57640d9c	storagenode/piecestore: switch usedserials db for in-memory usedserials store Part 2 of moving usedserials in memory * Drop usedserials table in storagenodedb * Use in-memory usedserials store in place of db for order limit verification * Update order limit grace period to be only one hour - this means uplinks must send their order limits to storagenodes within an hour of receiving them Change-Id: I37a0e1d2ca6cb80854a3ef495af2d1d1f92e9f03	2020-05-28 12:52:52 -04:00
littleskunk	2fbb34c3ea	nodeselection: Increase minimum free space to 500MB (#3898 )	2020-05-25 12:13:28 +02:00
littleskunk	ef2671927d	storagenode/piecestore: move queue size defaults (#3881 )	2020-05-15 19:10:26 +02:00
Egon Elbre	ec589a8289	all: fix comments about grpc Change-Id: Id830fbe2d44f083c88765561b6c07c5689afe5bd	2020-05-11 13:05:34 +03:00
Egon Elbre	7d29f2e0d3	all: remove drpc wrappers Change-Id: I45016f7d2a771dc00776196c1f531f3343e93b40	2020-05-11 08:20:34 +03:00
Egon Elbre	e6d5ce6b77	all: remove grpc It seems everyone has migrated to drpc. Change-Id: Ica6b2d0bdef68c6603083f2963458843eca71e9e	2020-05-10 06:36:09 +00:00
Jeff Wendling	57eb8a17e2	storagenode: allow configuring database path independently Fixes #3852 Change-Id: I021c29c4dd7c393399f6abef41d8457514032833	2020-05-04 06:04:31 +00:00
Matt Robinson	4a95ad0b2f	storagenode/piecestore: remove error from info log for download canceled (#3869 )	2020-04-30 18:42:16 +03:00
Isaac Hess	db0371703f	storagenode/pieces: Return UnhandledCount to satellite When we receive a piece deletion request, include the number of piece IDs we couldn't add to the queue in the reponse Change-Id: Ibebbe92ac50105bb5c74b18211ed38d468eb33f3	2020-04-27 08:56:56 -06:00
Isaac Hess	a785d37157	storagenode/pieces: Process deletes asynchronously To improve delete performance, we want to process deletes asynchronously once the message has been received from the satellite. This change makes it so that storagenodes will send the delete request to a piece Deleter, which will process a "best-effort" delete asynchronously and return a success message to the satellite. There is a configurable number of max delete workers and a max delete queue size. Change-Id: I016b68031f9065a9b09224f161b6783e18cf21e5	2020-04-23 11:51:19 -06:00
littleskunk	1336070fec	storagenode/piecestore: add missing log message about audit errors (#3861 ) * storagenode/piecestore: add missing log message about audit errors * storagenode/piecestore: add monkit data for oder limit verification errors	2020-04-23 13:20:47 -04:00
Jess G	dc78cd9634	storagenode/piecestore: fix annoying info log for upload canceled (#3857 ) * storagenode/piecestore: fix annoying info upload canceled log Change-Id: I3dd0f44226e7b946a2b30c3d0f30f28749ca6e88 * keep as info Change-Id: I7ebb5c19e4865e3030a8a6bb7f6279d316853e89 Co-authored-by: Fadila <Fadila82@users.noreply.github.com>	2020-04-15 14:15:07 -07:00
Jess G	75b9a5971e	satellite: update log levels (#3851 ) * satellite: update log levels Change-Id: I86bc32e042d742af6dbc469a294291a2e667e81f * log version on start up for every service Change-Id: Ic128bb9c5ac52d4dc6d6c4cb3059fbad73f5d3de * Use monkit for tracking failed ip resolutions Change-Id: Ia5aa71d315515e0c5f62c98d9d115ef984cd50c2 * fix compile errors Change-Id: Ia33c8b6e34e780bd1115120dc347a439d99e83bf * add request limit value to storage node rpc err Change-Id: I1ad6706a60237928e29da300d96a1bafa94156e5 * we cant track storage node ids in monkit metrics so lets use logging to track that for expired orders Change-Id: I1cc1d240b29019ae2f8c774792765df3cbeac887 * fix build errs Change-Id: I6d0ffe058e9a38b7ed031c85a29440f3d68e8d47	2020-04-15 12:32:22 -07:00
Egon Elbre	1b6ab173a8	private/context2: moved to storj.io/common/context2 Change-Id: Ic1dd1ed645ff3e1057c9b2b143e2c3ddf29d678e	2020-03-20 14:39:46 +00:00
Jennifer Johnson	1c1750e6be	removes bandwidth limiting On satellite, remove all references to free_bandwidth column in nodes table. On storage node, remove references to AllocatedBandwidth and MinimumBandwidth and mark as deprecated. Protobuf message, NodeCapacity, is left intact for backwards compatibility. Once this is released to all satellites, we can drop the column from the DB. Change-Id: I2ff6c6537fc9008a0c5588e951afea58ede85838	2020-03-04 14:04:00 +00:00
Cameron Ayer	7244a6a84e	storagenode/{contact, piecestore}: implement low disk notification with cooldown When a storagenode begins to run low on capacity, we want to notify the satellite before completely running out of space. To achieve this, at the end of an upload request, the SN checks if its available space has fallen below a certain threshold. If so, trigger a notification to the satellites. The new NotifyLowDisk method on the monitor chore is implemented using the common/syn2.Cooldown type, which allows us to execute contact only once within a given timeframe; avoiding hammering the satellites with requests. This PR contains changes to the storagenode/contact package, namely moving methods involving the actual satellite communication out of Chore and into Service. This allows us to ping satellites from the monitor chore Change-Id: I668455748cdc6741291b61130d8ef9feece86458	2020-03-03 10:45:37 -05:00
Egon Elbre	64330c55b3	all: use pbgrpc common/pb moved grpc to a separate package common/pb/pbgrpc. This updates this repository to use it. Change-Id: I2de2a190688871cf9cb61f7ea511f8a01e264e4e	2020-02-26 21:27:47 +02:00
Cameron Ayer	d578102672	storagenode/piecestore: add workgroup to endpoint to prevent stray goroutine after shutdown Change-Id: Ie8444c3c8f870745b73342de2e9a93027fcad371	2020-02-24 21:38:52 +00:00
Cameron Ayer	f22bddf122	{storagenode/contact, private/testplanet}: remove ErrFailureToStart and panic in testplanet.Start Change-Id: I252e8c9407400af7bda95a7657c8154660c3c801	2020-02-24 18:24:23 +00:00
Yingrong Zhao	5011e78311	storagenode/piecestore: remove unused DeletePiece endpoint With commit: `3331b443e7`, satellite will start calling `DeletePieces`. Therefore, we can remove the old endpoint once the above commit is deployed with all satellites Change-Id: I0124bc00a7cb808d119eb59f8fcd7fadf68158bb	2020-02-21 21:03:49 +00:00
Cameron Ayer	3e70a893dd	storagenode/{piecestore, contact}: report capacity to satellites if below specific threshold Curently, storage nodes only report their capacity to satellites once per hour. If a node fills up, it will fail all uploads until the next contact cycle begins. With these changes, at the end of an upload we check whether the MinimumDiskSpace threshold has been passed. If so, trigger the monitor chore to update the node's capacity, then trigger the contact chore to report the new capacity to the satellites Change-Id: Ie6aadaade1e2c12c87e03f8ff9059a50121380a0	2020-02-18 15:42:48 -05:00
Egon Elbre	8f20085683	storagenode/piecestore: clearer client cancellation error message Change-Id: Ia0595f71eb3eb1c0f091e615652e2de376d5609d	2020-02-14 09:36:03 +00:00
Jeff Wendling	7999d24f81	all: use monkit v3 this commit updates our monkit dependency to the v3 version where it outputs in an influx style. this makes discovery much easier as many tools are built to look at it this way. graphite and rothko will suffer some due to no longer being a tree based on dots. hopefully time will exist to update rothko to index based on the new metric format. it adds an influx output for the statreceiver so that we can write to influxdb v1 or v2 directly. Change-Id: Iae9f9494a6d29cfbd1f932a5e71a891b490415ff	2020-02-05 23:53:17 +00:00
Jeff Wendling	03166d6be3	storagenode/piecestore: log available bandwidth and space on uploads Change-Id: Ia92228cb2a178da45f4f123b48c476e5ec821fe8	2020-01-31 19:47:14 +00:00
Jeff Wendling	16bb374deb	storagenode/piecestore: add large timeouts to read/write operations this is to help protect against intentional or unintentional slowloris style problems where a client keeps a tcp connection alive but never sends any data. because grpc is great, we have to spawn a separate goroutine for every read/write to the stream so that we can return from the server handler to cancel it if necessary. yep. really. additionally, we update the rpcstatus package to do some stack trace capture and add a Wrap method for the times where we want to just use the existing error. also fixes a number of TODOs where we attach status codes to the returned errors in the endpoints. Change-Id: Id8bb8ff84aa34e0f711b0cf9bce3908b36a1d3c1	2020-01-23 19:20:49 +00:00
Egon Elbre	08f63614be	private/context2: add WithoutCancellation Change-Id: I38557c16f41b8983886f256353cc6afb7634d9e6	2020-01-15 14:23:46 +02:00
Egon Elbre	5af1f9e6d1	storagenode/{piecestore,storagenodedb}: use context in queries In endpoint.saveOrder, ensure we always try to save orders such that they can be settled. Change-Id: Ic9ac8f4bf684d8493282912ca97f386c1762e364	2020-01-14 20:27:26 +00:00
Egon Elbre	6615ecc9b6	common: separate repository Change-Id: Ibb89c42060450e3839481a7e495bbe3ad940610a	2019-12-27 14:11:15 +02:00
Fadila	115b8b0fc8	storagenode/piecestore: delete several pieces in a single request This is part of the deletion performance improvement. See https://storjlabs.atlassian.net/browse/V3-3349 Change-Id: Idcd83a302f2bd5cc3299e1a4195a7e177f452599	2019-12-27 10:58:04 +00:00
Andrew Harding	cb89496569	storagenode/trust: wire up list into pool - also updated ping chore to pick up trust changes - fixed small typo in blueprint - fixed flags for storj-sim - wired up changes to testplanet Change-Id: I02982f3a63a1b4150b82a009ee126b25ed51917d	2019-12-13 20:32:50 +00:00
Ivan Fraixedes	42c61138e8	storage: Improve doc comments delete methods (#3591 ) Improve the documentation of several methods involved in the delete operation to make clear their behavior without having to inspect their logic.	2019-12-02 12:18:20 +01:00
Ivan Fraixedes	bf97ef06fc	storagenode: Add new endpoint to receive satellite requests for… (#3590 ) * pkg/pg: Add new service function storage node Add a new service function to the storage node piece store for deleting pieces when satellites request them. * storagenode/piecestore: Add endpoint to delete piece Add a new endpoint to receive from trusted satellites to delete a piece. * private/testplanet: Fix storagenode mock Add to the storagenode mock the new endpoint method. * proto.lock: Update it with the last protbuff changes * storagenode/piecestore: Reuse test piece upload Extract the repeated logic from several tests functions for uploading a test piece to a test helper function. * uplink/piecestore: Implement client side method Implement the client side method of the new piecestore RPC function. * storagenode/piecestore: Add test DeletePiece endpoint Implement a test for the DeletePiece new endpoint method.	2019-11-26 18:47:19 +01:00
Isaac Hess	6aeddf2f53	storagenode/pieces: Add Trash and RestoreTrash to piecestore (#3575 ) * storagenode/pieces: Add Trash and RestoreTrash to piecestore * Add index for expiration trash	2019-11-20 09:28:49 -07:00

1 2 3

115 Commits