storj

Author	SHA1	Message	Date
Clement Sam	db7c6d38e5	storagenode/orders: archive unsent order for untrusted satellite The order service still tries to settle orders at all instances even when the satellite is marked as untrusted by the trust service, which will always fail because the trust cache no longer has record of the URL of the satellite, and it will keep retrying. This leaves a lot of "satellite is untrusted" errors in the logs. There has been several complaints on the forum because this was happening a lot for the stefanlite and I expect it will be the same issue for the decommisioned satellites US2 and EUN-1 once the forget-satellite command is run to clean up the satellite. This change allows the order service to archive unsent orders available for any untrusted satellite, and will not attempt to settle the order. https://github.com/storj/storj/issues/6262 Change-Id: If0f7f1783587cd18fab8917d45948f22df5b1dcf	2023-10-12 13:48:07 +00:00
Márton Elek	9186365507	storagenode/pieces: more granular io and hashing statistic This patch adds two new monkit metric: * piece_writer_io: the sum of the time, which is spent with io.Write during a piece upload (excluding the fs sync of the commit) * piece_writer_hash: the sum of the time, which is spent with hashing The second is especially important. My storagenode (hosted on a cloud server) spend ~30 ms on hasing data, piece_write_io time is usually 5ms for me. These metrics can help us to identify the reason of slownes on storagenode sides. Both of these depend on the size of the piece. To make it more meaningfull without exploding the cardinality, I created a few size categories and classified the pieces based on these. Measurements shows that it can provide usefull results (>2MB uploads are usually 23-28 ms). Change-Id: Ifa1c205a490046655bcc34891003e7b43ed9c0bc	2023-10-11 07:33:04 +00:00
Clement Sam	d432a7197a	storagenode/piecestore: notify if download piece was restored from trash Updates https://github.com/storj/storj/issues/6146 Change-Id: Iece285eb5ecb6898b29096416ab10e43338480b0	2023-10-05 18:50:01 +00:00
paul cannon	c33475f63e	storagenode/gracefulexit: fix flaky test (hopefully) If the storagenode chore is left running and it has a chance to check in again after we move time forward (line 139), then the satellite will mark it as having finished GE before we check which nodes are still in GE (line 149). Change-Id: I350e1ef2e943f758d44132aaddd05fe248b30f3e	2023-10-04 09:53:51 -05:00
Clement Sam	a2c162db9b	storagenode/trust: ensure trust pool updates satellite status on Refresh Fixes https://github.com/storj/storj/issues/6261 Change-Id: Ic01ce423156058dd4676fb073c0de3d768991d0e	2023-10-03 16:54:00 +00:00
paul cannon	72189330fd	satellite/gracefulexit: revamp graceful exit Currently, graceful exit is a complicated subsystem that keeps a queue of all pieces expected to be on a node, and asks the node to transfer those pieces to other nodes one by one. The complexity of the system has, unfortunately, led to numerous bugs and unexpected behaviors. We have decided to remove this entire subsystem and restructure graceful exit as follows: * Nodes will signal their intent to exit gracefully * The satellite will not send any new pieces to gracefully exiting nodes * Pieces on gracefully exiting nodes will be considered by the repair subsystem as "retrievable but unhealthy". They will be repaired off of the exiting node as needed. * After one month (with an appropriately high online score), the node will be considered exited, and held amounts for the node will be released. The repair worker will continue to fetch pieces from the node as long as the node stays online. * If, at the end of the month, a node's online score is below a certain threshold, its graceful exit will fail. Refs: https://github.com/storj/storj/issues/6042 Change-Id: I52d4e07a4198e9cb2adf5e6cee2cb64d6f9f426b	2023-09-27 08:40:01 +00:00
Clement Sam	f14fabc90a	cmd/storagenode: add forget-satellite subcommand This change adds a new forget-satellite sub-command to the storagenode CLI which cleans up untrusted satellite data. Issue: https://github.com/storj/storj/issues/6068 Change-Id: Iafa109fdc98afdba7582f568a61c22222da65f02	2023-09-13 19:06:55 +00:00
Clement Sam	9ab934e2ae	storagenode/piecestore: implement trash recovery for download requests This change allows a node to look for a piece in the trash when serving a download request. If the piece is found in the trash, it restores it to the blobs directory and continue to serve the request as expected. Resolves https://github.com/storj/storj/issues/6145 Change-Id: Ibfa3c0b4954875fa977bc995fc4dd2705ca3ce42	2023-09-05 23:04:21 +00:00
Egon Elbre	a3067b7b3b	storagenode/monitor: ignore shutdown errors This fixes some flakyness in tests. Change-Id: I535232c5d80827d6d72c73d61134f3b2806b5db9	2023-08-16 11:53:58 +00:00
Jess Stingray	6f01a81648	Downgrade "context canceled" errors to Info instead of Error (#6166 )	2023-08-10 09:33:49 +02:00
Egon Elbre	dc41978743	all: fix golangci failures Change-Id: I07421388d53c837e35a4727cead26fc21c324d04	2023-08-09 11:44:44 +03:00
Clement Sam	4cb85186b2	storagenode/pieces: enable lazyfilewalker by default Resolves https://github.com/storj/storj/issues/5861 Change-Id: I20e0a5b8a15ca966cbccd71369322a517a2c2130	2023-07-19 15:25:20 +00:00
Tomasz Melcer	5a1c3f7f19	storage/reputation: logging changes to node scores (#5877 ) Useful for monitoring storage nodes using log-parsing tools, like swatchdog. Co-authored-by: Clement Sam <clementsam75@gmail.com>	2023-07-13 17:03:18 +02:00
Márton Elek	70cdca5d3c	satellite: move satellite/nodeselection/uploadselection => satellite/nodeselection All the files in uploadselection are (in fact) related to generic node selection, and used not only for upload, but for download, repair, etc... Change-Id: Ie4098318a6f8f0bbf672d432761e87047d3762ab	2023-07-07 10:32:03 +02:00
Márton Elek	6a3802de4f	satellite,storagenode: propagate node tags with NodeCheckin Change-Id: Ib1a602a8cf81204efa001b5d338914ea4218c39b	2023-07-05 13:45:42 +00:00
Clement Sam	a740f96f75	storagenode/pieces/lazyfilewalker: test zapwrapper This add tests to the zapwrapper package and also adds a test to verify the issue in https://github.com/storj/storj/issues/6006 Change-Id: Iec3f568e72683af71e1718017109a1ed52794b0b	2023-07-05 12:33:00 +00:00
Clement Sam	7ac2031cac	web/multinode: fix wrong free disk space in allocation on dashboard There are many case where the keywords `free` and `available` are confused in their usage. For most cases, `free` space is the amount of free space left on the whole disk, and not just in allocation while `available` space is the amount of free space left in the allocated disk space. What the user/sno wants to see is not the free space but the available space. To the SNO, free space is the free space left in the allocated disk space. Because of this confusion, the multinode dashboard displays the `free` disk space instead of the free space in the allocated disk space https://github.com/storj/storj/issues/5248 While the storagenode dashboard shows the correct free space in the allocation. This change fixes the wrong free disk space. I also added a few comments to make a distinction between the `free` and `available` fields in the `DiskSpace*` structs. Change-Id: I11b372ca53a5ac05dc3f79834c18f85ebec11855	2023-07-05 11:24:24 +00:00
Márton Elek	d38b8fa2c4	satellite/nodeselection: use the same Node object from overlay and nodeselection We use two different Node types in `overlay` and `uploadnodeselection` and converting back and forth. Using the same object would allow us to use a unified node selection interface everywhere. Change-Id: Ie71e29d60184ee0e5b4547eb54325f09c418f73c	2023-07-03 16:59:33 +00:00
Egon Elbre	2463b881c6	storagenode/piecestore: fix TestUpload The test needs to wait for the upload information to be saved to the database. Fixes https://github.com/storj/storj/issues/6008 Change-Id: I1f258c923a4b33cbc571f97bad046cec70642a0b	2023-06-29 22:14:22 +03:00
Clement Sam	b6026b9ff3	storagenode/piecestore: fix ingress graph skewed by larger signed orders Storagenodes are currently getting larger signed orders due to a performance optimization in uplink, which now messes with the ingress graph because the storagenode plots the graph using the order amount instead of actually uploaded bytes, which this change fixes. The egress graph might have a similar issue if the order amount is larger than the actually downloaded bytes but since we pay for orders, whether fulfilled or unfulfilled, we continue using the order amount for the egress graph. Resolves https://github.com/storj/storj/issues/5853 Change-Id: I2af7ee3ff249801ce07714bba055370ebd597c6e	2023-06-20 13:28:08 +00:00
JT Olio	3fff61f04a	payments: don't redefine compensation rates twice Change-Id: Ic00abe3795a000d4f0284c99f270180123a2f663	2023-06-10 12:40:43 +00:00
JT Olio	1966340b2a	storagenode: report fastopen support Change-Id: Ic707bce32729540391d5139e7f3a82c693bcb834	2023-06-05 15:20:13 +00:00
Clement Sam	b64179c82a	{storagenode/pieces,cmd/storagenode}: refactor lazyfilewalker commands and tests With this change we are directly testing how the command is executed when the args are passed Change-Id: Ibb33926014c9d71c928e0fd374bf4edc5a8a1232	2023-06-02 00:11:53 +00:00
Antonio Franco	aa64162b44	Ordersfiles common unittests (#5516 ) * storagenode/orders/ordersfiles: unit test coverage This change implements unit testing on common.go from the ordersfile package. * storagenode/orders/ordersfiles: unit test coverage This change implements the zeebo assert library instead of gotools as to not introduce a new dependency. * storagenode/orders/ordersfiles: unit test coverage This change implements the zeebo assert library instead of gotools as to not introduce a new dependency.	2023-06-01 12:27:14 +00:00
Clement Sam	c6f67d4799	storagenode: fix lazyfilewalker failing with SIGPIPE Lazyfilewalker was failing with SIGPIPE which was quite misleading. The command was failing because the the value of the --lower-io-priority flag was assumed to be an arguement since it was passed as "--lower-io-priority true" instead "--lower-io-priority=true" Resolves https://github.com/storj/storj/issues/5900 Change-Id: Icf79fcce76dafee21659d76ee0ce19d8520c8f1d	2023-05-24 15:19:31 +00:00
Clement Sam	c64f3f3132	{storagenode/console,web/storagenode}: fetch pricing model from storagenode API Instead of the hardcoded payout rates that is assumed for all satellites, this change adds a new endpoint for fetching the pricing model for each satellite. The pricing model is then displayed on the Info & Estimation table on the dashboard Updates https://github.com/storj/storj-private/issues/245 Change-Id: Iac7669e3e6eb690bbaad6e64bbbe42dfd775f078	2023-05-09 13:25:44 +00:00
Clement Sam	1003d8213c	storagenode/blobstore/filestore: add debug log to WalkNamespace when dir does not exist This is particularly useful for monitoring the lazyfilewalker to make sure it is not checking the wrong directory. Updates https://github.com/storj/storj/issues/5349 Change-Id: I7e5fcfd4545ec4157d33a9225cd1bce607ccd154	2023-05-09 10:27:13 +00:00
Clement Sam	018b6eeeaf	storagenode: add tests for lazyfilewalker Updates https://github.com/storj/storj/issues/5349 Change-Id: I9544c14ba2acacd5b304f151ab29c70ff61adc5b	2023-05-08 21:50:40 +00:00
Clement Sam	cf7ce81d09	cmd/storagenode: refactor lazyfilewalker commands to satisfy the execwrapper.Command interface Follow-up change for https://review.dev.storj.io/c/storj/storj/+/10335 Updates https://github.com/storj/storj/issues/5349 Change-Id: Iadf55bae84ebc0803a0766830e596c396dfb332b	2023-05-08 15:09:53 +00:00
Clement Sam	291e639ac2	storagenode/pieces/lazyfilewalker: add execwrapper package The execwrapper package wraps the exec.Cmd and has a Command interface that mimics the behaviour of the exec.Cmd. This is useful for testing the lazyfilewalker subprocesses by stubbing instead of spawning a real subprocess. Updates https://github.com/storj/storj/issues/5349 Change-Id: I14084139c76a531f2b6d7163f9aa35c3f5e192d7	2023-05-06 02:02:23 +00:00
Clement Sam	ec8bfe6b94	storagenode/pieces: capture logger time Key in zapwrapper Updates https://github.com/storj/storj/issues/5349 Change-Id: I426f38c0ae0f93d498317e3f66ba4f5724620758	2023-05-06 02:02:23 +00:00
Egon Elbre	8b82dba602	storagenode/blobstore/filestore: add tracking of blobs We've had issues with forgetting to close readers and writers. Add leak tracking to find those pesky issues. Change-Id: If6b0ad6e9958318a7e0affee9c6d0a1ece412b6d	2023-05-05 15:40:15 +03:00
Clement Sam	e0542c2d24	storagenode: run garbage collection filewalker as a low I/O subprocess Updates https://github.com/storj/storj/issues/5349 Change-Id: I7d810d737b17f0b74943765f7f7cc30b9fcf1425	2023-05-02 19:43:38 +00:00
Jeff Wendling	80b3edf1d1	storagenode/piecestore: respect maximum chunk size requests See https://review.dev.storj.io/c/storj/common/+/10297 for more details. Change-Id: Id5d19f029ae872780a554874592679191c1b5b2f	2023-04-28 16:31:41 -04:00
Michal Niewrzal	6ac5bf0d7c	satellite/gracefulexit: remove segments loop parts We are switching completely to ranged loop. https://github.com/storj/storj/issues/5368 Change-Id: Ia3e2d7879d91f7f5ffa99b8e8f108380e3b39f31	2023-04-24 15:00:26 +00:00
Clement Sam	f076238748	storagenode: run used-space filewalker as a low IO subprocess As part of fixing the IO priority of filewalker related processes such as the garbage collection and used-space calculation, this change allows the initial used-space calculation to run as a separate subprocess with lower IO priority. This can be enabled with the `--storage2.enable-lazy-filewalker` config item. It falls back to the old behaviour when the subprocess fails. Updates https://github.com/storj/storj/issues/5349 Change-Id: Ia6ee98ce912de3e89fc5ca670cf4a30be73b36a6	2023-04-14 04:16:14 +00:00
Egon Elbre	8fba740332	storagenode/blobstore/testblobs: don't error checks in BadDB We automatically start a chore to check whether the blobstore is writeable and readable, however, we don't want to fail the tests due to that reason. Usually we want to test some other failure. There probably should be some nicer way to achieve this, but this is an easier fix. Change-Id: I77ada75329f88d3ea52edd2022e811e337c5255a	2023-04-11 16:14:31 +03:00
JT Olio	1d63395fd1	storagenode/peer: don't require CA whitelist any longer this change makes it so that the storage node no longer cares if the cert of peers it talks to has been signed by the sno registration server. this is fine because the only reason a storage node would talk to a peer besides the explicitly configured satellites is because a satellite told it to. we have already disabled this on uplinks (uplinks don't care about the peer ca whitelist), and we are starting to consider disabling this on satellites entirely. however, before we really can disable it on satellites, we need to disable it on storage nodes so that graceful exit and node to node transfers can work correctly. Change-Id: I2e0a0781bd247e574b82f0065aafb88804e59c71	2023-04-07 21:53:00 +00:00
Egon Elbre	f5020de57c	storagenode/blobstore: move blob store logic The blobstore implementation is entirely related to storagenode, so the rightful place is together with the storagenode implementation. Fixes https://github.com/storj/storj/issues/5754 Change-Id: Ie6637b0262cf37af6c3e558556c7604d9dc3613d	2023-04-05 18:06:20 +00:00
Márton Elek	462c16eb9b	storagenode/piecestore: use actual Initial/MaxStep defaults storj/storj uses storj/uplink and storj/uplink uses storj/storj (for integration test). Without using the real defaults (instead of hard coded ones) in storj/storj, we couldn't modify them. (modification in uplink will fail when storj/storj is used for integration test, with the unchanged, hard-coded defaults). Change-Id: Ifa68567dc2d5c8d08af8041ac338870c4fc26d45	2023-04-05 19:18:12 +02:00
paul cannon	556250911c	storagenode/monitor: add option to log only when verification check fails This is not recommended for most nodes; leaving your node running when it can't handle requests fast enough is a good way to fail audits and get disqualified, which may happen before you even know about the problem. But some Windows users are finding that this is being triggered regularly on their nodes, and that it apparently causes the whole system to lock up occasionally. We are adding this option as a way to mitigate that problem until we can collect more information. Change-Id: I7a652b0f9f970bbb9ed9f2cb3ad1cb89d90db8d7	2023-04-04 12:41:33 +00:00
Clement Sam	e5c43722dc	storagenode/pieces: introduce FileWalker FileWalker implements methods to walk over pieces in in a storage directory. This is just a refactor to separate filewalker functions from pieces.Store. This is needed to simplify the work to create a separate filewalker subprocess and reduce the number of config flags passed to the subprocess. You might want to check https://review.dev.storj.io/c/storj/storj/+/9773 Change-Id: I4e9567024e54fc7c0bb21a7c27182ef745839fff	2023-03-30 18:33:52 +00:00
Márton Elek	00420b5904	storagenode/piecestore: better monkit metric for download Download is server from two goroutines: * one is waiting for the orders (and updates the actual limit) * other one sends the valuable bytes back to the client (in case the actual order is big enough) These two tasks are syncrhonized with the help of a `sync2.NewThrottle()` But all of these happens in the same method, therefore we have no idea how much time is spent on waiting for next orders (throttle can wait until we receive new orderlimit), and how much time is spent with actual work. This patch moves the actual work (after sending routine is waked up) to a separated method to have better visibility and measure the actual work (read data + send it). Change-Id: Ia5068c544560a53bc2fcea6cb6fce85cfacbd95b	2023-03-30 10:46:18 +00:00
JT Olio	58f465c8d8	storagenode/piecestore: improve cancelation detection logic if a drpc.ClosedError was returned, it would always take the first (failure) branch, despite the second branch's existence. Change-Id: Ife3b27869c4e9d37ca2914e2d1d1a2c60d326309	2023-03-21 17:50:49 +00:00
JT Olio	0b4b04900a	private/server: debounce noise and tls connections to support TCP_FAST_OPEN, we're considering just using two TCP connections in parallel per request, one with and one without. this allows us to safely fire both concurrently without stressing out the node too much. see https://review.dev.storj.io/c/storj/storj/+/9933 Change-Id: I9aa8a0252350db5ace04ee125bfe469203e980ec	2023-03-21 16:51:31 +00:00
Clement Sam	c3d5965ef2	storagenode/monitor: add timeout to storage dir verification Resolves https://github.com/storj/storj/issues/4567 Change-Id: Ia071c476bcd1f5c99a9874801c94db86d1e105c6	2023-03-14 13:43:14 +00:00
Andrew Harding	5c744d7ed4	storagenode/pieces: close reader after use Change-Id: Icd9df821edb668c5521732396b7d6be3b8e75c7a	2023-03-13 14:06:10 +00:00
JT Olio	ade808375c	storagenode/piecestore: be more flexible with bandwidth usage max this is to prepare for https://review.dev.storj.io/c/storj/uplink/+/9814 Change-Id: Ib74fd63a352a0ae4cc1f8b66fb70df649844c33a	2023-03-06 18:00:17 +00:00
Egon Elbre	71e42ac2a6	storagnode/storagenodedb: fix database queries Change-Id: Ic5036c30aed7f68b9c28d6190650028461503465	2023-03-01 15:57:08 +02:00
Márton Elek	644aca0e42	storagenode: fix piecestore download metrics Storagenode download metrics are not accurate: * the current metric bump cancel metrics only for specific error messages, but there are cases where the error is already handled (err == nill) * instead of the full size of the piece we need to use the size of the downloaded bytes Change-Id: I6ca75770e2d40bf514f5e273785c78e02968c919	2023-02-28 12:52:58 +00:00

1 2 3 4 5 ...

841 Commits