storj

Author	SHA1	Message	Date
Michal Niewrzal	6ac5bf0d7c	satellite/gracefulexit: remove segments loop parts We are switching completely to ranged loop. https://github.com/storj/storj/issues/5368 Change-Id: Ia3e2d7879d91f7f5ffa99b8e8f108380e3b39f31	2023-04-24 15:00:26 +00:00
Clement Sam	f076238748	storagenode: run used-space filewalker as a low IO subprocess As part of fixing the IO priority of filewalker related processes such as the garbage collection and used-space calculation, this change allows the initial used-space calculation to run as a separate subprocess with lower IO priority. This can be enabled with the `--storage2.enable-lazy-filewalker` config item. It falls back to the old behaviour when the subprocess fails. Updates https://github.com/storj/storj/issues/5349 Change-Id: Ia6ee98ce912de3e89fc5ca670cf4a30be73b36a6	2023-04-14 04:16:14 +00:00
Egon Elbre	8fba740332	storagenode/blobstore/testblobs: don't error checks in BadDB We automatically start a chore to check whether the blobstore is writeable and readable, however, we don't want to fail the tests due to that reason. Usually we want to test some other failure. There probably should be some nicer way to achieve this, but this is an easier fix. Change-Id: I77ada75329f88d3ea52edd2022e811e337c5255a	2023-04-11 16:14:31 +03:00
JT Olio	1d63395fd1	storagenode/peer: don't require CA whitelist any longer this change makes it so that the storage node no longer cares if the cert of peers it talks to has been signed by the sno registration server. this is fine because the only reason a storage node would talk to a peer besides the explicitly configured satellites is because a satellite told it to. we have already disabled this on uplinks (uplinks don't care about the peer ca whitelist), and we are starting to consider disabling this on satellites entirely. however, before we really can disable it on satellites, we need to disable it on storage nodes so that graceful exit and node to node transfers can work correctly. Change-Id: I2e0a0781bd247e574b82f0065aafb88804e59c71	2023-04-07 21:53:00 +00:00
Egon Elbre	f5020de57c	storagenode/blobstore: move blob store logic The blobstore implementation is entirely related to storagenode, so the rightful place is together with the storagenode implementation. Fixes https://github.com/storj/storj/issues/5754 Change-Id: Ie6637b0262cf37af6c3e558556c7604d9dc3613d	2023-04-05 18:06:20 +00:00
Márton Elek	462c16eb9b	storagenode/piecestore: use actual Initial/MaxStep defaults storj/storj uses storj/uplink and storj/uplink uses storj/storj (for integration test). Without using the real defaults (instead of hard coded ones) in storj/storj, we couldn't modify them. (modification in uplink will fail when storj/storj is used for integration test, with the unchanged, hard-coded defaults). Change-Id: Ifa68567dc2d5c8d08af8041ac338870c4fc26d45	2023-04-05 19:18:12 +02:00
paul cannon	556250911c	storagenode/monitor: add option to log only when verification check fails This is not recommended for most nodes; leaving your node running when it can't handle requests fast enough is a good way to fail audits and get disqualified, which may happen before you even know about the problem. But some Windows users are finding that this is being triggered regularly on their nodes, and that it apparently causes the whole system to lock up occasionally. We are adding this option as a way to mitigate that problem until we can collect more information. Change-Id: I7a652b0f9f970bbb9ed9f2cb3ad1cb89d90db8d7	2023-04-04 12:41:33 +00:00
Clement Sam	e5c43722dc	storagenode/pieces: introduce FileWalker FileWalker implements methods to walk over pieces in in a storage directory. This is just a refactor to separate filewalker functions from pieces.Store. This is needed to simplify the work to create a separate filewalker subprocess and reduce the number of config flags passed to the subprocess. You might want to check https://review.dev.storj.io/c/storj/storj/+/9773 Change-Id: I4e9567024e54fc7c0bb21a7c27182ef745839fff	2023-03-30 18:33:52 +00:00
Márton Elek	00420b5904	storagenode/piecestore: better monkit metric for download Download is server from two goroutines: * one is waiting for the orders (and updates the actual limit) * other one sends the valuable bytes back to the client (in case the actual order is big enough) These two tasks are syncrhonized with the help of a `sync2.NewThrottle()` But all of these happens in the same method, therefore we have no idea how much time is spent on waiting for next orders (throttle can wait until we receive new orderlimit), and how much time is spent with actual work. This patch moves the actual work (after sending routine is waked up) to a separated method to have better visibility and measure the actual work (read data + send it). Change-Id: Ia5068c544560a53bc2fcea6cb6fce85cfacbd95b	2023-03-30 10:46:18 +00:00
JT Olio	58f465c8d8	storagenode/piecestore: improve cancelation detection logic if a drpc.ClosedError was returned, it would always take the first (failure) branch, despite the second branch's existence. Change-Id: Ife3b27869c4e9d37ca2914e2d1d1a2c60d326309	2023-03-21 17:50:49 +00:00
JT Olio	0b4b04900a	private/server: debounce noise and tls connections to support TCP_FAST_OPEN, we're considering just using two TCP connections in parallel per request, one with and one without. this allows us to safely fire both concurrently without stressing out the node too much. see https://review.dev.storj.io/c/storj/storj/+/9933 Change-Id: I9aa8a0252350db5ace04ee125bfe469203e980ec	2023-03-21 16:51:31 +00:00
Clement Sam	c3d5965ef2	storagenode/monitor: add timeout to storage dir verification Resolves https://github.com/storj/storj/issues/4567 Change-Id: Ia071c476bcd1f5c99a9874801c94db86d1e105c6	2023-03-14 13:43:14 +00:00
Andrew Harding	5c744d7ed4	storagenode/pieces: close reader after use Change-Id: Icd9df821edb668c5521732396b7d6be3b8e75c7a	2023-03-13 14:06:10 +00:00
JT Olio	ade808375c	storagenode/piecestore: be more flexible with bandwidth usage max this is to prepare for https://review.dev.storj.io/c/storj/uplink/+/9814 Change-Id: Ib74fd63a352a0ae4cc1f8b66fb70df649844c33a	2023-03-06 18:00:17 +00:00
Egon Elbre	71e42ac2a6	storagnode/storagenodedb: fix database queries Change-Id: Ic5036c30aed7f68b9c28d6190650028461503465	2023-03-01 15:57:08 +02:00
Márton Elek	644aca0e42	storagenode: fix piecestore download metrics Storagenode download metrics are not accurate: * the current metric bump cancel metrics only for specific error messages, but there are cases where the error is already handled (err == nill) * instead of the full size of the piece we need to use the size of the downloaded bytes Change-Id: I6ca75770e2d40bf514f5e273785c78e02968c919	2023-02-28 12:52:58 +00:00
JT Olio	529e3674e4	storagenode/piecestore: handle upload write if provided in first message we may in the future want to accept writes and commits as part of the initial request message, just like https://review.dev.storj.io/c/storj/storj/+/9245 this change is forward compatible but continues to work with existing clients. Change-Id: Ifd3ac8606d498a43bb35d0a3751859656e1e8995	2023-02-23 18:33:08 +00:00
JT Olio	ca13eca718	go.mod: bump libuplink to include noise Change-Id: If5bceb139ce6fdf6c0792b4bb536bc61b54e32bb	2023-02-10 17:03:51 +00:00
JT Olio	522aed083d	private/server,satellite/contact,misc: use new storj/common noise helpers this change uses the new storj/common noise helpers, which: * add a security fix (require an expected node id for validating noise key attestations) * stops doing an unnecessary order signature validation (it's already been done inside of PutPiece) * removes some duplicate code Change-Id: I5e67a08ff216cd9c5b0b82e40b4d9de664b6b0fc	2023-02-07 09:53:45 -05:00
Clement Sam	3d3f9d133a	storagenode: fix B*h to bytes disk usage conversion The used space graph values are correct when a single satellite is selected but wrong for 'All satellites'. This is related to the queries for getting the individual disk usages for all satellites per day and the summary and average for all satellites per day: 1. dividing the sum of at_rest_total by the total_hours is wrong. Simply put, we were assuming that, for example (4/2)+(6/3) equals to (4+6)/(2+3), assuming we had 4 and 6 at_rest_total values with 2 and 3 respective hours. 2. To get the average, we need to first find the sum of the at_rest_total_bytes for each timestamp across all satellites before taking the average of the sums instead of just taking the average from the individual satellite values. Closes https://github.com/storj/storj/issues/5519 Change-Id: Ib1314e238b695a6c1ecd9f9171ee86dd56bb3b24	2023-02-06 18:50:31 +00:00
JT Olio	ae9ea22193	storagenode/piecestore: return node certificate chain at upload conclusion uplinks currently get the node's certificate chain over TLS. once Noise is in use, uplinks will no longer be able to do this. we should start having the upload request return the certificate chain in the same release that starts supporting noise. Change-Id: I619b23cb8e25691bcc62d760f884403a4ccd64a0	2023-02-01 01:49:50 +00:00
JT Olio	382af95499	storagenode/contact: send noise key and settings as contact info Change-Id: I1e7a83de36d5cf16eed8874091b15af1e0b73df7	2023-01-31 21:49:20 +00:00
paul cannon	2f04e20627	storage/filestore: better error message on data corruption A user on the forum was seeing the error "bad message", which was not very helpful. This case from the ext4 filesystem using the code EBADMSG to indicate it detected an invalid CRC, suggesting disk corruption. This change adds some explanatory information about probable disk corruption to all errors coming from the (*blobInfo).Stat() call, which is where storagenode fs corruption problems will usually manifest. Refs: https://github.com/storj/storj/issues/5375 Change-Id: I87f4a800236050415c4191ef1a0fc952f9def315	2023-01-30 08:54:06 -06:00
paul cannon	ed7c82439d	storage/filestore: avoid stat() during walkNamespaceInPath Calling stat() (really, lstat()) on every file during a directory walk is the step that takes up the most time. Furthermore, not all directory walk uses _need_ to have a stat done on every file. Therefore, in this commit we avoid doing the stat at the lowest level of walkNamespaceInPath. The stat will still be done when it is requested, with the Stat() method on the blobInfo object. The major upside of this is that we can avoid the stat call on most files during a Retain operation. This should speed up garbage collection considerably. The major downside is that walkNamespaceInPath will no longer automatically skip over directories that are named like blob files, or blob files which are deleted between readdir() and stat(). Callers to walkNamespaceInPath and its variants (WalkNamespace, WalkSatellitePieces, etc) are now expected to handle these cases individually. Thanks to forum member Toyoo for the insight that this would speed up garbage collection. Refs: https://github.com/storj/storj/issues/5454 Change-Id: I72930573d58928fa25057ed89cd4ec474b884199	2023-01-30 13:47:03 +00:00
JT Olio	e40191afd6	storj: upgrade to use latest storj/common NodeAddress Change-Id: I5987391bcfe5f6dfd7b525698c337a4cbda9b76e	2023-01-25 01:37:26 +00:00
Clement Sam	95960572b3	storagenode/piecestore: improve logs for incoming requests - Adds "Remote Address" field to all INFO logs related to GET, PUT, and DELETE requests - Adds Offset and Size fields to all info logs related to GET requests Resolves https://github.com/storj/storj/issues/5404 Change-Id: I5dab1867619385362e5f1e0455dfab17d295a37a	2023-01-24 10:23:34 +00:00
JT Olio	8d69837f02	storagenode/piecestore: handle order if provided in first message we may in the future want to accept orders as part of the initial request message (e.g. https://review.dev.storj.io/c/storj/uplink/+/9246). this change is forward compatible but continues to work with existing clients. Change-Id: I475ad50d6cbfee8a1f843383230698e4ef9b9e54	2023-01-20 08:53:22 +00:00
Egon Elbre	90b7076d26	storagenode/pieces: fix log line Change-Id: I8dba6b0f3d6af3140dfa503c8d6b33e6808d004f	2023-01-17 11:04:47 +02:00
Egon Elbre	9544a670d7	storagenode/pieces: fix concurrent empty and restore trash This ensures that empty trash and restore trash cannot run at the same time. Fixes https://github.com/storj/storj/issues/5416 Change-Id: I9d2e3aa3d66e61e5c8a7427a95208bb96089792d	2023-01-03 15:01:54 +00:00
Michal Niewrzal	d1d617d654	storagenode/piecestore: small cleanup * normalize ExistsCheckWorkers flag to avoid setting incorrect values * wait for limiter to finish if context was canceled Change-Id: I688a395bd958cd09233605fc264d43f91ec45ed1	2022-12-19 12:37:12 +00:00
Michal Niewrzal	5110803102	storagenode/piecestore: add Exists endpoint Adds new method Exists which can be used to verify which requested piece ids exists on storage node. Will verify only pieces which belongs to the satellite that used that endpoint. Minum WASM size was increased a bit. https://github.com/storj/storj/issues/5415 Change-Id: Ia5f9cadeb526541b2776a8973eb7d50133ad8636	2022-12-17 04:08:26 +00:00
Egon Elbre	ee71fbb41d	storagenode/piecestore: start restore trash in the background Starting restore trash in the background allows the satellite to continue to the next storagenode without needing to wait until completion. Of course, this means the satellite doesn't get feedback whether it succeeds successfully or not. This means that the restore-trash needs to be executed several times. Change-Id: I62d43f6f2e4a07854f6d083a65badf897338083b	2022-12-16 18:15:52 +02:00
Clement Sam	951d5db7f7	storagenode: fix hour_interval for first day defaulted to 24h Previously because of the use of a LAG to calculate the hour_interval the first record, which is usually the first day of the month usually, doesn’t have a previous record and always assumes the at_rest_total is for 24 hours. Resolves https://github.com/storj/storj/issues/5390 Change-Id: Id532f8b38fe9df61432e62655318ff119a733d13	2022-12-15 13:30:11 +00:00
Clement Sam	7461ffe148	{storagenode,web/multinode}: fix storage usage db/cache retrieval queries The query changes we did while fixing the usage graph led to wrong payout calculations directly linked to disk space. This change: - avoids converting from Bh to B directly in the query - returns the at_rest_total in the original byteshour value - returns at_rest_total_bytes as the calculated disk spaced used in bytes - uses the at_rest_total_bytes only for the disk space graph - return summary_bytes as the average disk space used within the specified date - updates the disk space graph header to "average disk space used this month" The total disk used in the month is also displayed in B not Bday Resolves https://github.com/storj/storj/issues/5355 Change-Id: I2cfefb0fe711f9c59de2adb547c4ab50b05c7cbb	2022-12-09 11:07:33 +00:00
Egon Elbre	8777523255	private/testplanet: disable WAL for storagenodes Change-Id: I1be4f7901c830e829118afeb90f04b87df555459	2022-12-05 11:41:06 +00:00
Andrew Harding	4fdea51d5c	storagenode/storagenodedb: faster test db init Running all of the migrations necessary to initialize a storage node database takes a significant amount of time during runs. The package current supports initializing a database from manually coalesced migration data (i.e. snapshot) which improves the situation somewhat. This change takes things a bit further by changing the snapshot code to instead hydrate the database directory from a pre-generated snapshot zip file. name old time/op new time/op delta Run_StorageNodeCount_4/Postgres-16 2.50s ± 0% 0.16s ± 0% ~ (p=1.000 n=1+1) Change-Id: I213bbba5f9199497fbe8ce889b627e853f8b29a0	2022-12-01 20:45:36 +00:00
Clement Sam	3e0a4230a5	storagenode/payout: fix disk space value at payout Payout is still calculating using the tbh. 0So it’s getting the total disk space used this month and dividing by (2430) instead of just 30. More context here: https://forum.storj.io/t/current-month-earnings-in-node-v1-67-1/20319/5 Follow up PR for https://github.com/storj/storj/issues/5146 Change-Id: Ie2d48497f2a9bdbc995c99ee27e70b46580ff638	2022-11-18 01:04:20 +00:00
Clement Sam	f5156296d4	storagenode/pieces: warn and trash v0 pieces when not found in v0pieceInfoDB Context: https://github.com/storj/storj/issues/4225#issuecomment-1307575782 Closes https://github.com/storj/storj/issues/4225 Change-Id: Ib8c3189f86118338556d48a6af657e6dc109b4c0	2022-11-14 14:54:16 +00:00
Clement Sam	59b37db670	storagenode: overhaul QUIC check implementation The current implementation blocks the the startup until one or none of the trusted satellites is able to reach the node via QUIC. This can cause delayed startup. Also, the quic check is done once during startup, and if there is a misconfiguration later, snos would have to restart to node. In this change, we reuse the contact service which pings the satellite periodically for node checkin. During checkin the satellite tries pinging the node back via both TCP and QUIC and reports both statuses. WIth this, we are able to get a periodic update of the QUIC status without restarting the node. Also adds the time the node was last pinged via QUIC to the tooltip on the QUIC status tab. Resolves https://github.com/storj/storj/issues/4398 Change-Id: I18aa2a8e8d44e8187f8f2eb51f398fa6073882a4	2022-11-09 03:15:57 +00:00
Egon Elbre	aeb645d32b	all: replace deprecated ioutil Change-Id: I60b0bbf5b68b066e2d44b8b99438594d600a3c2d	2022-10-31 15:50:41 +00:00
Clement Sam	d625eb85fd	storagenode: use bytes instead of bytes*hour unit for used space graph Closes https://github.com/storj/storj/issues/5146 Change-Id: I1b135da81a68193b5b50c761088d79471ca3a2fe	2022-10-28 18:42:45 +00:00
JT Olio	58a9c55f36	mod: bump dependencies - storj.io/common Change-Id: Ib78154acc253a13683495abfdd96d702625fdce8	2022-10-19 17:01:53 +00:00
Egon Elbre	ff22fc7ddd	all: fix deprecated ioutil commands Change-Id: I59db35116ec7215a1b8e2ae7dbd319fa099adfac	2022-10-11 15:27:29 +00:00
Erik van Velzen	e6b5501f9b	satellite/gc/sender: new service to send retain filters Implement a new service to read retain filter from a bucket and send them out to storagenodes. This allows the retain filters to be generated by a separate command on a backup of the database. Paralellism (setting ConcurrentSends) and end-to-end garbage collection tests will be restored in a subsequent commit. Solves https://github.com/storj/team-metainfo/issues/121 Change-Id: Iaf8a33fbf6987676cc3cf74a18a8078916fe673d	2022-09-20 11:49:40 +00:00
Clement Sam	07beef378d	storagenode/collector: delete expired piece info if file does not exist The collector tries deleting a piece over and over again, though the piece does not exist on the storagenode's filesystem. We need to delete the piece info from the expired db if the targeted file does not exist. This does not resolve the base problem of why the file is deleted before the collector tries deleting it. This change deletes the piece info from the expired db if the file does not exist, since we're already trying to delete that piece anyway. Closes https://github.com/storj/storj/issues/4192 Change-Id: If659185ca14f1cb29fd3c4237374df6fcd535df8	2022-09-15 12:29:29 +00:00
Clement Sam	a848c29b9b	storagenode/nodestats: add monkit metrics for reputation scores Closes https://github.com/storj/storj/issues/4835 Change-Id: Ib56e34145b962bede3525066f9bd7ef950d21e9b	2022-09-15 08:43:48 +00:00
Clement Sam	64e5fb7772	storagenode/collector: fix error check when file does not exist The collector calls the Delete() method on the pieces which returns an error which is wrapped by many error classes. Delete() method is using Stat() from `1aec831d98/storage/filestore/dir.go (L328)` under the hood. os.IsNotExist(errors.Unwrap(err) will always be false unless errors.Unwrap(err) is called multiple times till it gets to the core os.ErrNotExist. Here is a test case to explain better: func TestABC(t testing.T) { classA := errs.Class("A") classB := errs.Class("B") wrappedError := classB.Wrap(classA.Wrap(os.ErrNotExist)) require.True(t, os.IsNotExist(errs.Unwrap(wrappedError))) require.True(t, os.IsNotExist(errors.Unwrap(wrappedError))) } Using errs.Is() seems to resolve this even without unwrapping the error: func TestABC(t testing.T) { classA := errs.Class("A") classB := errs.Class("B") wrappedError := classB.Wrap(classA.Wrap(os.ErrNotExist)) require.True(t, errs.Is(wrappedError, os.ErrNotExist)) require.False(t, errs.Is(wrappedError, os.ErrExist)) require.False(t, os.IsNotExist(wrappedError)) } Does not resolve the collector issue here but enhances it: https://github.com/storj/storj/issues/4192 Change-Id: Ifb75dd15b54c1e1a5e23f6eba2d621d64874a5cc	2022-09-02 12:26:33 +00:00
Márton Elek	7e71986493	storagenode: accept HTTP calls on public port, listening for monitoring requests Today each storagenode should have a port which is opened for the internet, and handles DRPC protocol calls. When we do a HTTP call on the DRPC endpoint, it hangs until a timeout. This patch changes the behavior: the main DRPC port of the storagenodes can accept HTTP requests and can be used to monitor the status of the node: * if returns with HTTP 200 only if the storagnode is healthy (not suspended / disqualified + online score > 0.9) * it CAN include information about the current status (per satellite). It's opt-in, you should configure it so. In this way it becomes extremely easy to monitor storagenodes with external uptime services. Note: this patch exposes some information which was not easily available before (especially the node status, and used satellites). I think it should be acceptable: * Until having more community satellites, all storagenodes are connected to the main Storj satellites. * With community satellites, it's good thing to have more transparency (easy way to check who is connected to which satellites) The implementation is based on this line: ``` http.Serve(NewPrefixedListener([]byte("GET / HT"), publicMux.Route("GET / HT")), p.public.http) ``` This line answers to the TCP requests with `GET / HT...` (GET HTTP request to the route), but puts back the removed prefix. Change-Id: I3700c7e24524850825ecdf75a4bcc3b4afcb3a74	2022-08-26 09:38:09 +00:00
Márton Elek	4b1be6bf8e	storagenode/satellite: support different piece hash algorithms Change-Id: I3db321e79f12f3ebaa249e6c32fa37fd9615687e	2022-08-23 18:15:06 +00:00
Clement Sam	bac0155664	storagenode/storagenodedb: fix null at_rest_total values for storage usage Wrapping a COALESCE around the computed at_rest_total value to fallback to the original at_rest_total value when the computed value is null. https://forum.storj.io/t/release-preparation-v1-62/19444/5?u=clement Change-Id: Ifa268ccbe35a63e3b68f07464194fa034ad261b5	2022-08-23 12:56:28 +00:00

1 2 3 4 5 ...

807 Commits