Commit Graph

1272 Commits

Author SHA1 Message Date
paul cannon
c3b5c18d00 cmd/tools/segment-verify: learn to take a CSV list of segments as input
This will allow us to retry some specific segments from
segments-retry.csv with particularly high counts of "retry" pieces.

Change-Id: I48fd419cc0350a3be4c9e77ce8d28871565b7f97
2023-01-18 20:53:27 +00:00
Jeremy Wharton
5d656e66bf satellite/payments/stripecoinpayments: implement invoice price override
This change allows for overriding project usage prices for a specific
partner so that users who sign up with that partner do not need their
invoices to be manually adjusted.

Relates to storj/storj-private#90

Change-Id: Ia54a9cc7c2f8064922bbb15861f974e5dea82d5a
2023-01-17 14:32:10 +00:00
Michal Niewrzal
0185bba90a cmd: cleanup segment verify/repair tools
* use the same DB application name for satellite and metabase
* use noop orders DB implementation to avoid storing allocated bandwidth
in DB

Change-Id: I20e88c694d38240fe1a20c45719e210cfb76402c
2023-01-12 15:27:07 +00:00
Clement Sam
b5d0021fb6 cmd/storagenode-updater: restart storagenode after update on BSD unix derivatives
On BSD, the storagenode-updater falls back to renaming the storagenode without
doing anything to restart the service.
Like the approach we have for linux, this change finds the process ID of the
storagenode using pgrep and sends an interrupt signal to the process.

Closes https://github.com/storj/storj/issues/5333

Change-Id: Icced90ea3e831528804784c2170a3b8b14952e8c
2023-01-11 12:38:26 +00:00
Clement Sam
5fdfc6c4f3 cmd/{storagenode,satellite}: remove unused docker compose YAML files
Change-Id: I8d35cc723fa6b25d2baff58d848e0888a863cf4c
2023-01-10 18:17:42 +00:00
paul cannon
246c193145 cmd/tools/segment-verify: add timeout to VerifyWithExists
We have to wait until the slowest node is done being tested before we
can move on to the next of segments. Since the slowest node can be
arbitrarily slow, we'll set a timeout and treat too-slow nodes as
temporarily offline.

Change-Id: I80fe865dd4e8f826700430fb0140c2d3aefca381
2023-01-06 17:25:30 +00:00
paul cannon
23acee2df0 cmd/tools/segment-verify: better handle timeouts
When we are verifying pieces by downloading the first byte, if we
encounter a timeout, treat the node as if we failed to connect to it,
and log the error once instead of twice.

Change-Id: I70602d554183c98f1213f3ffb1bfec41100ea0e7
2023-01-06 16:52:28 +00:00
paul cannon
3a9ad48345 cmd/tools/segment-verify: fix problem-pieces.csv
This csv file was being closed as soon as the service was created.
All subsequent writes to the closed file handle produced errors,
which were logged but otherwise ignored.

Instead, we would like the file to remain open and writable, until
the service is destroyed.

Change-Id: Ib29944d25b2f5b2d0f90fdbdcde44fea8d769321
2023-01-06 16:23:34 +00:00
Andrew Harding
5efb08cd7b cmd/uplink: skip abort on committed multipart writes
The copyFile method has some safeguards to ensure that the multipart
write is aborted. This is accomplished by always calling abort on the
MultiWriteHandle when the copy is finished, whether or not there was a
failure or it was successfully committed. If the copy was committed,
then this RPC is a no-op on the metainfo server.

Regardless, the calls to abort to constitute an additional RPC to the
satellite for no benefit. This is exacerbated by the fact that the code
currently ends up calling abort twice.

This change updates the libuplink-backed MultiWriteHandle implementation
to not call abort if the write is committed and vice-versa. This
eliminates the two wasteful RPC calls.

Change-Id: I13679234f6f473e9a93179e6791fb57eac512f25
2023-01-05 18:30:23 +00:00
Qweder93
8c69ee62fc {cmd/storj-sim, satellite/rangedloop}: added rangedloop to storj-sim, removed identity
added in storj-sim rangedloop for each satellite, to verify it works for metrics oveserver,
removed identity from rangedloop peer as we never use it, added logs on service run, added loop
to service instead of endless for loop, interval value to config

Closes: https://github.com/storj/storj/issues/5414

Change-Id: Ibc3b06071b68feda4a35b45da2bbe36e22a02fc8
2023-01-05 11:29:00 +00:00
paul cannon
6e1554652a cmd/tools/segment-verify: handle dq'd nodes
Previously, if any pieces are still on disqualified nodes, this tool
would treat those pieces as fine (if the disqualified node is still
online) or temporarily unavailable (if the disqualified node is
offline). Instead, we should treat such pieces as lost.

This also fixes a slight problem with the code that handles a broken
alias. This is not likely to happen, but if we do see an alias that is
not in the alias map, we return an error instead of nil.

Change-Id: Ib4e2e729ef0535dd7bd9ce2f621680d9f959891c
2023-01-04 17:54:03 +00:00
Ethan Adams
d29abed2aa cmd/satellite: Add run auditor to satellite entrypoint.
Needed to support new auditor processess

Change-Id: I6687a667c123c69a335317216affad3a14ab7b9c
2023-01-04 15:31:06 +00:00
paul cannon
2feb49afc3 cmd/tools/segment-verify: don't cache offline status forever
Because it was originally intended to work on only a few pieces from
each segment at a time, and would frequently have reset its list of
online nodes, segment-verify has been taking nodes out of its
onlineNodes set and never putting them back. This means that over a long
run in Check=0 mode, we end up treating more and more nodes as offline,
permanently. This trend obfuscates the number of missing pieces that
each segment really has, because we don't check pieces on offline nodes.

This commit changes the onlineNodes set to an "offlineNodes" set, with
an expiration time on the offline-ness quality. So nodes are added to
the offlineNodes set when we see they are offline, and then we only
treat them as offline for the next 30 minutes (configurable). After that
point, we will try connecting to them again.

Change-Id: I14f0332de25cdc6ef655f923739bcb4df71e079e
2023-01-03 23:11:42 +00:00
paul cannon
46d99a06d5 cmd/tools/segment-verify: write to pieces csv from WithExists methods
The WithExists methods previously were not writing problematic pieces to
problem-pieces.csv. With this change, they will.

Change-Id: I51eadd3d8f4299e1efa787c9266a7aacfa525eb3
2022-12-28 01:12:46 +00:00
paul cannon
9544936794 cmd/tools/segment-verify: don't double-count notfound
When this branch is followed, `audit.OutcomeFailure` is returned, and
`MarkNotFound()` is immediately called again (in
`(*NodeVerifier).Verify()`). Calling `MarkNotFound()` twice for the same
piece is not correct.

Change-Id: I1a2764bc32ed015628fcd9353ac3307f269b4bbd
2022-12-28 00:37:14 +00:00
paul cannon
aec596bb39 cmd/tools/segment-verify: monkit-ify WithExists methods
It may help to know how much faster these methods are than the
alternative (asking nodes for each piece in turn).

Change-Id: Ieb7c963f62b662f72c84a49de8a09c065c14f782
2022-12-28 00:00:40 +00:00
paul cannon
42e2a14316 cmd/tools/segment-verify: flush after write to pieces csv
It was ok as it was, but since we want to keep a close eye on progress
while the tool is running, it will help to have results written to the
output file immediately instead of after the buffer is full or the
program exits.

Change-Id: Ie027f05771a637afb06969ec775cd32b142b7635
2022-12-27 14:58:54 -06:00
Kaloyan Raev
56896353b6 cmd/uplink: add buffering while reading from stdin
This change is similar to
https://review.dev.storj.io/c/storj/storj/+/7687 but applied when
uploading from stdin with parallelism > 1.

Currently, the paralellism from stdin scales up to 3 or 4, but not
greater than that. If we buffer the content from stdin more aggressively
the parallelism scales to higher levels and reaches the performance of
reading directly from a file.

Change-Id: I1f447686a88074882709992ee6d52dd262e220fb
2022-12-23 16:40:54 +00:00
paul cannon
b2422caaef cmd/tools/segment-verify: log less retry segments
When Check == 0 (check all pieces), there is nearly always a piece left
in the retry count, so most segments get logged in segments-retry.csv.
This change makes it so we require retry>5 before adding to
segments-retry.csv (only in the check==0 case).

Change-Id: Iaea523c27eb777e3c248c27c7ef5effe77ae54cf
2022-12-23 14:29:25 +00:00
paul cannon
0b790070a3 cmd/tools/segment-verify: pass over bad segments
Change-Id: I1b4dd9da755c6a2028760723e15219f5821f702f
2022-12-22 18:12:12 -06:00
Kaloyan Raev
bfd189c3b0 cmd/uplink: add --inmemory-erasure-coding flag to cp command
This new advanced flag configures libuplink to store in-memory the
erasure-coded pieces that are temporarily created during upload.

By default, libuplink writes the erasure-coded pieces as temp files on
the disk, but this results in additional IOPS that affect the
performance in hot-rodded scenarios.

If the erasure-coded pieces are kept in-memory and the system has enough
RAM, the upload speed may be boosted with 20-30%.

The flag is added as "advanced" as we don't recommend it by default.

Co-authored-by: Stefan Benten <mail@stefan-benten.de>

Change-Id: Icc54f03b6c0bc27c97126f6f1d22748d21a15959
2022-12-22 19:48:58 +00:00
Michal Niewrzal
4851b4e06d cmd/tools/segment-verify: small improvements
* better error handling when Exists method is not avaialble on SN
* more optimal processing of response from Exists method

Change-Id: I6d61c09473e9f5ab76a4601720e8bd520767f4c2
2022-12-22 15:21:33 +00:00
Clement Sam
cda1d67465 cmd/tools/segment-verify: adjust to SN Exists endpoint
Change-Id: I409aeae29aa87996f2a6047f976d215a69e9d7f5
2022-12-21 19:24:31 +00:00
Fadila Khadar
d23e25ce0f cmd/tools/segment-verify: remove unused test code
Accidentally added some code to a test. As it is unused, this PR removes it.

Change-Id: I7adddc78c5ed747225e365989ab58504a9625ad7
2022-12-19 14:33:08 +00:00
Ethan Adams
1c309a0318 cmd/tools/segment-verify: check for unvetted nodes
this also renames the command from `duplicates` to `node-check`

Change-Id: Idd303b17ec03f5b55fbbb1f4039a7761da37abe6
2022-12-19 09:59:13 +00:00
paul cannon
7b851b42f7 satellite/audit: split out auditor process
This change creates a new independent process, the 'auditor', comparable
to the repairer, gc, and api processes. This will allow auditors to be
scaled independently of the core.

Refs: https://github.com/storj/storj/issues/5251
Change-Id: I8a29eeb0a6e35753dfa0eab5c1246048065d1e91
2022-12-16 12:44:32 -06:00
paul cannon
fc905a15f7 satellite/audit: newContainment->containment
Now that all the reverification changes have been made and the old code
is out of the way, this commit renames the new things back to the old
names. Mostly, this involves renaming "newContainment" to "containment"
or "NewContainment" to "Containment", but there are a few other renames
that have been promised and are carried out here.

Refs: https://github.com/storj/storj/issues/5230
Change-Id: I34e2b857ea338acbb8421cdac18b17f2974f233c
2022-12-16 17:59:52 +00:00
paul cannon
0342ca1aa6 satellite/audit: delete now-unused code
Now that we are doing scalable piecewise reverifications, the code for
handling the old way of doing things (containment, pending audits,
reporting, testing) can now be removed.

Refs: https://github.com/storj/storj/issues/5230
Change-Id: Ief1a75f423eff682e8f3d57804e343b3409a6631
2022-12-16 14:53:39 +00:00
Egon Elbre
04f16f8768 cmd/tools/segment-verify: tool for checking duplicate net
Change-Id: Ie47c1282e580ffc418bf3b1f3c8820a48973aefc
2022-12-15 22:58:36 +00:00
paul cannon
727136141a satellite/cmd/tools/segment-verify: check all pieces
This adds the capability to the segment-verify tool of checking all
pieces of every indicated segment.

Pieces which could not be accessed (i.e. we couldn't get a single
byte from them) are recorded in a csv file.

I haven't been able to test this in any very meaningful way, yet, but I
am comforted by the fact that the worst things it could possibly do are
(a) download pieces too many times, and (b) miss downloading some
pieces.

Change-Id: I3aba30921572c974993363eb36d0fd5b3ae97907
2022-12-14 19:06:08 +00:00
Fadila Khadar
995f78d579 satellite/cmd: segment-verify verifies segments in given bucket list
Provides the `segment-verify run buckets` command for verifying segments within a list of buckets.

Bucket list is a csv file with `project_id,bucket_name` to be checked.

https://github.com/storj/storj-private/issues/101

Change-Id: I3d25c27b56fcab4a6a1aebb6f87514d6c97de3ff
2022-12-13 20:10:00 +00:00
Jeremy Wharton
ba7d2c2dbe satellite/payments/stripecoinpayments: add config for price overrides
This change adds configuration flags for defining partner-specific
project usage price overrides.

Resolves https://github.com/storj/storj-private/issues/61

Change-Id: Ia535ac22576382211d045f9ff2c9b983a07e86f3
2022-12-09 15:33:27 +00:00
Jeff Wendling
f2fdd6ca33 cmd/uplink: fix some issues with share
Because --readonly is default true, passing something like
--disallow-deletes=false would not actually update that
value because the readonly flag would override. this makes it
so that the --disallow-* flags override the --readonly and
--writeonly flags.

Also fixes some minor formatting issues with share like an
extra space after the "Public Access:" entry.

Simplifies the handling of the explicit "none" by making the
flags for the dates optional and using nil to signify that
the value was left unset.

Bump the go.mod to go1.18 to enable the use of generics and
add a small generic function. This can easily be backed out
if it causes problems.

Change-Id: I1c5f1321ad17b8ace778ce55561cbbfc24321a68
2022-12-08 17:46:02 +00:00
paul cannon
378b8915c4 satellite/{satellitedb,audit}: add NewContainment
NewContainment will replace Containment later in this commit chain, but
for now it is not yet being used.

NewContainment will allow a node to be contained for multiple pending
reverify jobs at a time. It is implemented by way of the reverify queue.

Refs: https://github.com/storj/storj/issues/5231
Change-Id: I126eda0b3dfc4710a88fe4a5f41780618ec19101
2022-12-07 18:03:37 +00:00
Giulio Fidente
5d956c9dc5
Expose LOG_LEVEL env variable via Dockerfile for storagenode (#5362)
The default 'info' level for the storagenode will dump dozens of
lines every second. This change adds the ability to configure
the log.level argument at run time using LOG_LEVEL env variable.

Co-authored-by: Clement Sam <clementsam75@gmail.com>
2022-12-02 22:25:13 +00:00
Márton Elek
b4d8cbfbbf cmd/uplink: add options to save pprof/trace information
Change-Id: I5bcc602366de4ebd9b761e641a3806ddaeb9ecba
2022-11-30 11:53:29 +00:00
Márton Elek
e617db832e cmd/uplink: ability to set experimental flag from environment variable
Change-Id: I440764a54ac83e5a85e14f64843260d9c4f993fd
2022-11-29 12:11:18 +00:00
Ivan Fraixedes
ef4b564b82
cmd/uplink: Update error message referring to 'import'
Uplink doesn't have a `save` command, however, it's referred on an error
message that's returned when the `access register` command is executed
without having any default access configured.

The correct command to mention is `import`.

Change-Id: Ia2092d02965737f421683fc98c52a51c9529b86e
2022-11-25 18:54:51 +01:00
Moby von Briesen
3501656e98 satellite/repair: Add flag to allow disabling reputation updates
Reputation updates during repair currently consumes a lot of database
resources. Sometimes increasing the rate of repair is more important
than auditing a node based on whether they have or don't have the
correct piece during repair. This is the job of the audit service.

This commit is to implement an intermediate solution from this issue: https://github.com/storj/storj/issues/5089
This commit does not address the more in-depth fix discussed here: https://github.com/storj/storj/issues/4939

Change-Id: I4163b18d78a96fadf5265789fd73c8aa8def0e9f
2022-11-24 08:31:11 -05:00
Erik van Velzen
b574ee5e6d satellite/metabase/rangedloop: service skeleton
Create skeleton for multi-threaded segment loop, peer, cmd command for rangedloop.

Change-Id: I52c78a313f15070d43207c52ea94e53169821654
2022-11-22 15:21:41 +02:00
Michal Niewrzal
8e9b7736cc cmd/satellite: repair-segment; don't stop processing if segment is not found
If we are processing list of segments (csv) we should not stop if one of
segments is not found in DB.

Change-Id: I720f85dc7601c2ca77032e20c1577de55092bd9b
2022-11-22 08:31:16 +00:00
Michal Niewrzal
2ac5d16faf cmd/satellite: fix repair-segment command args validation
After adding option to input only CSV file number of allowed input parameters was not adjusted.

Change-Id: I55096be02d8e692de2f04571309be6b56d18bf67
2022-11-21 12:23:31 +00:00
Michal Niewrzal
8d5a2a90f2 cmd/satellite: repair-segment; add option to process csv file directly
Current option is to put stream id and position as an input but
it's not very efficient when we have long list of segments to repair.
This change adds option to read whole csv file and process each entry
one by one.

If command will have single argument then it will treat it as csv file
location and if will have two arguments then it will parse it just as
stream id and position.

Change-Id: I1e91cf57a794d81d74af0091c24a2e7d00d1fab9
2022-11-18 17:40:17 +00:00
Michal Niewrzal
ec777855e1 cmd/satellite: add segment-repair command
Implements logic for satellite command to repair a single segment.
Segment will be repaired even if it's healthy. This is not checked
during this process. As a part of repair command will download whole
segment into memory and will try to reupload segment to number of new
nodes that equal to existing number of pieces. After successful
upload new pieces will completely replace existing pieces. 

Command:
    satellite repair-segment <streamid> <position>

https://github.com/storj/storj/issues/5254

Change-Id: I8e329718ecf8e457dbee3434e1c68951699007d9
2022-11-18 16:18:08 +01:00
Márton Elek
8c569866aa uplink/share: support share via insecure authservice protocol
This patch makes it possible to use `uplink share` in test environment (like storj-up) where authservice doesn't have full secure endpoint.

This supposed to be an undocumented feature (no flag, just a custom prefix) to avoid any confusion for regular users.

Change-Id: I256aefc944066e52c72224e7b6f1a593b5bc57f7
2022-11-10 15:16:37 +00:00
Cameron
f06da25c3d satellite/overlay: add nodeevents.DB to satellite overlay service
Add nodeevents.DB to satellite overlay service so we can insert node
events into the nodeevents DB.

Change-Id: I642c0ccc9941ecdb08cb22d5c8cf701959a55156
2022-11-02 15:56:37 +00:00
Michal Niewrzal
d21bbab2b2 satellite: fix metabase configuration wiring
New flag 'MultipleVersions' was not correctly passed from metainfo
configuration to metabase configuration. Because configuration was
set correctly for unit tests we didn't catch it and issue was found
while testing on QA satellite.

This change reduce number of places where new metabase flags needs
to be propagated from metainfo configuration to avoid problems with
setting new flags in the future.

Fixes https://github.com/storj/storj/issues/5274

Change-Id: I74bc122649febefd87f665be2fba628f6bfd9044
2022-11-02 15:17:34 +00:00
Ethan
4efde65c9e satellite/gc: Optionally run the GC bloomfilter process once, instead of in a loop
The current deployment strategy requires that the GC bloomfilter generation process executes only once and exits.

Change-Id: I952991f126596aa165d1f2e9fce6f8548c21bdba
2022-11-01 18:19:40 +00:00
Egon Elbre
4b05beb3f5 cmd/satellite: add 'repair-segment' command
This mostly wiring up the necessary systems.

Change-Id: I9e03b3240235ca8e4cc51ddf6a41e27d9dbc198c
2022-11-01 14:31:39 +00:00
Egon Elbre
aeb645d32b all: replace deprecated ioutil
Change-Id: I60b0bbf5b68b066e2d44b8b99438594d600a3c2d
2022-10-31 15:50:41 +00:00