As part of fixing the IO priority of filewalker related
processes such as the garbage collection and used-space
calculation, this change allows the initial used-space
calculation to run as a separate subprocess with lower
IO priority.
This can be enabled with the `--storage2.enable-lazy-filewalker`
config item. It falls back to the old behaviour when the
subprocess fails.
Updates https://github.com/storj/storj/issues/5349
Change-Id: Ia6ee98ce912de3e89fc5ca670cf4a30be73b36a6
the parallelism and parallelism-chunk-size flags
which used to control how many parts to split a
segment into and many to perform in parallel
are now deprecated and replaced by
maximum-concurrent-pieces and long-tail-margin.
now, for an individual transfer, the total number
of piece uploads that transfer will perform is
controlled by maximum-concurrent-pieces, and
segments within that transfer will automatically
be performed in parallel. so if you used to set
your parallelism to n, a good value for the pieces
might be something approximately like 130*n, and
the parallelism-chunk-size is unnecessary.
Change-Id: Ibe724ca70b07eba89dad551eb612a1db988b18b9
We avoid putting more than one piece of a segment on the same /24
network (or /64 for ipv6). However, it is possible for multiple pieces
of the same segment to move to the same network over time. Nodes can
change addresses, or segments could be uploaded with dev settings, etc.
We will call such pieces "clumped", as they are clumped into the same
net, and are much more likely to be lost or preserved together.
This change teaches the repair checker to recognize segments which have
clumped pieces, and put them in the repair queue. It also teaches the
repair worker to repair such segments (treating clumped pieces as
"retrievable but unhealthy"; i.e., they will be replaced on new nodes if
possible).
Refs: https://github.com/storj/storj/issues/5391
Change-Id: Iaa9e339fee8f80f4ad39895438e9f18606338908
The cmd/storagenode/main.go is a big mess right now with so many
unneeded config structures initialized and shared by several
subcommands.
There are many instances where the config structure of one subcommand
is mistakenly used for another subcommand.
This changes is an attempt to clean up the main.go by moving the
subcommands to a separate `cmd_*.go` files with separate config structures
for each subcommand.
Resolves https://github.com/storj/storj/issues/5756
Change-Id: I85adf2439acba271c023c269739f7fa3c6d49f9d
updates flag descriptions with correct punctuation, and fix errors
to not be capitalized.
Updates #5623
Change-Id: I9c6ef6d9888b2fb90b17db8775cc6abe803e102f
The assignemnt to `err = nil` is not used in the rest of the code,
however, this was a protective err = nil assignment.
Change-Id: Id70fb2a2e68b91e2481952d865334e603ca41188
adds an additional flag to return an additional TXT record that will
enable TLS on custom domains with Linksharing.
Closes#5623
Change-Id: I941616362d7dcd9aec20dfd10346e483021516a4
Option added to `uplink access setup` and `uplink access create`
commands to disable object key encryption.
Related to https://github.com/storj/storj/issues/5678
Change-Id: I4789a94143742ff4b232fd60decc029ad2883c2a
We have lots of direct DB requests to get API keys. It should be handled
by cache but default value is very low at the moment.
Fixes https://github.com/storj/storj/issues/5665
Change-Id: I214ebebd6e397cacff80b2f36dc4a2eea388f93d
We do regular testing with executing uplink. But sometimes the recorded execution time showed spikes.
Would be nice to know the reason of the spikes (just internet blip, or something what we should be worried about).
We can collect distributed traces, but it's not easy to find the right trace in Jaeger.
* We can provide a random trace-id, but it should be persisted / processed
* We can also save standard output and use `--trace-verbose` which prints out the used trace id, but it's also complicated to collect all of them in a DB
Would be nice to attach additional metadata to traces to make sure that we can filter all traces of one specific kind of test.
This patch provides this feature:
* It always adds hostname to the trace (if you opt-in to distributed tracing, which is turned off by default)
* Additional tags can be defined with CLI flag
Tags can be used to find the right trace in Jaeger (or in Elastic search backend of Jaeger).
Change-Id: I08f10023bbebd783f812cfca95ac6237360ac2b0
Remove generate-missing-project-salt migration tool code and related
tests. This migration has already been run and this code is no longer
needed.
Issue https://github.com/storj/storj-private/issues/163
Change-Id: I4e36dcd95a07c5305c597113a7fd08148e100ccc
This test involves a satellite with dev defaults (DistinctIP=no) being
upgraded past commit 2522ff09b6, which
means we need to run the dev-defaults-satellite-upgrade migration SQL
to avoid getting DistinctIP=yes behavior (which breaks the tests).
Change-Id: I29fb596d1ffa568dad635d98cfe9abacd3aaa48f
Only API peer needs access to order DB (and rollups cache) because it's
only place where we are creating orders for PUT and GET operations. For
other peers like auditor and repairer we can set noop implementation to
reduce number of dependencies needed for them.
Change-Id: Ic32d1879f0b97ffc4516f401898e31e95ae892e4
It was surprising that `satellite auditor` complained about SMTP mail settings, even if it's not supposed to sending any mail.
Looks like we can remove the mail service dependency, as it's not a hard requirement for overlay.Service.
Change-Id: I29a52eeff3f967ddb2d74a09458dc0ee2f051bd7
quic is still configurable based on the quic rollout
environment variables in storj.io/common. this stops
using a method removed in:
https://review.dev.storj.io/c/storj/uplink/+/9815
Change-Id: Ibfe28cfb19e5672630970b9e2c8c6ac0c98d4822
I use `uplink share` command but I always fail to set the --not-before parameter.
* Usually I try +2d when I see in the help that +2h is possible --> fail
* When it fails, I try to set explicit date, like 2012-12-23 --> fail
This patch makes it possible to use:
* day duration (like +3d)
* shorter date definition (like `2023-12-12` or `2023-12-12T12:40`)
Change-Id: I2243b36f59c8929eb0473c4bb4fed19220890c71
The tests were using global variables for keeping the mock state, which
was indexed by the satellite ID. However, the satellite ID-s are
deterministic and it's possible for two tests end up using the same
mocks.
Instead make the mock creation not depend on the satellite ID and
instead require it being configured via paymentsconfig.
This fixes TestAutoFreezeChore failure.
Change-Id: I531d3550a934fbb36cff2973be96fd43b7edc44a
This code is essentially replacement for eestream.CalcPieceSize. To call
eestream.CalcPieceSize we need eestream.RedundancyStrategy which is not
trivial to get as it requires infectious.FEC. For example infectious.FEC
creation is visible on GE loop observer CPU profile because we were
doing this for each segment in DB.
New method was added to storj.Redundancy and here we are just wiring it
with metabase Segment.
BenchmarkSegmentPieceSize
BenchmarkSegmentPieceSize/eestream.CalcPieceSize
BenchmarkSegmentPieceSize/eestream.CalcPieceSize-8 5822 189189 ns/op 9776 B/op 8 allocs/op
BenchmarkSegmentPieceSize/segment.PieceSize
BenchmarkSegmentPieceSize/segment.PieceSize-8 94721329 11.49 ns/op 0 B/op 0 allocs/op
Change-Id: I5a8b4237aedd1424c54ed0af448061a236b00295
This change removes the trailing slash from the account activation and
password recovery URLs, making them consistent with the rest. The URLs'
previous forms are still supported, however, in order to not invalidate
emails containing them.
Resolvesstorj/customer-issues#491
Change-Id: Ie774a87698d8e9edd1836611968fc3911c6cc56f
Peer for generating bloom filters will be able to use ranged loop.
As an addition some cleanup were made:
* remove unused parts of GC BF peer (identity, version control)
* added missing Close method for ranged loop service
* some additional tests added
https://github.com/storj/storj/issues/5545
Change-Id: I9a3d85f5fffd2ebc7f2bf7ed024220117ab2be29
Previously we were exposing the testing facilities via interface casting
the necessary parts, however, when things are not part of the main
satellite.DB interface they need to be manually propagated. Rather than
relying on using hidden methods lets expose things as long as they don't
create a direct dependency to the database driver.
Change-Id: I2eb7d8b60f4b64de1320c2d32581f7be267c0f57
Users with a partner package plan should be unable to replace their
plan's coupon. This change enforces this behavior by rejecting coupon
application attempts from users that meet this criteria.
Change-Id: I6383d19f2c7fbd9e1a2826473b2f867ea8a8ea3e
listing "/" on windows was not returning files from
the root because it was adding an extra separator
unconditionally. the docs for filepath.Clean say
The returned path ends in a slash only if it represents
a root directory, such as "/" on Unix or `C:\` on Windows.
so we need to add the slash only if it doesn't already have
one to avoid the double slash problem while still ensuring
the path ends with a slash.
Change-Id: I98afc1f1a06bb06035c7647ecb0da3214080162d
Move global variables to be local for each test to reduce the likelihood
of unexpected bugs. Also parallelize the different db tests and clean up
unnecessary lines/checks.
Change-Id: I9dc3894d0945430908b10af5aeeba2f9246caf2a
Satellite DB tests will print into logs (WARN) if full table scan will
be detected. Test won't be failed automatically. That's because currently
we have multiple queries which are doing full table scan and it's not
trivial to change.
We may change that behavior when we will figure out how to skip
specific query from detection or we will fix all problematic queries.
https://github.com/storj/storj/issues/5471
Change-Id: Icafe782257a0d353e8bcdf6fa8a19c20b1091a0b
This change causes the bucket's partner info to be used rather than the
user's when calculating project usage prices. This ensures that users
who own differently-partnered buckets will be charged correctly for
usage based on the specific bucket they are utilizing.
according to the bucket's partner.
Related to storj/storj-private#90
Change-Id: Ieeedfcc5451e254216918dcc9f096758be6a8961
`storage.KeyValueStore` requires ordered iteration, which redis
doesn't support natively. This would require loading all the keys
into memory and then processing them, rather than iterating over them
one-by-one.
This adds a temporary `IterateUnordered` to handle the migrations
more gracefully.
Change-Id: I55b763500523077c7ab8fdfad175c32cc7788e47
This tool is being removed because it has served its purpose and was blocking another removal from being verified.
Change-Id: Ie888aa7ae1b153a34210af3a5d5a3682b381ba82
Add migration tool (and test) to update salt column in projects table
with the SHA-256 hash of the project ID when null
Issue https://github.com/storj/storj-private/issues/66
Change-Id: Ib8d484ac8d6ee25859064d803e2ac8fb46b45921
Add a flag to enable/disable analytics so uplink can be run
non-interactively. Also when run non-interactively for the first time
it will not error any more but instead default to disable analytics.
Part of https://github.com/storj/storj/issues/5126
Change-Id: I07ac8a040664334efcb4e2536f26c330c1751a6f
Add a docker image for uplink-cli and push it to docker hub.
We used to have this before the change to uplinkng. I'm not
sure if the pushing works, we'll see after merge.
To test, build an image with `make uplink-image`, read the tag from the
output and run normal uplink-cli commands using
`docker run -it storjlabs/uplink:df9bbceca-uplink-docker-go1.18.8-amd64 [command]`
Part of https://github.com/storj/uplink/issues/109
Change-Id: I8a10aab2b778951ff42a22ba2f252c581eb66b65
We were reading in a segment's stream ID and position, and assuming that
was enough for the downloader. But of course, the downloader needs
AliasPieces filled in. So now we request each segment record from the
metabase and fill in the VerifySegment records entirely.
Change-Id: If85236388eb99a65e2cb739aa976bd49ee2b2c89
This will allow us to retry some specific segments from
segments-retry.csv with particularly high counts of "retry" pieces.
Change-Id: I48fd419cc0350a3be4c9e77ce8d28871565b7f97
This change allows for overriding project usage prices for a specific
partner so that users who sign up with that partner do not need their
invoices to be manually adjusted.
Relates to storj/storj-private#90
Change-Id: Ia54a9cc7c2f8064922bbb15861f974e5dea82d5a
* use the same DB application name for satellite and metabase
* use noop orders DB implementation to avoid storing allocated bandwidth
in DB
Change-Id: I20e88c694d38240fe1a20c45719e210cfb76402c
On BSD, the storagenode-updater falls back to renaming the storagenode without
doing anything to restart the service.
Like the approach we have for linux, this change finds the process ID of the
storagenode using pgrep and sends an interrupt signal to the process.
Closes https://github.com/storj/storj/issues/5333
Change-Id: Icced90ea3e831528804784c2170a3b8b14952e8c
We have to wait until the slowest node is done being tested before we
can move on to the next of segments. Since the slowest node can be
arbitrarily slow, we'll set a timeout and treat too-slow nodes as
temporarily offline.
Change-Id: I80fe865dd4e8f826700430fb0140c2d3aefca381
When we are verifying pieces by downloading the first byte, if we
encounter a timeout, treat the node as if we failed to connect to it,
and log the error once instead of twice.
Change-Id: I70602d554183c98f1213f3ffb1bfec41100ea0e7
This csv file was being closed as soon as the service was created.
All subsequent writes to the closed file handle produced errors,
which were logged but otherwise ignored.
Instead, we would like the file to remain open and writable, until
the service is destroyed.
Change-Id: Ib29944d25b2f5b2d0f90fdbdcde44fea8d769321
The copyFile method has some safeguards to ensure that the multipart
write is aborted. This is accomplished by always calling abort on the
MultiWriteHandle when the copy is finished, whether or not there was a
failure or it was successfully committed. If the copy was committed,
then this RPC is a no-op on the metainfo server.
Regardless, the calls to abort to constitute an additional RPC to the
satellite for no benefit. This is exacerbated by the fact that the code
currently ends up calling abort twice.
This change updates the libuplink-backed MultiWriteHandle implementation
to not call abort if the write is committed and vice-versa. This
eliminates the two wasteful RPC calls.
Change-Id: I13679234f6f473e9a93179e6791fb57eac512f25
added in storj-sim rangedloop for each satellite, to verify it works for metrics oveserver,
removed identity from rangedloop peer as we never use it, added logs on service run, added loop
to service instead of endless for loop, interval value to config
Closes: https://github.com/storj/storj/issues/5414
Change-Id: Ibc3b06071b68feda4a35b45da2bbe36e22a02fc8
Previously, if any pieces are still on disqualified nodes, this tool
would treat those pieces as fine (if the disqualified node is still
online) or temporarily unavailable (if the disqualified node is
offline). Instead, we should treat such pieces as lost.
This also fixes a slight problem with the code that handles a broken
alias. This is not likely to happen, but if we do see an alias that is
not in the alias map, we return an error instead of nil.
Change-Id: Ib4e2e729ef0535dd7bd9ce2f621680d9f959891c
Because it was originally intended to work on only a few pieces from
each segment at a time, and would frequently have reset its list of
online nodes, segment-verify has been taking nodes out of its
onlineNodes set and never putting them back. This means that over a long
run in Check=0 mode, we end up treating more and more nodes as offline,
permanently. This trend obfuscates the number of missing pieces that
each segment really has, because we don't check pieces on offline nodes.
This commit changes the onlineNodes set to an "offlineNodes" set, with
an expiration time on the offline-ness quality. So nodes are added to
the offlineNodes set when we see they are offline, and then we only
treat them as offline for the next 30 minutes (configurable). After that
point, we will try connecting to them again.
Change-Id: I14f0332de25cdc6ef655f923739bcb4df71e079e
The WithExists methods previously were not writing problematic pieces to
problem-pieces.csv. With this change, they will.
Change-Id: I51eadd3d8f4299e1efa787c9266a7aacfa525eb3
When this branch is followed, `audit.OutcomeFailure` is returned, and
`MarkNotFound()` is immediately called again (in
`(*NodeVerifier).Verify()`). Calling `MarkNotFound()` twice for the same
piece is not correct.
Change-Id: I1a2764bc32ed015628fcd9353ac3307f269b4bbd
It may help to know how much faster these methods are than the
alternative (asking nodes for each piece in turn).
Change-Id: Ieb7c963f62b662f72c84a49de8a09c065c14f782
It was ok as it was, but since we want to keep a close eye on progress
while the tool is running, it will help to have results written to the
output file immediately instead of after the buffer is full or the
program exits.
Change-Id: Ie027f05771a637afb06969ec775cd32b142b7635
This change is similar to
https://review.dev.storj.io/c/storj/storj/+/7687 but applied when
uploading from stdin with parallelism > 1.
Currently, the paralellism from stdin scales up to 3 or 4, but not
greater than that. If we buffer the content from stdin more aggressively
the parallelism scales to higher levels and reaches the performance of
reading directly from a file.
Change-Id: I1f447686a88074882709992ee6d52dd262e220fb
When Check == 0 (check all pieces), there is nearly always a piece left
in the retry count, so most segments get logged in segments-retry.csv.
This change makes it so we require retry>5 before adding to
segments-retry.csv (only in the check==0 case).
Change-Id: Iaea523c27eb777e3c248c27c7ef5effe77ae54cf
This new advanced flag configures libuplink to store in-memory the
erasure-coded pieces that are temporarily created during upload.
By default, libuplink writes the erasure-coded pieces as temp files on
the disk, but this results in additional IOPS that affect the
performance in hot-rodded scenarios.
If the erasure-coded pieces are kept in-memory and the system has enough
RAM, the upload speed may be boosted with 20-30%.
The flag is added as "advanced" as we don't recommend it by default.
Co-authored-by: Stefan Benten <mail@stefan-benten.de>
Change-Id: Icc54f03b6c0bc27c97126f6f1d22748d21a15959
* better error handling when Exists method is not avaialble on SN
* more optimal processing of response from Exists method
Change-Id: I6d61c09473e9f5ab76a4601720e8bd520767f4c2
This change creates a new independent process, the 'auditor', comparable
to the repairer, gc, and api processes. This will allow auditors to be
scaled independently of the core.
Refs: https://github.com/storj/storj/issues/5251
Change-Id: I8a29eeb0a6e35753dfa0eab5c1246048065d1e91
Now that all the reverification changes have been made and the old code
is out of the way, this commit renames the new things back to the old
names. Mostly, this involves renaming "newContainment" to "containment"
or "NewContainment" to "Containment", but there are a few other renames
that have been promised and are carried out here.
Refs: https://github.com/storj/storj/issues/5230
Change-Id: I34e2b857ea338acbb8421cdac18b17f2974f233c
Now that we are doing scalable piecewise reverifications, the code for
handling the old way of doing things (containment, pending audits,
reporting, testing) can now be removed.
Refs: https://github.com/storj/storj/issues/5230
Change-Id: Ief1a75f423eff682e8f3d57804e343b3409a6631
This adds the capability to the segment-verify tool of checking all
pieces of every indicated segment.
Pieces which could not be accessed (i.e. we couldn't get a single
byte from them) are recorded in a csv file.
I haven't been able to test this in any very meaningful way, yet, but I
am comforted by the fact that the worst things it could possibly do are
(a) download pieces too many times, and (b) miss downloading some
pieces.
Change-Id: I3aba30921572c974993363eb36d0fd5b3ae97907
Provides the `segment-verify run buckets` command for verifying segments within a list of buckets.
Bucket list is a csv file with `project_id,bucket_name` to be checked.
https://github.com/storj/storj-private/issues/101
Change-Id: I3d25c27b56fcab4a6a1aebb6f87514d6c97de3ff
Because --readonly is default true, passing something like
--disallow-deletes=false would not actually update that
value because the readonly flag would override. this makes it
so that the --disallow-* flags override the --readonly and
--writeonly flags.
Also fixes some minor formatting issues with share like an
extra space after the "Public Access:" entry.
Simplifies the handling of the explicit "none" by making the
flags for the dates optional and using nil to signify that
the value was left unset.
Bump the go.mod to go1.18 to enable the use of generics and
add a small generic function. This can easily be backed out
if it causes problems.
Change-Id: I1c5f1321ad17b8ace778ce55561cbbfc24321a68
NewContainment will replace Containment later in this commit chain, but
for now it is not yet being used.
NewContainment will allow a node to be contained for multiple pending
reverify jobs at a time. It is implemented by way of the reverify queue.
Refs: https://github.com/storj/storj/issues/5231
Change-Id: I126eda0b3dfc4710a88fe4a5f41780618ec19101
The default 'info' level for the storagenode will dump dozens of
lines every second. This change adds the ability to configure
the log.level argument at run time using LOG_LEVEL env variable.
Co-authored-by: Clement Sam <clementsam75@gmail.com>
Uplink doesn't have a `save` command, however, it's referred on an error
message that's returned when the `access register` command is executed
without having any default access configured.
The correct command to mention is `import`.
Change-Id: Ia2092d02965737f421683fc98c52a51c9529b86e
Reputation updates during repair currently consumes a lot of database
resources. Sometimes increasing the rate of repair is more important
than auditing a node based on whether they have or don't have the
correct piece during repair. This is the job of the audit service.
This commit is to implement an intermediate solution from this issue: https://github.com/storj/storj/issues/5089
This commit does not address the more in-depth fix discussed here: https://github.com/storj/storj/issues/4939
Change-Id: I4163b18d78a96fadf5265789fd73c8aa8def0e9f
If we are processing list of segments (csv) we should not stop if one of
segments is not found in DB.
Change-Id: I720f85dc7601c2ca77032e20c1577de55092bd9b
Current option is to put stream id and position as an input but
it's not very efficient when we have long list of segments to repair.
This change adds option to read whole csv file and process each entry
one by one.
If command will have single argument then it will treat it as csv file
location and if will have two arguments then it will parse it just as
stream id and position.
Change-Id: I1e91cf57a794d81d74af0091c24a2e7d00d1fab9
Implements logic for satellite command to repair a single segment.
Segment will be repaired even if it's healthy. This is not checked
during this process. As a part of repair command will download whole
segment into memory and will try to reupload segment to number of new
nodes that equal to existing number of pieces. After successful
upload new pieces will completely replace existing pieces.
Command:
satellite repair-segment <streamid> <position>
https://github.com/storj/storj/issues/5254
Change-Id: I8e329718ecf8e457dbee3434e1c68951699007d9
This patch makes it possible to use `uplink share` in test environment (like storj-up) where authservice doesn't have full secure endpoint.
This supposed to be an undocumented feature (no flag, just a custom prefix) to avoid any confusion for regular users.
Change-Id: I256aefc944066e52c72224e7b6f1a593b5bc57f7
Add nodeevents.DB to satellite overlay service so we can insert node
events into the nodeevents DB.
Change-Id: I642c0ccc9941ecdb08cb22d5c8cf701959a55156
New flag 'MultipleVersions' was not correctly passed from metainfo
configuration to metabase configuration. Because configuration was
set correctly for unit tests we didn't catch it and issue was found
while testing on QA satellite.
This change reduce number of places where new metabase flags needs
to be propagated from metainfo configuration to avoid problems with
setting new flags in the future.
Fixes https://github.com/storj/storj/issues/5274
Change-Id: I74bc122649febefd87f665be2fba628f6bfd9044
The current deployment strategy requires that the GC bloomfilter generation process executes only once and exits.
Change-Id: I952991f126596aa165d1f2e9fce6f8548c21bdba
It's quite likely to hit some timelimit, rather than giving up
immediately, let's retry after the throttle amount.
Change-Id: I20944b058d771f5d4bfa0eea7a2c26cefcd74739
We want to send emails to SNOs. Node status changes go through the
overlay service, so it's a good place to add the mail service.
Add the mailservice.Service, satellite address, and satellite name to
overlay service. Also add feature flag --overlay.send-node-emails
Change-Id: I3bd2cb3bf22f9724954ce2374f8b651b902b3a24