the progress bar was being set to inconsistent lengths
multiple times, causing a crash. this fixes that by
only setting the progress bar length once to the length
of the full object. it avoids a round trip by doing so
only after it has gotten the first read handle from the
source, so the length information is cached.
Change-Id: I112d7c79016e54ba3794e96c6174cc01b8baedb4
for very large machines (>10Gbit) it is still useful
to have parallelism for uploads because we're actually
bound by getting new pieces from the satellite, so doing
that in parallel provides a big win.
this change adds back that flag to exist for uploads, and
removes the backwards compatibility code for the flag with
the maximum-concurrent-pieces as they are now independent.
the upload code parallelism story is now this:
- each object is a transfer
- each transfer happens in N parts (size dynamically
chosen to avoid having >10000 parts)
- each part can happen in parallel up to the limit
specified
- each parallel part can have up to the limit of
max concurrent pieces and segments
this change also changes some defaults to be better.
- the connection pool capacity now takes into acount
transfers, parallelism and max concurrent pieces
- the default smallest part size is 1GiB to allow the
new upload code path to upload multiple segments
Change-Id: Iff6709ae73425fbc2858ed360faa2d3ece297c2d
downloads still need the old copy code because they aren't
parallel in the same way uploads are. revert all the code
that removed the parallel copy, only use the non-parallel
copy for uploads, and add back the parallelism and chunk
size flags and have them set the maximum concurrent pieces
flags to values based on each other when only one is set
for backwards compatibility.
mostly reverts 54ef1c8ca2
Change-Id: I8b5f62bf18a6548fa60865c6c61b5f34fbcec14c
also change the config creation to be more robust to
changes that add defaults in the future by not fully
reconstructing the config value passed in to the
project.
Change-Id: I673e8b54ce0b951ae735bf4658525c477c26ac5a
the parallelism and parallelism-chunk-size flags
which used to control how many parts to split a
segment into and many to perform in parallel
are now deprecated and replaced by
maximum-concurrent-pieces and long-tail-margin.
now, for an individual transfer, the total number
of piece uploads that transfer will perform is
controlled by maximum-concurrent-pieces, and
segments within that transfer will automatically
be performed in parallel. so if you used to set
your parallelism to n, a good value for the pieces
might be something approximately like 130*n, and
the parallelism-chunk-size is unnecessary.
Change-Id: Ibe724ca70b07eba89dad551eb612a1db988b18b9
This change is similar to
https://review.dev.storj.io/c/storj/storj/+/7687 but applied when
uploading from stdin with parallelism > 1.
Currently, the paralellism from stdin scales up to 3 or 4, but not
greater than that. If we buffer the content from stdin more aggressively
the parallelism scales to higher levels and reaches the performance of
reading directly from a file.
Change-Id: I1f447686a88074882709992ee6d52dd262e220fb
This new advanced flag configures libuplink to store in-memory the
erasure-coded pieces that are temporarily created during upload.
By default, libuplink writes the erasure-coded pieces as temp files on
the disk, but this results in additional IOPS that affect the
performance in hot-rodded scenarios.
If the erasure-coded pieces are kept in-memory and the system has enough
RAM, the upload speed may be boosted with 20-30%.
The flag is added as "advanced" as we don't recommend it by default.
Co-authored-by: Stefan Benten <mail@stefan-benten.de>
Change-Id: Icc54f03b6c0bc27c97126f6f1d22748d21a15959
allow multiple source paths and a single destination path.
this makes commands like `uplink cp foo* sj://bucket` work
as expected.
require at least one remote path when copying. this ensures
that users don't accidentally overwrite their local files
with other local files, which is almost never what they wanted
because they would just use cp.
Change-Id: I28948f4ff735d29db06de81fc8c2a15b9f4ee3f5
As a reminder: latest clingy removed the requirement of having custom context (which made the usage of context.WithValue harder) and uses simple context instead.
Clingy saves the stdin/stdout/stderr to the context (earlier to separated context type) to make it available for unit testing.
Change-Id: I8896574f4670721de43a577cd4b35952e3b5d00e
Current pipelining to stdout is synchronous so we don't have any
advantage from using --parallelism flag. This change adds buffer
while writing to stdout. Each part is first read into the buffer
and flushed only when all data was read from this part.
https://github.com/storj/uplink/issues/105
Change-Id: I07bec0f4864dc4fccb42224e450d85d4d196f2ee
Main issue was that when one part copy failed while being inside
goroutine (limiter) and another part was still collecting src/dst parts
it was possible to drop errors from failed part copy. It was possible
bacause on fail context was canceled and if we were still getting
part src/dst then it was returning error immediately and error
group with errors from goroutine was ignored.
Change-Id: I75c6799eba358741629795f2971c7a964cb2c9ce
Few improvements were made to how we are handling errors
while doing parallel upload/download for single object:
* unhide error under 'context canceled' which was shown in most of
cases
* add part number to error message
* don't try to commit if any error occurs while operation
* combine errors into more readable form, example:
---
failed to download part 3: uplink: eestream: failed to download stripe 0:
error retrieving piece 00: ecclient: piecestore: rpc: tcp connector failed: rpc: dial tcp 97.119.158.36:28967: i/o timeout
...
error retrieving piece 89: ecclient: piecestore: rpc: tcp connector failed: rpc: dial tcp 161.129.152.194:28967: i/o timeout
failed to download part 1: uplink: eestream: failed to download stripe 0:
error retrieving piece 01: io: read/write on closed pipe
...
error retrieving piece 97: io: read/write on closed pipe
failed to download part 2: uplink: eestream: failed to download stripe 0:
error retrieving piece 00: io: read/write on closed pipe
...
error retrieving piece 01: ecclient: piecestore: rpc: tcp connector failed: rpc: dial tcp 180.183.132.234:28967: operation was canceled
error retrieving piece 96: io: read/write on closed pipe
main.(*cmdCp).parallelCopy:418
main.(*cmdCp).copyFile:262
main.(*cmdCp).Execute:156
main.(*external).Wrap:123
github.com/zeebo/clingy.(*Environment).dispatchDesc:126
github.com/zeebo/clingy.(*Environment).dispatch:53
github.com/zeebo/clingy.Environment.Run:34
main.main:26
runtime.main:250
---
Change-Id: I9bb70b3f754567761fa8d17bef8ef59b0709e33b
At some point uplink cli lost ability to set metadata. This change
brings back this functionality for 'cp' operation.
https://github.com/storj/storj/issues/3848
Change-Id: Ia5f60eb577fcab8a38d94730d8cdc6e0338d3b46
Uplink can upload from stdin and download to stdout. We had
such tests for old binary but now we were missing it.
Change-Id: I5110a9f531f5cc21277fa53611995fb5b556ff16
This changes allows fetching the file size more easily (for supported
files) in order to afterwards calculate the multipart part size
accordingly.
Change-Id: Idabba4c2ee794ee471973889f5843174a7acad35
This change allows the uplink to bump the part size based on the
content length that is being copied. This ensures we are staying
below the 10k part limit currently enforced on the satellites.
If the user specifies the flag, it will error out if the value
chosen by the user is too low. Otherwise it will use it.
Change-Id: I00d30f603d941c2f7703ba19d5923e668629a7b9
When ctx is cancelled limiter won't start a new goroutine.
The code didn't immediately return an error in that case.
The dst.Commit(ctx) would fail anyways due to a cancelled ctx.
However, we can make the behavior clearer by returning immediately.
Change-Id: I65df7ca85de55813f3200a50db2eaaa7a297ba2c
Also ensure that abort is given at least 5 seconds to clear up any
pending uploads on cancellation.
Change-Id: I814aa407ee5783f2609a76b54de2879dcd5f89bb
If the cp command is executed with higher level of parallelism, it would
open more connections to storage nodes at the same time. Therefore, the
connection pool capacity should be expanded accordingly.
The pool capacity is set to 100 * parallelism.
Change-Id: Ia8b3ab6a99340d8cbb87a7b80c3354b2b21c1958
I don't think it should matter for correctness whether this matches the
segment size or not, so I think there is something else wrong. However,
making this change seems to eliminate the "corruption when ulimit -n is
too low" problem we're seeing right now.
Change-Id: I232fe0d0a371b86ddf902e8c2d4778e140b2f1fc
When copying an object from cli you can now set the expiry.
It uses the same datetime format as restricting access grants.
Closes https://github.com/storj/storj/issues/4595
Change-Id: Icab73a64a9589817d6bc6d702b765b166ca1350d