Commit Graph

29 Commits

Author SHA1 Message Date
Jeff Wendling
b70fb2f87f cmd/uplink: fix progress bar crash
the progress bar was being set to inconsistent lengths
multiple times, causing a crash. this fixes that by
only setting the progress bar length once to the length
of the full object. it avoids a round trip by doing so
only after it has gotten the first read handle from the
source, so the length information is cached.

Change-Id: I112d7c79016e54ba3794e96c6174cc01b8baedb4
2023-08-15 13:10:03 +00:00
Jeff Wendling
1cbad0fcab cmd/uplink: add back parallelism
for very large machines (>10Gbit) it is still useful
to have parallelism for uploads because we're actually
bound by getting new pieces from the satellite, so doing
that in parallel provides a big win.

this change adds back that flag to exist for uploads, and
removes the backwards compatibility code for the flag with
the maximum-concurrent-pieces as they are now independent.

the upload code parallelism story is now this:

    - each object is a transfer
    - each transfer happens in N parts (size dynamically
      chosen to avoid having >10000 parts)
    - each part can happen in parallel up to the limit
      specified
    - each parallel part can have up to the limit of
      max concurrent pieces and segments

this change also changes some defaults to be better.

    - the connection pool capacity now takes into acount
      transfers, parallelism and max concurrent pieces
    - the default smallest part size is 1GiB to allow the
      new upload code path to upload multiple segments

Change-Id: Iff6709ae73425fbc2858ed360faa2d3ece297c2d
2023-08-14 20:28:58 -04:00
Jeff Wendling
f3c58174c4 cmd/uplink: only use new code path for uploads
downloads still need the old copy code because they aren't
parallel in the same way uploads are. revert all the code
that removed the parallel copy, only use the non-parallel
copy for uploads, and add back the parallelism and chunk
size flags and have them set the maximum concurrent pieces
flags to values based on each other when only one is set
for backwards compatibility.

mostly reverts 54ef1c8ca2

Change-Id: I8b5f62bf18a6548fa60865c6c61b5f34fbcec14c
2023-06-09 23:45:30 +00:00
Jeff Wendling
ccfe5cae49 cmd/uplink: pass in a maximum concurrent segments value
also change the config creation to be more robust to
changes that add defaults in the future by not fully
reconstructing the config value passed in to the
project.

Change-Id: I673e8b54ce0b951ae735bf4658525c477c26ac5a
2023-05-25 10:53:50 -04:00
Jeff Wendling
a0fbc87b31 cmd/uplink: add upload log file flag
Change-Id: I740eaf431cfe5bb3dcb746e4128dea5efc935257
2023-05-25 14:14:00 +00:00
Egon Elbre
d94207048a go.mod: bump vbauerster/mpb/v8 for fixes
Change-Id: I2d24eec4389ca8e5effdbfc25d7012ce083d46f5
2023-04-18 15:53:35 +03:00
Jeff Wendling
54ef1c8ca2 cmd/uplink: use new upload code path
the parallelism and parallelism-chunk-size flags
which used to control how many parts to split a
segment into and many to perform in parallel
are now deprecated and replaced by
maximum-concurrent-pieces and long-tail-margin.

now, for an individual transfer, the total number
of piece uploads that transfer will perform is
controlled by maximum-concurrent-pieces, and
segments within that transfer will automatically
be performed in parallel. so if you used to set
your parallelism to n, a good value for the pieces
might be something approximately like 130*n, and
the parallelism-chunk-size is unnecessary.

Change-Id: Ibe724ca70b07eba89dad551eb612a1db988b18b9
2023-04-13 16:52:38 -04:00
Andrew Harding
e676b5c893 cmd/uplink: progress bars for recursive copy
```
$ uplink cp -r -t=3 files/ sj://files
uploading 6 files...
files/bar/buz   (2 of 6) 134.22 MB / 134.22 MB [============================================] 100.00% 36.43 MiB/s
files/bar/baz   (1 of 6) 67.11 MB / 67.11 MB [==============================================] 100.00% 18.39 MiB/s
files/boo       (3 of 6) 67.11 MB / 67.11 MB [==============================================] 100.00% 20.42 MiB/s
files/foo       (4 of 6) 67.11 MB / 67.11 MB [==============================================] 100.00% 57.83 MiB/s
files/glue/flew (5 of 6) 67.11 MB / 67.11 MB [==============================================] 100.00% 55.01 MiB/s
files/stew      (6 of 6) 67.11 MB / 67.11 MB [==============================================] 100.00% 91.43 MiB/s
```

Change-Id: Ibd9d07a1291f7a599bd27fba93c1b2e0f17dc787
2023-04-10 15:13:22 +00:00
Kaloyan Raev
56896353b6 cmd/uplink: add buffering while reading from stdin
This change is similar to
https://review.dev.storj.io/c/storj/storj/+/7687 but applied when
uploading from stdin with parallelism > 1.

Currently, the paralellism from stdin scales up to 3 or 4, but not
greater than that. If we buffer the content from stdin more aggressively
the parallelism scales to higher levels and reaches the performance of
reading directly from a file.

Change-Id: I1f447686a88074882709992ee6d52dd262e220fb
2022-12-23 16:40:54 +00:00
Kaloyan Raev
bfd189c3b0 cmd/uplink: add --inmemory-erasure-coding flag to cp command
This new advanced flag configures libuplink to store in-memory the
erasure-coded pieces that are temporarily created during upload.

By default, libuplink writes the erasure-coded pieces as temp files on
the disk, but this results in additional IOPS that affect the
performance in hot-rodded scenarios.

If the erasure-coded pieces are kept in-memory and the system has enough
RAM, the upload speed may be boosted with 20-30%.

The flag is added as "advanced" as we don't recommend it by default.

Co-authored-by: Stefan Benten <mail@stefan-benten.de>

Change-Id: Icc54f03b6c0bc27c97126f6f1d22748d21a15959
2022-12-22 19:48:58 +00:00
Jeff Wendling
fa4af92392 cmd/uplink: improve cp behavior
allow multiple source paths and a single destination path.
this makes commands like `uplink cp foo* sj://bucket` work
as expected.

require at least one remote path when copying. this ensures
that users don't accidentally overwrite their local files
with other local files, which is almost never what they wanted
because they would just use cp.

Change-Id: I28948f4ff735d29db06de81fc8c2a15b9f4ee3f5
2022-09-27 10:23:41 +00:00
Márton Elek
ea1408f7a8 go.mod: bump clingy dependency
As a reminder: latest clingy removed the requirement of having custom context (which made the usage of context.WithValue harder) and uses simple context instead.

Clingy saves the stdin/stdout/stderr to the context (earlier to separated context type) to make it available for unit testing.

Change-Id: I8896574f4670721de43a577cd4b35952e3b5d00e
2022-08-31 10:24:27 +00:00
Michał Niewrzał
7e387af010 cmd/uplink: add buffering while writing to stdout
Current pipelining to stdout is synchronous so we don't have any
advantage from using --parallelism flag. This change adds buffer
while writing to stdout. Each part is first read into the buffer
and flushed only when all data was read from this part.

https://github.com/storj/uplink/issues/105

Change-Id: I07bec0f4864dc4fccb42224e450d85d4d196f2ee
2022-06-09 15:10:04 +00:00
Michał Niewrzał
ffbb43ddbc cmd/uplink: fix how we are collecting errors while copy in parallel
Main issue was that when one part copy failed while being inside
goroutine (limiter) and another part was still collecting src/dst parts
it was possible to drop errors from failed part copy. It was possible
bacause on fail context was canceled and if we were still getting
part src/dst then it was returning error immediately and error
group with errors from goroutine was ignored.

Change-Id: I75c6799eba358741629795f2971c7a964cb2c9ce
2022-05-31 10:18:51 +00:00
Michał Niewrzał
4f2fae4f28 cmd/uplink: better error handling for parallel transfer
Few improvements were made to how we are handling errors
while doing parallel upload/download for single object:
* unhide error under 'context canceled' which was shown in most of
cases
* add part number to error message
* don't try to commit if any error occurs while operation
* combine errors into more readable form, example:

---
failed to download part 3: uplink: eestream: failed to download stripe 0:
error retrieving piece 00: ecclient: piecestore: rpc: tcp connector failed: rpc: dial tcp 97.119.158.36:28967: i/o timeout
...
error retrieving piece 89: ecclient: piecestore: rpc: tcp connector failed: rpc: dial tcp 161.129.152.194:28967: i/o timeout
failed to download part 1: uplink: eestream: failed to download stripe 0:
error retrieving piece 01: io: read/write on closed pipe
...
error retrieving piece 97: io: read/write on closed pipe
failed to download part 2: uplink: eestream: failed to download stripe 0:
error retrieving piece 00: io: read/write on closed pipe
...
error retrieving piece 01: ecclient: piecestore: rpc: tcp connector failed: rpc: dial tcp 180.183.132.234:28967: operation was canceled
error retrieving piece 96: io: read/write on closed pipe
	main.(*cmdCp).parallelCopy:418
	main.(*cmdCp).copyFile:262
	main.(*cmdCp).Execute:156
	main.(*external).Wrap:123
	github.com/zeebo/clingy.(*Environment).dispatchDesc:126
	github.com/zeebo/clingy.(*Environment).dispatch:53
	github.com/zeebo/clingy.Environment.Run:34
	main.main:26
	runtime.main:250
---

Change-Id: I9bb70b3f754567761fa8d17bef8ef59b0709e33b
2022-05-27 14:00:35 +00:00
Paul Willoughby
1d97b2c855 cmd/uplink: add use - for stdout,stdin to cp help
Change-Id: Ife3a0972d1be119a73eaefc0e23407b74fe03f54
2022-05-26 10:27:20 -06:00
Michał Niewrzał
d90ce467fc cmd/uplink: bring back --metadata for cp command
At some point uplink cli lost ability to set metadata. This change
brings back this functionality for 'cp' operation.

https://github.com/storj/storj/issues/3848

Change-Id: Ia5f60eb577fcab8a38d94730d8cdc6e0338d3b46
2022-05-18 15:58:53 +00:00
Michał Niewrzał
f9a3f19443 scripts: add 'uplink cp' tests for stdin/stdout
Uplink can upload from stdin and download to stdout. We had
such tests for old binary but now we were missing it.

Change-Id: I5110a9f531f5cc21277fa53611995fb5b556ff16
2022-05-18 08:42:04 +00:00
Stefan Benten
7afdb15fc8
cmd/uplink: adding Length Method to MultiReadHandle
This changes allows fetching the file size more easily (for supported
files) in order to afterwards calculate the multipart part size
accordingly.

Change-Id: Idabba4c2ee794ee471973889f5843174a7acad35
2022-05-13 21:31:45 +02:00
Stefan Benten
5e4ec0b3be
cmd/uplink: adjust multipart part size based on file size
This change allows the uplink to bump the part size based on the
content length that is being copied. This ensures we are staying
below the 10k part limit currently enforced on the satellites.

If the user specifies the flag, it will error out if the value
chosen by the user is too low. Otherwise it will use it.

Change-Id: I00d30f603d941c2f7703ba19d5923e668629a7b9
2022-05-13 21:31:23 +02:00
Jeff Wendling
f25ead5f98 cmd/uplink: set default parallelism to 1
Change-Id: Ic4198131c9958cc864fd861f983e32776bf56595
2022-04-26 22:55:11 +00:00
Egon Elbre
1ed36e9fea cmd/uplink: make clearer ctx cancellation path in copy
When ctx is cancelled limiter won't start a new goroutine.
The code didn't immediately return an error in that case.

The dst.Commit(ctx) would fail anyways due to a cancelled ctx.
However, we can make the behavior clearer by returning immediately.

Change-Id: I65df7ca85de55813f3200a50db2eaaa7a297ba2c
2022-04-25 18:16:46 +03:00
Egon Elbre
847ddaaab0 cmd/uplink: cancel on failed copy
Also ensure that abort is given at least 5 seconds to clear up any
pending uploads on cancellation.

Change-Id: I814aa407ee5783f2609a76b54de2879dcd5f89bb
2022-04-22 14:57:24 +03:00
Kaloyan Raev
978e0f1a26 cmd/uplink: cp sets connection pool capacity based on parallelism
If the cp command is executed with higher level of parallelism, it would
open more connections to storage nodes at the same time. Therefore, the
connection pool capacity should be expanded accordingly.

The pool capacity is set to 100 * parallelism.

Change-Id: Ia8b3ab6a99340d8cbb87a7b80c3354b2b21c1958
2022-04-21 14:10:08 +00:00
paul cannon
1422a1ff19 cmd/uplink: use 64 MiB for parallel chunk size, not 64 MB
I don't think it should matter for correctness whether this matches the
segment size or not, so I think there is something else wrong. However,
making this change seems to eliminate the "corruption when ulimit -n is
too low" problem we're seeing right now.

Change-Id: I232fe0d0a371b86ddf902e8c2d4778e140b2f1fc
2022-04-19 12:08:08 -05:00
Erik van Velzen
61a47f3e95 cmd/uplink: refactor date parsing
Change-Id: I6a5cbdf86eecdc5578f3dae7a8ab1b0d4485e1da
2022-04-05 01:03:20 +00:00
Qweder93
2a7b20e8e4 cmd/uplink: integrate server-side copy with uplink cp command
Resolves https://github.com/storj/storj/issues/4486

Change-Id: I42ac2ad2e1a05df4a83606f1990b639f08791403
2022-03-31 09:25:29 +00:00
Erik van Velzen
85fa78eae7 cmd/uplink: supporty expires in copy
When copying an object from cli you can now set the expiry.
It uses the same datetime format as restricting access grants.

Closes https://github.com/storj/storj/issues/4595

Change-Id: Icab73a64a9589817d6bc6d702b765b166ca1350d
2022-03-07 02:43:51 +01:00
Jeff Wendling
9061dd309f cmd/uplinkng: become cmd/uplink
Change-Id: If426c32219d32044d715ab6dfa9718807f32cb9f
2022-02-09 17:02:21 +00:00