We couldn't use environment variables safely to configure storagenode, since we introduced the embedded updater.
For example STORJ_DEBUG_ADDR=localhost:11111 would try to set debug port 11111 for both the storagenode and storagenode-updated, causing port conflict.
This small change enables to configure storagenode-updater with STORJUPDATER_... environment variables.
Tested with creating custom image and installing to my own storage node.
Change-Id: I6b0a601a4dc63d2d1ff3c191ae89981434e55c30
Sessions now expire after a much shorter amount of time, requiring
clients to issue API requests for session extension. This is handled
behind the scenes as the user interacts with the page, but once session
expiration is imminent, a modal appears which informs the user of his
inactivity and presents him with the choice of loging out or preserving
his session.
Change-Id: I68008d45859c814a835d65d882ad5ad2199d618e
When enconding structs into JSON, byte slices are marshalled as base64
encoded string using the base64.StdEncoding.Encode():
ea9c3fd42d/src/encoding/json/encode.go (L833-L861)
We, however, expect API Secrets to be encoded as base64URL, so when
an marshalled secret (with byte slice type) is added to the multinode
dashboard, it fails with `illegal base64 data at input byte XX`.
This change changes the type of APISecret field in the
multinode/nodes.Nodes struct to use multinodeauth.Secret type instead
of []byte.
multinodeauth.Secret is extended with custom MarshalJSON and
UnmarshalJSON methods which implement the json.Marshaler and
json.Unmarshaler interfaces, respectively.
Resolves https://github.com/storj/storj/issues/4949
Change-Id: Ib14b5f49ceaac109620c25d7ff83be865c698343
Added Setup of access maker call into cmd_access_setup to use flags during cmd call
Closes https://github.com/storj/storj/issues/4766
Change-Id: I0c75f224414099573b021b18b87d9e17192cecc5
It's possible that content was not being flushed from processes.
For now, ignore other process failures under storj-sim network test.
Once we get other processes stable, we can repropagate the error.
Change-Id: I01ed572d7c779ab6451124f1e24e3d1168b3ea79
this allows one to specify a trace id and cause any remote
spans to be sent up to wherever. it doesn't collect any
local traces.
Change-Id: Ia87e294bb276d966f9f3dbfbaf6e7916b1ec7af9
Some nodes were added to the nodes table due to a bug in quic
based storagenode contact code. This is a tool to clean up these nodes.
Delete with batch-size 1k seems to take ~400ms on local CockroachDB.
Change-Id: Ic0c1180528c27952e19c431fc9cc327292a10a5f
Don't abbreviate multinode in the command help message because there
isn't a need for it and the abbreviation isn't clear at all.
Change-Id: I7a1f2be6ae1f7d4b287c18c48b22c630549b731f
It seems the tests relied on time.Now(), which might cause some
discrepancies in calculations. Use a fixed time.Now() rather than
recalculating.
As a sidefix, remove "Test" prefix from t.Run. These are unnecessary.
Change-Id: I1de903fcf0fcf46fc8e3acf2463e17239b8e3cc6
Current pipelining to stdout is synchronous so we don't have any
advantage from using --parallelism flag. This change adds buffer
while writing to stdout. Each part is first read into the buffer
and flushed only when all data was read from this part.
https://github.com/storj/uplink/issues/105
Change-Id: I07bec0f4864dc4fccb42224e450d85d4d196f2ee
This change adds project ID and bucket name columns to the generated
partner attribution report. Attribution values are now summed based on
their project ID and bucket name in addition to their user agent.
Additionally, the command to generate the attribution report has been
modified to optionally include only certain user agents.
Change-Id: I61a1d854379134f26b31467d9e83a787beb451dd
The admin UI assets aren't inside of the `web` directory they are
directly in the `satellite` one.
The invalid path provoked that storj-sim generated a satellite
configuration with an invalid path to the Admin UI static assets
provoking that it didn't load the UI by default.
Change-Id: I49fb289377f51634057173690fbd8cf863ca9a9d
Main issue was that when one part copy failed while being inside
goroutine (limiter) and another part was still collecting src/dst parts
it was possible to drop errors from failed part copy. It was possible
bacause on fail context was canceled and if we were still getting
part src/dst then it was returning error immediately and error
group with errors from goroutine was ignored.
Change-Id: I75c6799eba358741629795f2971c7a964cb2c9ce
Few improvements were made to how we are handling errors
while doing parallel upload/download for single object:
* unhide error under 'context canceled' which was shown in most of
cases
* add part number to error message
* don't try to commit if any error occurs while operation
* combine errors into more readable form, example:
---
failed to download part 3: uplink: eestream: failed to download stripe 0:
error retrieving piece 00: ecclient: piecestore: rpc: tcp connector failed: rpc: dial tcp 97.119.158.36:28967: i/o timeout
...
error retrieving piece 89: ecclient: piecestore: rpc: tcp connector failed: rpc: dial tcp 161.129.152.194:28967: i/o timeout
failed to download part 1: uplink: eestream: failed to download stripe 0:
error retrieving piece 01: io: read/write on closed pipe
...
error retrieving piece 97: io: read/write on closed pipe
failed to download part 2: uplink: eestream: failed to download stripe 0:
error retrieving piece 00: io: read/write on closed pipe
...
error retrieving piece 01: ecclient: piecestore: rpc: tcp connector failed: rpc: dial tcp 180.183.132.234:28967: operation was canceled
error retrieving piece 96: io: read/write on closed pipe
main.(*cmdCp).parallelCopy:418
main.(*cmdCp).copyFile:262
main.(*cmdCp).Execute:156
main.(*external).Wrap:123
github.com/zeebo/clingy.(*Environment).dispatchDesc:126
github.com/zeebo/clingy.(*Environment).dispatch:53
github.com/zeebo/clingy.Environment.Run:34
main.main:26
runtime.main:250
---
Change-Id: I9bb70b3f754567761fa8d17bef8ef59b0709e33b
At some point uplink cli lost ability to set metadata. This change
brings back this functionality for 'cp' operation.
https://github.com/storj/storj/issues/3848
Change-Id: Ia5f60eb577fcab8a38d94730d8cdc6e0338d3b46
Uplink can upload from stdin and download to stdout. We had
such tests for old binary but now we were missing it.
Change-Id: I5110a9f531f5cc21277fa53611995fb5b556ff16
if you somehow get an invalid access grant in your
config file, it'd be nice to be able to list it
and delete it and stuff.
Change-Id: I7e335bf32353f294d5abb6a7c5f8f3aa18f2f6a7
The current supervisord condifguration sets up the HTTP server
to listen on a tcp socket which is private i.e. available only
on localhost. This poses a regression where multiple containers
cannot be run if the host network interface is used when docker
container is run with `--network host` option.
This change adds a new env variable `SUPERVISOR_SERVER`, with
potential values `unix | private_port | public_port`, where
`unix` is set as the default value.
By default, the HTTP server is now set to listen on a UNIX
domain socket.
The file path is set to `/etc/supervisor/supervisor.sock`
instead of the /tmp directory since some systems
periodically delete older files in /tmp. If the socket file is
deleted, supervisorctl will be unable to connect to supervisord.
When SUPERVISOR_SERVER is set to `public_port` or `private_port`,
the HTTP server is set to listen on a TCP socket.
Resolves https://github.com/storj/storj/issues/4661
Change-Id: I224836dcae0293bcfe49874f2748be7723944687
This changes allows fetching the file size more easily (for supported
files) in order to afterwards calculate the multipart part size
accordingly.
Change-Id: Idabba4c2ee794ee471973889f5843174a7acad35
This change allows the uplink to bump the part size based on the
content length that is being copied. This ensures we are staying
below the 10k part limit currently enforced on the satellites.
If the user specifies the flag, it will error out if the value
chosen by the user is too low. Otherwise it will use it.
Change-Id: I00d30f603d941c2f7703ba19d5923e668629a7b9
Things that make debugging easier.
* Added logging to automatic link clicking to make it obvious, when it
fails.
* Added monitoring to oidc.
* Made dbx create calls noreturn for oauth_*
Change-Id: I37397b4e84ce5bfd82954aed9c38fdfd52595f24
recursive copy had a bug with relative local paths.
this fixes that bug and changes the test framework
to use more of the code that actually runs in uplink
and only mocks out the direct interaction with the
operating system.
Change-Id: I9da2a80bfda8f86a8d05879b87171f299f759c7e
Implement a buffer for inserting repair items into the queue in a batch.
Part of https://github.com/storj/storj/issues/4727
Change-Id: I718472b2f2b1f4993c3d6f15c44923776407155a
The new storagenode base image version contains the fix for the
failing "processes" supervisord event listener.
Resolves https://github.com/storj/storj/issues/4772
Change-Id: I6d67aa6f85ee33cd9abe6a663e4f9a84ea57fdbf
/bin/stop-supervisor fails in posix shell since the standard read utility
takes at least one variable's name as argument.
Changing the header #!bin/sh to #!/bin/bash fixes this issue.
`read` with no variable's name works in bash.
Looks like the shell in alpine isn't POSIX-compliant so we didn't
encounter this issue on alpine.
Also, I changed the name from "processes" to "processes-exit-eventlistener"
to make it clearer in the logs since supervisord spawns event listeners as
separate processes.
Change-Id: Ife9378c2013e2eb54f2adcd52a163d64eaacbbab
When running the docker auto-updater image as non-root user,
supervisord logs a "CRIT could not write pidfile /run/supervisord.pid"
since the user does not have permission to the /run directory.
Changing the location to /etc/supervisor fixes it because permissions
are set for non-root access of the /etc/supervisor directory.
Closes https://github.com/storj/storj/issues/4730
Change-Id: Id463f3a08db44dd9283921ece4575abdad9bd7f2
With this change users can use the uplink cli in
scripts (ie. bash) more easily, since the output
can be switched to an easier processable json format.
It keeps the default of tabbed output.
Change-Id: I37e2c55f75c2250c3119fd8df8b66a766ff9096b
When ctx is cancelled limiter won't start a new goroutine.
The code didn't immediately return an error in that case.
The dst.Commit(ctx) would fail anyways due to a cancelled ctx.
However, we can make the behavior clearer by returning immediately.
Change-Id: I65df7ca85de55813f3200a50db2eaaa7a297ba2c
It was possible for the a previous write / part to fail or be aborted
and the next part write still happened. This causes a data ordering
corruption.
The whole write to parallel stdout fails, so there shouldn't be
confusion with regards to the output acceptability. However, it would
be clearer, if we avoided writing out-of-order data... mainly to be
clear that we didn't corrupt the data, just that it's incomplete.
Change-Id: I97b0d14404f29e8615e7d29b10cbd61ccb861e40
Also ensure that abort is given at least 5 seconds to clear up any
pending uploads on cancellation.
Change-Id: I814aa407ee5783f2609a76b54de2879dcd5f89bb
If the cp command is executed with higher level of parallelism, it would
open more connections to storage nodes at the same time. Therefore, the
connection pool capacity should be expanded accordingly.
The pool capacity is set to 100 * parallelism.
Change-Id: Ia8b3ab6a99340d8cbb87a7b80c3354b2b21c1958
I don't think it should matter for correctness whether this matches the
segment size or not, so I think there is something else wrong. However,
making this change seems to eliminate the "corruption when ulimit -n is
too low" problem we're seeing right now.
Change-Id: I232fe0d0a371b86ddf902e8c2d4778e140b2f1fc
Attribution is attached to bucket usage, but that's more granular than
necessary for the attribution report. This change iterates over the
bucket attributions, parses the user agent, converts the first entry
to lower case, and uses that as the key to a map which holds the
attribution totals for each unique user agent.
Change-Id: Ib2962ba0f57daa8a7298f11fcb1ac44a8bb97875
Now that we have both the storagenode and updater processes running
in a single docker container, we need a way to know which log entry
is logged by any of the processes.
This change includes a Process field in the log entries.
Resolves https://github.com/storj/storj/issues/4648
Change-Id: I167b9ab65728a41136d264b5fe2c41bb64ed1785
Before, the VA query was summing the total and dividing by the number of
rows. This gives the average bytes stored per hour, but we charge for
usage with byte-hours. Why not do value attribution the same way?
To do that, we don't divide by the number of rows. We also have object
and segment fees so return segment-hours and object-hours too.
Change-Id: I1f18b7e1b2bae1d3fae1ca3b93bfc24db5b9b0e6
We've had a lot of issues with alpine and currently there's a broken
network issue on alpine for users running on RPI arm32 architechture
which requires a workaround before docker is able to sync time between
the host and the container: https://wiki.alpinelinux.org/wiki/Release_Notes_for_Alpine_3.13.0\#time64_requirements.
Since we're switching the base image of the storagenode to debian,
it's best to switch the base image of all our docker images to
debian as well for consistency; less drift across them and keeps
the push target consistent.
Change-Id: If3adf7a57dc59f19ef2221b892f340d919798fc5
In the migration to migrate the corresponding name of the partner id to
user agent, part of the requirement was to migrate the partner id
itself if there was no partner name associated. This turned out to not
be so good. When we parse the user_agent column later, it is returning an
error if the user agent is one of these UUIDs.
Change-Id: I776ea458b82e1f99345005e5ba73d92264297bec
We are switching from alpine to debian due to a network issue
introduced in alpine 3.13+ which fails to verify certificates
due to not all armhf boards meet the time64 requirement:
https://wiki.alpinelinux.org/wiki/Release_Notes_for_Alpine_3.13.0\#time64_requirements
Also, Debian does not have official imagess for arm32v6 architecture
so we are building with arm32v5 arch in the Makefile.
Change-Id: I3660c3f64b7c2b342dd4ccb876af5f4e3036ea9d
When there is an error fetching a piece, the reader might be present or
it might not, depending on how far the fetch operation got. The
fetch-pieces code did not handle the "reader-not-present" case. Now it
should.
Change-Id: I263657d544d0ab8ba5d307a34ffc76bbf56835d0
Updating the version of the base image for the storagenode docker image.
Also fixes the non-root permission issue to /app directory
Change-Id: I8b55a1e3062f55ce6fc52e126ec1a18bfa24e669
This change fixes the following issues:
wget: Alpine docker image by default uses the builtin BusyBox wget which is not capable of handling SSL traffic via proxy unlike the GNU wget. We have to replace BusyBox wget with GNU wget.
updater failing to restart the node: supervisorctl pointing to wrong config file. We remove the default configuration file and point supervisorctl to custom config in systemctl
updates https://github.com/storj/storj/issues/4489
Change-Id: I24a7f18377ba723bbc377bb5d25aaa14f37021b1
Add ability to limit updates in migrations.
To make sure things are looking okay in the migration, we can run it
with a limit of something like 10 or 30. We can look at the output of
the migrated columns to see if they are correct. This should have no
effect on subsequently running the full migration.
Change-Id: I2c74879c8909c7938f994e1bd972d19325bc01f0
This change fixes the `sed: can't create temp file '/etc/supervisor/supervisord.confXXXXXX': Permission denied` issue when editing the supervisord.conf file during runtime as a non-root user.
While editing the config file, Sed creates a temporary file, saves the result and then finally mv the original file with the temporary one. So we need to set the permission for the /etc/supervisor where the temporary file is created.
Change-Id: Ic9c147a9cf0a6ef94adf702e33054edce1828806
The supervisord.conf file is edited to set the args for the storagenode and storagenode-updater binaries at runtime. This change moves the config file to the base image so we can set the permission to allow non-root users edit the config file.
Non-root user permission is also needed for the /app directory so we can install/update the binaries when run as a non-root user.
Updates https://github.com/storj/storj/issues/4489
Change-Id: If7a51a00ea171253e41923501174a43393f4638c
When copying an object from cli you can now set the expiry.
It uses the same datetime format as restricting access grants.
Closes https://github.com/storj/storj/issues/4595
Change-Id: Icab73a64a9589817d6bc6d702b765b166ca1350d
Having the storagenode and storagenode-updater processes in one container
requires a process manager to properly handle the individual processes.
Using a process manager like supervisord requires that you package
supervisord and it configuration in the image, along with the storagenode
and storagenode-updater binaries.
Installing supervisord requires that we run apk to install it and its
dependencies at build time which makes it difficult to build multi-platoform
images; executing apk forces a requirement of the build system to run
foreign architechtures.
This change adds a dockerfile which will be used to build the base image
for the storagenode and has supervisord packaged. The base image will be
built manually using docker buildx, with QEMU binfmt support.
Updates https://github.com/storj/storj/issues/4489
Change-Id: I33f8f01398a7207bca08d8a4a43f4ed56b6a2473
We would like to disable in production those parts of code
which are now mixed with new server-side copy logic.
Change-Id: Iff50682bc9545207330f58dd19b5eee53d404d7f
The text has been expanded a bit to clarify that it is necessary to create identity files with an example before using the Docker image.
Changed the <identity-dir> placeholder to <multinode-identity-dir> so no one confuses them with the storagenode identity files.
Changed the <storage-dir> placeholder to <multinode-config-dir> so no one confuses them with the storagenode 'config' folder.
fixed#4547
Closes#4547
We do not build an docker image for the multinode dashboard,
which makes monitoring for docker-focused environments harder.
This adds the basic image and ties it into CI/CD.
Change-Id: I14c01a7f1f0019f6f5c1b8fd75dc424fc362b18d
Currently the metainfo/metabase DB connections are missing the proper
application_name in order to differentiate and filter queries on the DB
side for analytics.
Without it, it is very time-consuming to correlate processes and their load.
This change adds the "check" on DB connection init and passes the fallbacks
in all places to catch connection strings, that do not set it.
Change-Id: Iea5cea8658bc63778ff89038e5c1c352bf482cfd
The "satellite fetch-pieces" command allows a satellite operator to
fetch as many pieces of a segment as possible, along with their
original order limits and hashes as provided by the storage nodes. The
fetched pieces and associated info will be stored on in a specified
folder as they are, rather than being RS-decoded or decrypted.
It is hoped that this will allow easier debugging of certain one-off
problems we've observed in the wild.
Change-Id: I42ae0e9ef0023538e42473a9be5a2460a3ac0f3a
some old configs had a value like
access: <data>
in the yaml. this would end up causing migration to
create a json file where it had no access values and
a default name of the data. that's not what the command
expects to operate on, so now we fix that during
migration and add a little mini migration for any
users that may have hit it.
Change-Id: I4c98ca5d09d043fe9338738ef6b4f930f933892c
In addition to upgrading the storj.io/common library, this change
moves off the TCPConnector in favor of the HybridConnector per
the deprecation warning.
Change-Id: I7e7e1e7568e8b95e4a99ad9caa158a799e68e1e3
Change the implementation of register and share so that it uses the
uplink method to contact the Auth Service. The network protocol switches
from HTTP to DRPC.
Closes https://github.com/storj/storj/issues/4324
Change-Id: Ib8fdb1665c6385bb39a546ba46a8df43a136df9c
Through `docker run storjlabs/storagenode:latest --help` we have always
made available around 100 command-line arguments.
However if you now pass such an argument it will be passed to
storagenode-update and it may no longer be recognized. This will cause
the storagenode not to start.
This was introduced in
https://review.dev.storj.io/c/storj/storj/+/5426
This change restores previous functionality.
Change-Id: I06823283ff82ffda12aee48c4d83717bddfbfdac
Change the order of when the storage node setup node loads the identity
for avoiding to write anything in the disk in the case that there is an
error loading the identity.
This bug was reported by @onionjake Github username's and the specific
changes to make.
Closes#4387#4396
Change-Id: I360fff3c23b160c9e055203d3526d749edfd9129
Get storagenode and storagenode-updater binaries during
run of the container to not to release new docker image
on each new version of the storagenode binary.
Fixes https://github.com/storj/storj/issues/4176
Change-Id: I994c4942136a2cc7298eb0346238689eb406ae5b
Use satellite.DB method TestingMigrateToLatest instead of
MigrateToLatest. TestingMigrateToLatest is much faster.
Also, run package tests in parallel.
Change-Id: I18bc0926dcfb80ace30d0b401e64ed919bfb966f
Value attribution codes were converted into UUIDs and stored
in the users, projects, api_keys, bucket_metainfos, and
value_attributions tables in the partner_id column. This
migration will lookup the appropriate partner name associated
with each of these UUIDs, and store the partner name directly
in the user_agent column within each table. If no corresponding
partner name exists for a partner_id, the partner_id value will
be stored instead.
Add migration for users table with tests.
Change-Id: I61254d9b81c474e76bcfc1c8cd863697c6ef44b6
Users signing up through a url containing a promo code will have that code applied to their stripe account instead of the free tier coupon.
Change-Id: I071041b0934648ef3f5bdb05b6ec97c400f89ae4