This makes it possible to remove of this obsolete flag from the
multi-tenant gateway.
As a consequence, displaying the GATEWAY_0_ACCESS env var will always
require a running storj-sim. Until now, it was required only the first
time. Then the value was stored in the 'access' config. But this is now
not possible anymore.
The changes in StripeMock are required to fix failures in integration
tests. StripeMock is in-memory and its data does not survive restarts of
storj-sim. The second and following starts of storj-sim had invalid
state of StripeMock, which failed requests that were required to
populate the GATEWAY_0_ACCESS env var. The changes in StripeMock makes
it repopulate the Stripe customers from the database.
Change-Id: I981a208172b76577f12ecdaae485f5ae4ea269bc
log.Fatal immediately terminates the program without running any defers.
We should properly close all the services and databases.
Change-Id: I5e959cef3eafedeacb3a2062e3da47e8d04e8e75
Sadly the build process with this command is very, very flaky and often fails pulling down curl via apk.
As we currently do not need it anyway, it is safe to remove.
Change-Id: I8a396c560d61a7fe6324560152a68c07c6b31638
The VerifyPieceHashes method has a sanity check for the number pieces to
be removed from the pointer after the audit for verifying the piece
hashes.
This sanity check failed when we executed the command on the production
satellites because the Verify command removes Fails and PendingAudits
nodes from the audit report if piece_hashes_verified = false.
A new temporary UsedToVerifyPieceHashes flag is added to
audits.Verifier. It is set to true only by the verify-piece-hashes
command. If the flag is true then the Verify method will always include
Fails and PendingAudits nodes in the report.
Test case is added to cover this use case.
Change-Id: I2c7cb6b12029d52b2fc565365eee0826c3de6ee8
Remove usage of --non-interactive flag. It is not provided (and
necessary) by the multitenant S3 gateway anymore.
ACCESS_KEY and SECRET_KEY env vars are not provided anymore as they are
not generated by the multitenant S3 gateway.
Change-Id: I3ecfb92110e31ae13977de3899dad273daae6c1e
Jira: https://storjlabs.atlassian.net/browse/PG-69
There are a number of segments with piece_hashes_verified = false in
their metadata) on US-Central-1, Europe-West-1, and Asia-East-1
satellites. Most probably, this happened due to a bug we had in the
past. We want to verify them before executing the main migration to
metabase. This would simplify the main migration to metabase with one
less issue to think about.
Change-Id: I8831af1a254c560d45bb87d7104e49abd8242236
Jira: https://storjlabs.atlassian.net/browse/PG-67
There are a number of old-style objects (without the number of segments
in their metadata) on US-Central-1, Europe-West-1, and Asia-East-1
satellites. We want to migrate their metadata to contain the number of
segments before executing the main migration to metabase. This would
simplify the main migration to metabase with one less issue to think
about.
Change-Id: I42497ae0375b5eb972aab08c700048b9a93bb18f
Fix case where line is larger than maxline and writer ended up in an
infinite loop trying to buffer more data.
Change-Id: I243da738b331279d6bf27255778b5798e7f37f95
This PR updates `uplink rb --force` command to use the new libuplink API
`DeleteBucketWithObjects`.
It also updates `DeleteBucket` endpoint to return a specific error
message when a given bucket has concurrent writes while being deleted.
Change-Id: Ic9593d55b0c27b26cd8966dd1bc8cd1e02a6666e
periodically create and delete a temp file in the storage directory
to verify writability. If this check fails, shut the node down.
Change-Id: I433e3a8d1d775fc779ae78e7cf3144a05ffd0574
Jira: https://storjlabs.atlassian.net/browse/USR-822
This the last step of dropping these 2 db tables. It also deletes all
code associate with them.
Change-Id: I8be840dc2a7be255cf6308c9434b729fe4d9391e
Add a config so that some percent of users require credit cards /
account balances
in order to create a project or have a promotional coupon applied
UI was updated to match needed paywall status
At this point we decided not to use a field to store if a user is in an
A/B
test, and instead just use math to see if they're in a test. We decided
to use MD5 (because its in Postgres too) and User UUID for that math.
Change-Id: I0fcd80707dc29afc668632d078e1b5a7a24f3bb3
To prevent storagenode from implicitly recreating missing dbs and storage,
as such behaviour leads to audit failures. Do not allow storagenode to
start if any of dbs or storage is missing, corrupted, or dedicated storage disk is
unmounted, to get downtime instead.
Change-Id: Ic64e1f0ff4d8ef5b2fddbe7a7e53df4f4bd8652e
We passed in revocationDB and metainfoDB for no reason.
Lets remove it from the dependency list to further reduce the footprint.
Change-Id: Ic0317bb92670fbd305d4a8b0ed1cb82858e2f6d3
Jira: https://storjlabs.atlassian.net/browse/USR-821
The `migrate-credits` billing command checks the available credits
balance for all users and moves it to the Stripe balance by creating a
new credit balance transaction.
Change-Id: Iafc7b95a4edad47f7c145a22e210f8c821ac183d
Add a long description to the graceful exit command to clearly mention
that the command is interactive asking which satellite the SNO wants to
exit.
Change-Id: Icd4056a470e707322f600133e63d9dc56eb877b7
When a request comes in on the satellite api and we validate the
macaroon, we now also check if any of the macaroon's tails have been
revoked.
Change-Id: I80ce4312602baf431cfa1b1285f79bed88bb4497
when a user runs `uplink share`, they get a bunch of results back,
given their configuration and existing access. one of the results
is a URL for in-browser sharing and hosting of the file.
first off, we want to make sure this URL is read only. we want to
avoid a situation where someone post this URL to some public
location, not realizing the access allows writes or deletes. if
a user really wants a URL with write/delete access, they can
construct it themselves.
secondly, we want to make sure the url is sharing a single path or
path prefix. having a url for multiple paths/path prefixes
indepedently again can be constructed of course, but should not
be the default behavior
Change-Id: I2ca2ebeea9f1c7d4bfbd7a437a32dc7a3b2a32cc
Reverts https://review.dev.storj.io/c/storj/storj/+/2074
That change broke all billing commands with `sql: database is closed`.
Fixing the issue does not make sense as the whole setup of billing
commands is now error-prone. It's better to revert it.
Change-Id: Id13f817b73c7313ba60181f740b0712e4bdce077
setup
We don't have default ports as a part of configuration anymore because
satellite-addr flag was removed.
Change-Id: Ibf9fc4b399beaf51ebb9461de2d8994a322f9686
This changeset allows a user agent string to be set during uplink setup, which
is thereafter used for partner value attribution. EG
uplink setup --client.user-agent ”MyCompany”
Change-Id: Iefa8755fccc06acb8a303a342b943cece44a81f7
Remove unneeded observer's tests cases after the changes that enhanced
the segment-reaper not to be limited to objects up to 65 segments.
Also remove some tests cases that were already covered by the
TestObserver_processSegment_from_to test function.
Change-Id: I2385b406b50f7810ed22d3297b903a993f83fcfe
Viper has a feature/bug where all YAML config keys are cast
to lowercase (see https://github.com/spf13/viper/issues/260
and https://github.com/spf13/viper/issues/411).
Since we use Viper to load our config values, it doesn't seem
like there's an easy way to preserve case sensitivity for the
access names chosen by users right now. Adding this prompt
should help user experience by clarifying that all access
names must be lowercase.
Change-Id: I47e8344bb0ca7e78458405496f20e78e3c9f9a88
Rename a test function to indicate the function that it actually tests
rather than indicating the function that calls the function under test.
Change-Id: I2c1fe165e3acf952993a9a65dfbfbdcda0d1cccc
After the segment-reaper was enhanced to support objects with any number
of segments, the test helper function to generate test data doesn't have
to support to generate objects with a number of segments up to or over
the previous limit.
The commit remove that unnecessary test logic.
Change-Id: I2897c5a96fb133f61de2cdb0ef7d13ee5e0151e8
This allows to seeing logs in the output of the invoice commands.
Existing ensure-stripe-customer commands is moved from the 'reports' to
the new 'billing' root command.
Change-Id: I752c7ab6ca59bfac8e0f174a45d2ab45fc18e467
able to cover more testing scenarios
Currently, its hard to implement test suite for payments because
mockpayments is on to high level and we cannot emulate many things e.g.
adding credit card. This change is first to be able to add mock for
Stripe client and do more granular tests.
Change-Id: Ied85d4bd0642debdffe1161657c1e475202e9d23
To avoid including multiple months in a single invoice, we need all
inspector's invoice commands to run in for specific period.
See https://storjlabs.atlassian.net/browse/USR-725
Change-Id: I3637dc189234f02350daca8d897c21765762ea55
See https://storjlabs.atlassian.net/browse/SM-752
These changes allow us to change the log level at runtime through a handler off of the debug endpoint.
Examples of changing the log level on storj-sim
To get the current level for the satellite api process:
curl -XGET 'http://127.0.0.1:10009/logging' --header 'Content-Type: text/plain'
To change the log level:
curl -XPUT 'http://127.0.0.1:10009/logging' --header 'Content-Type: text/plain' --data-raw '{"level":"error"}'
Change-Id: I05d164b290929fa06b6d78c01075ee41f8238044
Reorganize some operations in the observer.processSegment method:
* to minimal reduce the memory usage removing the segments of the
objects marked as skipped. Skipped objects aren't discarded of the
analysis stage so the segments aren't needed.
* returns earlier when an object is skipped because isn't needed a
further processing.
Change-Id: I210a26c394477ee411ff7f640507dcc07733a47f
Rename a test name and an error class because the bitmask was renamed to
bitArray in a previous commit, however, they weren't updated.
Change-Id: I92afcf747dec9e0a15d4c59b896e07b942f3518b
The bitArray.IsSequence method had a bug that caused that sometimes the
test "Bitmask/IsSequence/sequence starts at other index than 0" to fail
due to it uses random values.
The test mostly found a corner case, I've added new tests to catch the
bug and then apply the fix.
Change-Id: If01dc07006d261ba40bbd99d36ef776e09ed3efc
Change the bitmask used by segment reaper to use []byte rather than uint64
This passes tests but I have literally no clue how to integration test this.
Change-Id: I393f4598b27cae6e427da2190dd3109bca721c34
Currently uploads can cause a lot of IOPS, reduce this by introducing a
in-memory buffer on-top of the file.
Change-Id: I5f4e3e01c0a36258271d180b922107de447bcb59
CreateTables hasn't been quite true for a while now, rename to
MigrateToLatest to be clearer in it's behavior.
Change-Id: Ida48e95122a5d9b7a814e922d3698e00024a2ba7
when tracing is enabled, we should also set sampling rate to
a non-zero value. For now, we will set it to 1.
Uplink CLI users should be able to override it with the sample
flag.
Change-Id: I8bcf514fb14c2a1c4349b7957dd24ec23e4a85e5
In cases like the segment reaper script connecting to the metainfodb,
we don't want a db migration to happen automatically when we call
metainfo.NewStore. This adds MigrateToLatest method for postgreskv
and cockroackv, and calls MigrateToLatest in places where NewStore used
to create tables.
Change-Id: I682d0f26d609af0601dfdb32a24866cdf5d32a7e
Previously we are using tracing.sampled to be the switch for turning on/off tracing.
However we would like to separate sampling rate from being the switch,
so we can set sampling rate to be 0 but still intialize tracing for
satellite and storagenodes
Change-Id: I27e6ba25ea6f6b612b4e1a57cf1301889ded41ec
We want to make tracing to be opt-in.
For now, we will use `tracing.sample` as the toggle config to enable or
disable tracing and default to sample every traces from uplink cli.
If user wants to change the default sampling rate, they can do so by
using the `--tracing.sample` flag to override the default value
Change-Id: I6f25dac0f43024c50a8aaf6c549e6a514211f834
currently production uses a different application suffix for gc
services, so chronograf can distinguish between gc processes and core
processes, but it'd be nice to be a bit more consistent with repairers
and api servers
Change-Id: Icb96fed006c59d7afd730317d35636a6e4573b58
Currently storj-sim relies on the log lines to be exactly the same,
when they change it cannot find the necessary information from log.
Change-Id: Ia039915ef3375a7cf60f107b2c05c958de15b6d5
uuid.UUID implements driver.Value so it can be directly used as a
scannable result.
Replace uses of dbutil.BytesToUUID with uuid.FromBytes.
Change-Id: I51a670185ceb3cc2199d5aa2b76bc3fc191ca8fe
Previous split to a storj.io/private repository broke tag-release.sh
script. This is the minimal temporary fix to make things work.
This links the build information to specified variables and sets them
inline. This approach, of course, is very fragile.
Change-Id: I73db2305e6c304146e5a14b13f1d917881a7455c
* debug
* traces
* cfgstruct
* process
Package `storj/private/version` will be removed as a separate change.
Change-Id: Iadc40faa782e6225513b28218952f02d9c240a9f
We are not generating identity for gateway during setup anymore, its
generated on-fly by libuplink. With this change we can remove
`--identity-dir` from gateway.
Change-Id: I63298115d4399564b2f29541b8dc16e3b3acdcf2
The admission/v3 protocol now supports arbitrary key/value headers to be
included in each packet of metrics. This commit creates support for
this, so the lua config file can declare a filter taking into account
the key/value headers.
Change-Id: I41de8c018d33304ccf46ec221ae689d55c5fb1ee
Previously, we were simply discarding rows from the repair queue when
they couldn't be repaired (either because the overlay said too many
nodes were down, or because we failed to download enough pieces).
Now, such segments will be put into the irreparableDB for further
and (hopefully) more focused attention.
This change also better differentiates some error cases from Repair()
for monitoring purposes.
Change-Id: I82a52a6da50c948ddd651048e2a39cb4b1e6df5c
New API has limited number of options to configure at the moment. We
should remove unused flags from Uplink CLI and add if needed in the
future.
Change-Id: Icf3f3dadd43cb61a3b408b02d0762aef34425dbf
On satellite, remove all references to free_bandwidth column in nodes table.
On storage node, remove references to AllocatedBandwidth and MinimumBandwidth and mark as deprecated.
Protobuf message, NodeCapacity, is left intact for backwards compatibility.
Once this is released to all satellites, we can drop the column from the DB.
Change-Id: I2ff6c6537fc9008a0c5588e951afea58ede85838
By using a require for storj.io/storj it will make the import
unambiguous. This means it is possible to have a module name
storj.io/storj/cmd/gateway.
Change-Id: I98439cbbaf433ae31309b7f80a19ced896018f65
Now that we are trying to identify the root cause of the satellite load limitations (i.e. currently the satellite has a max ability of 400 rps for uploads and we need this to be higher), we are using the golang diagnostic tools to collect insight into what the bottlenecks are. We currently have a debug endpoint to gather some cpu and mem data, but it could be useful to have continuous profiling. GCP stackdriver has support for continuous profiling so lets set that up and see if it is helpful to gather more data.
This PR adds support for [GCP continuous profiler](https://cloud.google.com/profiler) which allows enabling continuous cpu/mem profiling and the stats are sent to stackdriver in google cloud console.
To enable the continuous profiling for a storj component, do the following:
- prereq: the workload must be running in GKE and have Stackdriver Profiling IAM role permissions
- provide the config flag `debug.profilename` in the config.yaml file for the workload (i.e. satellite api process, etc). The profilename should be the workload name, for example "satellite-api".
- once the above config flag is provided, the profiler will be initialized and profiling stats will automatically be sent to GCP project where the workload is running and viewable in the Stackdriver Profile page in the console
The current implementation assumes the workload is running in GKE, however if we find if useful we can add support to enable this from anywhere. But for simplicity, its configured this way assuming the main goal is to enable in production systems.
Change-Id: Ibf8ebe2df7bf06fdd4951ee6a1e48854dd36ad47
The code to generate monkit.lock has a bug where it doesn't take
ScopeNamed into account and assumes the package. Since the downgrade
file was created from monkit.lock, we also assumed the package, so
we were downgrading to the wrong metric.
No other places call ScopeNamed that would cause a problem.
Change-Id: If9fbbd971a7d755f5de33ed20b8a6bcc95670ee3
This peer will contain our administrative panels.
It's completely separated from our other satellite
processes because it allows better control for restricting
access to it.
Change-Id: Ifca473bee82ff6c680b346918ba32b835a7a6847
In case the endpoint doesn't start, it might end up indefinitely
waiting for it to come up stalling jenkins.
Change-Id: Ib10bf1a25461e7532ec56ca705178bc9a7f85d12
this commit updates our monkit dependency to the v3 version where
it outputs in an influx style. this makes discovery much easier
as many tools are built to look at it this way.
graphite and rothko will suffer some due to no longer being a tree
based on dots. hopefully time will exist to update rothko to
index based on the new metric format.
it adds an influx output for the statreceiver so that we can
write to influxdb v1 or v2 directly.
Change-Id: Iae9f9494a6d29cfbd1f932a5e71a891b490415ff
Currently we risk losing pending bandwidth rollup writes even on a clean
shutdown. This change ensures that all pending writes are actually
written to the db when shutting down the satellite.
Change-Id: Ideab62fa9808937d3dce9585c52405d8c8a0e703
Setup command of uplink has to create the configuration directory just
before saving the configuration file for making it more robust than
creating in the initial state of the process.
When creating the directory at the beginning of the process leaves the
possibility to delete such directory during the setup process and leads
to a failure.
Ticket https://storjlabs.atlassian.net/browse/V3-3545
Change-Id: I30db0175e23a597e9675d267b4d7e25d5d4c5119
storagenode database preflight check.
Disable preflight database check by default, and have the option to
enable it. This will allow us to enable it once it is definitely
working.
Also change the name of the config flag for preflight time sync.
Change-Id: Ie2e20f9e25dcb38794eafa7e1505e7c6ff287c99
for storagenode
Ensure that database schema matches latest test migration schema before
allowing the node to start up.
Ensure minimal read/write functionality for each storagenode database
before allowing the node to start up.
This will eliminate many unhandled audit errors we are seeing.
Change-Id: Ic0e628b04a9c35b7a8243f6a81d4683918170ba9
this commit introduces the reported_serials table. its purpose is
to allow for blind writes into it as nodes report in so that we have
minimal contention. in order to continue to accurately account for
used bandwidth, though, we cannot immediately add the settled amount.
if we did, we would have to give up on blind writes.
the table's primary key is structured precisely so that we can quickly
find expired orders and so that we maximally benefit from rocksdb
path prefix compression. we do this by rounding the expires at time
forward to the next day, effectively giving us storagenode petnames
for free. and since there's no secondary index or foreign key
constraints, this design should use significantly less space than
the current used_serials table while also reducing contention.
after inserting the orders into the table, we have a chore that
periodically consumes all of the expired orders in it and inserts
them into the existing rollups tables. this is as if we changed
the nodes to report as the order expired rather than as soon as
possible, so the belief in correctness of the refactor is higher.
since we are able to process large batches of orders (typically
a day's worth), we can use the code to maximally batch inserts into
the rollup tables to make inserts as friendly as possible to
cockroach.
Change-Id: I25d609ca2679b8331979184f16c6d46d4f74c1a6
JIRA: https://storjlabs.atlassian.net/browse/V3-3499
The `uplink share` command does not print the restricted API key and the
restricted encryption access anymore.
Change-Id: Ie4ebe0b27067ee00af97c775f4e06f558b894fe2
We want to make using uplink as easy as possible. That's why we wan't to
avoid requiring setup or import command before normal usage if user
specified --access flag. If this flag is set then rest flags should be
set as defaults.
https://storjlabs.atlassian.net/browse/V3-3490
Change-Id: I95a7bd77a3f00b8d9981fee513e9e77aef298bca
We decided that better name for "scope" will be "access". This change
refactors cmd part of code but don't touch libuplink. For backward
compatibility old configs with "scope" field will be loaded without any
issue. Old flag "scope" won't be supported directly from command line.
https://storjlabs.atlassian.net/browse/V3-3488
Change-Id: I349d6971c798380d147937c91e887edb5e9ae4aa
Remove starting up messages from peers. We expect all of them to start,
if they don't, then they should return an error why they don't start.
The only informative message is when a service is disabled.
When doing initial database setup then each migration step isn't
informative, hence print only a single line with the final version.
Also use shorter log scopes.
Change-Id: Ic8b61411df2eeae2a36d600a0c2fbc97a84a5b93