Small cleanups of the observer stats init code:
1. Use sync.Once for race free addition to the monitoring chain
(purely defensive)
2. Set the observer durations before adding to the monitoring chain on
first use.
3. observerDurations slice does not need to be initialized to non-nil
Change-Id: I9ae8ec96debc7d52c4ee5d22762a89f21bb2e38c
* use the same DB application name for satellite and metabase
* use noop orders DB implementation to avoid storing allocated bandwidth
in DB
Change-Id: I20e88c694d38240fe1a20c45719e210cfb76402c
bucket_bandwidth_rollups table
We have performance problems with updating bucket_bandwidth_rollups. To
improve situation we can stop storing allocated bandwidth in this table.
This should reduce large number of updates which are comming from
metainfo endpoints, repair workers and audit.
Next step will be to drop `allocated` column completely from
bucket_bandwidth_rollups.
Allocated GET bandwidth is all we need and we are keeping it in
bucket_bandwidth_rollups table.
Change-Id: Ifdd26a89ba8262acbca6d794a6c02883ad0c0c9b
On generated console api endpoints allow either the project ID or the
public ID to be used as the ID parameter.
github issue: https://github.com/storj/storj/issues/5412
Change-Id: Ic9901ed273931a50ae12f20142a3c4938dfcc8c0
When an observer errors we still want to finish the other observers.
This changes store the error and continues the loop, skipping
the observer which errored out and setting the duration metric to -1.
When the error occurs in the process stage, it does continue the other
ranges of the same observer. It removes the observer entirely after the process
stage. To improve this would make it more complex due to race
conditions.
Closes https://github.com/storj/storj/issues/5389
Change-Id: I528432c491d4340817d6950f1200ee2b9e703309
Move the IsAuthenticated check until after initial parameter
parsing/validation. IsAuthenticated will be more expensive than
parsing/validation, so we should fail before auth if possible.
Change-Id: I96a020892eabcb750e8ec9ecc1d8b7d9bf8bf573
Add option AsOfSystemTime to segment provider to make it equivalent to
the old segment loop.
There's no comment on what it does because it's pretty complex and
makes no sense, but we can improve it later.
closes https://github.com/storj/storj/issues/5434
Change-Id: I8f09b03803e681e2fd41008c5dba67804b0f37a1
On BSD, the storagenode-updater falls back to renaming the storagenode without
doing anything to restart the service.
Like the approach we have for linux, this change finds the process ID of the
storagenode using pgrep and sends an interrupt signal to the process.
Closes https://github.com/storj/storj/issues/5333
Change-Id: Icced90ea3e831528804784c2170a3b8b14952e8c
The `coupons`, `offers`, `coupon_usages`, `coupon_codes`, and
`user_credits` tables are not used anymore. They exist due to
legacy billing code which has already been removed.
Change-Id: Ie93af90a06e5247a74015af2a78a223d69249565
The shell script has different behavior on different operating systems,
and now has enough logic that it makes sense to put it into a go script.
This change replaces dbx/gen.sh with dbx/gen/main.go, which accomplishes
the same thing.
Change-Id: Ibb53ebf9c2475377aae1165cfe245140d7162026
The current satellitedb.dbx has grown quite big over time.
Split the single file into multiple smaller ones.
Change-Id: I8d6ca851b834ac60820aff565906f04aab661875
We have to wait until the slowest node is done being tested before we
can move on to the next of segments. Since the slowest node can be
arbitrarily slow, we'll set a timeout and treat too-slow nodes as
temporarily offline.
Change-Id: I80fe865dd4e8f826700430fb0140c2d3aefca381
When we are verifying pieces by downloading the first byte, if we
encounter a timeout, treat the node as if we failed to connect to it,
and log the error once instead of twice.
Change-Id: I70602d554183c98f1213f3ffb1bfec41100ea0e7
This csv file was being closed as soon as the service was created.
All subsequent writes to the closed file handle produced errors,
which were logged but otherwise ignored.
Instead, we would like the file to remain open and writable, until
the service is destroyed.
Change-Id: Ib29944d25b2f5b2d0f90fdbdcde44fea8d769321
Update get usage-limits, daily-usage and salt endpoints to support
both project-ID and project-PublicID.
Issue: https://github.com/storj/storj/issues/5411
Change-Id: Iff0114a295d1a479b141bfffbfb31599844d1fc0
Update the delete API key by name and projectID to support project-ID
and project-publicID.
Issue: https://github.com/storj/storj/issues/5410
Change-Id: I3bd11b9c3ae1ad6ce662dfc18b42779d2e4edf9b
Add an observer to monitor ranged segment loop progress.
Tested by running the segment loop in storj-up and navigating to
http://<container>:11111/mon/stats and there is the entry:
rangedloop-live,scope=storj.io/storj/satellite/metabase/rangedloop numSegments=364523630000.000000
part of https://github.com/storj/storj/issues/5223
Change-Id: If3d2774d2f17f51eac86f47c6dda1fb8ad696dfe
The copyFile method has some safeguards to ensure that the multipart
write is aborted. This is accomplished by always calling abort on the
MultiWriteHandle when the copy is finished, whether or not there was a
failure or it was successfully committed. If the copy was committed,
then this RPC is a no-op on the metainfo server.
Regardless, the calls to abort to constitute an additional RPC to the
satellite for no benefit. This is exacerbated by the fact that the code
currently ends up calling abort twice.
This change updates the libuplink-backed MultiWriteHandle implementation
to not call abort if the write is committed and vice-versa. This
eliminates the two wasteful RPC calls.
Change-Id: I13679234f6f473e9a93179e6791fb57eac512f25
Removing all references to column last_verification_reminder which is to be removed, due to new column verification_reminders
Issue: https://github.com/storj/storj/issues/4560
Change-Id: I7c9a426e946c7aed58e62c1eef80629daf6b1272
Support interruption of the ranged segment loop through context.
Part of https://github.com/storj/storj/issues/5223
Change-Id: Iae0260e250f8ea33affed95c6592a1f42df384eb
added in storj-sim rangedloop for each satellite, to verify it works for metrics oveserver,
removed identity from rangedloop peer as we never use it, added logs on service run, added loop
to service instead of endless for loop, interval value to config
Closes: https://github.com/storj/storj/issues/5414
Change-Id: Ibc3b06071b68feda4a35b45da2bbe36e22a02fc8
Add public ID field to graphql Project so it can be used on the front
end. Additionally public_id needed to be added to the ListByOwnerID sql
query which is called by graphql OwnedProjectsQuery.
github issue: https://github.com/storj/storj/issues/5408
Change-Id: I2ec04363c20493dc0f9c70b6d1610f724f18ec2f
Previously, if any pieces are still on disqualified nodes, this tool
would treat those pieces as fine (if the disqualified node is still
online) or temporarily unavailable (if the disqualified node is
offline). Instead, we should treat such pieces as lost.
This also fixes a slight problem with the code that handles a broken
alias. This is not likely to happen, but if we do see an alias that is
not in the alias map, we return an error instead of nil.
Change-Id: Ib4e2e729ef0535dd7bd9ce2f621680d9f959891c
This testplan is going to cover the changes to allow for scaling audit workers. It will go over the scaling audit worker design doc found under blueprints.
Co-authored-by: Antonio Franco (He/Him) <antonio@storj.io>
Slightly modified the warning message on the "Add STORJ Tokens" modal;
And added a warning for zkSync.
Issue: https://github.com/storj/storj/issues/5418
Change-Id: I206e7b493c3fe04c69a3815a5f03bd6a07cfceae
Wire up duration measurement of observers with monkit.
Tested by attaching a SleepObserver, starting the rangedloop in storj-up
and navigating to http://<container>:11111/mon/stats. It reports the
following statistic:
completed-observer-duration,observer=*rangedlooptest.SleepObserver,scope=storj.io/storj/satellite/metabase/rangedloop duration=10.000117
Change-Id: Ief131d34001dd5d3ba1d7be6f161986e1f66440d
Before we introduced objects versions internally move operation was
always failing when under target location object exists. But then we
had only single version 1 all the time. With versions different than 1
we need to check all existing objects under target location.
To be backward compatible with our API new logic looks like this:
* if there is no object under target location use source object version
as target version
* if there are only pending objects find first free (highest) version
which could be used to move object there
* if there is committed object under target location reject move
operation
Fixes https://github.com/storj/storj/issues/5403
Change-Id: I717f3e7c42470b406287d6ec335f6f057d3fc3b5
Because it was originally intended to work on only a few pieces from
each segment at a time, and would frequently have reset its list of
online nodes, segment-verify has been taking nodes out of its
onlineNodes set and never putting them back. This means that over a long
run in Check=0 mode, we end up treating more and more nodes as offline,
permanently. This trend obfuscates the number of missing pieces that
each segment really has, because we don't check pieces on offline nodes.
This commit changes the onlineNodes set to an "offlineNodes" set, with
an expiration time on the offline-ness quality. So nodes are added to
the offlineNodes set when we see they are offline, and then we only
treat them as offline for the next 30 minutes (configurable). After that
point, we will try connecting to them again.
Change-Id: I14f0332de25cdc6ef655f923739bcb4df71e079e
This change implements the ranged loop observer to replace the audit
chore that builds the audit queue.
The strategy employed by this change is to use a collector for each
segment range to build separate per-node segment reservoirs that are
then merge them during the join step.
In previous observer migrations, there were only a handful of tests so
the strategy was to duplicate them. In this package, there are dozens
of tests that utilize the chore. To reduce code churn and maintenance
burden until the chore is removed, this change introduces a helper that
runs tests under both the chore and observer, providing a pair of
functions that can be used to pause or run the queueing function.
https://github.com/storj/storj/issues/5232
Change-Id: I8bb4b4e55cf98b1aac9f26307e3a9a355cb3f506
This ensures that empty trash and restore trash cannot run at the same
time.
Fixes https://github.com/storj/storj/issues/5416
Change-Id: I9d2e3aa3d66e61e5c8a7427a95208bb96089792d
The WithExists methods previously were not writing problematic pieces to
problem-pieces.csv. With this change, they will.
Change-Id: I51eadd3d8f4299e1efa787c9266a7aacfa525eb3
When this branch is followed, `audit.OutcomeFailure` is returned, and
`MarkNotFound()` is immediately called again (in
`(*NodeVerifier).Verify()`). Calling `MarkNotFound()` twice for the same
piece is not correct.
Change-Id: I1a2764bc32ed015628fcd9353ac3307f269b4bbd
It may help to know how much faster these methods are than the
alternative (asking nodes for each piece in turn).
Change-Id: Ieb7c963f62b662f72c84a49de8a09c065c14f782
It was ok as it was, but since we want to keep a close eye on progress
while the tool is running, it will help to have results written to the
output file immediately instead of after the buffer is full or the
program exits.
Change-Id: Ie027f05771a637afb06969ec775cd32b142b7635