Satellite caches the project bandwidth in Redis when it doesn't have it
because was not set or the key expired, however, it doesn't perform the
check and set if not exists in a transaction. It also uses the increase
function which increases the value if it exists otherwise it sets it.
This provokes that multiple concurrent request to the same project may
increase the total project by multiples of the bandwidth usage
registered in the database rather than setting it because they may check
if the key exists before any other has executed the increase and then
the first one executing it will set the value but the others will
increased causing that Redis has a wrong bandwidth usage value which is
N magnitude of the real one and making the satellite to deny the
downloading if it surpasses the project limit.
This commit changes the "update"" project bandwidth usage by an "insert"
but using a Redis function that only sets the value if the key doesn't
exists for solving the increase issue but also not overriding the value
due to may contain updates of other downloading requests which aren't
already registered in the DB.
Change-Id: I33e2fe462930b2fdb4061fc94002bd3544476f94
TestRollupNoDeletes is very flaky (passes locally but fails in the main branch build).
The exact reason is not clear, but stopping the loop seems to be async, the following lines may not stop the loops immediatelly which is a potential problem:
```
satellitePeer.Accounting.Rollup.Loop.Pause()
satellitePeer.Accounting.Tally.Loop.Pause()
```
Fortunatelly these test check only the database interfaces. Instead of testplanet.Run we can run only satellitedbtest.Run which is faster and more predictable (no background loops).
Other potential problem: comment claims that the default of DeleteTallies is false:
```
// In testplanet the setting config.Rollup.DeleteTallies defaults to false.
```
But it seems to be true (rollup.go):
```
DeleteTallies bool `help:"option for deleting tallies after they are rolled up" default:"true"`
```
This is also fixed in the patch (as we need set it explicit), but TBH it can be fixed with testplanet, too.
Change-Id: Id7ec80d5c069bed2c556f4d001c71aa23fc5af23
We don't have metric to track how many pending objects we have in the
system. This change is using tally objects loop to collect pending
objects per bucket and at the end its combining all buckets values
into single metric for all pending objects in a system.
Change-Id: Iac7a6bfb48854f7e70127d275ea8fdd60c4eb8b7
Recently we applied this optimization to metrics observer and time
used by its method dropped from 12m to 3m for us1 (220m segments).
It looks that it make sense to apply the same code to all observers.
Change-Id: I05898aaacbd9bcdf21babc7be9955da1db57bdf2
modify tally to calculate how we need to update segments in the live accounting
cache with UpdateProjectSegmentUsage method. adjust accounting.ExceedsUploadLimits
to use only cache for segment validation, if cache returns 0 or key not found then
we shouldn't reject such project as its possible that we won't have this value before first object iteration
https://github.com/storj/storj/issues/4744
Change-Id: I32c22d7fb71236e354653ba8719e029fc71f04c7
Doing server-side copy operation should not affect user monthly
bandwidth. This test covers that case.
Fixes https://github.com/storj/storj/issues/4717
Change-Id: I84ffab96b84851f395ea3a34d88f7dba424ec440
Initialy we wanted to put segment usage into cache values retrieved
directly from metabase but it cause performance issues. Now we
will be collecting segment usage from tally object loop and those
values will be put into a cache but becuse we cannot get those values
on demand we shouldn't expire cache value at all because objects
loop requires sustencial amount of time to be executed.
Part of solution for https://github.com/storj/storj/issues/4744
Change-Id: I3b37e23badeecebed0c95064156e85b38038bfe2
Fix for this customer issue
https://github.com/storj/customer-issues/issues/34
By this change we fetch bucket usage since its creation instead of using project's createdAt timestamp.
Change-Id: Ic0ea5d169056a5bd64ed143d13954d794da6e1d2
return storage and segment totals as a single result, instead of returning only storage
and bandwidth and segment values are filtered out, https://github.com/storj/storj/issues/4744
Change-Id: I624d67ed5205ae21ecd5a2f39775f63ed042e629
We were already able to override (or not) metadata with this method
but to be explicit we are introducting new option to control storing
metadata with object. Separate option should be less error prone.
https://github.com/storj/team-metainfo/issues/105
Change-Id: I4c5bce953a633a0009b05c5ca84266ca6ceefc26
We implemented server-side copy feature and we would like to
confirm that it is not affecting accounting/tally service.
Resolves https://github.com/storj/storj/issues/4697
Change-Id: I3944ea52c0acc68107ec15c1911750dc7d947501
Added new endpoint to get project's single bucket usage rollup.
Extended generation code to handle service method args.
Change-Id: Ief768632a801c047c66e0617056fbd7b30427b33
Added new projectaccounting query to get project's single bucket usage rollup.
Added new service method to call new query.
Added implementation for IsAuthenticated method which is used by new generated API.
Change-Id: I7cde5656d489953b6c7d109f236362eb465fa64a
Finished implementing queries for both bandwidth and storage using pgx.Batch.
Fixed CSP styling issue.
Change-Id: I5f9e10abe8096be3115b4e1f6ed3b13f1e7232df
Existing tests are checking errors by comparing error message.
We plan to expose storage/segment limit errors as a public uplink
API but before that will happen we need to update tests to handle
both cases as new uplink error will have a bit different error message
than current tested error.
https://github.com/storj/uplink/issues/77
Change-Id: Id2706323c60d050d96752e66e859d4ec051a69b9
Implemented endpoint and query to get bandwidth chart data for new project dashboard.
Connected backend with frontend.
Storage chart data is mocked right now.
Change-Id: Ib24d28614dc74bcc31b81ee3b8aa68b9898fa87b
For better performance we should use AOST for getting
data where being up to data it not very crucial. We don't
care about small differences for segment limit calculation.
Change-Id: I9b2d4f2bd15ebc9d1c46bc84dd51a2e9d9231506
Additional test case to cover situation when cache value
expired and we need to get this value directly from DB.
Change-Id: I07bda67e19333e09a567104ce70f112fd47a7845
We had two problems here. First was how we were handling
errors message from GetProjectSegmentUsage. We were always
returning error, even for ErrKeyNotFound which should be used
to refresh value in cache and not propagated out of method.
Second, we were using wrong value to update cache. We used
current value from cache which obviously it's not what we intend.
Fixes https://github.com/storj/storj/issues/4389
Change-Id: I4d7ca7a1a0a119587db6a5041b44319102ef64f8
We need to combine methods from accounting.Service (ExceedsStorageUsage and ExceedsSegmentUsage)
to run checks concurrently.
Resolves https://github.com/storj/team-metainfo/issues/73
Change-Id: I47831bca92457f16cfda789da89dbd460738ac97
We want to be able to limit the number of segments per project for users.
To limit this we need to check limit value associated with project
and value of used segments already in BeginMoveObject, BeginMoveSegment
and increment cache segments usage after each CommitSegment call.
Resolves https://github.com/storj/team-metainfo/issues/1
Change-Id: I6290e67c095a174b9d101c4521802d9bfe0453b8
Batching of the order submissions can lead to combining the allocated
traffic totals for two completely different time windows, resulting
in incorrect customer accounting. This change will group the batched
order submissions by projectID as well as time window, leading to
distinct updates of a buckets bandwidth rollup based on the hour
window in which the order was created.
Change-Id: Ifb4d67923eec8a533b9758379914f17ff7abea32
We want to set maximum number of segments per project. This change will
add functionality to get number of segments currently used by project.
To avoid often DB calls segment cound will be cached and refreshed every
few minutes.
Change-Id: I2ecb6484f5afc3875c0e0dfaea360e8872f9d196
Exposes functionality to get and update project segment
limit. It will be used to limit number of segments per project
while uploading object.
Change-Id: I971d48eebb4e7db8b01535c3091829e73437f48d
Added new query to get project object and segment count.
Added appropriate object and segment count view for new project dashboard.
Change-Id: I69a2e55442f318c51dc365c0c578b964f2f06c7f
Even though we want to start charging segment fee instead of object fee,
it's hard for users to understand what a segment is. This PR adds the
object count back in the UI alongside with segment count to help address
the issue.
Change-Id: I92eb42c769d350eba68a72443deffec5c278359c
Populate the egress_dead column for taking into account allocated bandwidth that can be removed because orders have been sent by the storage nodes. The bandwidth not used in these orders can be allocated again.
Change-Id: I78c333a03945cd7330aec052edd3562ec671118e
At some point we moved metabase package outside Metainfo
but we didn't do that for satellite structure. This change
refactors only tests.
When uplink will be adjusted we can remove old entries in
Metainfo struct.
Change-Id: I2b66ed29f539b0ec0f490cad42c72840e0351bcb
Bucket tally calculation will be removed from metaloop and will
use metabase objects iterator directly.
At the moment only bucket tally needs objects so it make no sense
to implement separate objects loop.
Change-Id: Iee60059fc8b9a1bf64d01cafe9659b69b0e27eb1
Current tally is calculating storage both for buckets and
storage nodes. This change is moving nodes storage
calculation to separate service that will be using
segment loop.
Change-Id: I9e68bfa0bc751c82ff738c71ca58d311f257bd8d
When a user adds a credit card, switch them to the paid tier and update
their projects with new bandwidth/storage limits. New projects for the
paid tier user will also have the updated limits.
The new limits are:
* storage per project - 50 GB free/25 TB paid
* bandwidth per project - 50 GB free/100 TB paid
Change-Id: I7d6467d077e8bb2bbe4bcf88ab8d75490f83165e
We want to calculate bucket tally only from iterating objects.
Object currently has an info about totals for bytes and segments.
We need to adjust tallies to keep those totals. Older entries will
be untouched and code will use totals only if available. Change
is adding columns for totals to bucket_storage_tally table and
is adding general handling for them.
Next step is to start using total columns instead of inline/remote.
This will be done with next change.
Change-Id: I37fed1b327789efcf1d0570318aee3045db17fad
We want to use StreamID/Position to identify injured
segment. As it is hard to alter existing injuredsegments
table we are adding a new table that will replace existing
one. Old table will be dropped later.
Change-Id: I0d3b06522645013178b6678c19378ebafe485c49
Adding AS OF SYSTEM TIME to query that is calculating project bandiwdth.
As an addition method for setting interval is added as test doesn't
work well with default interval.
Change-Id: Id1e15be4f6afff13b9dc2b7f595e2edb6de28db9
The redis key associated to bandwidth usage depended on the current month.
This change makes it depends also on the day, so that we update bandwidth usage
daily to take into account changes associated to expired but not used allocated bandwidth.
It also add a test to to check that the allocated but not settled bandwidth is not counted after 2 days.
Change-Id: Iee9dbe3517cc3b9825438360b276a07a43dfbc64
Satellites set their configuration values to default values using
cfgstruct, however, it turns out our tests don't test these values
at all! Instead, they have a completely separate definition system
that is easy to forget about.
As is to be expected, these values have drifted, and it appears
in a few cases test planet is testing unreasonable values that we
won't see in production, or perhaps worse, features enabled in
production were missed and weren't enabled in testplanet.
This change makes it so all values are configured the same,
systematic way, so it's easy to see when test values are different
than dev values or release values, and it's less hard to forget
to enable features in testplanet.
In terms of reviewing, this change should be actually fairly
easy to review, considering private/testplanet/satellite.go keeps
the current config system and the new one and confirms that they
result in identical configurations, so you can be certain that
nothing was missed and the config is all correct.
You can also check the config lock to see what actual config
values changed.
Change-Id: I6715d0794887f577e21742afcf56fd2b9d12170e
Replace GetProjectAllocatedBandwidth by GetProjectBandwidth which calculates
used bandwidth from allocated and settled bandwidth recorded in the
project_bandwidth_daily_rollups table.
For each day in the month, if the allocated bandwidth is expired, it uses the
settled bandwidth for computing used bandwidth.
Change-Id: Ife723c6d5275338f470619631acb25930d39ac3c
project_bandwidth_daily_rollups table
We want to calculate used bandwidth better so we need to calculate it
from allocated and settled bandwidth. To do this we need first populate
this new table.
https://storjlabs.atlassian.net/browse/PG-56
Change-Id: I308b737bf08ee48ce4e46a3605697ab2095f7257
errs.Class should not contain "error" in the name, since that causes a
lot of stutter in the error logs. As an example a log line could end up
looking like:
ERROR node stats service error: satellitedbs error: node stats database error: no rows
Whereas something like:
ERROR nodestats service: satellitedbs: nodestatsdb: no rows
Would contain all the necessary information without the stutter.
Change-Id: I7b7cb7e592ebab4bcfadc1eef11122584d2b20e0
Initially there were pkg and private packages, however for all practical
purposes there's no significant difference between them. It's clearer to
have a single private package - and when we do get a specific
abstraction that needs to be reused, we can move it to storj.io/common
or storj.io/private.
Change-Id: Ibc2036e67f312f5d63cb4a97f5a92e38ae413aa5
cache is really common variable and type name and we have already used
the package name alias in multiple places.
Change-Id: I6435785b7549b541d533de59ec94557b9bd11e04
Currently the loop handling is heavily related to the metabase rather
than metainfo.
metainfo over time has become related to the "public API" for accessing
the metabase data.
Currently updates monkit.lock, because monkit monitoring does not handle
ScopeNamed correctly. Needs a followup change to monitoring check.
Change-Id: Ie50519991d718dfb872ec9a0176a82e732c97584
metabase has become a central concept and it's more suitable for it to
be directly nested under satellite rather than being part of metainfo.
metainfo is going to be the "endpoint" logic for handling requests.
Change-Id: I53770d6761ac1e9a1283b5aa68f471b21e784198
Check that the bloom filter creation date is earlier than the
metainfo loop system time used for db scanning.
Change-Id: Ib0f47c124f5651deae0fd7e7996abcdcaac98fb4
Rename the functions that are prefixed with 'New' which connect with
Redis by 'Open' to make clear that they perform network operations.
Change-Id: I1351e89a642e8e2c2586626646315ad0fb2c6242
This is one step for implementing the free tier:
* Change the default project limit from 10 to 3
* Move storage and bandwidth project usage limits from the metainfo
package to the console package (otherwise there is a cyclical
dependency, and metainfo doesn't use these values anyway)
* Change the default storage usage limit per project from 500gb to 50gb
* Change the default bandwidth usage limit per project from 500gb to 50gb
* Migrate the database so that old users and projects continue to have
the old defaults (10 projects/500gb usage)
Change-Id: Ice9ee6a738bc6410da18c336c672d3fcd0cab1b9
Update the Redis dependency to use the last major production version.
The last version accepts a context parameter in all the network methods
so it allows us to pass it through them.
Change-Id: I34121b2ec3c2728602115c724933ad24c9e6e4fd
Move a specific interface & types used for testing to be a private
subpackage with a name that clearly identifies it for testing purpose.
Change-Id: I646cf3b6f0a3b518a6f9a125998dc5a02df02db6
For metainfo loop we need only some of Segment fields. By removing some of them we will reduce memory consumption during loop.
Change-Id: I4af8baab58f7de8ddf5e142380180bb70b1b442d
We want to read from DB only those fields that are used by metainfo loop so we need to remove most of fields from LoopObjectEntry.
Change-Id: I14ecae288f631dc0ff54f4c560ce43b736eccdcf
TestProjectUsage_ResetLimitsFirstDayOfNextMonth seems to be flaky.
It doesn't seem to reproduce easily, however it should flush orders from
storagenodes to satellites, that way the orders chore flushing on the
satellite makes sense.
Change-Id: I586d9d8ce10b5f6320e79324f6128b4f8c2cac5f
Delete satellite order methods and DB tables which aren't used anymore
after we have done a refactoring on the orders to stuck bucket
information in the orders' encrypted metadata.
There are also configuration parameters and a satellite chore that
aren't needed anymore after the orders refactoring.
Change-Id: Ida3682b95921df70792284b42c96d2508bf8ca9c
The rollup archiver chore moves bucket bandwidth rollups and
storagenode rollups that are older than a given duration
to two new archive tables.
Change-Id: I1626a3742ad4271bc744fbcefa6355a29d49c6a5
When we observed the value for total piecesizes stored in the network,
we were doing it after converting them to byte-hours, rather than using
the actual piece sizes. This fixes that issue.
Change-Id: I1564d21b519f70eb59f298d97dbd777baf127723
Tally shouldn't abort its cycle if the accounting cache return an error
because it isn't an essential requirement to update it.
Change-Id: I78fd2bd9cf253ddedfb9ada80c0fa2ddf438f647
We have to adapt the live accounting to allow the packages that use it
to differentiate about errors for being able to ignore them and make our
satellite resilient to Redis downtime.
For differentiating errors we should make changes in the live accounting
but also in the storage/redis.Client, however, we may need to do some
dirty workarounds or break other parts of the implementation that
depends on it.
On the other hand we want to get rid of the storage/redis.Client because
it has more functionality that the one that we are using and some
process has been started to remove it.
Hence, we have refactored the live accounting to directly use the Redis
client library for later on (in a future commit) adapt the satellite for
being resilient to Redis downtime.
Last but not least, a test for expired bandwidth keys have been added
and with it a bug was spotted and fix it.
Change-Id: Ibd191522cd20f6a9a15e5ccb7beb83a678e530ff
this change isn't the real fix. it's just ignoring the problem.
i don't know what the real fix is. is the problem with the test, or
is there actually a problem with the rollup code?
Change-Id: I552bdd947deadc212cc56efc5f818942b9827126
Remove a declared variable that's set by never read nor passed to any
function so it's unused code.
Change-Id: I8daf9d1f71d29ab39d7a80011d1b4813ada1c67d
We want to stop using the serial_numbers table in satelliteDB. One of the last places using the serial_numbers table is when storagenodes settle orders, we look up the bucket name and project ID from the serial number from the serial_numbers table.
Now that we have support to add encrypted metadata into the OrderLimit, this PR makes use of that and now attempts to read the project ID and bucket name from the encrypted orderLimit metadata instead of from the serial_numbers table. For backwards compatibility and to ensure no errors, we will still fallback to the old way of getting that info from the serial_numbers table, but this will be removed in the next release as long as there are no errors.
All processes that create orderLimits must have an orders.encryption-keys set. The services that create orderLimits (and thus need to encrypt the order metadata) are the satellite apiProcess, the repair process, audit service (core process), and graceful exit (core process). Only the satellite api process decrypts the order metadata when storagenodes settle orders. This means that the same encryption key needs to be provided in the config for the satellite api process, repair process, and the core process like so:
orders.include-encrypted-metadata=true
orders.encryption-keys="<"encryptionKeyID>=<encryptionKey>"
Change-Id: Ie2c037971713d6fbf69d697bfad7f8b672eedd66
this change tries really hard to never have all of the storage node
rollups in memory at the same time, up until the rollups are actually
getting summed together.
Change-Id: If67f49e7d71106798d996a6850b3e48671bd9e18
MetadataSize can slightly vary and checking for exact value makes
difficult to change what's being encoded in metadata.
Change-Id: I5f1ade41bc26d115e6743367ee35cf1ba74795c9