Because we are saving all tallies as a single SQL statement we finally
reached maximum message size. With this change we will call SaveTallies multiple times in batches.
https://github.com/storj/storj/issues/5977
Change-Id: I0c7dd27779b1743ede66448fb891e65c361aa3b0
Tallies are now created for buckets with no objects. Previously, the
bucket tally collector skipped empty buckets since it created tallies
only using information from the objects table. Methods that used
bucket tallies when calculating usage costs would return incorrect
results because of this.
Change-Id: I0b37fe7159a11cc02a51562000dad9258555d9f9
This change removes the use of the objects loop for calculating bucket
tallies. It has been superseded by a custom query.
Change-Id: I9ea4633006c9af3ea14d7de40871639e7b687c22
We will remove segments loop soon so we need first to move
Segment definition to rangedloop package.
https://github.com/storj/storj/issues/5237
Change-Id: Ibe6aad316ffb7073cc4de166f1f17b87aac07363
With this change we are replacing parsing code with existing go-redis
util.
We also switch redis client to version 9.
Change-Id: Ie4a651e3ae6960e68958c690873925d319b70e10
We want to eliminate usages of LoopSegmentEntry.Pieces, because it is
costing a lot of cpu time to look up node IDs with every piece of every
segment we read.
In this change, we are eliminating use of LoopSegmentEntry.Pieces in the
node tally observer (both the ranged loop and segments loop variants).
It is not necessary to have a fully resolved nodeID until it is time to
store totals in the database. We can use NodeAliases as the map key
instead, and resolve NodeIDs just before storing totals.
Refs: https://github.com/storj/storj/issues/5622
Change-Id: Iec12aa393072436d7c22cc5a4ae1b63966cbcc18
Metainfo needs to know rate and burst limit to be able to limit users
requests. We made cache for per project limiter but to make single
instance we need to know about limits. So far we were doing direct DB
call to get rate/burst limit for project but it's generating lots of
DB requests and can be easily cached as we even have project limit cache.
This change extends project limit cache with rate/burst limit and starts
using this change while creating project limiter instance for metainfo.
Because data size kept in project limit cache is quite small this change
also bumps a bit default capacity of the cache.
Fixes https://github.com/storj/storj/issues/5663
Change-Id: Icb42ec1632bfa0c9f74857b559083dcbd054d071
We just fixed case were project limit cache was not used properly. This
is test case to cover that fix.
Change-Id: Iee467f0a46836860a14ab6238a9842ffbf54ed4c
It looks that at some point we broke how project limits cache is used
and we were missing cache in most critical paths (upload/download).
This is fix for this issue.
I also adjusted cache methods naming.
Change-Id: Ic98372779a39365d0920fe3943f1f7a68b064173
Commit fb59974 disabled usage price overrides because of a failing
test. This change reenables it while resolving the issue that caused
the test to fail.
The previous version of the test passed Gerrit verification and was
merged, but it failed for the primary Jenkins pipeline after merge.
This is due to a difference in how the Jenkins build runs Cockroach
and Postgres for each pipeline.
This commit rewrites the test to be safe for concurrent execution by
ensuring any mutable variables are defined within each test so that
shared state across tests is reduced.
Change-Id: Ia4566c9cd2d698afdb2caa4b7e2808b17e18de4e
Project usage price overriding has been removed because it produces
incorrect results when tested. It should not be re-implemented until
the issues it causes are resolved.
Change-Id: Ic92eff374c9af4fea3bf32782a72303a7978b055
This code is essentially replacement for eestream.CalcPieceSize. To call
eestream.CalcPieceSize we need eestream.RedundancyStrategy which is not
trivial to get as it requires infectious.FEC. For example infectious.FEC
creation is visible on GE loop observer CPU profile because we were
doing this for each segment in DB.
New method was added to storj.Redundancy and here we are just wiring it
with metabase Segment.
BenchmarkSegmentPieceSize
BenchmarkSegmentPieceSize/eestream.CalcPieceSize
BenchmarkSegmentPieceSize/eestream.CalcPieceSize-8 5822 189189 ns/op 9776 B/op 8 allocs/op
BenchmarkSegmentPieceSize/segment.PieceSize
BenchmarkSegmentPieceSize/segment.PieceSize-8 94721329 11.49 ns/op 0 B/op 0 allocs/op
Change-Id: I5a8b4237aedd1424c54ed0af448061a236b00295
Additional elements added:
* monkit metric for observers methods like Start/Fork/Join/Finish to
be able to check how much time those methods are taking
* few more logs e.g. entries with processed range
* segmentsProcessed metric to be able to check loop progress
Change-Id: I65dd51f7f5c4bdbb4014fbf04e5b6b10bdb035ec
Add notifications for free account limits for segment usage
and update to follow the figma designs.
Issue: https://github.com/storj/storj/issues/5482
Change-Id: I8a2fe38d609d53e09bf5074484cedc343223bffd
To improve deletion of old entries in project_bandwidth_daily_rollup
we need index on `interval_day` column which is used to find those old
entries.
As an addition we are changing interval how often deletion is executed
from 7 to 1 day. We would like to have smaller portion of data to
delete.
Fixes https://github.com/storj/storj/issues/5465
Change-Id: Ie18ebe859887b93d6e4e6065a61fb9214c7ad27a
This change causes the bucket's partner info to be used rather than the
user's when calculating project usage prices. This ensures that users
who own differently-partnered buckets will be charged correctly for
usage based on the specific bucket they are utilizing.
according to the bucket's partner.
Related to storj/storj-private#90
Change-Id: Ieeedfcc5451e254216918dcc9f096758be6a8961
Add node tally ranged loop observer and partial.
Add node tally randed observer to range loop peer.
Add config flag to select which loop to use for node tally.
Update satellite core to use segement/ranged loop based on a flag.
Duplicate existing node tally test but using ranged loop.
Change-Id: I6786f1a16933463fab5f79601bf438203a7a5f9e
bucket_bandwidth_rollups table
We have performance problems with updating bucket_bandwidth_rollups. To
improve situation we can stop storing allocated bandwidth in this table.
This should reduce large number of updates which are comming from
metainfo endpoints, repair workers and audit.
Next step will be to drop `allocated` column completely from
bucket_bandwidth_rollups.
Allocated GET bandwidth is all we need and we are keeping it in
bucket_bandwidth_rollups table.
Change-Id: Ifdd26a89ba8262acbca6d794a6c02883ad0c0c9b
We have a bug where if number of buckets in the system will be
multiplication of batch size (2500) then loop that is going over
all buckets can run indefinitely.
Fixes https://github.com/storj/storj/issues/5374
Change-Id: Idd4d97c638db83e46528acb9abf223c98ad46223
We added alternative way to calculate bucket tallies for accounting and
now it's tested and we will enable it by default.
CollectBucketTallies was extended to support overriding current time
to be able to test handling expired objects.
Change-Id: I738b99a33fd2e086245f92d874c1cbb806e834c0
since amount of objects is growing and looping through all of them
starts taking lot of time, we are switching for SQL query to do it
in chunks of tallies per bucket. 2nd part of issue fix.
Closes https://github.com/storj/team-metainfo/issues/125
Change-Id: Ia26bcac0a7e2c6503df9ebbf4817a636841d3284
We removed monkit call from "object" method because it was using
too much cpu and was visible on cpu profile but we should first try
optimized version of this call.
Change-Id: Ib76d8a2968a704ce47235c6dac6edad4e40bde48
This is another change to remove monkit calls from fast methods. Those
calls are visible in CPU profiles.
Change-Id: Ib3beba0dca6a6d93c3342b0994c580f78bbdd50b
Benchmark against 'main':
name old time/op new time/op delta
RemoteSegment/Cockroach/multiple_segments-8 5.56µs ± 5% 0.69µs ±12% -87.57% (p=0.008 n=5+5)
name old alloc/op new alloc/op delta
RemoteSegment/Cockroach/multiple_segments-8 2.72kB ± 0% 0.00kB ~ (p=0.079 n=4+5)
name old allocs/op new allocs/op delta
RemoteSegment/Cockroach/multiple_segments-8 50.0 ± 0% 0.0 -100.00% (p=0.008 n=5+5)
Change-Id: I20527fb576cd81db667a81929fa95b810ee11b14
While investigating high memory consumption on a Redis instance, it was found that much of the memory consumed was from cached redis scripts.
Redis caches scripts based on the hash of the script itself. This change uses ARGV to reduce the number of cached scripts.
Change-Id: Ia878fe81552a3067f09e60c44bb4ace25c6b5f9a
We made optimization for segment loop observers to avoid
heavy monkit initialization on each call. It was applied to very
often executed methods. Unfortunately we used wrong monkit
method to track function times. Instead mon.Task we used
mon.Func().
https://github.com/spacemonkeygo/monkit#how-it-works
Change-Id: I9ca454dbd828c6b43ba09ca75c341991d2fd73a8
Adding an interval_end_time column to the accounting_rollups table
to keep the last interval_end_time for each daily storage tallies.
Updates https://github.com/storj/storj/issues/4178
Change-Id: If7a8210c5e9fe2fc9df84b137a8b6e3db2471c58
removed segment limit validation and checks in metainfo endpoint and accounting/projectusage
since feature is live and has always has segment limitation now
Resolves: https://github.com/storj/storj/issues/4470
Change-Id: I8cf87cbbc40ac61262f9f05e52573d3ae6410611
Previously there was no realtime administration of the storage usage
during copies. Now there is.
Closes https://github.com/storj/storj/issues/4719
Change-Id: I0d536bf551d16208116c3aceac89ed590ec473bf
Currently we have a significant number of tallies that need to be
deleted together. Add a limit (by default 10k) to how many will
be deleted at the same time.
Change-Id: If530383f19b4d3bb83ed5fe956610a2e52f130a1
It seems the tests relied on time.Now(), which might cause some
discrepancies in calculations. Use a fixed time.Now() rather than
recalculating.
As a sidefix, remove "Test" prefix from t.Run. These are unnecessary.
Change-Id: I1de903fcf0fcf46fc8e3acf2463e17239b8e3cc6
Satellite caches the project bandwidth in Redis when it doesn't have it
because was not set or the key expired, however, it doesn't perform the
check and set if not exists in a transaction. It also uses the increase
function which increases the value if it exists otherwise it sets it.
This provokes that multiple concurrent request to the same project may
increase the total project by multiples of the bandwidth usage
registered in the database rather than setting it because they may check
if the key exists before any other has executed the increase and then
the first one executing it will set the value but the others will
increased causing that Redis has a wrong bandwidth usage value which is
N magnitude of the real one and making the satellite to deny the
downloading if it surpasses the project limit.
This commit changes the "update"" project bandwidth usage by an "insert"
but using a Redis function that only sets the value if the key doesn't
exists for solving the increase issue but also not overriding the value
due to may contain updates of other downloading requests which aren't
already registered in the DB.
Change-Id: I33e2fe462930b2fdb4061fc94002bd3544476f94
TestRollupNoDeletes is very flaky (passes locally but fails in the main branch build).
The exact reason is not clear, but stopping the loop seems to be async, the following lines may not stop the loops immediatelly which is a potential problem:
```
satellitePeer.Accounting.Rollup.Loop.Pause()
satellitePeer.Accounting.Tally.Loop.Pause()
```
Fortunatelly these test check only the database interfaces. Instead of testplanet.Run we can run only satellitedbtest.Run which is faster and more predictable (no background loops).
Other potential problem: comment claims that the default of DeleteTallies is false:
```
// In testplanet the setting config.Rollup.DeleteTallies defaults to false.
```
But it seems to be true (rollup.go):
```
DeleteTallies bool `help:"option for deleting tallies after they are rolled up" default:"true"`
```
This is also fixed in the patch (as we need set it explicit), but TBH it can be fixed with testplanet, too.
Change-Id: Id7ec80d5c069bed2c556f4d001c71aa23fc5af23
We don't have metric to track how many pending objects we have in the
system. This change is using tally objects loop to collect pending
objects per bucket and at the end its combining all buckets values
into single metric for all pending objects in a system.
Change-Id: Iac7a6bfb48854f7e70127d275ea8fdd60c4eb8b7
Recently we applied this optimization to metrics observer and time
used by its method dropped from 12m to 3m for us1 (220m segments).
It looks that it make sense to apply the same code to all observers.
Change-Id: I05898aaacbd9bcdf21babc7be9955da1db57bdf2