also remove the continuation support from the queue, otherwise
we may end up sequential scanning the entire table to get
a few rows at the end.
then, in the core, instead of looping both to get a big enough
batch inside of the queue, as well as outside of it to ensure
we consume the whole queue, just get a single batch at a time.
also, make the queue size configurable because we'll need to
do some tuning in production.
Change-Id: If1a997c6012898056ace89366a847c4cb141a025
Replace most of old libuplink usages in testplanet. 100% migration will
be possible when we will be able to implement UploadWithClientConfig
with new libuplink.
Change-Id: I432d7d4917c7b67d46a058abd0a2a6a13f565ac4
until we do a good job of cleaning them up, we should at least
not charge or pay people for them. nodes already locally delete
expired segments.
subsumes the tests in 1112.
Change-Id: I5961185764e02f6136b3231b44ecc75a9a8832c9
by doing an indexed anti-join we're able to reduce the time to
select the pending orders by over 10x on postgres. this should
help us process pending orders much more quickly.
it probably won't do as good a job on cockroach because it does
not do an indexed anti-join and instead does a hash join after
scanning the entire consumed serials table. we should either
remove orders entirely or try to make that more efficient
when necessary.
Change-Id: I8ca0535acd21c51e74955b24c9b86d20e4f2ff9c
* change overlay.UpdateStats to allow a third audit outcome. Now it can
handle successful, failed, and unknown audits.
* when "unknown audit reputation"
(unknownAuditAlpha/(unknownAuditAlpha+unknownAuditBeta)) falls below the
DQ threshold, put node into suspension.
* when unknown audit reputation goes above the DQ threshold, remove node
from suspension.
* record unknown audits from audit reporter.
* add basic tests around unknown audits and suspension.
Change-Id: I125f06f3af52e8a29ba48dc19361821a9ff1daa1
The billing tests were flaky because some assertions ran before the
storage nodes finish their work.
A new helper function in testplanet has been added to allow to wait for
storage nodes endpoints to finish their work. This function now it's
used in the billing tests for avoiding their flakiness.
This commit closes the ticket:
https://storjlabs.atlassian.net/browse/SM-403
A part of fixing other billing tests flakiness.
Change-Id: Iacb750af435f515c04b1e1d3510a218d184c9abc
Add a test for checking that the billing:
* it doesn't include upload traffic
* it includes download traffic
Change-Id: I1655c15c1fad642f77dd210f2014b2586ae10104
Remove the check around consuming an expired serial so that we
have more time to run the migration. It does open a small race
of double spends for entries already counted and then added to
the pending queue right around when they're going to expire and
the consumed serials have already been removed, but that should
be rare if we keep the pending queue empty.
Change-Id: I000b15979b09c67751281ff675ea6c81fc9d22dc
This change adds two new tables to process orders as fast as we used
to but in an asynchronous manner and with hopefully less storage
usage. This should help scale on cockroach, but limits us to one
worker. It lays the groundwork for the order processing pipeline to
be queue rather than database driven.
For more details, see the added fast billing changes blueprint.
It also fixes the orders db so that all the timestamps that are
passed to columns that do not contain a time zone are converted to
UTC at the last possible opportunity, making it less likely to use
the APIs incorrectly. We really should migrate to include timezones
on all of our timestamp columns.
Change-Id: Ibfda8e7a3d5972b7798fb61b31ff56419c64ea35
at the end of tally iteration, in order to set the new live
accounting totals, we were iterating over all live accounting
projects. We found a bug with this when running storj-sim. If
we restarted the satellite live accounting would be cleared
because storj-sim was running the live accounting redis instance.
Since live accounting was cleared, at the end of tally, even if
it found data in projects, we would not update the live accounting
totals because we were iterating over the projects from live
accounting to do so. We now iterate over projects found from tally
in order to update live accounting
We also found that if a user deleted everything from their project,
tally would not find it and the live accounting would not be updated.
For this reason, we merge live accounting projects into tally projects
Change-Id: If0726ba0c7b692d69f42c5806e6c0f47eecccb73
if redis crashed in the middle of tally we could have a situation
where we erroneously subtract from a project total. Currently,
`latest` should never be less than `initial`
Change-Id: Ibb5ab724ac0ad4d684f7954fad7a9e061104b7df
Sometimes the upload that is supposed to fail due to excess usage
would pass. This looks to be because it's overwriting another object
uploaded earlier in the test and deleting the old pointer. If tally
happened to run after the pointer is deleted but before the current
upload reaches the live accounting check, it might pass through.
The solution is to upload to a different path each time.
Change-Id: Ie6c825b9c6eab9ed53426ae262e7997bcb6beb7f
this commit updates our monkit dependency to the v3 version where
it outputs in an influx style. this makes discovery much easier
as many tools are built to look at it this way.
graphite and rothko will suffer some due to no longer being a tree
based on dots. hopefully time will exist to update rothko to
index based on the new metric format.
it adds an influx output for the statreceiver so that we can
write to influxdb v1 or v2 directly.
Change-Id: Iae9f9494a6d29cfbd1f932a5e71a891b490415ff