WHAT:
people who sign up on US2 are not redirected to verifying page. From now on we have to set verify URL to make redirect happen
WHY:
user experience
Change-Id: I96c51a2c4f9cb6376cbfea639675b32918b58bee
This PR removes all back-end related referral program code including the
marketing portal.
We will have a separate PR for front-end code and database migration to
drop `offers` and `usercredits` table
Change-Id: If59f952cddfe0558a7dc03a0eac7cc1081517f88
We will add a cache to nodes, so using completely random nodes wouldn't
show the actual performance.
Change-Id: I94f18283712812f05f7795efd3c7cf57499fa52c
We need to keep an inmemory cache to avoid lookups into aliases table.
This adds the inmemory state of the cache.
Change-Id: Ief2b9bb19e10b46839b9208472dfc3035eb49af3
This is first step in supporting node aliases. It adds a table
that automatically assigns aliases to nodes inserted into the table.
Change-Id: Ibdf40097c3c1e5b371500203f8db203505a48adc
This ensures the caveats are unique even when they contain the same
permissions and will result in unique macaroons. This is important to
ensure revocation doesn't impact more macaroons than intended.
Change-Id: I6354edd0119f2d85eaf580f2d1926a3de9151b88
Remove the orders Settlement endpoint because it isn't used and it was
already always returning an error.
Change-Id: I81486fbe7044a1444182173bc0693698ee7cfe7e
These changes are independently tracked on
https://github.com/storj/storj/tree/jt/migration-reorder
The point of this is to make the distributed column
migration, needed for SNO invoice generation, the very
next one, so we can release it as a point release.
Change-Id: I26e1c03629c4f079b9ad12485e2b71a715d82b3b
allow disabling tcp/quic
In order to have more control of a server so that we can
simulate connection failures in `testplanet`, this PR changes
quic.Listener to accept an existing UDPConn instead of relying on the
quic-go library to create the UDPConn.
This PR also adds two flags on the `server.Config` struct to allow
enabling/disabling tcp/tls listener and quic listener. By default, they
are both set to true.
- `DisableTCPTLS`: internal flag, disables tcp/tls listener.
- `DisableQUIC`: hidden flag, disables quic listener
By making the `DisableQUIC` a hidden flag, it allows storagenode operators to
have the ability to disable quic traffic in case their set up can't work
with udp traffic.
Change-Id: I853b12435d988b9c41ad9b873fd57480d792e378
this changes from a satellite error to a local encryption
error with the upcoming permissions changes where we only
include keys for the paths that are allowed.
Change-Id: I7aa37cfbaee31a1e54afe0423b283b9f41d9345f
Limit bucket name lookup to date range of the calling methods since we only need distinct bucket names for that time period.
Adds new index and removes an index specific to project ID since it is no longer needed.
Change-Id: Ic07bbfb1c32280e0c0e39f8da020b284e1e5d974
It's impossible to time correctly this check. The segment may expire
just at the time we upload the repaired pieces to new storage nodes.
They will reject this as expired and the repair will fail.
Also, we penalize storage nodes with audit failure only if they fail
piece hash verification, i.e. return incorrect data, but only if they
have already deleted the piece.
So, it would be best if the repair service does not care about object
expiration at all. This is a responsibility of another service.
Removing this check will also simplify how we migrate this code
correctly to the metabase.
Change-Id: I09f7b372ae2602daee919a8a73cd0475fb263cd2
Fix an issue due to copy-paste problem that made that the Graceful Exit
test to be flaky.
The test uses a time created at the beginning of the test for avoiding
to get undeterministic time differences due to the fact of the response
time variation by the DB queries, however some part of the test were
using a current time rather than this base time, so they have been
addressed.
Change-Id: I4786f06209e041269875c07798a44c2850478438
We need this method to fix repairing pending objects. In another PR, it
will replace the GetObjectLatestVersion + GetSegmentByPosition calls
that are currently executed.
Change-Id: I4c5c2ab604edf898452b6fd21b86d4d3f970ce79
Delete satellite order methods and DB tables which aren't used anymore
after we have done a refactoring on the orders to stuck bucket
information in the orders' encrypted metadata.
There are also configuration parameters and a satellite chore that
aren't needed anymore after the orders refactoring.
Change-Id: Ida3682b95921df70792284b42c96d2508bf8ca9c
Add a command to the satellite for cleaning up the Graceful Exit (a.k.a
GE) transfer queue items of nodes that have exited.
The commit adds to the GE satellite DB a couple of new methods, and its
corresponding test, for performing the operations of the new command.
Change-Id: I29a572a59689d63b24990ac13c52e76d65aaa917
using redash i manually checked that the only times the sum of
the payments does not match the paid column is for 2020-12 and
if it does not match then there are no payments.
Change-Id: I71ce0571de7e38e21548d7d6757b25abc3bfa781
The rollup archiver chore moves bucket bandwidth rollups and
storagenode rollups that are older than a given duration
to two new archive tables.
Change-Id: I1626a3742ad4271bc744fbcefa6355a29d49c6a5
This index is obsolete and duplicates a similiar (project_id, name)
index on the same table.
Moreover, it might confuse CockroachDB which of the two index to use,
which may might affect DB performance.
Change-Id: If8d1df8347714942cea9dca82864ba5f4973bed3
Comparing the result from a subquery with the "IN" operator instead of
"=" makes a huge difference in the execution time of the SQL query on
CockroachDB.
Change-Id: I76e8f75a7bc95951667345d1ed9bd60f9aef3edb
When we observed the value for total piecesizes stored in the network,
we were doing it after converting them to byte-hours, rather than using
the actual piece sizes. This fixes that issue.
Change-Id: I1564d21b519f70eb59f298d97dbd777baf127723
We wanto have single uplink branch for standard and multipart-upload satellite but some tests are using helper methods from multipart. This change adds methods used by uplink test.
Change-Id: I82352ed56674ff7e8743b58061ba594018e78e3b
We are checking if satStreamID is created in the last 48 hours. If it is
older we treat is as expired an fail to unmarshal it.
Since the satStreamID is also the Upload ID for multipart uploads, this
means that all calls fail for multipart uploads older than 48 hours.
Even aborting old multipart uploads is not possible.
To resolve this issue, we should stop checking satStreamID for
expiration.
Change-Id: Ieaf53ed3cd800cdd08843676c2d9490b007d962e
Parts that have segment index gaps should be treated similarly how
multipart objects are, because direct calculation of the segment does
not work.
Change-Id: I2717eac36f085b5100f3d600fcf0ce056202a9eb
CreateGetOrderLimits is not used anymore because we have CreateGetOrderLimits2. We need to remove old method and fix name of second.
Change-Id: I59148b8d28fc9dbab7d452c884319125a02745d1
In some cases we need to set encryption parameters later, with CommitObject method. This change makes Encryption optional with BeginObject* methods and mandatory with CommitObject if not set earlier.
Change-Id: I812c9b0e8fc213ca32d4758e0e68227e0e9bdd32
In the past we were storing fixed segment size with StreamInfo, encrypted in metadata. The value was unencrypted size of segment, not encrypted one.
Change-Id: Id6b18440c674223eabbb152b1636c83e1ab6462c
Add ProjectsCursor type for pagination
Add PageCount, CurrentPage, and TotalCount ProjectsPage
This allows us to mimic the logic of GetBucketTotals and the
implementation of BucketUsages in graphql for the new ProjectsByOwnerID
functionality.
Change-Id: I4e1613859085db65971b44fcacd9813d9ddad8eb
Full scope:
private/testplanet,satellite/{overlay,satellitedb}
Description:
In most cases, downtime tracking with audits will eventually lead
to DQ for nodes who are unresponsive. However, if a stray node has no
pieces, it will not be audited and will thus never be disqualified.
This chore will check for nodes who have not successfully been contacted
in some set time and DQ them.
There are some new flags for toggling DQ of stray nodes and the timeframes
for running the chore and how long nodes can go without contact.
Change-Id: Ic9d41fdbf214736798925e728245180fb3c55615
Tally shouldn't abort its cycle if the accounting cache return an error
because it isn't an essential requirement to update it.
Change-Id: I78fd2bd9cf253ddedfb9ada80c0fa2ddf438f647
On upload we need to override pending and committed object. This change is adjusting DeleteObjectAllVersions to delete both.
Change-Id: Ib66c2af207c618119f7bf0de7fa9d3e5145d8641
* Deduplicate NodeID list prior to fetching IPs.
* Use NodeSelectionCache for fetching reliable IPs.
* Return number of segements, reliable pieces and all pieces.
Change-Id: I13e679caab275488b4037624b840a4068dad9589
For being able to have resilient multi-region satellites we cannot stop
processing uploads/download client request when Redis isn't responding
properly.
These changes avoid to stop the processing of the client requests when
we cannot check if the client exceeds its storage or bandwidth limits
and we cannot update its used storage/bandwidth limits because Redis is
not responding successfully or the satellite database returns an error.
Change-Id: Ia7f12c07fc9ffdfad0e7ff052ff3fd81eca0f0e3
Respond to the HTTP clients which request the project usage limits with
different status codes depending of the error class returned by the
satellite/accounting Service.
Change-Id: I6f486ea55517f616c7cec81dbbe77e997484180f
This is the first step in the removal of uptime columns on the
nodes table. These columns are no longer used:
uptime_success_count
total_uptime_count
uptime_reputation_alpha
uptime_reputation_beta
In order to avoid breaking backwards compatibility, we need to
remove all references to these columns before removing the columns
themselves from the database. However, since uptime_success_count
and total_uptime_count are NOT NULLABLE, we can't remove them from
the insert statements in the overlay. So we can't remove the columns
because of the references, and we can't remove the references because
the columns can't be null. What a pickle. To remedy this, we will set a
default on the columns. Then we should be able to remove them from the
insert statements
Change-Id: I75f6c56fb7897835bbf29869f86f39de1d9dd345
We have to adapt the live accounting to allow the packages that use it
to differentiate about errors for being able to ignore them and make our
satellite resilient to Redis downtime.
For differentiating errors we should make changes in the live accounting
but also in the storage/redis.Client, however, we may need to do some
dirty workarounds or break other parts of the implementation that
depends on it.
On the other hand we want to get rid of the storage/redis.Client because
it has more functionality that the one that we are using and some
process has been started to remove it.
Hence, we have refactored the live accounting to directly use the Redis
client library for later on (in a future commit) adapt the satellite for
being resilient to Redis downtime.
Last but not least, a test for expired bandwidth keys have been added
and with it a bug was spotted and fix it.
Change-Id: Ibd191522cd20f6a9a15e5ccb7beb83a678e530ff
GetSuccessfulNodeNotCheckedInSince and GetOfflineNodesLimited are overlay methods
which were only used by the previous downtime tracking system which has been removed.
These methods should also be removed.
Change-Id: Idb829d742e1f987e095604423fff656fe581183e
Non-multipart uplink implementation is always trying to download object
by downloading last segment first (PartNumber=0, Index=-1) but this
approach won't work with multipart object. We need to reject such old
style request with reasonable message.
Change-Id: I9221e019933565a8d25136bdfef3e054320bac3d
SatelliteAddress in OrderLimit is not being used anymore and some
satellite addresses may consume too much bytes.
Change-Id: Ic7a0efe5b6211c2f3b91af67b293cde98b29d074
Avoid using project uuid string representation, because
it uses more bandwidth.
This reduces the encrypted metadata size from 118 -> 97 bytes.
Change-Id: Ic53a81b83acc065f24f28cd404f9c0b1fe592594
The total_plain_size and total_encrypted_size columns in the objects
table were set as INT4, which limits the size of committed objects to
just 2 GiB.
This patch migrates the DB to change the type of these fields to INT8.
Change-Id: Iad7e7b44a652e6c5b8e17b80588637bb48390fe6
Do not insert the number of healthy pieces for segment health anymore.
Rather, insert the segment health calculated by our new priority
function.
Change-Id: Ieee7fb2deee89f4d79ae85bac7f577befa2a0c7f
Full prefix: satellite/{overlay,nodestats},storagenode/{reputation,nodestats}
Allow the storagenode to receive its audit history data from the
satellite via the satellite's GetStats endpoint.
The storagenode does not save this data for use in the API yet.
Change-Id: I9488f4d7a4ccb4ccf8336b8e4aeb3e5beee54979
* Separate audit history interface into its own file in the overlay
package
* Add overlay.AuditHistory struct so that internalpb.AuditHistory is
only used from within the database layer
* Add overlay.GetAuditHistory function for features that will require
access to detailed audit history information
* Do not return full audit history from UpdateAuditHistory - callers to
that function only need to know the online score and whether a full
tracking period has been completed
* Move audit history tests out of satellite/satellitedb, since they are
independent of database implementation
Change-Id: I35b0c4ac23bbaabd80624f8a9631c3cb1a1f33bd
Now that the deprecated downtime tracking service is removed
(3fc76f4ffe), we can safely remove
the nodes_offline_times table.
Change-Id: Ia7c6efe32ba104dff5a830af5f2beee3337eefe5
Nodes which are offline_suspended will no longer be considered for new
uploads. The current threshold that enters a node into offline
suspension is 0.6. Disqualification for offline suspension is still
disabled.
Change-Id: I0da9abf47167dd5bf6bb21e0bc2186e003e38d1a
Currently we do not allow anything other than the "paid" status for invoices when
trying to delete a user. However there can be a couple of other states that are
still fine to accept during deletion of a user. This change reverses the order to
check for the status that we do not want to allow.
Change-Id: I78d85af6438015c55100fa201ccffc731c91de1c
this change isn't the real fix. it's just ignoring the problem.
i don't know what the real fix is. is the problem with the test, or
is there actually a problem with the rollup code?
Change-Id: I552bdd947deadc212cc56efc5f818942b9827126
Query nodes table using AS OF SYSTEM TIME '-10s' (by default) when on CRDB to alleviate contention on the nodes table and minimize CRDB retries. Queries for standard uploads are already cached, and node lookups for graceful exit uploads has retry logic so it isn't necessary for the nodes returned to be current.
IterateObjectsAllVersionsWithStatus
We need different implementation for IterateObjectsAllVersions because
we want to iterate over all object without specifying object status.
Existing method will have new name but implementation details are not
changed.
Change-Id: I01b987996772fa7f8fd73da9910d52db2d1aa0d7
Since the Satellite now requires the order encryption functionality (since serial_number table is deprecated) to properly function, we can remove the config flag to turn on/off the feature.
Change-Id: Ie973f72a9a05a81cef9e53dc9c99d22c940c2488
This PR contains the minimum changes needed to stop inserting into the serial_numbers table. This is the first step in completely deprecating that table.
The next step is to create another PR to remove the expiredSerial chore, fix more tests, and remove any other methods on the serial_number table.
Change-Id: I5f12a56ebf3fa4d1a1976141d2911f25a98d2cc3
This fix issues with passing observers between iteration methods.
It's not best implementation but I think we will need to optimize it
soon one way or another.
Change-Id: I574599bfd10822d84e2d2f1800bcd88e176a76ea
The chief segment health models we've come up with are the "immediate
danger" model and the "survivability" model. The former calculates the
chance of losing a segment becoming lost in the next time period (using
the CDF of the binomial distribution to estimate the chance of x nodes
failing in that period), while the latter estimates the number of
iterations for which a segment can be expected to survive (using the
mean of the negative binomial distribution). The immediate danger model
was a promising one for comparing segment health across segments with
different RS parameters, as it is more precisely what we want to
prevent, but it turns out that practically all segments in production
have infinite health, as the chance of losing segments with any
reasonable estimate of node failure rate is smaller than DBL_EPSILON,
the smallest possible difference from 1.0 representable in a float64
(about 1e-16).
Leaving aside the wisdom of worrying about the repair of segments that
have less than a 1e-16 chance of being lost, we want to be extremely
conservative and proactive in our repair efforts, and the health of the
segments we have been repairing thus far also evaluates to infinity
under the immediate danger model. Thus, we find ourselves reaching for
an alternative.
Dr. Ben saves the day: the survivability model is a reasonably close
approximation of the immediate danger model, and even better, it is
far simpler to calculate and yields manageable values for real-world
segments. The downside to it is that it requires as input an estimate
of the total number of active nodes.
This change replaces the segment health calculation to use the
survivability model, and reinstates the call to SegmentHealth() where it
was reverted. It gets estimates for the total number of active nodes by
leveraging the reliability cache.
Change-Id: Ia5d9b9031b9f6cf0fa7b9005a7011609415527dc