Commit Graph

178 Commits

Author SHA1 Message Date
Qweder93
0521435e08 storagenode/gracefulexit: added deletion of all files left in storage/blobs/satellite after successful GE
https://storjlabs.atlassian.net/browse/SG-368

Change-Id: I29a978fe0d0153aedf2be91dc7f45b4ef386d447
2020-07-08 14:38:31 +03:00
JT Olio
78c0d9352d storage: support monkit traces of limit exceeded
errors.New errors will not show up in monkit tracing
as a useful error type. this change fixes a test (!)
and makes it so monkit will tell us what the error
type is, if we have this failure

Change-Id: Ic9933704e4095495c7ee286d9df3eb7eb94b25c9
2020-07-06 15:14:02 +00:00
Ivan Fraixedes
ed9816fd30 storage/filestore: Ignore IsNotExist error walking files
A file piece could be deleted in between walking the list of files read
from a directory and before we actually perform any operation on such
file. When that happens, we don't want to return an error, we want to
just ignore it and carry on.

Change-Id: I8f6986070e5883599a08fccf8b125c075b30fe1b
2020-06-23 19:28:42 +00:00
Rafael Gomes
958ea1b9df satellite/accounting: add download limit cache
Change-Id: I722930cab8bd5d240f4878dc6997e9bc7637311f
2020-06-12 16:33:46 -03:00
Egon Elbre
36c461bd59 private/tagsql: track proper closing of rows and statements
This ensures that rows are closed to avoid leaks.
Also verifies that Err() is called, to ensure that no
error is left behind.

Change-Id: Idd1bec9bf479f40021da67b2c80ce83033149469
2020-06-05 18:25:43 +00:00
Egon Elbre
fca4f43a04 storage/filestore: benchmark diskInfoFromPath
Change-Id: I996057b1c650aec7cec84b49877d1e184a12514e
2020-06-02 17:49:14 +03:00
Natalie Villasana
e79e83b618 storage/cockroachkv: handle retry errors for GetAll and DeleteMultiple
It looks like GetAll and DeleteMultiple are only used in tests for now,
but they didn't have handling for retry errors returned from cockroach.
If they're used in prod in the future, now they will retry.

Change-Id: I0f281454ddebf282789142ff1d66a69bda5727c9
2020-06-01 16:13:43 +00:00
Natalie Villasana
4ad163de2f storage/cockroachkv: rename opi to oci
opi was a carry-over from "ordered postgres iterator".

Change-Id: Id28cb8b8b3dccd119a1d1bbbb7f20206e932e4c5
2020-05-27 17:05:02 -04:00
Qweder93
f2a0c64425 storage/filestore: log potential disk corruption
In walkNamespaceWithPrefix log in case of "lstat" error, because this may indicate an underlying disk corruption.

SG-50

Change-Id: I867c3ffc47cfac325ae90658ec4780d213ff3e63
2020-05-27 12:12:55 +00:00
Natalie Villasana
8bd4d7b43e storage/cockroachkv: add check if retry is needed during iteration
This changeset replaces https://review.dev.storj.io/c/storj/storj/+/1839
which did the same thing but Nat couldn't figure out how to fix conflicting
files the correct gerrity way.

Change-Id: If05a8902aca986ea9f6c9168a90b31beebab839a
2020-05-26 14:32:06 -04:00
Jeff Wendling
f03b23d2dc storage/postgreskv: monitor calls to sql.Next
Change-Id: I32c0c92b347a6d1cadfeb69117de58f4d9b41ad2
2020-05-14 16:35:25 +00:00
Egon Elbre
d98b8f6e23 satellite/metainfo,storage: use different limit for metainfo loop
Change-Id: I5ef7233930679b977b33f7b3e1dda45c907dcfad
2020-05-05 10:37:20 +00:00
Egon Elbre
c630cf2490 storagenode/pieces: implement buffering for writing
Currently uploads can cause a lot of IOPS, reduce this by introducing a
in-memory buffer on-top of the file.

Change-Id: I5f4e3e01c0a36258271d180b922107de447bcb59
2020-05-04 06:01:32 +00:00
Qweder93
6c4d3f133f storagenode/dashboard: trash added to avaliable space calculations
Change-Id: Ia6f3af20dc98f569b86796ffa68428065d662c78
2020-05-01 15:26:02 +00:00
Egon Elbre
d225e2de48 all: add missing ctx.Cleanup calls in tests
Change-Id: Iaa65f90b9731d721691322bb92fc3da736aa10fe
2020-04-29 17:58:40 +00:00
Egon Elbre
85c45cd56f private/dbutil/pgtest: support multiple databases for testing
Currently Cockroach isn't performant for concurrent database setup and
tear-down. Instead of a single instance allow setting multiple potential
connection strings and let the tests pick one connection string
randomly.

This improves test duration by ~10 minutes.

While we are at significantly changing how pgtest works, introduce
helper PickPostgres and PickCockroach for selecting the database to
reduce code duplications in multiple places.

Change-Id: I8ad171d5c4c8a4fc081ec2ae9bdd0cc948a80619
2020-04-28 21:55:49 +03:00
Natalie Villasana
6f84be133a satellite/metainfo: add MigrateToLatest to PointerDB
In cases like the segment reaper script connecting to the metainfodb,
we don't want a db migration to happen automatically when we call
metainfo.NewStore. This adds MigrateToLatest method for postgreskv
and cockroackv, and calls MigrateToLatest in places where NewStore used
to create tables.

Change-Id: I682d0f26d609af0601dfdb32a24866cdf5d32a7e
2020-04-28 17:26:35 +00:00
Egon Elbre
2c0d61b18e satellite/metainfo: avoid temporary list
Currently ListV2 loaded the whole data into memory, even when all the
data wasn't being used, using up more memory than needed.

Change-Id: I5846d979344729b447c108a6cc9f4227229ec981
2020-04-15 08:01:42 +00:00
Egon Elbre
1cad686e9b storage: reduce default lookup limit to 500
Change-Id: Ic0adbf2f519babd780237d34c60636c1a1606762
2020-04-13 19:00:23 +03:00
Jeff Wendling
97e980cd8a private/dbutil: add database name to configure as a tag
storagenodes have like 10 or more databases. without this
tag they all get sent as the same value, stomping on each
other.

Change-Id: Ib12019684d6ea8f2a5b83df584056dfa79e3c4b3
2020-03-26 16:50:15 +00:00
Egon Elbre
326c0cebde storage/boltdb: update to etcd/bbolt v1.3.4
bbolt v1.3.4 has pointer usage fixes.

Change-Id: I5e0fc4782711d01c09ced579f25a4f8fbc8de85c
2020-03-24 12:33:34 +02:00
Egon Elbre
decb2ec69a private/processgroup: moved to storj.io/common/processgroup
Change-Id: I1ec0bb440dda757d8f9a6f564a0084dde2f9cc84
2020-03-03 10:50:33 +00:00
Egon Elbre
8bef560ab9 storage: delete unused code and lower visibility of static iterator
Change-Id: I8ec6ec9a710650611d272b03b2927759a8b02f91
2020-02-17 14:53:54 +00:00
Jeff Wendling
5d6cb68cd7 storage/{cockroachkv,postgreskv}: detailed monitoring for list
Change-Id: Iedba10776367233e59f3a6523efdb303b836b241
2020-02-12 10:55:07 +00:00
Cameron Ayer
33d696b096 storage/redis/redisserver: simplify redisserver creation
Change-Id: I881576a7881db671b5abeeca7120a022987cc47f
2020-02-11 19:11:57 +00:00
Egon Elbre
34f38bf6ce mod: upgrade miniredis to latest
miniredis 2.5.0 had a bug with matching keys with newlines.

Change-Id: I9bcf998459be6d7d4e03bca3589e989e5ed2304d
2020-02-06 13:31:17 +00:00
Jeff Wendling
7999d24f81 all: use monkit v3
this commit updates our monkit dependency to the v3 version where
it outputs in an influx style. this makes discovery much easier
as many tools are built to look at it this way.

graphite and rothko will suffer some due to no longer being a tree
based on dots. hopefully time will exist to update rothko to
index based on the new metric format.

it adds an influx output for the statreceiver so that we can
write to influxdb v1 or v2 directly.

Change-Id: Iae9f9494a6d29cfbd1f932a5e71a891b490415ff
2020-02-05 23:53:17 +00:00
Egon Elbre
81d44f19ee storage/filestore: ensure we bail on deleted folder without error
Change-Id: Iecf5f9ac5bc278489b433923c526d60611d356a4
2020-01-30 16:32:10 -05:00
Egon Elbre
da5e408afe storage: add DeleteMultiple method
DeleteMultiple will allow metainfo to delete multiple segments
and get the old pointers in a single request.

Change-Id: Ic144f30c5453274fa2b80df2895f123f5a9cc48b
2020-01-29 13:13:54 -05:00
Egon Elbre
c6f94ce9e4 satellite/metainfo: remove support for boltdb based pointerDB
By previous changes we can now remove testplanet.New and
also remove metainfo boltdb support.

Change-Id: I5bdfbbbb45967492728e705b34b2fedb4f28c381
2020-01-23 13:54:00 +02:00
Egon Elbre
76fdb5d863 storage: add configurable lookup limits
Currently storage tests were tied to the default lookup limit.
By increasing the limits, the tests will take longer and sometimes
cause a large number of goroutines to be started.

This change adds configurable lookup limit to all storage backends.

Also remove boltdb.NewShared, since it's not used any more.

Change-Id: I1a052f149da471246fac5745da133c3cfc27582e
2020-01-22 21:35:56 +02:00
ccase
38f707c0d2
storage/redis: Limit should not be applied as count.
COUNT on a SCAN does not actually limit the results [1]. It limits the
amount of work a single call to SCAN will perform before returning. By
setting this to limit we can sometimes timeout on the request if limit
is very large.

This restores storage/redis back to it's original behavior.

[1]: https://redis.io/commands/scan#the-count-option

Change-Id: Ia75afb5152df909df38c9a7c6feb74d062f49d6a
2020-01-22 13:23:06 -05:00
ccase
818242f452
storage/postgreskv: Reverting back to the venerable PG CAS
This was inadvertently converted to the Cockroach version. This reverts
most of that and keeps the changes since then.

Change-Id: Ia440eeebb01bc89fbfa8ce266668030173061469
2020-01-22 11:36:25 -05:00
Egon Elbre
1279eeae39 private/tagsql,storage: fixes to context cancellation
Replace all the remaining uses of sql.DB with tagsql.DB to
fix issues with context cancellation.

Introduce tagsql.Open which helps to get rid of all tagsql.Wrap-s.
Use tagsql in cockroachkv and postgreskv.

Change-Id: I8946d203341cb85a25976896fc7881e1f704e779
2020-01-20 15:44:39 +02:00
ccase
034f9845b1 storage: Plumb limit through storage backends.
* Plumbs the limit through all backends ensuring they don't do
  unnecessary work.
* Don't arbitrarily limit at the backend with hardcoded defaults. The
  limit will be set by the caller.

Prior to this change the code on recursive in some backends would do 10k
results from the database and then only return the first 1k (throwing
out 9k of them).

Prior to this change some backends had no limit at all (e.g. redis).

Change-Id: I1f327eefe095776d123dd11362cd00994c22efdf
2020-01-19 21:23:20 +00:00
Egon Elbre
1abfe42142 satellite: use tagsql
Change-Id: I2170dee409fb0c2fe85913ddd36e7811a3b853ed
2020-01-19 14:39:16 +02:00
ccase
14b43b7e9b storage/postgreskv/schema/data.go: Regenerate migrations that failed to update.
Change-Id: I9fd5a9a5414214faea5f8c476778fccbe022cb6c
2020-01-19 11:22:00 +00:00
Stefan Benten
409d4123bb
Add proper Pathdata Index (#3750) 2020-01-17 00:48:59 +01:00
Cameron Ayer
4424697d7f satellite/accounting: refactor live accounting to hold current estimated totals
live accounting used to be a cache to store writes before they are picked up during
the tally iteration, after which the cache is cleared. This created a window in which
users could potentially exceed the storage limit. This PR refactors live accounting to
hold current estimations of space used per project. This should also reduce DB load
since we no longer need to query the satellite DB when checking space used for limiting.

The mechanism by which the new live accounting system works is as follows:

During the upload of any segment, the size of that segment is added to its respective
project total in live accounting. At the beginning of the tally iteration we record
the current values in live accounting as `initialLiveTotals`. At the end of the tally
iteration we again record the current totals in live accounting as `latestLiveTotals`.
The metainfo loop observer in tally allows us to get the project totals from what it
observed in metainfo DB which are stored in `tallyProjectTotals`. However, for any
particular segment uploaded during the metainfo loop, the observer may or may not
have seen it. Thus, we take half of the difference between `latestLiveTotals` and
`initialLiveTotals`, and add that to the total that was found during tally and set that
as the new live accounting total.

Initially, live accounting was storing the total stored amount across all nodes rather than
the segment size, which is inconsistent with how we record amounts stored in the project
accounting DB, so we have refactored live accounting to record segment size

Change-Id: Ie48bfdef453428fcdc180b2d781a69d58fd927fb
2020-01-16 10:26:49 -05:00
Egon Elbre
64fb2d3d2f Revert "dbutil: statically require all databases accesses to use contexts"
This reverts commit 8e242cd012.

Revert because lib/pq has known issues with context cancellation.
These issues need to be resolved before these changes can be merged.

Change-Id: I160af51dbc2d67c5449aafa406a403e5367bb555
2020-01-15 07:28:00 +00:00
JT Olio
8e242cd012 dbutil: statically require all databases accesses to use contexts
this will allow for some nice runtime analysis down the road.
also, this allows for wrapping database handles in a way that
can interact with these contexts

requires https://review.dev.storj.io/c/storj/dbx/+/514

Change-Id: Ib087b7cd73296dd2c1e0331314da34d861f61d2b
2020-01-14 18:20:47 -05:00
JT Olio
86093d0940 postgreskv: drop not null on buckets
Change-Id: I2a2bd7709de211a9d1808248af573f1bb630cfd5
2020-01-14 12:07:53 -07:00
JT Olio
e1ba3931ec postgres2: use cockroachkv impl against postgres
this allows for setting $STORJ_METAINFO_POSTGRESQL_USE_ALT=yes if you
want to use the cockroachkv implementation for metainfo against postgres

Change-Id: I0c9458c83fd67ee63ef4a78351e64a80a0647408
2020-01-13 14:51:56 -06:00
Egon Elbre
b9740f0c0a storage/cockroachkv: add ctx argument
Change-Id: Ib6c29f44722b0354afcd499a0e567f04aef7eb28
2020-01-13 15:57:47 +02:00
Egon Elbre
0835b9024c private/dbutil/pgutil: add ctx argument
Change-Id: Icfd56ca8c1f831ad56c0195a0b883e8f0618daaf
2020-01-13 15:27:06 +02:00
Simon Guindon
5a1b2f49f4 storage/cockroachkv: add application name to the db connection string.
CockroachDB collects query metrics and separates them by application name and we were not setting the correct application name for the cockroachkv client. This PR calls our existing function that appends it to the connection string.
Change-Id: I4a97ed248c31f8b187c680d84b45472f0d50fd7e
2020-01-10 15:11:08 -05:00
Egon Elbre
d3d75a597f satellite,storage: clean global ctx usage in tests
Change-Id: I89ea5c95fc6895518b464f8eb6a4c74c6ae37651
2020-01-09 10:37:21 +00:00
paul cannon
0135852a0e storage/postgreskv: use transactional helper
We may never need this code to work with CockroachDB, but I'm on a
mission to avoid problematic uses of Begin() and BeginTx(), and anywhere
they appear is a possible place for someone to copy-and-paste and do
something wrong. dbutil.WithTx makes this code a little bit simpler too,
so it seems worthwhile.

Change-Id: I9b4ab484db4590cad5ab07de515bbf5d9708daed
2020-01-06 23:24:44 +00:00
Egon Elbre
6615ecc9b6 common: separate repository
Change-Id: Ibb89c42060450e3839481a7e495bbe3ad940610a
2019-12-27 14:11:15 +02:00
Isaac Hess
7d1e28ea30 storagenode: Include trash space when calculating space used
This commit adds functionality to include the space used in the trash
directory when calculating available space on the node.

It also includes this trash value in the space used cache, with methods
to keep the cache up-to-date as files are trashed, restored, and
emptied.

As part of the commit, the RestoreTrash and EmptyTrash methods have
slightly changed signatures. RestoreTrash now also returns the keys that
were restored, while EmptyTrash also returns the total disk space
recovered. Each of these changes makes it possible to keep the cache
up-to-date and know how much space is being used/recovered.

Also changed is the signature of PieceStoreAccess.ContentSize method.
Previously this method returns only the content size of the blob,
removing the size of any header data. This method has been renamed
`Size` and returns both the full disk size and content size of the blob.
This allows us to only stat the file once, and in some instances (i.e.
cache) knowing the full file size is useful.

Note: This commit simply adds the trash size data to the piece size data
we were already collecting. The piece size data is not accurate for all
use-cases (e.g. because it does not contain piece header data); however,
this commit does not fix that problem. Now that the ContentSize (Size)
method returns the full size of the file, it should be easier to fix
this problem in a future commit.

Change-Id: I4a6cae09e262c8452a618116d1dc66b687f59f85
2019-12-23 19:07:03 -07:00