Commit Graph

159 Commits

Author SHA1 Message Date
Jeff Wendling
97e980cd8a private/dbutil: add database name to configure as a tag
storagenodes have like 10 or more databases. without this
tag they all get sent as the same value, stomping on each
other.

Change-Id: Ib12019684d6ea8f2a5b83df584056dfa79e3c4b3
2020-03-26 16:50:15 +00:00
Egon Elbre
326c0cebde storage/boltdb: update to etcd/bbolt v1.3.4
bbolt v1.3.4 has pointer usage fixes.

Change-Id: I5e0fc4782711d01c09ced579f25a4f8fbc8de85c
2020-03-24 12:33:34 +02:00
Egon Elbre
decb2ec69a private/processgroup: moved to storj.io/common/processgroup
Change-Id: I1ec0bb440dda757d8f9a6f564a0084dde2f9cc84
2020-03-03 10:50:33 +00:00
Egon Elbre
8bef560ab9 storage: delete unused code and lower visibility of static iterator
Change-Id: I8ec6ec9a710650611d272b03b2927759a8b02f91
2020-02-17 14:53:54 +00:00
Jeff Wendling
5d6cb68cd7 storage/{cockroachkv,postgreskv}: detailed monitoring for list
Change-Id: Iedba10776367233e59f3a6523efdb303b836b241
2020-02-12 10:55:07 +00:00
Cameron Ayer
33d696b096 storage/redis/redisserver: simplify redisserver creation
Change-Id: I881576a7881db671b5abeeca7120a022987cc47f
2020-02-11 19:11:57 +00:00
Egon Elbre
34f38bf6ce mod: upgrade miniredis to latest
miniredis 2.5.0 had a bug with matching keys with newlines.

Change-Id: I9bcf998459be6d7d4e03bca3589e989e5ed2304d
2020-02-06 13:31:17 +00:00
Jeff Wendling
7999d24f81 all: use monkit v3
this commit updates our monkit dependency to the v3 version where
it outputs in an influx style. this makes discovery much easier
as many tools are built to look at it this way.

graphite and rothko will suffer some due to no longer being a tree
based on dots. hopefully time will exist to update rothko to
index based on the new metric format.

it adds an influx output for the statreceiver so that we can
write to influxdb v1 or v2 directly.

Change-Id: Iae9f9494a6d29cfbd1f932a5e71a891b490415ff
2020-02-05 23:53:17 +00:00
Egon Elbre
81d44f19ee storage/filestore: ensure we bail on deleted folder without error
Change-Id: Iecf5f9ac5bc278489b433923c526d60611d356a4
2020-01-30 16:32:10 -05:00
Egon Elbre
da5e408afe storage: add DeleteMultiple method
DeleteMultiple will allow metainfo to delete multiple segments
and get the old pointers in a single request.

Change-Id: Ic144f30c5453274fa2b80df2895f123f5a9cc48b
2020-01-29 13:13:54 -05:00
Egon Elbre
c6f94ce9e4 satellite/metainfo: remove support for boltdb based pointerDB
By previous changes we can now remove testplanet.New and
also remove metainfo boltdb support.

Change-Id: I5bdfbbbb45967492728e705b34b2fedb4f28c381
2020-01-23 13:54:00 +02:00
Egon Elbre
76fdb5d863 storage: add configurable lookup limits
Currently storage tests were tied to the default lookup limit.
By increasing the limits, the tests will take longer and sometimes
cause a large number of goroutines to be started.

This change adds configurable lookup limit to all storage backends.

Also remove boltdb.NewShared, since it's not used any more.

Change-Id: I1a052f149da471246fac5745da133c3cfc27582e
2020-01-22 21:35:56 +02:00
ccase
38f707c0d2
storage/redis: Limit should not be applied as count.
COUNT on a SCAN does not actually limit the results [1]. It limits the
amount of work a single call to SCAN will perform before returning. By
setting this to limit we can sometimes timeout on the request if limit
is very large.

This restores storage/redis back to it's original behavior.

[1]: https://redis.io/commands/scan#the-count-option

Change-Id: Ia75afb5152df909df38c9a7c6feb74d062f49d6a
2020-01-22 13:23:06 -05:00
ccase
818242f452
storage/postgreskv: Reverting back to the venerable PG CAS
This was inadvertently converted to the Cockroach version. This reverts
most of that and keeps the changes since then.

Change-Id: Ia440eeebb01bc89fbfa8ce266668030173061469
2020-01-22 11:36:25 -05:00
Egon Elbre
1279eeae39 private/tagsql,storage: fixes to context cancellation
Replace all the remaining uses of sql.DB with tagsql.DB to
fix issues with context cancellation.

Introduce tagsql.Open which helps to get rid of all tagsql.Wrap-s.
Use tagsql in cockroachkv and postgreskv.

Change-Id: I8946d203341cb85a25976896fc7881e1f704e779
2020-01-20 15:44:39 +02:00
ccase
034f9845b1 storage: Plumb limit through storage backends.
* Plumbs the limit through all backends ensuring they don't do
  unnecessary work.
* Don't arbitrarily limit at the backend with hardcoded defaults. The
  limit will be set by the caller.

Prior to this change the code on recursive in some backends would do 10k
results from the database and then only return the first 1k (throwing
out 9k of them).

Prior to this change some backends had no limit at all (e.g. redis).

Change-Id: I1f327eefe095776d123dd11362cd00994c22efdf
2020-01-19 21:23:20 +00:00
Egon Elbre
1abfe42142 satellite: use tagsql
Change-Id: I2170dee409fb0c2fe85913ddd36e7811a3b853ed
2020-01-19 14:39:16 +02:00
ccase
14b43b7e9b storage/postgreskv/schema/data.go: Regenerate migrations that failed to update.
Change-Id: I9fd5a9a5414214faea5f8c476778fccbe022cb6c
2020-01-19 11:22:00 +00:00
Stefan Benten
409d4123bb
Add proper Pathdata Index (#3750) 2020-01-17 00:48:59 +01:00
Cameron Ayer
4424697d7f satellite/accounting: refactor live accounting to hold current estimated totals
live accounting used to be a cache to store writes before they are picked up during
the tally iteration, after which the cache is cleared. This created a window in which
users could potentially exceed the storage limit. This PR refactors live accounting to
hold current estimations of space used per project. This should also reduce DB load
since we no longer need to query the satellite DB when checking space used for limiting.

The mechanism by which the new live accounting system works is as follows:

During the upload of any segment, the size of that segment is added to its respective
project total in live accounting. At the beginning of the tally iteration we record
the current values in live accounting as `initialLiveTotals`. At the end of the tally
iteration we again record the current totals in live accounting as `latestLiveTotals`.
The metainfo loop observer in tally allows us to get the project totals from what it
observed in metainfo DB which are stored in `tallyProjectTotals`. However, for any
particular segment uploaded during the metainfo loop, the observer may or may not
have seen it. Thus, we take half of the difference between `latestLiveTotals` and
`initialLiveTotals`, and add that to the total that was found during tally and set that
as the new live accounting total.

Initially, live accounting was storing the total stored amount across all nodes rather than
the segment size, which is inconsistent with how we record amounts stored in the project
accounting DB, so we have refactored live accounting to record segment size

Change-Id: Ie48bfdef453428fcdc180b2d781a69d58fd927fb
2020-01-16 10:26:49 -05:00
Egon Elbre
64fb2d3d2f Revert "dbutil: statically require all databases accesses to use contexts"
This reverts commit 8e242cd012.

Revert because lib/pq has known issues with context cancellation.
These issues need to be resolved before these changes can be merged.

Change-Id: I160af51dbc2d67c5449aafa406a403e5367bb555
2020-01-15 07:28:00 +00:00
JT Olio
8e242cd012 dbutil: statically require all databases accesses to use contexts
this will allow for some nice runtime analysis down the road.
also, this allows for wrapping database handles in a way that
can interact with these contexts

requires https://review.dev.storj.io/c/storj/dbx/+/514

Change-Id: Ib087b7cd73296dd2c1e0331314da34d861f61d2b
2020-01-14 18:20:47 -05:00
JT Olio
86093d0940 postgreskv: drop not null on buckets
Change-Id: I2a2bd7709de211a9d1808248af573f1bb630cfd5
2020-01-14 12:07:53 -07:00
JT Olio
e1ba3931ec postgres2: use cockroachkv impl against postgres
this allows for setting $STORJ_METAINFO_POSTGRESQL_USE_ALT=yes if you
want to use the cockroachkv implementation for metainfo against postgres

Change-Id: I0c9458c83fd67ee63ef4a78351e64a80a0647408
2020-01-13 14:51:56 -06:00
Egon Elbre
b9740f0c0a storage/cockroachkv: add ctx argument
Change-Id: Ib6c29f44722b0354afcd499a0e567f04aef7eb28
2020-01-13 15:57:47 +02:00
Egon Elbre
0835b9024c private/dbutil/pgutil: add ctx argument
Change-Id: Icfd56ca8c1f831ad56c0195a0b883e8f0618daaf
2020-01-13 15:27:06 +02:00
Simon Guindon
5a1b2f49f4 storage/cockroachkv: add application name to the db connection string.
CockroachDB collects query metrics and separates them by application name and we were not setting the correct application name for the cockroachkv client. This PR calls our existing function that appends it to the connection string.
Change-Id: I4a97ed248c31f8b187c680d84b45472f0d50fd7e
2020-01-10 15:11:08 -05:00
Egon Elbre
d3d75a597f satellite,storage: clean global ctx usage in tests
Change-Id: I89ea5c95fc6895518b464f8eb6a4c74c6ae37651
2020-01-09 10:37:21 +00:00
paul cannon
0135852a0e storage/postgreskv: use transactional helper
We may never need this code to work with CockroachDB, but I'm on a
mission to avoid problematic uses of Begin() and BeginTx(), and anywhere
they appear is a possible place for someone to copy-and-paste and do
something wrong. dbutil.WithTx makes this code a little bit simpler too,
so it seems worthwhile.

Change-Id: I9b4ab484db4590cad5ab07de515bbf5d9708daed
2020-01-06 23:24:44 +00:00
Egon Elbre
6615ecc9b6 common: separate repository
Change-Id: Ibb89c42060450e3839481a7e495bbe3ad940610a
2019-12-27 14:11:15 +02:00
Isaac Hess
7d1e28ea30 storagenode: Include trash space when calculating space used
This commit adds functionality to include the space used in the trash
directory when calculating available space on the node.

It also includes this trash value in the space used cache, with methods
to keep the cache up-to-date as files are trashed, restored, and
emptied.

As part of the commit, the RestoreTrash and EmptyTrash methods have
slightly changed signatures. RestoreTrash now also returns the keys that
were restored, while EmptyTrash also returns the total disk space
recovered. Each of these changes makes it possible to keep the cache
up-to-date and know how much space is being used/recovered.

Also changed is the signature of PieceStoreAccess.ContentSize method.
Previously this method returns only the content size of the blob,
removing the size of any header data. This method has been renamed
`Size` and returns both the full disk size and content size of the blob.
This allows us to only stat the file once, and in some instances (i.e.
cache) knowing the full file size is useful.

Note: This commit simply adds the trash size data to the piece size data
we were already collecting. The piece size data is not accurate for all
use-cases (e.g. because it does not contain piece header data); however,
this commit does not fix that problem. Now that the ContentSize (Size)
method returns the full size of the file, it should be easier to fix
this problem in a future commit.

Change-Id: I4a6cae09e262c8452a618116d1dc66b687f59f85
2019-12-23 19:07:03 -07:00
Jeff Wendling
efa08d4081 storage/cockroachkv: use different batch size for recursive iteration
the relatively small batch size of 128 was chosen so that if we have
a set of keys like

    a/1
    a/2
    ...
    a/100000
    b
    c

list operations would not have to walk 100k keys inside of a/ before
skipping to b. unfortunately, iteration is also used by the metainfo
loop. in that case, it's doing a recursive listing, and so there's
no need to skip large prefixes. thus, we can use a bigger batch
size when recursive listing is requested.

Change-Id: I87cf1ba385b6eb2928c5b7cc5e0f7a8c7bd126d9
2019-12-17 18:24:26 +00:00
Egon Elbre
7a36507a0a private/testcontext: ensure we call cleanup everywhere
Change-Id: Icb921144b651611d78f3736629430d05c3b8a7d3
2019-12-17 14:16:09 +00:00
paul cannon
2f7465c294 private/dbutil: register "cockroach" as sql.DB driver
this will allow us to inspect the type of `db.Driver()` on *sql.DB
connections to correctly differentiate between pg and crdb conns.

as a bonus, this moves all concerns about when to replace "cockroach://"
with "postgres://" out of view, letting the thin shim "driver" take care
of that.

Change-Id: Ib24103ab7c508231e681f89a7321b623e4e125e9
2019-12-16 19:10:00 +00:00
paul cannon
94651921c3 storage/testsuite: pass ctx in to bulk setup methods
to make them cancelable. Also,

* rename BulkDelete->BulkDeleteAll

this leaves room for a new method `BulkDelete(items storage.Items)` that
does a bulk deletion of a specified list of items, as opposed to
deleting _everything_. such a method would be used in the
`cleanupItems()` function found in utils.go, because when individual
deletes are fairly slow, that step takes way too long during tests.

* use BulkDelete method if available

nothing currently provides `BulkDelete(items storage.Items) error`,
but we made use of it with the Bigtable testing and code, and may make
use of it again when adding new kv backends.

* and eliminate the global context in test_iterate.go

Change-Id: I171c7a3818beffbad969b131e98b9bbe3f324bf2
2019-12-10 20:22:08 +00:00
Jeff Wendling
48da8baab5 storj-sim: work with cockroach:// urls for satellite databases
for storj-sim to work, we need to avoid schemas in cockroach urls
so we have storj-sim create namespaced databases instead of schemas
and we have the migrate command create the database in the same way
that it would create a schema for postgres. then it works!

a follow up commit will move the creation of the database/schemas
into storj-sim's setup step so that we can avoid doing these icky
creations during normal migration calls. it will also make the
pointerdb have an explicit call to migrate instead of just doing
it every time it's opened.

Change-Id: If69ef5cb96b6866b0438c761bd445afb3597ae5f
2019-12-09 23:44:00 +00:00
Jeff Wendling
1df7b360d7 satellite/metainfo: Use cockroachdb client for metainfo db
Change-Id: I3cf7a00de4f654eacaffbb494f4841c64a2d9ce6
2019-12-05 10:33:54 -07:00
Jeff Wendling
f15192ea40 storage/cockroachkv: initial client implementation
Change-Id: I72ae558739b2ca532f31cb64f480b43437b6b309
2019-12-05 16:11:23 +00:00
paul cannon
378b863b2b private,satellite: unite all the "temp db schema" things
first, so that they all work the same way, because it's getting
complicated, and second, so that we can do the appropriate thing
instead of CREATE SCHEMA for cockroachdb.

Change-Id: I27fbaeeb6223a3e06d97bcf692a2d014b31465f7
2019-12-05 15:36:59 +00:00
Ivan Fraixedes
42c61138e8
storage: Improve doc comments delete methods (#3591)
Improve the documentation of several methods involved in the delete
operation to make clear their behavior without having to inspect their
logic.
2019-12-02 12:18:20 +01:00
Isaac Hess
a6235d3962 storage/filestore: Monitor when we open files in trash
Change-Id: I817bf8349c2e1ba55e1490f06162af1099bebdb0
2019-11-26 14:38:49 -07:00
Isaac Hess
56f8fd2dd7
storagenode/pieces: Add EmptyTrash functionality (#3640)
* storagenode/pieces: Add EmptyTrash functionality

* storagenode/pieces: Fix err

* storagenode/pieces: Fix lint
2019-11-26 09:25:21 -07:00
Matt Robinson
9af97d366a Make sed a little more cross platformable (#3629) 2019-11-22 11:17:02 -07:00
Isaac Hess
2166c2a21b
storage/filestore: Add Trash and RestoreTrash to Blobs (#3529)
* storage/filestore: Add Trash and RestoreTrash to Blobs

* Change restore to be satellite-specific

* Fix comment

* Fix merge rename conflict
2019-11-14 15:19:15 -07:00
Egon Elbre
ee6c1cac8a
private: rename internal to private (#3573) 2019-11-14 21:46:15 +02:00
Egon Elbre
1e64006e32 lint: add staticcheck as a separate step (#3569) 2019-11-14 10:31:30 +02:00
paul cannon
d5963d81d0
storage/postgreskv: fix ultra-slow nonrecursive list (#3564)
This is based on Jeff's most excellent work to identify why
non-recursive listing under postgreskv was phenomenally slow. It turns
out PostgreSQL's query planner was actually using two sequential scans
of the pathdata table to do its job. It's unclear for how long that has
been happening, but obviously it won't scale any further.

The main change is propagating bucket association with pathnames through
the CTE so that the query planner lets itself use the pathdata index on
(bucket, fullpath) for the skipping-forward part.

Jeff also had some changes to the range ends to keep NULL from being
used- I believe with the intent of making sure the query planner was
able to use the pathdata index. My tests on postgres 9.6 and 11
indicate that those changes don't make any appreciable difference in
performance or query plan, so I'm going to leave them off for now to
avoid a careful audit of the semantic differences.

There is a test included here, which only serves to check that the new
version of the function is indeed active. To actually ensure that no
sequential scans are being used in the query plan anymore, our tests
would need to be run against a test db with lots of data already loaded
in it, and that isn't feasible for now.

Change-Id: Iffe9a1f411c54a2f742a4abb8f2df0c64fd662cb
2019-11-13 17:52:14 -06:00
paul cannon
bd89f51c66
Keep v0pieceinfo database isolated (#3364)
* put TestCreateV0 back in StoreForTest
* avoid direct handles to V0 pieceinfo db
* type mismatch fix
* use storage.Blobs interface in store_test.go

..instead of filestore.Store. this will allow filestore.Store to become
unexported.

* unexport filestore.Store

rename it to blobStore. things should use the storage.Blobs interface
instead. changes in this commit are purely mechanical (made through the
"refactor" tool in Gocode followed by search/replace on the word "Store"
within the storage/filestore/ directory).

* kill filestore.StoreForTest

now that filestore.blobStore is unexported, there isn't a need for a
specialized wrapper type. this (not coincidentally) also makes it
possible for the WriterForFormatVersion() method on
storagenode/pieces.StoreForTest to work, without requiring everything to
wrap the store.blobs attribute in a filestore.StoreForTest, which was
impractical.
2019-11-13 13:15:31 -06:00
paul cannon
0d0b5a449b storage/filestore: monkit event for delete queuing (#3507) 2019-11-12 15:56:57 -05:00
paul cannon
0c025fa937 storage/: remove reverse-key-listing feature
We don't use reverse listing in any of our code, outside of tests, and
it is only exposed through libuplink in the
lib/uplink.(*Project).ListBuckets() API. We also don't know of any users
who might have a need for reverse listing through ListBuckets().

Since one of our prospective pointerdb backends can not support
backwards iteration, and because of the above considerations, we are
going to remove the reverse listing feature.

Change-Id: I8d2a1f33d01ee70b79918d584b8c671f57eef2a0
2019-11-12 18:47:51 +00:00