Commit Graph

198 Commits

Author SHA1 Message Date
Michal Niewrzal
7dde184cb5 Merge 'master' branch
Change-Id: I6070089128a150a4dd501bbc62a1f8b394aa643e
2020-11-10 11:58:59 +00:00
Egon Elbre
e0dca4042d all: add pprof labels for debugger
By using pprof.Labels debugger is able to show service/peer names in
goroutine names.

Change-Id: I5f55253470f7cc7e556f8e8b87f746394e41675f
2020-10-29 15:10:07 +00:00
Egon Elbre
caefde6b32 private/{dbutil,tagsql}: pass ctx to database opening
Database opening usually dial and hence we should pass ctx to them.

Change-Id: Iaa2875981570d83e65be3710f841cf30349f807b
2020-10-29 10:51:29 +00:00
Egon Elbre
e3985799a1 storage/{cockroachkv,postgreskv}: add ctx to opening
Database opening usually dial and hence we should pass ctx to them.

Change-Id: Iecf41241aaa94d54506cbc80b0e53449848d8819
2020-10-29 10:49:08 +00:00
Egon Elbre
22ec940f7e storage/filestore: defer closing
Change-Id: Iccd6d1c64c1b7a6eecaa4c675bb7b554b381d0f5
2020-10-26 21:09:58 +02:00
Egon Elbre
2268cc1df3 all: fix linter complaints
Change-Id: Ia01404dbb6bdd19a146fa10ff7302e08f87a8c95
2020-10-13 15:59:01 +03:00
Egon Elbre
0bdb952269 all: use keyed special comment
Change-Id: I57f6af053382c638026b64c5ff77b169bd3c6c8b
2020-10-13 15:13:41 +03:00
Cameron Ayer
ca0c1a5f0c storagenode/{monitor,pieces}, storage/filestore: add loop to check storage directory writability
periodically create and delete a temp file in the storage directory
to verify writability. If this check fails, shut the node down.

Change-Id: I433e3a8d1d775fc779ae78e7cf3144a05ffd0574
2020-08-31 21:20:49 +00:00
Cameron Ayer
586e6f2f13 private/testblobs, storage, storage/filestore: add storage dir verification to filestore
Sometimes SNOs fail to properly configure or lose connection to their storage directory
which can result in DQ. This causes unnecessary repair and is unfortunate for all parties.

This change introduces the creation of a special file in the storage directory at runtime
containing the node ID. While the storage node runs, it periodically verifies that it can
find said file with the correct contents in the correct location. If not, the node will
shut down with an error message.

This change will solve the issue of nodes losing access to the storage directory, but it will not
solve the issue of nodes pointing to the wrong directory, as the identifying file is created each
time the node starts up. After this change has been the minimum version for a few releases, we will
remove the creation of the directory-identifying file from the storage node run command and add it
to the setup command.

Change-Id: Ib7b10e96ac07373219835e39239e93957e7667a4
2020-08-19 17:18:14 +00:00
Egon Elbre
94a09ce20b all: add missing dots
Change-Id: I93b86c9fb3398c5d3c9121b8859dad1c615fa23a
2020-08-11 17:50:01 +03:00
Yaroslav Vorobiov
4d2a505788 storagenode/db: explicitly open and create dbs
To prevent storagenode from implicitly recreating missing dbs and storage,
as such behaviour leads to audit failures. Do not allow storagenode to
start if any of dbs or storage is missing, corrupted, or dedicated storage disk is
unmounted, to get downtime instead.

Change-Id: Ic64e1f0ff4d8ef5b2fddbe7a7e53df4f4bd8652e
2020-07-24 14:08:47 +03:00
Egon Elbre
b67d7ecbc5 cmd/storagenode,storage/cockroachkv: better error handling
Change-Id: I6646aa046dc365c0dee38f23041be4fc2defb759
2020-07-16 20:03:50 +03:00
Egon Elbre
d8dcae3075 all: fix error checking
Change-Id: Ia0da1bbd6ce695139922f94096c2419281905e32
2020-07-16 19:13:14 +03:00
Egon Elbre
e70da5cd4e all: fix comments
Change-Id: I2d2307e3fab87de47a72b3595d051e2c95ff4f8a
2020-07-16 19:13:14 +03:00
Egon Elbre
080ba47a06 all: fix dots
Change-Id: I6a419c62700c568254ff67ae5b73efed2fc98aa2
2020-07-16 14:58:28 +00:00
stefanbenten
257855b5de all: replace == comparison with errors.Is
Change-Id: I05d9a369c7c6f144b94a4c524e8aea18eb9cb714
2020-07-14 15:50:25 +00:00
paul cannon
bbdb351e5e all: use jackc/pgx in place of lib/pq
What:

Use the github.com/jackc/pgx postgresql driver in place of
github.com/lib/pq.

Why:

github.com/lib/pq has some problems with error handling and context
cancellations (i.e. it might even issue queries or DML statements more
than once! see https://github.com/lib/pq/issues/939). The
github.com/jackx/pgx library appears not to have these problems, and
also appears to be better engineered and implemented (in particular, it
doesn't use "exceptions by panic"). It should also give us some
performance improvements in some cases, and even more so if we can use
it directly instead of going through the database/sql layer.

Change-Id: Ia696d220f340a097dee9550a312d37de14ed2044
2020-07-13 15:54:41 +00:00
Rafael Gomes
24a1eac16c storage/postgreskv: Sort storage keys before delete (postgres)
Change-Id: I63599e142b387ded25d110458ae10c2c96cd8ea6
2020-07-10 20:43:45 +00:00
Rafael Gomes
569b49768e storage/cockroachkv: Sort storage keys before delete (cockroach)
Change-Id: Iee8e56ff66e2760e933f3860d5fc75230a507558
2020-07-10 16:27:26 -04:00
Egon Elbre
5bdcd86fa7 ci: test benchmarks
This runs each benchmark for one iteration to ensure that they are
valid. Unfortunately, it does not give any useful metrics as output.

Change-Id: I68940398c8dd849aed656bd12656f48d5df10128
2020-07-10 13:26:49 +00:00
Qweder93
0521435e08 storagenode/gracefulexit: added deletion of all files left in storage/blobs/satellite after successful GE
https://storjlabs.atlassian.net/browse/SG-368

Change-Id: I29a978fe0d0153aedf2be91dc7f45b4ef386d447
2020-07-08 14:38:31 +03:00
JT Olio
78c0d9352d storage: support monkit traces of limit exceeded
errors.New errors will not show up in monkit tracing
as a useful error type. this change fixes a test (!)
and makes it so monkit will tell us what the error
type is, if we have this failure

Change-Id: Ic9933704e4095495c7ee286d9df3eb7eb94b25c9
2020-07-06 15:14:02 +00:00
Ivan Fraixedes
ed9816fd30 storage/filestore: Ignore IsNotExist error walking files
A file piece could be deleted in between walking the list of files read
from a directory and before we actually perform any operation on such
file. When that happens, we don't want to return an error, we want to
just ignore it and carry on.

Change-Id: I8f6986070e5883599a08fccf8b125c075b30fe1b
2020-06-23 19:28:42 +00:00
Rafael Gomes
958ea1b9df satellite/accounting: add download limit cache
Change-Id: I722930cab8bd5d240f4878dc6997e9bc7637311f
2020-06-12 16:33:46 -03:00
Egon Elbre
36c461bd59 private/tagsql: track proper closing of rows and statements
This ensures that rows are closed to avoid leaks.
Also verifies that Err() is called, to ensure that no
error is left behind.

Change-Id: Idd1bec9bf479f40021da67b2c80ce83033149469
2020-06-05 18:25:43 +00:00
Egon Elbre
fca4f43a04 storage/filestore: benchmark diskInfoFromPath
Change-Id: I996057b1c650aec7cec84b49877d1e184a12514e
2020-06-02 17:49:14 +03:00
Natalie Villasana
e79e83b618 storage/cockroachkv: handle retry errors for GetAll and DeleteMultiple
It looks like GetAll and DeleteMultiple are only used in tests for now,
but they didn't have handling for retry errors returned from cockroach.
If they're used in prod in the future, now they will retry.

Change-Id: I0f281454ddebf282789142ff1d66a69bda5727c9
2020-06-01 16:13:43 +00:00
Natalie Villasana
4ad163de2f storage/cockroachkv: rename opi to oci
opi was a carry-over from "ordered postgres iterator".

Change-Id: Id28cb8b8b3dccd119a1d1bbbb7f20206e932e4c5
2020-05-27 17:05:02 -04:00
Qweder93
f2a0c64425 storage/filestore: log potential disk corruption
In walkNamespaceWithPrefix log in case of "lstat" error, because this may indicate an underlying disk corruption.

SG-50

Change-Id: I867c3ffc47cfac325ae90658ec4780d213ff3e63
2020-05-27 12:12:55 +00:00
Natalie Villasana
8bd4d7b43e storage/cockroachkv: add check if retry is needed during iteration
This changeset replaces https://review.dev.storj.io/c/storj/storj/+/1839
which did the same thing but Nat couldn't figure out how to fix conflicting
files the correct gerrity way.

Change-Id: If05a8902aca986ea9f6c9168a90b31beebab839a
2020-05-26 14:32:06 -04:00
Jeff Wendling
f03b23d2dc storage/postgreskv: monitor calls to sql.Next
Change-Id: I32c0c92b347a6d1cadfeb69117de58f4d9b41ad2
2020-05-14 16:35:25 +00:00
Egon Elbre
d98b8f6e23 satellite/metainfo,storage: use different limit for metainfo loop
Change-Id: I5ef7233930679b977b33f7b3e1dda45c907dcfad
2020-05-05 10:37:20 +00:00
Egon Elbre
c630cf2490 storagenode/pieces: implement buffering for writing
Currently uploads can cause a lot of IOPS, reduce this by introducing a
in-memory buffer on-top of the file.

Change-Id: I5f4e3e01c0a36258271d180b922107de447bcb59
2020-05-04 06:01:32 +00:00
Qweder93
6c4d3f133f storagenode/dashboard: trash added to avaliable space calculations
Change-Id: Ia6f3af20dc98f569b86796ffa68428065d662c78
2020-05-01 15:26:02 +00:00
Egon Elbre
d225e2de48 all: add missing ctx.Cleanup calls in tests
Change-Id: Iaa65f90b9731d721691322bb92fc3da736aa10fe
2020-04-29 17:58:40 +00:00
Egon Elbre
85c45cd56f private/dbutil/pgtest: support multiple databases for testing
Currently Cockroach isn't performant for concurrent database setup and
tear-down. Instead of a single instance allow setting multiple potential
connection strings and let the tests pick one connection string
randomly.

This improves test duration by ~10 minutes.

While we are at significantly changing how pgtest works, introduce
helper PickPostgres and PickCockroach for selecting the database to
reduce code duplications in multiple places.

Change-Id: I8ad171d5c4c8a4fc081ec2ae9bdd0cc948a80619
2020-04-28 21:55:49 +03:00
Natalie Villasana
6f84be133a satellite/metainfo: add MigrateToLatest to PointerDB
In cases like the segment reaper script connecting to the metainfodb,
we don't want a db migration to happen automatically when we call
metainfo.NewStore. This adds MigrateToLatest method for postgreskv
and cockroackv, and calls MigrateToLatest in places where NewStore used
to create tables.

Change-Id: I682d0f26d609af0601dfdb32a24866cdf5d32a7e
2020-04-28 17:26:35 +00:00
Egon Elbre
2c0d61b18e satellite/metainfo: avoid temporary list
Currently ListV2 loaded the whole data into memory, even when all the
data wasn't being used, using up more memory than needed.

Change-Id: I5846d979344729b447c108a6cc9f4227229ec981
2020-04-15 08:01:42 +00:00
Egon Elbre
1cad686e9b storage: reduce default lookup limit to 500
Change-Id: Ic0adbf2f519babd780237d34c60636c1a1606762
2020-04-13 19:00:23 +03:00
Jeff Wendling
97e980cd8a private/dbutil: add database name to configure as a tag
storagenodes have like 10 or more databases. without this
tag they all get sent as the same value, stomping on each
other.

Change-Id: Ib12019684d6ea8f2a5b83df584056dfa79e3c4b3
2020-03-26 16:50:15 +00:00
Egon Elbre
326c0cebde storage/boltdb: update to etcd/bbolt v1.3.4
bbolt v1.3.4 has pointer usage fixes.

Change-Id: I5e0fc4782711d01c09ced579f25a4f8fbc8de85c
2020-03-24 12:33:34 +02:00
Egon Elbre
decb2ec69a private/processgroup: moved to storj.io/common/processgroup
Change-Id: I1ec0bb440dda757d8f9a6f564a0084dde2f9cc84
2020-03-03 10:50:33 +00:00
Egon Elbre
8bef560ab9 storage: delete unused code and lower visibility of static iterator
Change-Id: I8ec6ec9a710650611d272b03b2927759a8b02f91
2020-02-17 14:53:54 +00:00
Jeff Wendling
5d6cb68cd7 storage/{cockroachkv,postgreskv}: detailed monitoring for list
Change-Id: Iedba10776367233e59f3a6523efdb303b836b241
2020-02-12 10:55:07 +00:00
Cameron Ayer
33d696b096 storage/redis/redisserver: simplify redisserver creation
Change-Id: I881576a7881db671b5abeeca7120a022987cc47f
2020-02-11 19:11:57 +00:00
Egon Elbre
34f38bf6ce mod: upgrade miniredis to latest
miniredis 2.5.0 had a bug with matching keys with newlines.

Change-Id: I9bcf998459be6d7d4e03bca3589e989e5ed2304d
2020-02-06 13:31:17 +00:00
Jeff Wendling
7999d24f81 all: use monkit v3
this commit updates our monkit dependency to the v3 version where
it outputs in an influx style. this makes discovery much easier
as many tools are built to look at it this way.

graphite and rothko will suffer some due to no longer being a tree
based on dots. hopefully time will exist to update rothko to
index based on the new metric format.

it adds an influx output for the statreceiver so that we can
write to influxdb v1 or v2 directly.

Change-Id: Iae9f9494a6d29cfbd1f932a5e71a891b490415ff
2020-02-05 23:53:17 +00:00
Egon Elbre
81d44f19ee storage/filestore: ensure we bail on deleted folder without error
Change-Id: Iecf5f9ac5bc278489b433923c526d60611d356a4
2020-01-30 16:32:10 -05:00
Egon Elbre
da5e408afe storage: add DeleteMultiple method
DeleteMultiple will allow metainfo to delete multiple segments
and get the old pointers in a single request.

Change-Id: Ic144f30c5453274fa2b80df2895f123f5a9cc48b
2020-01-29 13:13:54 -05:00
Egon Elbre
c6f94ce9e4 satellite/metainfo: remove support for boltdb based pointerDB
By previous changes we can now remove testplanet.New and
also remove metainfo boltdb support.

Change-Id: I5bdfbbbb45967492728e705b34b2fedb4f28c381
2020-01-23 13:54:00 +02:00