storj/cmd/tools/segment-verify
Michal Niewrzal d21bbab2b2 satellite: fix metabase configuration wiring
New flag 'MultipleVersions' was not correctly passed from metainfo
configuration to metabase configuration. Because configuration was
set correctly for unit tests we didn't catch it and issue was found
while testing on QA satellite.

This change reduce number of places where new metabase flags needs
to be propagated from metainfo configuration to avoid problems with
setting new flags in the future.

Fixes https://github.com/storj/storj/issues/5274

Change-Id: I74bc122649febefd87f665be2fba628f6bfd9044
2022-11-02 15:17:34 +00:00
..
batch.go cmd/tools/segment-verify: add failure tests 2022-09-26 19:38:16 +00:00
csv_test.go cmd/tools/segment-verify: add csv writer 2022-09-15 13:28:21 +00:00
csv.go cmd/tools/segment-verify: add csv writer 2022-09-15 13:28:21 +00:00
main.go satellite: fix metabase configuration wiring 2022-11-02 15:17:34 +00:00
nodealias.go cmd/tools/segment-verify: allow ignoring specific nodes 2022-10-10 20:14:38 +03:00
process_test.go cmd/tools/segment-verify: don't mark node immediately offline 2022-10-14 08:10:26 +00:00
process.go cmd/tools/segment-verify: don't mark node immediately offline 2022-10-14 08:10:26 +00:00
README.md cmd/tools/segment-verify: a few fixes 2022-10-05 15:51:38 +00:00
service_test.go cmd/tools/segment-verify: don't mark node immediately offline 2022-10-14 08:10:26 +00:00
service.go cmd/tools/segment-verify: don't mark node immediately offline 2022-10-14 08:10:26 +00:00
summarize.go cmd/tools/segment-verify: add tool for summarizing log 2022-10-10 20:02:50 +03:00
verify_test.go cmd/tools/segment-verify: don't mark node immediately offline 2022-10-14 08:10:26 +00:00
verify.go mod: bump dependencies 2022-10-19 17:01:53 +00:00

segment-verify is a tool for verifying the segments.

High Level Overview

segment-verify verifies segment status on storage nodes in a few stages:

  1. First it loads the metabase for a batch of --service.batch-size=10000 segments.
  2. They are then distributed into queues using every storage nodes. It will preferentially choose nodes specified in --service.priority-nodes-path file, one storagenode id per line.
  3. Then it will query each storage node a single byte for each segment. --service.concurrency=1000 concurrent connections at a time are made.
  4. Every segment will be checked --service.check=3 times. However, any failed attempt (e.g. node is offline) is only retried once.
  5. When there are failures in verification process itself, then those segments are written into --service.retry-path=segments-retry.csv path.
  6. When the segment isn't found at least on one of the nodes, then it's written into --service.not-found-path=segments-not-found.csv file.

There are few parameters for controlling the verification itself:

# This allows to throttle requests, to avoid overloading the storage nodes.
--verify.request-throttle minimum interval for sending out each request (default 150ms)
# When there's a failure to make a request, the process will retry after this duration.
--verify.order-retry-throttle duration     how much to wait before retrying order creation (default 50ms)
# This is the time each storage-node has to respond to the request.
--verify.per-piece-timeout duration        duration to wait per piece download (default 800ms)
# Just the regular dialing timeout.
--verify.dial-timeout duration             how long to wait for a successful dial (default 2s)

Running the tool

segment-verify run range --low 00 --high ffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffff --config-dir ./satellite-config-dir