Commit Graph

52 Commits

Author SHA1 Message Date
paul cannon
740cb0d9c7 cmd/tools/segment-verify: fix read-csv subcommand
We were reading in a segment's stream ID and position, and assuming that
was enough for the downloader. But of course, the downloader needs
AliasPieces filled in. So now we request each segment record from the
metabase and fill in the VerifySegment records entirely.

Change-Id: If85236388eb99a65e2cb739aa976bd49ee2b2c89
2023-01-24 09:08:03 +00:00
paul cannon
c3b5c18d00 cmd/tools/segment-verify: learn to take a CSV list of segments as input
This will allow us to retry some specific segments from
segments-retry.csv with particularly high counts of "retry" pieces.

Change-Id: I48fd419cc0350a3be4c9e77ce8d28871565b7f97
2023-01-18 20:53:27 +00:00
Michal Niewrzal
0185bba90a cmd: cleanup segment verify/repair tools
* use the same DB application name for satellite and metabase
* use noop orders DB implementation to avoid storing allocated bandwidth
in DB

Change-Id: I20e88c694d38240fe1a20c45719e210cfb76402c
2023-01-12 15:27:07 +00:00
paul cannon
246c193145 cmd/tools/segment-verify: add timeout to VerifyWithExists
We have to wait until the slowest node is done being tested before we
can move on to the next of segments. Since the slowest node can be
arbitrarily slow, we'll set a timeout and treat too-slow nodes as
temporarily offline.

Change-Id: I80fe865dd4e8f826700430fb0140c2d3aefca381
2023-01-06 17:25:30 +00:00
paul cannon
23acee2df0 cmd/tools/segment-verify: better handle timeouts
When we are verifying pieces by downloading the first byte, if we
encounter a timeout, treat the node as if we failed to connect to it,
and log the error once instead of twice.

Change-Id: I70602d554183c98f1213f3ffb1bfec41100ea0e7
2023-01-06 16:52:28 +00:00
paul cannon
3a9ad48345 cmd/tools/segment-verify: fix problem-pieces.csv
This csv file was being closed as soon as the service was created.
All subsequent writes to the closed file handle produced errors,
which were logged but otherwise ignored.

Instead, we would like the file to remain open and writable, until
the service is destroyed.

Change-Id: Ib29944d25b2f5b2d0f90fdbdcde44fea8d769321
2023-01-06 16:23:34 +00:00
paul cannon
6e1554652a cmd/tools/segment-verify: handle dq'd nodes
Previously, if any pieces are still on disqualified nodes, this tool
would treat those pieces as fine (if the disqualified node is still
online) or temporarily unavailable (if the disqualified node is
offline). Instead, we should treat such pieces as lost.

This also fixes a slight problem with the code that handles a broken
alias. This is not likely to happen, but if we do see an alias that is
not in the alias map, we return an error instead of nil.

Change-Id: Ib4e2e729ef0535dd7bd9ce2f621680d9f959891c
2023-01-04 17:54:03 +00:00
paul cannon
2feb49afc3 cmd/tools/segment-verify: don't cache offline status forever
Because it was originally intended to work on only a few pieces from
each segment at a time, and would frequently have reset its list of
online nodes, segment-verify has been taking nodes out of its
onlineNodes set and never putting them back. This means that over a long
run in Check=0 mode, we end up treating more and more nodes as offline,
permanently. This trend obfuscates the number of missing pieces that
each segment really has, because we don't check pieces on offline nodes.

This commit changes the onlineNodes set to an "offlineNodes" set, with
an expiration time on the offline-ness quality. So nodes are added to
the offlineNodes set when we see they are offline, and then we only
treat them as offline for the next 30 minutes (configurable). After that
point, we will try connecting to them again.

Change-Id: I14f0332de25cdc6ef655f923739bcb4df71e079e
2023-01-03 23:11:42 +00:00
paul cannon
46d99a06d5 cmd/tools/segment-verify: write to pieces csv from WithExists methods
The WithExists methods previously were not writing problematic pieces to
problem-pieces.csv. With this change, they will.

Change-Id: I51eadd3d8f4299e1efa787c9266a7aacfa525eb3
2022-12-28 01:12:46 +00:00
paul cannon
9544936794 cmd/tools/segment-verify: don't double-count notfound
When this branch is followed, `audit.OutcomeFailure` is returned, and
`MarkNotFound()` is immediately called again (in
`(*NodeVerifier).Verify()`). Calling `MarkNotFound()` twice for the same
piece is not correct.

Change-Id: I1a2764bc32ed015628fcd9353ac3307f269b4bbd
2022-12-28 00:37:14 +00:00
paul cannon
aec596bb39 cmd/tools/segment-verify: monkit-ify WithExists methods
It may help to know how much faster these methods are than the
alternative (asking nodes for each piece in turn).

Change-Id: Ieb7c963f62b662f72c84a49de8a09c065c14f782
2022-12-28 00:00:40 +00:00
paul cannon
42e2a14316 cmd/tools/segment-verify: flush after write to pieces csv
It was ok as it was, but since we want to keep a close eye on progress
while the tool is running, it will help to have results written to the
output file immediately instead of after the buffer is full or the
program exits.

Change-Id: Ie027f05771a637afb06969ec775cd32b142b7635
2022-12-27 14:58:54 -06:00
paul cannon
b2422caaef cmd/tools/segment-verify: log less retry segments
When Check == 0 (check all pieces), there is nearly always a piece left
in the retry count, so most segments get logged in segments-retry.csv.
This change makes it so we require retry>5 before adding to
segments-retry.csv (only in the check==0 case).

Change-Id: Iaea523c27eb777e3c248c27c7ef5effe77ae54cf
2022-12-23 14:29:25 +00:00
paul cannon
0b790070a3 cmd/tools/segment-verify: pass over bad segments
Change-Id: I1b4dd9da755c6a2028760723e15219f5821f702f
2022-12-22 18:12:12 -06:00
Michal Niewrzal
4851b4e06d cmd/tools/segment-verify: small improvements
* better error handling when Exists method is not avaialble on SN
* more optimal processing of response from Exists method

Change-Id: I6d61c09473e9f5ab76a4601720e8bd520767f4c2
2022-12-22 15:21:33 +00:00
Clement Sam
cda1d67465 cmd/tools/segment-verify: adjust to SN Exists endpoint
Change-Id: I409aeae29aa87996f2a6047f976d215a69e9d7f5
2022-12-21 19:24:31 +00:00
Fadila Khadar
d23e25ce0f cmd/tools/segment-verify: remove unused test code
Accidentally added some code to a test. As it is unused, this PR removes it.

Change-Id: I7adddc78c5ed747225e365989ab58504a9625ad7
2022-12-19 14:33:08 +00:00
Ethan Adams
1c309a0318 cmd/tools/segment-verify: check for unvetted nodes
this also renames the command from `duplicates` to `node-check`

Change-Id: Idd303b17ec03f5b55fbbb1f4039a7761da37abe6
2022-12-19 09:59:13 +00:00
Egon Elbre
04f16f8768 cmd/tools/segment-verify: tool for checking duplicate net
Change-Id: Ie47c1282e580ffc418bf3b1f3c8820a48973aefc
2022-12-15 22:58:36 +00:00
paul cannon
727136141a satellite/cmd/tools/segment-verify: check all pieces
This adds the capability to the segment-verify tool of checking all
pieces of every indicated segment.

Pieces which could not be accessed (i.e. we couldn't get a single
byte from them) are recorded in a csv file.

I haven't been able to test this in any very meaningful way, yet, but I
am comforted by the fact that the worst things it could possibly do are
(a) download pieces too many times, and (b) miss downloading some
pieces.

Change-Id: I3aba30921572c974993363eb36d0fd5b3ae97907
2022-12-14 19:06:08 +00:00
Fadila Khadar
995f78d579 satellite/cmd: segment-verify verifies segments in given bucket list
Provides the `segment-verify run buckets` command for verifying segments within a list of buckets.

Bucket list is a csv file with `project_id,bucket_name` to be checked.

https://github.com/storj/storj-private/issues/101

Change-Id: I3d25c27b56fcab4a6a1aebb6f87514d6c97de3ff
2022-12-13 20:10:00 +00:00
Cameron
f06da25c3d satellite/overlay: add nodeevents.DB to satellite overlay service
Add nodeevents.DB to satellite overlay service so we can insert node
events into the nodeevents DB.

Change-Id: I642c0ccc9941ecdb08cb22d5c8cf701959a55156
2022-11-02 15:56:37 +00:00
Michal Niewrzal
d21bbab2b2 satellite: fix metabase configuration wiring
New flag 'MultipleVersions' was not correctly passed from metainfo
configuration to metabase configuration. Because configuration was
set correctly for unit tests we didn't catch it and issue was found
while testing on QA satellite.

This change reduce number of places where new metabase flags needs
to be propagated from metainfo configuration to avoid problems with
setting new flags in the future.

Fixes https://github.com/storj/storj/issues/5274

Change-Id: I74bc122649febefd87f665be2fba628f6bfd9044
2022-11-02 15:17:34 +00:00
JT Olio
58a9c55f36 mod: bump dependencies
- storj.io/common

Change-Id: Ib78154acc253a13683495abfdd96d702625fdce8
2022-10-19 17:01:53 +00:00
Egon Elbre
22c0b0ac5c cmd/tools/segment-verify: don't mark node immediately offline
Rather than marking node immediately offline, wait for more failures
until removing from the set.

Change-Id: I4363294a75d7d2844afc1f9c0025f664f933c2d7
2022-10-14 08:10:26 +00:00
Egon Elbre
a80a0ebeae cmd/tools/segment-verify: redial once rather than giving up
It's quite likely to hit some timelimit, rather than giving up
immediately, let's retry after the throttle amount.

Change-Id: I20944b058d771f5d4bfa0eea7a2c26cefcd74739
2022-10-14 01:18:31 +00:00
Cameron
a52f766273 satellite/overlay: add email-sending functionality to overlay service
We want to send emails to SNOs. Node status changes go through the
overlay service, so it's a good place to add the mail service.
Add the mailservice.Service, satellite address, and satellite name to
overlay service. Also add feature flag --overlay.send-node-emails

Change-Id: I3bd2cb3bf22f9724954ce2374f8b651b902b3a24
2022-10-13 18:01:05 +00:00
Egon Elbre
dd60318147 cmd/tools/segment-verify: use resolved ip
Change-Id: I3662aaea3ff8721c415c038b2b5324d165b60975
2022-10-12 12:43:11 +00:00
Egon Elbre
ff22fc7ddd all: fix deprecated ioutil commands
Change-Id: I59db35116ec7215a1b8e2ae7dbd319fa099adfac
2022-10-11 15:27:29 +00:00
Egon Elbre
142a04f208 cmd/tools/segment-verify: add connection pool
Change-Id: If0f85edbf99438ac41c23fc7107fdab926288cc2
2022-10-11 09:06:44 +03:00
Egon Elbre
8916f2ee92 cmd/tools/segment-verify: allow ignoring specific nodes
This adds a new flag that allows to ignore some nodes completely.

Change-Id: I203d25f931262c809037e25e9c37e9a89bf47026
2022-10-10 20:14:38 +03:00
Egon Elbre
9e50d837e3 cmd/tools/segment-verify: add tool for summarizing log
Change-Id: I3177ab71dfd25e11adfedce32a530d83dda63bd6
2022-10-10 20:02:50 +03:00
Egon Elbre
5f01dad3a3 cmd/tools/segment-verify: add total progress indicator
Change-Id: Ib729abf6adbeba8d94e08c7e11497c6d5ddd5ec2
2022-10-10 20:02:30 +03:00
Egon Elbre
ea4b3023d9 cmd/tools/segment-verify: fix piece id derivation
Change-Id: Ib27fd8630e1e5a90060dff2a09c51f488960177f
2022-10-06 13:43:08 +00:00
Egon Elbre
c8506cdda3 satellite/metabase,cmd/tools/segment-verify: simplify interface
Change-Id: Icdd445b1713bc26cee3b3a125b68b0cde0739837
2022-10-06 13:42:00 +00:00
Egon Elbre
c1817ab743 cmd/tools/segment-verify: a few fixes
The flags weren't properly loading from config.

The code assumed that every node that's online for downloading also have
data uploaded to them -- which is not true.

Change-Id: Ifd65a47b9eca5b4841231928244fab17acbde6fb
2022-10-05 15:51:38 +00:00
Egon Elbre
4c374a2357 cmd/tools/segment-verify: add tiny readme
Change-Id: Ia314c615f8b7fdb13e2b5f81c1be82ec686ca819
2022-10-03 16:01:24 +00:00
Michal Niewrzal
a97cd97789 satellite/orders: remove unused service dependency
Orders service doesn't need buckets service anymore.

Change-Id: I27853cda87e82b528f53667e4b4866801f7bfb62
2022-09-28 08:56:36 +00:00
Egon Elbre
8069973dee cmd/tools/segment-verify: add failure tests
* Fix an invalid slice index calculation.

Change-Id: I7f1b85edc46df362697aa132b967d5d23f9d5522
2022-09-26 19:38:16 +00:00
Egon Elbre
f98d551c9b cmd/tools/segment-verify: test service
Change-Id: Ibd83960c18123e8f29e22089007dc32c8d532240
2022-09-22 17:23:02 +00:00
Egon Elbre
0bfaadcc6c cmd/tools/segment-verify: fixes and more tests
* Disallow too large listing limit, which would cause a lot of memory to
  be consumed.
* Fix throttling logic and add a test.
* Fix read error handling; depending on the concurrency it can return
  the NotFound status either in the Read or Close.

Change-Id: I778f04a5961988b2480df5c7faaa22393fc5d760
2022-09-22 10:32:30 +00:00
Egon Elbre
0e99f7a8cf cmd/tools/segment-verify: add loading of priority nodes
Change-Id: Idcc41469ea5f71eab1b9dccbe0f14da537386a17
2022-09-21 14:56:13 +00:00
Egon Elbre
8b527f2d12 cmd/tools/segment-verify: add throttling
Change-Id: Ia0b4ec255adc90d874f4366b80799414a1a94700
2022-09-21 14:52:51 +00:00
Egon Elbre
cf50696745 cmd/tools/segment-verify: wire up overlay logic
Change-Id: I0a4c737a8b0995a1c3e3adeac728fe833d0ce684
2022-09-19 11:32:18 +03:00
Egon Elbre
0809ae73cf cmd/tools/segment-verify: add main
Change-Id: Ib7161a0f44d447f9ddb9be83f6673587a0bd7712
2022-09-19 10:36:57 +03:00
Jennifer Johnson
8529a169ee cmd/tools/segment-verify: add verifier
Change-Id: I4cc1fbcf964c4a9a37cf80322f6f99dd956f3d7b
2022-09-19 10:36:57 +03:00
Egon Elbre
9b520b2114 satellite/metabase: expose ConvertNodesToAliases and ConvertAliasesToNodes
They are needed for segment-verify tool.

Also rename some of the conversion methods to make clear,
which of them have side-effects.

Change-Id: Ie9a0952548e9ed5068c7a30c2fd2134b07139bca
2022-09-15 13:56:10 +00:00
Egon Elbre
cd81c5bd58 cmd/tools/segment-verify: add csv writer
Change-Id: I9306d7a6927f4dacca9623d7bd57f8560404db3e
2022-09-15 13:28:21 +00:00
Egon Elbre
507b099d44 cmd/tools/segment-verify: add monitoring / error
Change-Id: I6fd0369719ddf176a98208348560004a4134f810
2022-09-14 18:20:48 +00:00
Egon Elbre
6127f465dc cmd/tools/segment-verify: add logic for iterating over segments
This adds parts for:
1. iterating over the segments
2. using an interface for writing the segments
3. stubs for handling deleted segments

Change-Id: I76a17cac6deb0b6c042a8ab7c4155a890db9da84
2022-09-14 18:20:31 +00:00