Commit Graph

35 Commits

Author SHA1 Message Date
Qweder93
01bb2bd17d satellite/audit: verifier checks if node made sucess GE before auditing
Change-Id: Ia6cde4e9fcf11020a5301d38065f7159f276eb80
2020-08-17 23:37:57 +03:00
Egon Elbre
94a09ce20b all: add missing dots
Change-Id: I93b86c9fb3398c5d3c9121b8859dad1c615fa23a
2020-08-11 17:50:01 +03:00
Egon Elbre
080ba47a06 all: fix dots
Change-Id: I6a419c62700c568254ff67ae5b73efed2fc98aa2
2020-07-16 14:58:28 +00:00
Egon Elbre
ed627144ed all: use DialNodeURL throughout the codebase
Change-Id: Iaf9ae3aeef7305c937f2660c929744db2d88776c
2020-05-20 10:36:30 +00:00
paul cannon
0c8c11b251 satellite/audit: add not_enough_shares_for_audit counter
We have been using the SQL expression `name='(*Verifier).Verify' AND
error_name='not enough shares for successful audit'` thus far to detect
cases of this problem and alert on them. Unfortunately, since this
rarely (hopefully never) happens, influxdb has no data for most of the
auditor instances, and when it has no data for a time series, it returns
no columns either. This makes Redash upset when it tries to perform a
query for an alert and can't find the column whose value it expects to
check.

This change should make it so zero values are reported when the problem
has not happened, and higher values when it has.

Change-Id: I79e5e000f879678b661dac88caae1e2915b39ab1
2020-04-03 17:00:50 +00:00
littleskunk
23e5a0471f
satellite/audit: clean up logging (#3832)
Co-authored-by: Ivan Fraixedes <ivan@fraixed.es>
2020-03-30 12:09:50 -06:00
Bill Thorp
94c11c5212 satellite: remove some unnecessary UTC() calls
Fixes some easy cases of extraneous UTC() calls

Change-Id: I3f4c287ae622a455b9a492a8892a699e0710ca9a
2020-03-13 13:49:44 +00:00
Jennifer Johnson
0d60c1a4b2 satellite/audit: fix checkSegmentAltered to detect segments that have changed during an audit
- Previously, checkSegmentAltered only checked for segments that were replaced
  but we want to detect all changes to a segment that occurred while an audit was being conducted.
- Fixed a bug where nodes failing audits during reverify for non-piece-hash-verified
  segments were not being removed from containment mode.
- Filled in gaps in reverify testing to ensure nodes are properly removed from containment.

Change-Id: Icd96d369278987200fd28581395725438972b292
2020-03-05 19:05:39 +00:00
Moby von Briesen
6043d01c90 satellite/audit/verifier: add metric for number of successfully downloaded shares
Change-Id: Ia4f1dc6e088db802e340aaecf80cc7ef6dc237a4
2020-02-27 14:33:59 +00:00
Egon Elbre
5342dd9fe6 go.mod: update uplink
Change-Id: I867a6a1eef8aa5d60bb676e5112b98c4192ce811
2020-02-21 16:08:12 +02:00
Jeff Wendling
7999d24f81 all: use monkit v3
this commit updates our monkit dependency to the v3 version where
it outputs in an influx style. this makes discovery much easier
as many tools are built to look at it this way.

graphite and rothko will suffer some due to no longer being a tree
based on dots. hopefully time will exist to update rothko to
index based on the new metric format.

it adds an influx output for the statreceiver so that we can
write to influxdb v1 or v2 directly.

Change-Id: Iae9f9494a6d29cfbd1f932a5e71a891b490415ff
2020-02-05 23:53:17 +00:00
Egon Elbre
082ec81714
uplink: move to storj.io/uplink (#3746) 2020-01-08 15:40:19 +02:00
Egon Elbre
6615ecc9b6 common: separate repository
Change-Id: Ibb89c42060450e3839481a7e495bbe3ad940610a
2019-12-27 14:11:15 +02:00
Moby von Briesen
ab777e823e do not update pointer for failed audits
Change-Id: If88dce8928db28d6f53c3dc771e14ea97aae9661
2019-12-16 10:50:54 -05:00
Egon Elbre
72d407559e satellite/metainfo: don't leak error implementation detail (#3722)
* satellite/metainfo: don't leak implementation detail

* add missing wrap
2019-12-10 15:21:30 -05:00
Maximillian von Briesen
8653dda2b1 satellite/audit: do not contain nodes for unknown errors (#3592)
* skip unknown errors (wip)

* add tests to make sure nodes that time out are added to containment

* add bad blobs store

* call "Skipped" "Unknown"

* add tests to ensure unknown errors do not trigger containment

* add monkit stats to lockfile

* typo

* add periods to end of bad blobs comments
2019-11-19 17:30:28 +01:00
Egon Elbre
ee6c1cac8a
private: rename internal to private (#3573) 2019-11-14 21:46:15 +02:00
Egon Elbre
cc032d3151 satellite/metainfo: fix some uses of metainfo.Delete (#3513)
* satellite/metainfo: rename Delete to UnsynchronizedDelete

* fix deletes

* make db private

* fix typos

* also verify on commit object
2019-11-06 18:02:14 +01:00
Maximillian von Briesen
7cdc1b351a satellite/audit: do not audit expired segments (#3497)
* during audit Verify, return error and delete segment if segment is expired

* delete "main" reverify segment and return error if expired

* delete contained nodes and pointers when pointers to audit are expired

* update testplanet.Upload and testplanet.UploadWithConfig to use an expiration time of an hour from now

* Revert "update testplanet.Upload and testplanet.UploadWithConfig to use an expiration time of an hour from now"

This reverts commit e9066151cf84afbff0929a6007e641711a56b6e5.

* do not count ExpirationDate=time.Time{} as expired
2019-11-05 20:41:48 +01:00
littleskunk
eeb38245ff
satellite/audit: improve logging (#3285) 2019-10-16 13:48:05 +02:00
JT Olio
a5d1776539 audits: missing continue
Change-Id: Ifcac8e61ebd8c59407e01c791adc60d9f88ff1b7
2019-10-11 13:55:27 -06:00
Maximillian von Briesen
784ca1582a
satellite/audit: fix audit panic (#3217) 2019-10-09 10:06:58 -04:00
Maximillian von Briesen
3a3d576d9b satellite/audit: add mutex to pieceHashesVerified map (#3214)
* add mutex

* remove double send to ch

* lock mutex inside defer

* import sync
2019-10-08 17:01:32 -04:00
littleskunk
c009543236
satellite/audit: Add piece hash verified to log messages (#3204) 2019-10-08 12:51:57 +02:00
Maximillian von Briesen
e1b7d01160
satellite/audit: do not fail or contain nodes for audited segments that are not piece-hash-verified (#3161) 2019-10-07 16:06:10 -04:00
Maximillian von Briesen
edadf46009 satellite/audit: delete nodes from containment when segment has changed (#3115) 2019-09-29 04:03:15 +02:00
Jeff Wendling
098cbc9c67 all: use pkg/rpc instead of pkg/transport
all of the packages and tests work with both grpc and
drpc. we'll probably need to do some jenkins pipelines
to run the tests with drpc as well.

most of the changes are really due to a bit of cleanup
of the pkg/transport.Client api into an rpc.Dialer in
the spirit of a net.Dialer. now that we don't need
observers, we can pass around stateless configuration
to everything rather than stateful things that issue
observations. it also adds a DialAddressID for the
case where we don't have a pb.Node, but we do have an
address and want to assert some ID. this happened
pretty frequently, and now there's no more weird
contortions creating custom tls options, etc.

a lot of the other changes are being consistent/using
the abstractions in the rpc package to do rpc style
things like finding peer information, or checking
status codes.

Change-Id: Ief62875e21d80a21b3c56a5a37f45887679f9412
2019-09-25 15:37:06 -06:00
Maximillian von Briesen
a4048fd529 satellite/audit: fix containment mode (#3085)
* add test to make sure we will reverify the share in the containment db rather than in the pointer passed into reverify

* use pending audit information only when running reverify
2019-09-19 01:45:15 +02:00
Natalie Villasana
aa3567187e
satellite/audit: worker now verifies and reverifies (#2965) 2019-09-11 18:37:01 -04:00
Egon Elbre
a801fab66a
all: add archview annotations (#2964) 2019-09-10 16:24:16 +03:00
Yingrong Zhao
8eda360ad3
add segment path into logs (#2898) 2019-08-29 08:38:26 -04:00
Egon Elbre
2d69d47655
all: fix Error.New formatting (#2840) 2019-08-21 19:30:29 +03:00
Ivan Fraixedes
87f3b6c708 satellite/audit: Improve comments in verifier (#2829)
Improve some source code comments in the verifier.
2019-08-20 10:23:14 -04:00
Egon Elbre
c8edeb0257
satellite/overlay: rename overlay.Cache to overlay.Service (#2717) 2019-08-06 19:35:59 +03:00
Egon Elbre
5d0816430f
rename all the things (#2531)
* rename pkg/linksharing to linksharing
* rename pkg/httpserver to linksharing/httpserver
* rename pkg/eestream to uplink/eestream
* rename pkg/stream to uplink/stream
* rename pkg/metainfo/kvmetainfo to uplink/metainfo/kvmetainfo
* rename pkg/auth/signing to pkg/signing
* rename pkg/storage to uplink/storage
* rename pkg/accounting to satellite/accounting
* rename pkg/audit to satellite/audit
* rename pkg/certdb to satellite/certdb
* rename pkg/discovery to satellite/discovery
* rename pkg/overlay to satellite/overlay
* rename pkg/datarepair to satellite/repair
2019-07-28 08:55:36 +03:00