storj/docs/blueprints/archive/kademlia-audit-gating.md
JT Olio c81e4fcb9e
design/blueprints (#2927)
we've decided to rename design docs to blueprints to indicate that we don't intend to keep blueprints up to date once completed. instead, the final step of blueprint implementation is to archive the blueprint and update actual documentation elsewhere (potentially copying and pasting most of the blueprint). this PR updates the template and readme
2019-09-09 08:55:05 -06:00

6.0 KiB
Raw Permalink Blame History

Kademlia Audit Gating

Abstract

Storage node B is added to storage node A's routing table only if one of the satellites that A trusts have verified that B has a high enough audit and uptime count (not disqualified). If B was not verified by at least one of these satellites, it is added to A's Routing Table Antechamber.

Background

White paper: 4.6.1 Mitigating Sybil attacks

While weve adopted the proof-of-work scheme S/Kademlia proposes to partially address Sybil attacks, we extend Kademlia with an application specific integration to further defend our network.

Given two storage nodes, A and B, storage node B is not allowed to enter storage node As routing table until storage node B can present a signed message from a Satellite C, a satellite that node A trusts. This signed message must state that B has passed enough audits [and uptime checks] (sections 4.13 and 4.15). This ensures that only nodes with verified disk space have the opportunity to participate in the routing layer.

A node that is allowed to enter routing tables is considered vetted and lookups only progress through vetted nodes. To make sure unvetted nodes can still be found, vetted nodes keep unbounded lists of their unvetted neighbors provided that the XOR distance to all unvetted neighbors is no farther than the farthest of the k-closest vetted neighbors. Unvetted nodes keep their k-nearest vetted nodes up-to-date.

Goals

  1. Add a signed message from Satellites to verify that a Node has been not been disqualified and that it has a high enough audit and uptime count.

  2. Create a trusted Satellite list that contains the IDs of Satellites from which Nodes will accept verification signatures of other Nodes.

  3. Implement a Routing Table Antechamber: An XOR-ordered data structure in which unverified nodes are entered if successfully contacted.

  4. Modify FindNear to return n XOR-closest nodes from the antechamber in addition to those already returned from the routing table.

  5. On deployment, avoid complete erasure of the network's routing tables.

Terminology

Routing Table Antechamber - XOR-ordered temporary holding place for unverified storage nodes

Node Neighborhood - The k-closest nodes to self where distance is measured by XOR. A node is within the node neighborhood if it is closer than the furthest node in the neighborhood. The vetted node neighborhood is the k-closest nodes that are currently in the Routing Table.

node neighborhood

Design

  1. Satellite Signatures for Node Verification

    • Identities can sign messages already
    • Create a Voucher (message) that satellites can sign that a Node has been verified: qualification status, audit successes, and uptime
    • Audit success ratio and uptime count thresholds are per-satellite
    • The vouchers issued by the satellite should have an expiration on them (tunable by satellite)
    • Nodes are expected to get up to date vouchers
  2. Trusted Satellites List

    • Create Whitelist/blacklist with an abstraction layer for trusted/untrusted Satellites
    • These lists will live on each Node
    • NB: We are using config.Storage.WhitelistedSatelliteIDs "a comma-separated list of approved satellite node ids" for now
  3. Routing Table Antechamber

    • XOR ordered data structure (boltDB bucket)
    • A node can be added if it would be within the vetted node neighborhood
    • If a node is successfully connected to, it sends its vouchers to the other node
    • If that node doesn't have any trustworthy vouchers, it cannot enter the main routing table.
    • If the node is already in the routing table, it is removed and added to the antechamber if it fits in the node neighborhood
    • A Node may enter a Routing Table directly if at first contact it is already verified by a trusted Satellite.
    • Since the antechamber may only contain nodes that would be part of the node neighborhood, if the network grows, the space in the neighborhood will shrink so we must remove antechamber nodes that no longer fit in the neighborhood.
  4. FindNear

    • should return n XOR-closest nodes from the antechamber in addition to its current behavior
    • During processes like Bootstrapping or Kademlia Lookups, we should only call FindNear on verified nodes, not on those that are in the antechamber.
  5. Progressively migrate nodes from the current Routing Table into antechamber until verified.

    • Deploy vetting and signing first
    • Deploy antechamber and nodes will naturally shift

Rationale

To help prevent Sybill Attacks where bad Nodes fill the routing tables, push out good Nodes, and propagate themselves across the network into the majority of routing tables, essentially voiding the DHT.

Implementation

  1. Implement Satellite Signatures for Node Verification

  2. Make Trusted Satellites List

  3. Deploy steps 1 and 2 to production

  4. Create Routing Table Antechamber

  5. Update FindNear

Open issues

Q: Should closer buckets to self get refreshed more frequently?

A: Yes, but not as part of this Epic. See this ticket

Q: Should the routing table antechamber have a maximum size?

A: We've decided not for now.

Change antechamber keys to be the xor from local node

kademlia: integrate antechamber and vouchers

Move kbucketdb, nodebucketdb to kademliadb package

Add ability for satellite to discover antechamber nodes