storj/docs/testplan/storagenode-email-notification.md
nadimhq 4b72314e90
docs/testplan: Testplan for Storage-node Email Notification (#5338)
This testplan is going to cover the changes to storage-node email notifications. It will go over the storage-node email notification design doc.

Co-authored-by: Antonio Franco (He/Him) <antonio@storj.io>
2022-12-05 13:48:47 -05:00

13 KiB

Storagenode Email Notification Testplan

 

Background

This testplan is going to cover the changes to storagenode email notifications. It will go over the design doc seen here - Storagenode Email Notification Design Doc

 

Test Scenario Test Case Description Comments
Email Alerts Node Suspension Error Alert Email If user node experiences too many errors in response to audits then the user should be sent an email alert (following the email shown in the PRD) warning the user that their node was suspended because of too many errors
Node Suspension Offline Alert Email If user node goes offline during a set percentage of audits the user should be sent an email alert (following the email shown in the PRD) warning the user that their node was suspended because it was offline
Node Disqualified Alert Email If user node gets disqualified then user should be sent an email detailing the process which their node was disqualified (following the email shown in the PRD), also node can only be disqualified after several consecutive failed audits or with a high audit failure rate
Node Software Update Alert Email If user node can be upgraded to the most recent storage node software, then the user should receive an email alert (following the email shown in the PRD) detailing this; user should also be able to see options on how to update their storage node software and possible
Node Offline Alert Email If a node goes offline for four hours, then node owner should receive an email alert (following the email shown in the PRD) stating that their node was offline for the past four hours. Alert should also detail user how to check status of their node to take appropriate action & show a link to download and enable more frequent offline notifications
Node Online Alert Email If a node goes back online after being offline and previously given an alert for being offline, then the user should receive an email alert (following the email shown in the PRD) stating that their node went back online
Send Alerts from Satellite Service Alerts should be sent from satellite service since this will allow for node operators to receive important node alerts in a timely manner, compared to if node status alerts are sent from customer.io which results in a delay of alerts
Disqualified Alert Email for Graceful Exit & Stray Node Chore If a node undergoes graceful exit or gets cleaned from stray node chore, then this should result in a node disqualified email (currently no format of email in PRD)
Dupe Mails Satellites run multiple instances of the reputation service, which can lead to multiple alert mails stemming from multiple audits/repairs done in a short period; should not be sending multiple emails in the first place if this turns out to be an issue, then later as stated in the design doc, a central service can be implemented to deduplicate email instances
Audit Service Node Status Change When the status of a node changes, the audit service should add an email to the queue to be sent by a separate service; if there is no status change then no alert email will be sent
Audit Performance Audit performance should not be reduced due to this email sending in the aforementioned case
DB Mail Service Starting from the nodes initiating contact with the satellite every hour during check in process, the mail service should send alert mails depending on nodes table column values that are changed
Node Offline Start Alert If the last_contact_success column on the nodes table is more than a set configured time for nodes, then a node offline alert should be sent to the email tied to those node
Node Online Start Alert If the last_contact_success column on the nodes table was previously more than a set configured time for nodes but is no longer, then a node online alert should be sent to the email tied to those node
Node Error Suspended Start Alert If the unknown_audit_suspended_at on the nodes table is set from NULL to not NULL for nodes, then a node offline alert should be sent to the email tied to those nodes
Node Error Unsuspended Start Alert If the unknown_audit_suspended_at on the nodes table is set from not NULL to NULL for nodes and was previously set from NULL to not NULL, then a node offline alert should be sent to the email tied to those nodes
Node Offline Suspended Start Alert If the offline_suspended column on the nodes table is set from NULL to not NULL for nodes, then a node suspended offline alert should be sent to the email tied to those nodes
Node Offline Unsuspended Start Alert If the offline_suspended column on the nodes table is set from not NULL to NULL for nodes and was previously set from NULL to not NULL, then a node unsuspended offline alert should be sent to the email tied to those nodes
Node Disqualified Start Alert If the disqualified column changes from NULL to not NULL for nodes in the nodes table, then a node disqualification alert should be sent to the email tied to those nodes
Node Software Update Start Alert In the nodes table if nodes are below the configured value for a node version, then a node software update alert should be sent to the email tied to those nodes
DB Restores When there is a DB outage and restoration it should not respectively trigger Node Offline and then Node Online Start Alerts