Bayes Training¶

Bayes training improves rspamd's detection rate by learning from known messages. The training page shows all archived and quarantined mails in a single combined list — training actions are available inline.

Statistics¶

The top of the page shows current corpus numbers:

Metric	Description
Bayes Ham	Number of ham-trained messages in the Bayes corpus
Bayes Spam	Number of spam-trained messages in the Bayes corpus
Neural Spam Samples	Training data for the neural network (spam)
Neural Ham Samples	Training data for the neural network (ham)
Scanned	Total number of all processed messages
Learned	Sum of all Bayes training actions

The neural network only starts training at 1,000 samples per class (rspamd default behaviour).

Bayes Classes¶

nmg supports 6 Bayes classes (not just spam/ham):

Class	Description	Use Case
`spam`	Unwanted advertising mail	Standard spam training
`ham`	Legitimate mail	Standard ham training
`phishing`	Phishing attempt	Train specific phishing patterns
`bec`	Business Email Compromise	CEO fraud, targeted impersonation
`newsletter`	Mass mail / newsletter	Correctly classify legitimate bulk mail
`transactional`	Transactional mails	Order confirmations, system notifications

For spam and ham, quick buttons are available directly in each row. The other 4 classes are accessed via the dropdown menu (⋮) in the action column.

Search¶

The search field above the table enables full-text search across the mail corpus by sender, recipient, or subject. The search filters all table entries and updates the view immediately. Regular expressions are not supported — the search term is matched as a substring.

Mail Corpus¶

The table shows all mails available for training — in a single combined list:

Archived mails (source: delivered) — mails from the BCC archive
Quarantine mails (source: hold) — mails in the Postfix hold queue

Column	Description
Time	Receipt timestamp
From	Sender (masked depending on role)
To	Recipient (masked depending on role)
Subject	Subject (masked depending on role)
Score	rspamd score at delivery
Bayes Status	`manualSpam` / `manualHam` / `autoSpam` / `autoHam` / `notLearned`
Trained By	Admin account that triggered the training
Node	Cluster node where the mail resides

Per-Mail Actions¶

Ham / Spam (quick buttons) — train directly as ham or spam
Other classes (dropdown) — phishing, bec, newsletter, transactional
Unlearn — undo the training for this mail
Preview — display mail body and headers
Download EML — download the raw file

Bulk Training¶

Select multiple rows and train in one step via Train as Spam or Train as Ham. Errors in individual rows do not interrupt bulk training — they are reported separately.

Sender, recipient, and subject are masked depending on the user role:

Role	Display
`admin_full` / `admin`	Always shown in plain text
`training_operator`	Masked — unmasking possible via Reveal button (creates audit entry)
other	Always masked, no unmasking

Autolearn¶

When configured in Mail Configuration → Autolearn, nmg automatically trains: - High-score mails as spam - Low-score mails as ham

Spam Bursts¶

Under Spam Bursts, clusters of similar spam mails in short time windows are detected.

Burst Table¶

Column	Description
Time Window	Start and end of the burst window
Count	Number of similar mails
Distinct Senders	Number of different sender addresses
Sender Domain	Most frequent sender domain
Sample Subject	Typical subject line of the burst
Sample Recipients	First affected recipients
Avg Score	Average rspamd score
Active	Whether the burst is still actively blocked
Expires	Automatic expiry date of the block

Actions¶

Train as Spam — Add all burst mails to the Bayes corpus
Unblock — Mark burst as handled (without training)
Delete — Remove the burst entry

Enable Show Expired to see burst blocks that have already expired.

Spam Analytics¶

Under Spam Analytics, which rspamd symbols are most frequently active is shown.

Symbol Table¶

Column	Description
Symbol	rspamd symbol name (e.g. `RCVD_IN_SPAMHAUS_SBL`)
Hits	Total hits in the selected time range
Avg Score	Average score contribution
% of Spam	Share of spam detection traffic
% of Ham	Share of ham traffic (false positive indicator)

Symbols with a high ham percentage are potential false positive sources → reduce in Score Tuning.

Score Distribution¶

The bar chart shows which score ranges the processed mails fall into:

Bucket	Meaning
`< 0 (Ham)`	Clearly legitimate mail
`0 – 2`, `2 – 4`, `4 – 6`	Grey zones
`6 – 8`, `8 – 10`, `10 – 14`	Probable spam
`≥ 14 (Reject)`	Immediately rejected mail

Near-Threshold Senders (Top 50)¶

Sender domains whose mails average close to the quarantine threshold — early warning for gradually worsening spam sources:

Column	Description
Domain	Sender domain
Count	Mails in this time range
Avg Score	Average rspamd score
Max Score	Highest observed score

False Negatives¶

Mails reported as spam by users (that passed through the filter):

Column	Description
Time	Report time
Source	`delivered` (archived), `hold` (quarantine), `other`
Subject	Subject of the reported mail
Sender	Sender address
Actor	Who reported the mail

The time range filter (24h / 7d / 30d) applies to all three views.

Neural Network¶

rspamd contains a neural network (neural) that automatically learns from Bayes training. It only starts training at 1,000 spam and 1,000 ham samples. Configured in Mail Configuration → Neural Network.