-
Notifications
You must be signed in to change notification settings - Fork 126
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Merge pull request #763 from rspamd/vstakhov-gpt-announce
Write post about GPT experiments
- Loading branch information
Showing
3 changed files
with
287 additions
and
1 deletion.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,171 @@ | ||
--- | ||
layout: post | ||
title: "Integrating Rspamd with GPT" | ||
categories: misc | ||
--- | ||
|
||
## Preface | ||
|
||
Historically, our only text classification method has been Bayes, a powerful statistical method that performs well with sufficient training. However, Bayes has its limitations: | ||
|
||
* It requires thorough and well-balanced training | ||
* It cannot work with low confidence levels, especially when dealing with a wide variety of spam | ||
|
||
Large Language Models (LLMs) offer promising solutions to these challenges. These models can perform deep introspection with some sort of contextual "understanding". However, their high computational demands (typically requiring GPUs) make scanning all emails impractical. Separating LLM execution from the scanning engine mitigates resource competition. | ||
|
||
## Rspamd GPT plugin | ||
|
||
In Rspamd 3.9, I have tried to integrate the OpenAI GPT API for spam filtering and assess its usefulness. Here are the basic ideas behind this plugin: | ||
|
||
* The selected displayed text part is extracted and submitted to the GPT API for spam probability assessment | ||
* Additional message details such as Subject, displayed From, and URLs are also included in the assessment | ||
* Then, we ask GPT to provide results in JSON format since human-readable GPT output cannot be parsed (in general) | ||
* Some specific symbols (`BAYES_SPAM`, `FUZZY_DENIED`, `REPLY`, etc.) are excluded from the GPT scan | ||
* Obvious spam and ham are also excluded from the GPT evaluation | ||
|
||
The former two points reduce the GPT workload for something that is already known, where GPT cannot add any value in the evaluation. We also use GPT as one of the classifiers, meaning that we do not rely solely on GPT evaluation. | ||
|
||
## Evaluation results | ||
|
||
To evaluate the performance of the GPT-based classifier, we developed the `rspamadm classifier_test` utility, capable of evaluating both supervised and unsupervised classifiers: | ||
|
||
* It divides spam and ham samples into separate training and validation sets | ||
* For supervised classifiers, it uses the training set to train the classifier | ||
* Then both supervised and unsupervised classifiers are evaluated using the validation set to measure their performance | ||
|
||
For example, the Bayes engine, trained on a robust corpus, demonstrates the following results: | ||
|
||
~~~ | ||
$ rspamadm classifier_test --ham /ham --spam /spam --cv-fraction 0.3 | ||
Spam: 348 train files, 815 cv files; ham: 754 train files, 1762 cv files | ||
Start learn spam, 348 messages, 10 connections | ||
Start learn ham, 754 messages, 10 connections | ||
Learning done: 348 spam messages in 1.61 seconds, 754 ham messages in 3.88 seconds | ||
Start cross validation, 2577 messages, 10 connections | ||
Metric Value | ||
------------------------------ | ||
True Positives 735 | ||
False Positives 22 | ||
True Negatives 1717 | ||
False Negatives 49 | ||
Accuracy 0.97 | ||
Precision 0.97 | ||
Recall 0.94 | ||
F1 Score 0.95 | ||
Classified (%) 97.90 | ||
Elapsed time (seconds) 12.71 | ||
~~~ | ||
|
||
These results are impressive but assume the classifier is properly and decently trained. In scenarios involving a fresh system or high variability in emails, gathering reliable statistics might be challenging. | ||
|
||
In contrast, the GPT engine operates as an unsupervised learning algorithm. We assume that LLM models have enough "understanding" of the language to distinguish spam and ham without direct training on emails. Moreover, we provide only text data, not raw email content. | ||
|
||
Below are the results from different GPT models: | ||
|
||
### OpenAI GPT-3.5 Turbo | ||
|
||
~~~ | ||
Metric Value | ||
------------------------------ | ||
True Positives 129 | ||
False Positives 35 | ||
True Negatives 263 | ||
False Negatives 69 | ||
Accuracy 0.79 | ||
Precision 0.79 | ||
Recall 0.65 | ||
F1 Score 0.71 | ||
Classified (%) 95.20 | ||
Elapsed time (seconds) 318.91 | ||
~~~ | ||
|
||
This model is cost-effective and can be used as a baseline. The results were obtained from a low-quality sample corpus, resulting in high false positives and negatives. | ||
|
||
### OpenAI GPT-4o | ||
|
||
~~~ | ||
Metric Value | ||
------------------------------ | ||
True Positives 178 | ||
False Positives 25 | ||
True Negatives 257 | ||
False Negatives 9 | ||
Accuracy 0.93 | ||
Precision 0.88 | ||
Recall 0.95 | ||
F1 Score 0.91 | ||
Classified (%) 90.02 | ||
Elapsed time (seconds) 279.08 | ||
~~~ | ||
|
||
Despite its high cost, this advanced model is suitable, for example, for low-traffic personal email. It demonstrates significantly lower error rates compared to GPT-3.5, even with a similar low-quality sample corpus. | ||
|
||
### Using GPT to train Bayes classifier | ||
|
||
Another interesting approach involves using GPT to supervise Bayes engine training. In this case, we benefit from the best of both worlds: GPT can operate without training, while Bayes can catch up afterward and perform instead of GPT (or at least serve as a cost-saving alternative). | ||
|
||
So we tested GPT training Bayes and compared efficiency using the same methodologies. | ||
|
||
GPT results: | ||
|
||
~~~ | ||
Metric Value | ||
------------------------------ | ||
True Positives 128 | ||
False Positives 13 | ||
True Negatives 301 | ||
False Negatives 68 | ||
Accuracy 0.84 | ||
Precision 0.91 | ||
Recall 0.65 | ||
F1 Score 0.76 | ||
Classified (%) 97.89 | ||
Elapsed time (seconds) 341.77 | ||
~~~ | ||
|
||
Bayes classifier results (trained by GPT in the previous test iteration): | ||
|
||
~~~ | ||
Metric Value | ||
------------------------------ | ||
True Positives 19 | ||
False Positives 43 | ||
True Negatives 269 | ||
False Negatives 9 | ||
Accuracy 0.85 | ||
Precision 0.31 | ||
Recall 0.68 | ||
F1 Score 0.42 | ||
Classified (%) 65.26 | ||
Elapsed time (seconds) 29.18 | ||
~~~ | ||
|
||
Bayes still exhibits uncertainty in classification, with more false positives than GPT. Improvement could be achieved through autolearning and by refining the corpus used for testing (our corpus contains many ham emails that look like spam even for human evaluators). | ||
|
||
## Plugin design | ||
|
||
The GPT plugin operates as follows: | ||
|
||
* It selects messages that qualify several pre-checks: | ||
- they must not contain any symbols from the `excluded` set (e.g. Fuzzy/Bayes spam/Whitelists) | ||
- they must not clearly appear as ham or spam (e.g. with reject action or no action with a high negative score) | ||
- they should have enough text tokens in the meaningful displayed part | ||
* If a message satisfies these checks, Rspamd selects the displayed part (e.g. HTML) and sends the following content to GPT: | ||
- text part content as a single-line string (honoring limits if necessary) | ||
- message subject | ||
- displayed From | ||
- some details about URLs (e.g. domains) | ||
* This data is merged with a prompt to GPT requesting an evaluation of the email's spam probability, with the output returned in JSON format (other output types may sometimes allow GPT to provide human-readable text that is very difficult to parse) | ||
* After these steps, a corresponding symbol with a confidence score is inserted | ||
* With autolearning enabled, Rspamd also trains the supervised classifier (Bayes) | ||
|
||
## Pricing considerations and conclusions | ||
|
||
OpenAI provides an API for these requests, incurring costs (currently no free tier available). However, for personal email usage or automated Bayes training without manual intervention, GPT presents a viable option. For instance, processing a substantial volume of personal emails with GPT-3.5 costs approximately $0.05 daily (for about 100k tokens). | ||
|
||
For large-scale email systems, it may be preferable to use another LLM (e.g. llama) internally on a GPU-powered platform. The current plugin is designed to integrate with different LLM types without significant modifications. This approach also enhances data privacy by avoiding sending email content to a third-party service (though OpenAI claims their models do not learn from API requests). | ||
|
||
Despite not achieving 100% accuracy, the GPT plugin demonstrates efficiency comparable to human-filtered email. Future enhancements will focus on improving accuracy through additional metadata integration into the GPT engine, while optimizing token usage efficiency. There are also plans to better utilize LLM knowledge in Rspamd, particularly for better fine-grained classification. | ||
|
||
The GPT plugin will be available starting from Rspamd 3.9, requiring an OpenAI API key and financial commitment for accessing ChatGPT services. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,105 @@ | ||
--- | ||
layout: doc | ||
title: GPT Plugin | ||
--- | ||
# Rspamd GPT Plugin | ||
|
||
The Rspamd GPT Plugin, introduced in Rspamd 3.9, integrates OpenAI's GPT API to enhance spam filtering capabilities using advanced natural language processing techniques. Here are the basic ideas behind this plugin: | ||
|
||
* The selected displayed text part is extracted and submitted to the GPT API for spam probability assessment | ||
* Additional message details such as Subject, displayed From, and URLs are also included in the assessment | ||
* Then, we ask GPT to provide results in JSON format since human-readable GPT output cannot be parsed (in general) | ||
* Some specific symbols (`BAYES_SPAM`, `FUZZY_DENIED`, `REPLY`, etc.) are excluded from the GPT scan | ||
* Obvious spam and ham are also excluded from the GPT evaluation | ||
|
||
The former two points reduce the GPT workload for something that is already known, where GPT cannot add any value in the evaluation. We also use GPT as one of the classifiers, meaning that we do not rely solely on GPT evaluation. | ||
|
||
For detailed information about this plugin, refer to the [blog post]({{ site.baseurl }}/misc/2024/07/03/gpt.html). | ||
|
||
## Configuration Options | ||
|
||
```hcl | ||
gpt { | ||
# Supported type: openai | ||
type = "openai"; | ||
# Your OpenAI API key | ||
api_key = "xxx"; | ||
# Model name | ||
model = "gpt-3.5-turbo"; | ||
# Maximum tokens to generate | ||
max_tokens = 1000; | ||
# Temperature for sampling | ||
temperature = 0.7; | ||
# Top p for sampling | ||
top_p = 0.9; | ||
# Timeout for requests | ||
timeout = 10s; | ||
# Prompt for the model (use default if not set) | ||
prompt = "xxx"; | ||
# Custom condition (Lua function) | ||
condition = "xxx"; | ||
# Autolearn if GPT classified | ||
autolearn = true; | ||
# Reply conversion (Lua code) | ||
reply_conversion = "xxx"; | ||
# URL for the OpenAI API | ||
url = "https://api.openai.com/v1/chat/completions"; | ||
} | ||
``` | ||
|
||
### Description of Configuration Options | ||
|
||
- **type**: Specifies the GPT model type. Currently, only "openai" is supported. | ||
|
||
- **api_key**: Your API key for accessing OpenAI services. Replace "xxx" with your actual API key. | ||
|
||
- **model**: Specifies the GPT model to use, such as "gpt-3.5-turbo". | ||
|
||
- **max_tokens**: Sets the maximum number of tokens to generate in the GPT response, controlling the length of the generated text. | ||
|
||
- **temperature**: Controls the creativity of the generated text during sampling. Values range from 0 to 1, with higher values resulting in more creative outputs (default: 0.7). | ||
|
||
- **top_p**: Sets the cumulative probability threshold for token selection using nucleus sampling (default: 0.9). | ||
|
||
- **timeout**: Specifies the maximum wait time for API responses in seconds (e.g., 10s). | ||
|
||
- **prompt**: Optional initial text prompt guiding model generation. If not set, a default prompt is used. | ||
|
||
- **condition**: Custom Lua function defining conditions for plugin usage. | ||
|
||
- **autolearn**: Enables automatic learning based on GPT classifications when set to true. | ||
|
||
- **reply_conversion**: Custom Lua code converting the model's reply into a specific format or handling it in a certain way (default: JSON output of the GPT model). | ||
|
||
- **url**: Endpoint for the OpenAI API (default: "https://api.openai.com/v1/chat/completions"). | ||
|
||
## Example Configuration | ||
|
||
Here is an example configuration with the fields filled in: | ||
|
||
```hcl | ||
gpt { | ||
type = "openai"; | ||
api_key = "your_api_key_here"; | ||
model = "gpt-3.5-turbo"; | ||
max_tokens = 500; | ||
temperature = 0.6; | ||
top_p = 0.8; | ||
timeout = 15s; | ||
} | ||
``` | ||
|
||
## Conclusion | ||
|
||
The Rspamd GPT Plugin integrates OpenAI's GPT models into Rspamd, enhancing its spam filtering capabilities with advanced text processing techniques. By configuring the options above, users can customize the plugin to meet specific requirements, thereby enhancing the efficiency and accuracy of spam filtering within Rspamd. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters