PubMed Semantic Search Review is a tool designed to systematically query the PubMed API using single search terms and optional date ranges. By design, it only accepts single search terms for focused retrieval. Boolean operations (AND
, OR
) can be applied during the analysis phase, enabling more complex evaluations of relevance after data retrieval.
The tool processes each article's title and abstract text by combining them into a prompt for OpenAI's GPT models, generating structured outputs defined by the following schema:
public class StructuredResponseDto : StructuredResponseBaseDto
{
[JsonPropertyName("is_relevant")]
public bool IsRelevant { get; set; }
[JsonPropertyName("estimated_percent_relevant")]
public decimal? EstimatedPercentRelevant { get; set; }
[JsonPropertyName("abstract_summary")]
public string? AbstractSummary { get; set; }
[JsonPropertyName("relevance_reason")]
public string? RelevanceReason { get; set; }
}
- Queries PubMed using single search terms and optional date ranges.
- Processes article titles and abstracts with OpenAI GPT models for structured relevance analysis.
- Outputs results in a structured CSV format for easy review.
- Supports bulk processing with rate limits to prevent API throttling.
- Provides detailed logs for better traceability and debugging.
-
.NET 8: Ensure you have the .NET SDK installed on your machine.
- Download from Microsoft's official site.
-
API Keys:
- PubMed API Key: Obtain a key by registering for NCBI APIs at NCBI Developers.
- OpenAI API Key: Obtain a key by signing up at OpenAI's platform.
-
Clone the Repository
git clone https://github.com/your-repo/pubmed-semantic-search-review.git cd pubmed-semantic-search-review
-
Add API Keys to
appsettings.json
- Navigate to
appsettings.json
in the root directory and update the following placeholders:"PubMedConfig": { "ApiKey": "your-pubmed-api-key" }, "OpenAiConfig": { "ApiKey": "your-openai-api-key" }
- Navigate to
-
Set Up User Secrets (Optional)
- For added security, store API keys in
secrets.json
:dotnet user-secrets init dotnet user-secrets set "PubMedConfig:ApiKey" "your-pubmed-api-key" dotnet user-secrets set "OpenAiConfig:ApiKey" "your-openai-api-key"
- For added security, store API keys in
-
Install Dependencies
dotnet restore
-
Run the Application
dotnet run
-
Temperature Setting:
- The
Temperature
parameter in theOpenAiConfig
section directly influences the creativity or randomness of the model's output. It is a critical tuning parameter to balance between deterministic and exploratory responses.
- The
-
System and User Prompts:
- The system prompt (
SystemPrompt_AbstractAnalysis.txt
) sets the tone and context for the model's behavior (e.g., "You are a helpful and professional medical researcher"). - The user prompt (
UserPrompt_AbstractAnalysis.txt
) incorporates the article's title and abstract, alongside your specific research question and hypothesis, to guide the model's structured analysis. Text in this file must contain a placeholder for the abstract which is currently {{abstract}}.
- The system prompt (
-
OpenAI Costs:
- This project incurs costs based on token usage.
- Current pricing (January 1st, 2025):
- Prompt Tokens: $2.50 per million tokens.
- Completion Tokens: $10.00 per million tokens.
- Monitor your API usage and budget carefully.
-
Rate Limiting:
- The PubMed API limits requests to 1 per second by default.
- The tool implements a delay to adhere to this limit.
-
Logging:
- Logs are saved to the
logs
directory, segmented by hour. - Review logs for insights or troubleshooting.
- Logs are saved to the
- Integration with free Large Language Models (LLMs) to reduce costs.
- Enhanced analysis and report generation capabilities.
- UI-based configuration for non-technical users.
This project is licensed under the MIT License.
For issues or feature requests, please open a ticket in the GitHub Issues section.