Skip to main content

Natural Language Processing (NLP) Services

NLP Services are proposed Artificial Intelligence services within FOSPS, potentially including automatic ePI Preprocessors, Supporting Material topic extractors, and summary Lenses.

Purpose

Apply AI/ML to automate:

  • Semantic annotation of ePI text
  • Entity recognition (medical concepts)
  • Topic extraction from Supporting Material
  • Content summarization
  • Language translation
  • Readability adaptation

NLP Model Types

Named Entity Recognition (NER)

Identify medical entities:

  • Conditions/diseases
  • Medications/substances
  • Dosages and routes
  • Symptoms
  • Procedures

Text Classification

Categorize content:

  • ePI section types
  • Target demographics
  • Content complexity level
  • Clinical specialty

Relationship Extraction

Identify connections:

  • Drug-condition relationships
  • Contraindications
  • Drug-drug interactions
  • Causal relationships

Summarization

Generate concise versions:

  • Key safety information
  • Dosing essentials
  • Patient-friendly summaries

Integration as Preprocessors

NLP services can function as Preprocessors:

  1. Receive raw ePI
  2. Run NER/classification models
  3. Map entities to standard terminologies
  4. Generate HtmlElementLink extensions
  5. Return p(ePI)

Benefits:

  • Automated annotation at scale
  • Consistent terminology application
  • Faster than manual Annotation Tool

Challenges:

  • Training data requirements
  • Domain-specific accuracy
  • Regulatory validation needed

Integration as Lenses

NLP can create specialized Lenses:

Summary Lens

  • Extract key information
  • Generate condensed view
  • Maintain regulatory compliance (no removal)
  • Present in collapsible sections

Translation Lens

  • Detect patient language preference
  • Provide inline translations
  • Link to translated Supporting Material

Simplification Lens

  • Identify complex terminology
  • Add glossary definitions
  • Suggest simpler alternatives (as supplements)

Training Data Sources

Models trained on:

Model Architecture

Potential approaches:

  • Transformer models: BERT, BioBERT, ClinicalBERT
  • Domain adaptation: Fine-tuning on pharmaceutical text
  • Multi-lingual models: XLM-R for language support
  • Ensemble methods: Combine multiple models

Deployment

As FOSPS microservices:

  • Containerized with model files
  • GPU acceleration for performance
  • Versioned models
  • A/B testing capabilities

Discoverable as Preprocessors:

eu.gravitate-health.fosps.preprocessing=true

Quality Assurance

Validation methods:

  • Precision/recall on test datasets
  • Expert review of annotations
  • Comparison with Annotation Tool gold standard
  • Continuous monitoring in production

Performance Considerations

NLP services should be Preprocessors not Lenses:

  • Heavy computation acceptable
  • Run once per ePI
  • Results cached as p(ePI)
  • Keep Lenses lightweight