I built a text classifier that can learn new categories on the fly without full retraining – solving a practical problem I kept hitting in production systems. Try it out:
from adaptive_classifier import AdaptiveClassifier
# Start with basic sentiment classes
classifier = AdaptiveClassifier("distilbert-base-uncased")
classifier.add_examples(
["Great product!", "Terrible service"],
["positive", "negative"]
)
# Later add a completely new class without retraining
classifier.add_examples(
["404 error", "API timeout"],
["technical", "technical"]
)
# It maintains accuracy on both old and new classes
print(classifier.predict("The API is down")) # [('technical', 0.82), ...]
print(classifier.predict("Best purchase ever")) # [('positive', 0.89), ...]
Technical implementation:
- Uses Elastic Weight Consolidation to prevent catastrophic forgetting when learning new classes
- Combines prototype memory with neural adaptation for efficient few-shot learning
- Memory-efficient: stores only representative examples using k-means clustering
- Fully deterministic results through careful seed management
- Built on PyTorch/HuggingFace, works with any transformer model
The key insight was combining EWC (usually used in reinforcement learning) with a prototype memory system. This lets the model learn new patterns while preserving its knowledge of existing categories.
Real-world use cases I've tested:
- Customer feedback classification where new issue types emerge over time
- Content moderation systems that need to handle new violation types
- Technical support routing where new product categories get added
pip install adaptive-classifier
GitHub: https://github.com/codelion/adaptive-classifier
Would love feedback from anyone who's dealt with similar challenges in production ML systems!
Traditional classifier adding a new class:
- Requires full retraining (~30-60 minutes on typical dataset)
- Needs all historical data
- Uses 2-3x more memory during training
This approach:
- Adds new class in seconds
- Needs only examples of new class
- Memory usage stays constant
- Maintains 95%+ accuracy on existing classes
The code is well-documented and tested. I've included detailed examples showing:
- Batch processing for large datasets
- Multi-language support
- Model persistence
- Custom transformer models
Happy to share more details about the architecture or specific implementation challenges!