Developed NER models (bilstm+crf, bert+bilstm+crf) for subway work order analysis, achieving a 5 percentage point improvement in average F1 performance and an overall F1 score of 0.85 for 24-category entity recognition.
Improved data annotation quality and consistency through comparative analysis and standardized guidelines, leading to significant performance gains with increased data volume.
Developed and encapsulated five core NLP modules (word segmentation, syntactic analysis, keyword extraction, named entity recognition, TFIDF), ensuring robust functionality and API integration.
Refactored module APIs using RegisterModel and BaseNLPModel classes, streamlining function calls and enabling unified external interfaces for parameter setting, training, and prediction.
Deployed the intelligent Q&A system and the NLP platform into Docker images, including Python/Java services, facilitating seamless integration and deployment for various business applications.