Genism:Gensim is a Python library for topic modelling, document indexing and similarity retrieval with large corpora. Target audience is the natural language processing (NLP) and information retrieval (IR) community.
开发语言:Python
网址:RaRe-Technologies/gensim
协议:LGPL-2.1 license
活跃度:github star数超过五千,近期(201711)仍在更新
TextBlob:Simple, Pythonic, text processing--Sentiment analysis, part-of-speech tagging, noun phrase extraction, translation, and more.
开发语言:Python
网址:sloria/TextBlob
功能:情感分析、词性标注、翻译等
活跃度:github star 超过4千,近期(201711)仍在更新
Spacy:spaCy is a library for advanced Natural Language Processing in Python and Cython. It's built on the very latest research, and was designed from day one to be used in real products. spaCy comes with pre-trained statistical models and word vectors, and currently supports tokenization for 20+ languages. It features the fastest syntactic parser in the world, convolutional neural network models for tagging, parsing and named entity recognition and easy deep learning integration. It's commercial open-source software, released under the MIT license.
开发语言:python
协议:MIT协议
功能: 功能很多,如tagging, parsing and named entity recognition等
性能:功能强大,支持二十多种语言(然而目前还不支持中文,可以阅读官方文档了解更多信息https://spacy.io/usage/),号称是工业级强度的Python NLP工具包,区别于学术性质更浓的Python NLTK
活跃度:star 超过7千,近期(201711)仍非常活跃