Ecosyste.ms: Packages

An open API service providing package, version and dependency metadata of many open source software ecosystems and registries.

pypi.org "nlp" keyword

Top 1.2% on pypi.org
spacy-lookups-data 1.0.5
Additional lookup tables and data resources for spaCy
18 versions - Latest release: 9 months ago - 7 dependent packages - 118 dependent repositories - 65.7 thousand downloads last month - 93 stars on GitHub - 3 maintainers
Top 1.9% on pypi.org
es-core-news-sm 3.1.0
Spanish pipeline optimized for CPU. Components: tok2vec, morphologizer, parser, senter, ner, attr...
2 versions - Latest release: over 2 years ago - 2 dependent packages - 43 dependent repositories - 3.58 thousand downloads last month - 93 stars on GitHub - 1 maintainer
Top 8.5% on pypi.org
budoux 0.6.2
BudouX is the successor of Budou
17 versions - Latest release: 4 months ago - 2 dependent repositories - 3.77 thousand downloads last month - 1,378 stars on GitHub - 1 maintainer
sequence-align 0.1.0
Efficient implementations of Needleman-Wunsch and other sequence alignment algorithms in Rust wit...
1 version - Latest release: about 1 year ago - 1 dependent package - 1.38 thousand downloads last month - 57 stars on GitHub - 2 maintainers
ontogpt 0.3.11
OntoGPT
24 versions - Latest release: 18 days ago - 1 dependent repositories - 397 downloads last month - 501 stars on GitHub - 4 maintainers
rulm 0.0.2
Language modeling and instruction tuning for Russian
2 versions - Latest release: about 5 years ago - 1 dependent repositories - 26 downloads last month - 395 stars on GitHub - 1 maintainer
wikiente 0.1.0
Entity extraction using DBPedia through spotlight
1 version - Latest release: almost 5 years ago - 1 dependent repositories - 12 downloads last month - 2 stars on GitHub - 2 maintainers
akrikola 0.1
A light-weight NLP library for Finnish language
1 version - Latest release: over 4 years ago - 30 downloads last month - 0 stars on GitHub - 1 maintainer
py-bangla-stemmer 0.5.1
Rule based Bengali Stemmer written in python
5 versions - Latest release: about 5 years ago - 1 dependent repositories - 96 downloads last month - 0 stars on GitHub - 2 maintainers
stopwords-tr 2.0.0
Stopwords filter for Turkish languages
18 versions - Latest release: 1 day ago - 1.68 thousand downloads last month - 0 stars on GitHub - 1 maintainer
japanesetokenizer 1.3.7
aim to use JapaneseTokenizer as easy as possible
21 versions - Latest release: about 6 years ago - 1 dependent repositories - 168 downloads last month - 136 stars on GitHub - 2 maintainers
phospho 0.3.20
Text Analytics for LLM apps
50 versions - Latest release: 2 days ago - 1 dependent repositories - 2.96 thousand downloads last month - 244 stars on GitHub - 2 maintainers
m3tl 0.7.0
BERT for Multi-task Learning
1 version - Latest release: over 2 years ago - 14 downloads last month - 544 stars on GitHub - 2 maintainers
Top 3.3% on pypi.org
deeppavlov 1.6.0
An open source library for building end-to-end dialog systems and training chatbots.
57 versions - Latest release: about 2 months ago - 80 dependent repositories - 9.63 thousand downloads last month - 6,547 stars on GitHub - 1 maintainer
paddle-ernie 0.2.0.dev1
A pretrained NLP model for every NLP tasks
6 versions - Latest release: almost 3 years ago - 4 dependent repositories - 65 downloads last month - 6,199 stars on GitHub - 2 maintainers
paddle-propeller 0.5.1.dev1
high level paddle-paddle API
3 versions - Latest release: over 3 years ago - 1 dependent repositories - 30 downloads last month - 6,199 stars on GitHub - 2 maintainers
jtyoui-ernie 0.0.3
A pretrained NLP model for every NLP tasks
1 version - Latest release: almost 4 years ago - 1 dependent repositories - 11 downloads last month - 6,199 stars on GitHub - 1 maintainer
asapppy 0.2b1
Semantic Textual Similarity and Dialogue System package for Python
14 versions - Latest release: almost 3 years ago - 130 downloads last month - 2 stars on GitHub - 1 maintainer
naturalminer 0.1.0
Mine data for patterns described in natural language
1 version - Latest release: 12 months ago - 11 downloads last month - 2 stars on GitHub - 2 maintainers
kidx-core 0.0.1a4
Machine learning based dialogue engine for conversational software.
4 versions - Latest release: almost 5 years ago - 1 dependent repositories - 44 downloads last month - 2,332 stars on GitHub - 2 maintainers
Top 1.7% on pypi.org
rasa-core 0.14.5
Machine learning based dialogue engine for conversational software.
85 versions - Latest release: almost 5 years ago - 1 dependent package - 148 dependent repositories - 14.4 thousand downloads last month - 2,332 stars on GitHub - 2 maintainers
textdescriptives 2.8.0
A library for calculating a variety of features from text using spaCy
34 versions - Latest release: 23 days ago - 1 dependent package - 1 dependent repositories - 1.93 thousand downloads last month - 288 stars on GitHub - 4 maintainers
newspaper-scraper 0.2.1
The all-in-one Python package for seamless newspaper article indexing, scraping, and processing –...
3 versions - Latest release: 12 months ago - 30 downloads last month - 12 stars on GitHub - 2 maintainers
Top 4.6% on pypi.org
deepdoctection 0.31
Repository for Document AI
22 versions - Latest release: 23 days ago - 14 dependent repositories - 2.64 thousand downloads last month - 2,198 stars on GitHub - 2 maintainers
Top 6.5% on pypi.org
tweetnlp 0.4.4
NLP library for Twitter.
24 versions - Latest release: 11 months ago - 1 dependent package - 2 dependent repositories - 2.1 thousand downloads last month - 268 stars on GitHub - 1 maintainer
dirtyclean 0.1
get rid of unicode punctuation and other garbage from strings
1 version - Latest release: over 6 years ago - 1 dependent repositories - 10 downloads last month - 3 stars on GitHub - 2 maintainers
tftokenizers 0.1.8
Use Huggingface Transformer and Tokenizers as Tensorflow Reusable SavedModels.
9 versions - Latest release: about 2 years ago - 1 dependent repositories - 122 downloads last month - 5 stars on GitHub - 2 maintainers
ekorpkit 0.1.40
eKorpkit provides a flexible interface for NLP and ML research pipelines such as extraction, tran...
94 versions - Latest release: over 1 year ago - 1 dependent repositories - 26 downloads last month - 5 stars on GitHub - 1 maintainer
pysupwsdpocket 0.0.7
Just a Python Version of SupWSD Pocket: A software suite for SUPervised Word Sense Disambiguation
1 version - Latest release: about 4 years ago - 1 dependent repositories - 19 downloads last month - 1 stars on GitHub - 2 maintainers
nlpblock 0.0.1
Use All NLP models abstracted to block level with Pytorch
1 version - Latest release: about 5 years ago - 1 dependent repositories - 13 downloads last month - 3 stars on GitHub - 2 maintainers
langfuse-haystack 0.0.4
Additional packages (components, document stores and the likes) to extend the capabilities of Hay...
4 versions - Latest release: about 2 hours ago - 316 downloads last month - 61 stars on GitHub - 2 maintainers
bahasa 1.0.1 💰
Bahasa is natural language toolkit for bahasa indonesia
7 versions - Latest release: over 1 year ago - 1 dependent package - 1 dependent repositories - 59 downloads last month - 19 stars on GitHub - 2 maintainers
Top 2.0% on pypi.org
bardapi 1.0.0 💰
The python package that returns Response of Google Bard through API.
39 versions - Latest release: about 1 month ago - 19 dependent packages - 31 dependent repositories - 25.5 thousand downloads last month - 5,181 stars on GitHub - 2 maintainers
googlebardapi 0.1.2 💰
The python package that returns Response of Google Bard through API.
3 versions - Latest release: 12 months ago - 1 dependent repositories - 45 downloads last month - 5,197 stars on GitHub - 2 maintainers
nlppets 0.1.0
My nlp snippets
1 version - Latest release: 11 months ago - 1 dependent package - 1 dependent repositories - 20 downloads last month - 0 stars on GitHub - 2 maintainers
adamix-gpt2 0.0.2
PyTorch implementation of low-rank adaptation (LoRA) and Adamix, a parameter-efficient approach t...
2 versions - Latest release: over 1 year ago - 42 downloads last month - 116 stars on GitHub - 1 maintainer
langua 1.0.12
Faster port of Language detection built by Shuyo in Python
12 versions - Latest release: over 6 years ago - 2 dependent repositories - 687 downloads last month - 4 stars on GitHub - 2 maintainers
Top 6.0% on pypi.org
stanford-openie 1.3.2 💰
Minimalist wrapper around Stanford OpenIE
8 versions - Latest release: 4 months ago - 1 dependent package - 14 dependent repositories - 751 downloads last month - 616 stars on GitHub - 2 maintainers
Top 5.5% on pypi.org
sent2vec 0.3.0 💰
How to encode sentences in a high-dimensional vector space, a.k.a., sentence embedding.
15 versions - Latest release: about 2 years ago - 25 dependent repositories - 2.74 thousand downloads last month - 124 stars on GitHub - 4 maintainers
flair-pos 0.13.1 💰
A very simple framework for state-of-the-art NLP
1 version - Latest release: 3 months ago - 9 downloads last month - 13,578 stars on GitHub - 2 maintainers
pd3f-flair 0.6.0.1 💰
Flair's language models without unnecessary dependencies
1 version - Latest release: over 3 years ago - 4 dependent repositories - 105 downloads last month - 13,578 stars on GitHub - 2 maintainers
faststylometry 1.0.4
Python library for calculating the Burrows Delta.
10 versions - Latest release: 8 months ago - 1 dependent repositories - 112 downloads last month - 20 stars on GitHub - 2 maintainers
Top 0.7% on pypi.org
flair 0.13.1 💰
A very simple framework for state-of-the-art NLP
31 versions - Latest release: 5 months ago - 28 dependent packages - 497 dependent repositories - 147 thousand downloads last month - 13,329 stars on GitHub - 1 maintainer
lelesk 0.1
A fast Python implementation of the extended LESK algorithm for Word-Sense Disambiguation (WSD)
2 versions - Latest release: almost 3 years ago - 2 dependent repositories - 24 downloads last month - 4 stars on GitHub - 2 maintainers
spanish-chatbot 0.1.0
Chatbot in spanish using differents model: Seq2Seq model with Luong attention and transformer
1 version - Latest release: over 4 years ago - 1 dependent repositories - 6 downloads last month - 17 stars on GitHub - 2 maintainers
ba-abydos 0.6.3
Fork of the Abydos NLP/IR library
1 version - Latest release: 7 months ago - 11 downloads last month - 0 stars on GitHub - 2 maintainers
codabl-python 0.1.2
Codabl Community API
2 versions - Latest release: about 4 years ago - 10 downloads last month - 5 stars on GitHub - 1 maintainer
fasttextrank 1.4
Extract abstracts and keywords from Chinese text
5 versions - Latest release: over 5 years ago - 1 dependent repositories - 21 downloads last month - 410 stars on GitHub - 1 maintainer
clipbit 1.0.0
Generate concise meaningful summaries YouTube videos
1 version - Latest release: about 3 years ago - 1 dependent repositories - 16 downloads last month - 5 stars on GitHub - 2 maintainers
ediscovery 0.0.6
nlp tool
8 versions - Latest release: almost 2 years ago - 1 dependent repositories - 13 downloads last month - 2 maintainers
Top 0.2% on pypi.org
datasets 2.19.0
HuggingFace community-driven open-source library of datasets
82 versions - Latest release: 14 days ago - 650 dependent packages - 14,962 dependent repositories - 9.35 million downloads last month - 18,426 stars on GitHub - 4 maintainers
fdatasets 1.12.1 removed
HuggingFace/Datasets is an open library of NLP datasets.
1 version - Latest release: about 2 years ago - 14,671 stars on GitHub
Top 2.6% on pypi.org
argostranslate 1.9.6 💰
Open-source neural machine translation library based on OpenNMT's CTranslate2
35 versions - Latest release: 2 days ago - 8 dependent packages - 36 dependent repositories - 50.3 thousand downloads last month - 2,882 stars on GitHub - 1 maintainer
Top 4.5% on pypi.org
argos-translate-files 1.1.4 💰
Translate files with Argos Translate
10 versions - Latest release: 9 months ago - 1 dependent package - 8 dependent repositories - 4.89 thousand downloads last month - 3,245 stars on GitHub - 2 maintainers
Top 4.0% on pypi.org
translatehtml 1.5.2 💰
Translate HTML using Beautiful Soup and Argos Translate
3 versions - Latest release: over 2 years ago - 2 dependent packages - 10 dependent repositories - 4.49 thousand downloads last month - 3,245 stars on GitHub - 2 maintainers
Top 6.5% on pypi.org
mlconjug3 3.11.0 💰
A Python library to conjugate French, English, Spanish, Italian, Portuguese and Romanian verbs us...
34 versions - Latest release: 8 months ago - 1 dependent package - 7 dependent repositories - 1.66 thousand downloads last month - 65 stars on GitHub - 2 maintainers
Top 7.9% on pypi.org
jury 2.2.4
Evaluation toolkit for neural language generation.
22 versions - Latest release: 11 months ago - 1 dependent package - 2 dependent repositories - 604 downloads last month - 178 stars on GitHub - 2 maintainers
pywhisper 1.0.6 💰
openai/whisper speech to text model + extra features
7 versions - Latest release: over 1 year ago - 5 dependent repositories - 245 downloads last month - 92 stars on GitHub - 1 maintainer
openvino-nightly 2024.2.0.dev20240502
OpenVINO(TM) Runtime
111 versions - Latest release: about 4 hours ago - 1 dependent repositories - 34 thousand downloads last month - 5,906 stars on GitHub - 2 maintainers
resparserok 0.0.7 💰
A simple resume parser used for extracting information from resumes
7 versions - Latest release: 8 months ago - 38 downloads last month - 749 stars on GitHub - 2 maintainers
Top 3.5% on pypi.org
pyresparser 1.0.6 💰
A simple resume parser used for extracting information from resumes
6 versions - Latest release: over 4 years ago - 90 dependent repositories - 7.49 thousand downloads last month - 749 stars on GitHub - 2 maintainers
Top 3.0% on pypi.org
sparseml 1.7.0
Libraries for applying sparsification recipes to neural networks with a few lines of code, enabli...
39 versions - Latest release: about 2 months ago - 4 dependent packages - 20 dependent repositories - 3.37 thousand downloads last month - 1,976 stars on GitHub - 1 maintainer
Top 8.1% on pypi.org
sparseml-nightly 1.8.0.20240404
Libraries for applying sparsification recipes to neural networks with a few lines of code, enabli...
150 versions - Latest release: 28 days ago - 1 dependent package - 1 dependent repositories - 752 downloads last month - 1,976 stars on GitHub - 1 maintainer
Top 8.1% on pypi.org
sparsezoo-nightly 1.8.0.20240425
Neural network model repository for highly sparse and sparse-quantized models with matching spars...
157 versions - Latest release: 7 days ago - 3 dependent packages - 1 dependent repositories - 3.31 thousand downloads last month - 357 stars on GitHub - 1 maintainer
Top 4.6% on pypi.org
sparsezoo 1.7.0
Neural network model repository for highly sparse and sparse-quantized models with matching spars...
28 versions - Latest release: about 2 months ago - 5 dependent packages - 6 dependent repositories - 5.44 thousand downloads last month - 357 stars on GitHub - 1 maintainer
Top 6.3% on pypi.org
indra 1.22.0
Integrated Network and Dynamical Reasoning Assembler
28 versions - Latest release: over 1 year ago - 2 dependent packages - 10 dependent repositories - 897 downloads last month - 160 stars on GitHub - 2 maintainers
openodia 0.1.11 💰
Open source Odia language tools
25 versions - Latest release: over 2 years ago - 1 dependent repositories - 185 downloads last month - 7 stars on GitHub - 2 maintainers
Top 6.7% on pypi.org
pylangacq 0.19.1 💰
Tools for Language Acquisition Research
19 versions - Latest release: about 1 month ago - 2 dependent packages - 9 dependent repositories - 2.23 thousand downloads last month - 36 stars on GitHub - 1 maintainer
eazymind 0.0.4
AI Library providing ai-as-a-service for abstractive text summarization , just your text to be su...
3 versions - Latest release: over 4 years ago - 1 dependent repositories - 15 downloads last month - 4 stars on GitHub - 2 maintainers
wikipron 1.3.1
Scraping grapheme-to-phoneme data from Wiktionary
7 versions - Latest release: 2 months ago - 2 dependent repositories - 75 downloads last month - 267 stars on GitHub - 5 maintainers
Top 3.1% on pypi.org
langsmith 0.1.53
Client library to connect to the LangSmith LLM Tracing and Evaluation Platform.
162 versions - Latest release: about 4 hours ago - 46 dependent packages - 2,234 dependent repositories - 8.02 million downloads last month - 217 stars on GitHub - 2 maintainers
txtchat 0.2.0
Retrieval augmented generation (RAG) and language model powered search applications
3 versions - Latest release: 4 months ago - 77 downloads last month - 217 stars on GitHub - 2 maintainers
apple-ocr 1.0.8 💰
An OCR (Optical Character Recognition) utility for text extraction from images.
9 versions - Latest release: 3 months ago - 99 downloads last month - 68 stars on GitHub - 1 maintainer
Top 9.3% on pypi.org
epitator 1.3.5
Annotators for extracting epidemiological information from text.
22 versions - Latest release: over 4 years ago - 2 dependent repositories - 53 downloads last month - 41 stars on GitHub - 2 maintainers
topicmodeltuner 0.3.4
HDBSCAN Tuning for BERTopic Models
4 versions - Latest release: about 1 year ago - 78 downloads last month - 36 stars on GitHub - 1 maintainer
obsei 0.0.15
Obsei is an automation tool for text analysis need
14 versions - Latest release: 4 months ago - 3 dependent repositories - 317 downloads last month - 1,081 stars on GitHub - 2 maintainers
util-ds 0.5.3 💰
This project is a convenient part of the NLP project, including several already exposed projects ...
22 versions - Latest release: almost 2 years ago - 1 dependent repositories - 33 downloads last month - 3,419 stars on GitHub - 2 maintainers
extracteur-de-fou-malade-pour-charles-le-charlo 0.0.1 💰
PDF data parser
1 version - Latest release: over 3 years ago - 1 dependent repositories - 16 downloads last month - 3,419 stars on GitHub - 2 maintainers
Top 1.6% on pypi.org
sumy 0.11.0 💰
Module for automatic summarization of text documents and HTML pages.
16 versions - Latest release: over 1 year ago - 7 dependent packages - 413 dependent repositories - 422 thousand downloads last month - 3,419 stars on GitHub - 1 maintainer
tf-notification-callback 0.2 💰
Receive notifications about your model training anywhere you want!
2 versions - Latest release: about 4 years ago - 1 dependent repositories - 24 downloads last month - 10 stars on GitHub - 2 maintainers
presidio-evaluator 0.1.3
This package features data-science related tasks for developing new recognizers for Presidio. It ...
4 versions - Latest release: 4 months ago - 831 downloads last month - 149 stars on GitHub - 1 maintainer
grammaregex 0.1.3
grammaregex - library for matching and finding tree sentence in regex-like way
4 versions - Latest release: over 7 years ago - 4 dependent repositories - 28 downloads last month - 42 stars on GitHub - 2 maintainers
Top 5.1% on pypi.org
vncorenlp 1.0.3
A Python wrapper for VnCoreNLP using a bidirectional communication channel.
2 versions - Latest release: almost 6 years ago - 4 dependent packages - 31 dependent repositories - 2.15 thousand downloads last month - 55 stars on GitHub - 2 maintainers
nlpatl 0.0.2 💰
Natural language processing active learning library for deep neural networks
2 versions - Latest release: over 2 years ago - 1 dependent repositories - 12 downloads last month - 18 stars on GitHub - 2 maintainers
Top 1.4% on pypi.org
nlpaug 1.1.11 💰
Natural language processing augmentation library for deep neural networks
37 versions - Latest release: almost 2 years ago - 16 dependent packages - 141 dependent repositories - 144 thousand downloads last month - 4,303 stars on GitHub - 1 maintainer
fasttext-serving 0.2.0 💰
fasttext-serving gRPC client
3 versions - Latest release: over 3 years ago - 1 dependent repositories - 33 downloads last month - 58 stars on GitHub - 1 maintainer
fasttext-serving-server 0.6.2 💰
fastText model serving API server
1 version - Latest release: about 3 years ago - 1 dependent repositories - 22 downloads last month - 58 stars on GitHub - 2 maintainers
telegram-anal 0.1.1
A fun tool to play with Telegram chats
2 versions - Latest release: over 1 year ago - 21 downloads last month - 2 stars on GitHub - 1 maintainer
Top 4.4% on pypi.org
keyphrase-vectorizers 0.0.13
Set of vectorizers that extract keyphrases with part-of-speech patterns from a collection of text...
13 versions - Latest release: about 5 hours ago - 2 dependent packages - 6 dependent repositories - 18.4 thousand downloads last month - 241 stars on GitHub - 1 maintainer
mathsolver 0.1.0
Decode natural language to solve mathemathical calculations
2 versions - Latest release: about 5 years ago - 1 dependent repositories - 11 downloads last month - 1 stars on GitHub - 2 maintainers
giotto-deep 0.0.4
Toolbox for Deep Learning and Topological Data Analysis.
4 versions - Latest release: 7 months ago - 61 downloads last month - 68 stars on GitHub - 1 maintainer
ditty 0.5.0
A library for simplifying fine tuning with multi gpu setups in the Huggingface ecosystem.
10 versions - Latest release: about 5 hours ago - 121 downloads last month - 15 stars on GitHub - 2 maintainers
fast-sentence-classify 0.1.7
Generic Sentence Classification Service
8 versions - Latest release: about 1 year ago - 63 downloads last month - 2 maintainers
vagueify 0.0.1
1 version - Latest release: about 2 years ago - 1 dependent repositories - 16 downloads last month - 2 maintainers
wake 0.11.0 💰
🍰 Making Wikipedia and Wikidata Processing Easy, Like Eating a Piece of Cake
11 versions - Latest release: almost 4 years ago - 2 dependent repositories - 58 downloads last month - 0 stars on GitHub - 1 maintainer
bert_clear_title 0.0.0.1.2
Terry toolkit Bert_clear_title
3 versions - Latest release: over 3 years ago - 16 downloads last month - 0 stars on GitHub - 2 maintainers
Top 2.1% on pypi.org
rake-nltk 1.0.6
RAKE short for Rapid Automatic Keyword Extraction algorithm, is a domain independent keyword extr...
7 versions - Latest release: over 2 years ago - 5 dependent packages - 255 dependent repositories - 232 thousand downloads last month - 1,034 stars on GitHub - 1 maintainer
semantic-text-splitter 0.12.3
Split text into semantic chunks, up to a desired chunk size. Supports calculating length by chara...
22 versions - Latest release: 1 day ago - 11.3 thousand downloads last month - 135 stars on GitHub - 2 maintainers
Top 5.8% on pypi.org
hanja 0.14.0
Hangul & Hanja library
15 versions - Latest release: 8 days ago - 1 dependent package - 10 dependent repositories - 10.5 thousand downloads last month - 122 stars on GitHub - 2 maintainers
starcc 0.0.5
Python implementation of StarCC
5 versions - Latest release: about 2 years ago - 1 dependent repositories - 120 downloads last month - 18 stars on GitHub - 2 maintainers
Related Keywords
python 693 natural-language-processing 593 machine-learning 482 deep-learning 284 pytorch 265 ai 213 bert 198 spacy 195 NLP 193 transformers 174 text 153 natural language processing 141 llm 131 tensorflow 123 ner 118 nlp-library 114 named-entity-recognition 113 text-classification 108 language 105 transformer 97 learning 96 linguistics 95 python3 92 data-science 90 language-model 84 artificial-intelligence 79 chatbot 69 nlp-machine-learning 69 hacktoberfest 68 sentiment-analysis 68 embeddings 65 machine learning 64 machine 63 tokenizer 61 huggingface 61 classification 59 parser 58 deep 54 nltk 54 text-mining 54 corpus 53 nlu 53 ml 52 computer-vision 52 text-processing 52 openai 52 embedding 49 tokenization 48 large-language-models 48 question-answering 48 gpt 47 word2vec 47 natural-language-understanding 46 information-extraction 45 api 45 bot 43 deep learning 43 neural-network 42 summarization 40 mlops 38 search 38 keras 38 dataset 37 conversational-ai 37 chatgpt 36 data 36 spacy-extension 36 parsing 36 keyword-extraction 36 preprocessing 36 topic-modeling 35 ocr 35 information-retrieval 35 haystack 34 language-models 34 pretrained-models 34 word-embeddings 33 japanese 33 computational-linguistics 33 chinese 32 natural 31 translation 31 morphology 31 sentence 31 speech 31 pos-tagging 31 neural-networks 30 inference 30 BERT 30 langchain 30 text-analysis 29 sentiment 29 machine-learning-library 29 speech-recognition 28 extraction 28 datasets 28 datetime 28 processing 27 analysis 27 python-library 27