pypi.org "document-intelligence" keyword
Top 1.0% on pypi.org
96 versions - Latest release: over 1 year ago - 15 dependent packages - 438 dependent repositories - 31.9 thousand downloads last month - 11,867 stars on GitHub - 1 maintainer
paddlenlp 2.8.1
Easy-to-use and powerful NLP library with Awesome model zoo, supporting wide-range of NLP tasks f...96 versions - Latest release: over 1 year ago - 15 dependent packages - 438 dependent repositories - 31.9 thousand downloads last month - 11,867 stars on GitHub - 1 maintainer
kreuzberg 4.4.2
High-performance document intelligence library for Python. Extract text, metadata, and structured...131 versions - Latest release: 7 days ago - 81.5 thousand downloads last month - 6,130 stars on GitHub - 1 maintainer
byteit 0.1.2
AI-powered document intelligence platform - Turn your data into structured data with a single lin...3 versions - Latest release: about 1 month ago - 273 downloads last month - 1 maintainer
langchain-kreuzberg 1.0.0
Kreuzberg document loader for LangChain — extract text from 75+ file formats with true async and ...1 version - Latest release: 14 days ago - 92 downloads last month - 1 maintainer
contextgem 0.21.0
Effortless LLM extraction from documents44 versions - Latest release: 16 days ago - 2.02 thousand downloads last month - 1,697 stars on GitHub - 1 maintainer
datalab-python-sdk 0.2.3
SDK for the Datalab document intelligence API20 versions - Latest release: 18 days ago - 44.6 thousand downloads last month - 1 maintainer
servifai 1.1.0
AI-powered PDF parsing and retrieval with multiple subscription tiers for python8 versions - Latest release: 10 months ago - 12 downloads last month - 7 stars on GitHub - 1 maintainer
faster-tokenizer 0.2.0
PaddleNLP Faster Tokenizer Library written in C++7 versions - Latest release: over 3 years ago - 1 dependent package - 606 downloads last month - 11,913 stars on GitHub - 1 maintainer
documiner 0.8.2
Advanced tool designed for text analysis and data mining in documents1 version - Latest release: 8 months ago - 1 maintainer
mseep-kreuzberg 3.13.5
Document intelligence framework for Python - Extract text, metadata, and structured data from div...4 versions - Latest release: 6 months ago - 43 downloads last month - 2,454 stars on GitHub - 1 maintainer
llm_etl_pipeline 0.1.0
LLM extraction from documents1 version - Latest release: 9 months ago - 6 downloads last month - 1 stars on GitHub - 1 maintainer
Top 8.6% on pypi.org
11 versions - Latest release: over 2 years ago - 1 dependent repositories - 74 downloads last month - 11,927 stars on GitHub - 1 maintainer
paddle-pipelines 0.6.2
Paddle-Pipelines: An End to End Natural Language Proceessing Development Kit Based on PaddleNLP11 versions - Latest release: over 2 years ago - 1 dependent repositories - 74 downloads last month - 11,927 stars on GitHub - 1 maintainer
tool-helpers 0.1.2
Data tool helpers for PaddleNLP pre-training.3 versions - Latest release: over 1 year ago - 1 dependent package - 9.91 thousand downloads last month - 11,927 stars on GitHub - 1 maintainer
netintel-ocr 0.1.17
Enterprise Document Intelligence Platform with High-Performance C++ Extensions, API v2, MCP, and ...41 versions - Latest release: 6 months ago - 463 downloads last month - 1 stars on GitHub - 2 maintainers
fast-dataindex 0.1.2
Data tool helpers for PaddleNLP pre-training.1 version - Latest release: over 1 year ago - 1.6 thousand downloads last month - 12,887 stars on GitHub - 1 maintainer
Top 3.3% on pypi.org
4 versions - Latest release: about 3 years ago - 2 dependent packages - 14 dependent repositories - 199 downloads last month - 11,927 stars on GitHub - 1 maintainer
fast-tokenizer-python 1.0.2
PaddleNLP Fast Tokenizer Library written in C++4 versions - Latest release: about 3 years ago - 2 dependent packages - 14 dependent repositories - 199 downloads last month - 11,927 stars on GitHub - 1 maintainer
faster-tokenizers 0.1.1
PaddleNLP Faster Tokenizer Library written in C++2 versions - Latest release: almost 4 years ago - 1 dependent repositories - 110 downloads last month - 11,927 stars on GitHub - 1 maintainer
credeed-pdf-to-markdown 0.1.0
Convert PDF to Markdown using Azure AI Document Intelligence and upload to S3. Provided by the Cr...1 version - Latest release: 11 months ago - 20 downloads last month - 0 stars on GitHub - 1 maintainer
tikara 0.1.6
The metadata and text content extractor for almost every file type.6 versions - Latest release: about 1 year ago - 182 downloads last month - 4 stars on GitHub - 1 maintainer
Top 9.5% on pypi.org
2 versions - Latest release: 26 days ago - 191 downloads last month - 1 stars on GitHub - 1 maintainer
contractex 0.1.1
Modern Python library for LLM-powered contract intelligence and legal document analysis2 versions - Latest release: 26 days ago - 191 downloads last month - 1 stars on GitHub - 1 maintainer
omnidoc-sdk 0.3.9
Enterprise-grade SDK for document ingestion, OCR, semantic chunking, and RAG-ready processing3 versions - Latest release: 2 months ago - 55 downloads last month - 1 maintainer
Related Keywords
llm
13
information-extraction
12
nlp
11
semantic-analysis
9
question-answering
9
ocr
8
document-processing
8
pdf
7
uie
7
transformers
7
sentiment-analysis
7
search-engine
7
pretrained-models
7
paddlenlp
7
neural-search
7
llama
7
ernie
7
embedding
7
distributed-training
7
compression
7
bert
7
rag
5
data-extraction
5
document-analysis
5
document-extraction
5
document-parsing
5
document-understanding
4
docx
4
ai
4
generative-ai
4
machine-learning
4
structured-data
4
text-extraction
4
llm-library
3
llm-framework
3
llm-extraction
3
large-language-models
3
knowledge-extraction
3
insights-extraction
3
artificial-intelligence
3
automated-prompting
3
content-extraction
3
contract-analysis
3
document
3
document-qa
3
document-pipeline
3
llm-reasoning
3
multilingual
3
multimodal
3
neural-segmentation
3
metadata-extraction
3
text-processing
3
zero-shot
3
unstructured-data
3
retrieval-augmented-generation
2
mcp
2
document-classification
2
api
2
llms
2
entity-extraction
2
extraction-justifications
2
extraction-pipeline
2
fintech
2
topic-extraction
2
text-analysis
2
structured-data-extraction
2
reference-mapping
2
prompt-free
2
no-prompt-engineering
2
legaltech
2
low-code
2
table-extraction
2
tesseract
2
pdf-extraction
2
office-documents
2
java
2
pdfium
2
markdown
2
python
2
langchain
2
contract-review
2
contract-parsing
2
contract-management
2
contract-intelligence
2
contract-automation
2
context-aware
2
concept-extraction
2
aspect-extraction
2
file-parsing
1
excel
1
file-analysis
1
file-conversion
1
document-text
1
document-reader
1
document-ocr
1
file-format
1
document-metadata
1
document-management
1
file-identification
1
document-indexing
1