An open API service providing package, version and dependency metadata of many open source software ecosystems and registries.

pypi.org "document-intelligence" keyword

Top 1.0% on pypi.org
paddlenlp 2.8.1
Easy-to-use and powerful NLP library with Awesome model zoo, supporting wide-range of NLP tasks f...
96 versions - Latest release: over 1 year ago - 15 dependent packages - 438 dependent repositories - 31.9 thousand downloads last month - 11,867 stars on GitHub - 1 maintainer
kreuzberg 4.4.2
High-performance document intelligence library for Python. Extract text, metadata, and structured...
131 versions - Latest release: 7 days ago - 81.5 thousand downloads last month - 6,130 stars on GitHub - 1 maintainer
byteit 0.1.2
AI-powered document intelligence platform - Turn your data into structured data with a single lin...
3 versions - Latest release: about 1 month ago - 273 downloads last month - 1 maintainer
langchain-kreuzberg 1.0.0
Kreuzberg document loader for LangChain — extract text from 75+ file formats with true async and ...
1 version - Latest release: 14 days ago - 92 downloads last month - 1 maintainer
contextgem 0.21.0
Effortless LLM extraction from documents
44 versions - Latest release: 16 days ago - 2.02 thousand downloads last month - 1,697 stars on GitHub - 1 maintainer
datalab-python-sdk 0.2.3
SDK for the Datalab document intelligence API
20 versions - Latest release: 18 days ago - 44.6 thousand downloads last month - 1 maintainer
servifai 1.1.0
AI-powered PDF parsing and retrieval with multiple subscription tiers for python
8 versions - Latest release: 10 months ago - 12 downloads last month - 7 stars on GitHub - 1 maintainer
faster-tokenizer 0.2.0
PaddleNLP Faster Tokenizer Library written in C++
7 versions - Latest release: over 3 years ago - 1 dependent package - 606 downloads last month - 11,913 stars on GitHub - 1 maintainer
documiner 0.8.2
Advanced tool designed for text analysis and data mining in documents
1 version - Latest release: 8 months ago - 1 maintainer
mseep-kreuzberg 3.13.5
Document intelligence framework for Python - Extract text, metadata, and structured data from div...
4 versions - Latest release: 6 months ago - 43 downloads last month - 2,454 stars on GitHub - 1 maintainer
llm_etl_pipeline 0.1.0
LLM extraction from documents
1 version - Latest release: 9 months ago - 6 downloads last month - 1 stars on GitHub - 1 maintainer
Top 8.6% on pypi.org
paddle-pipelines 0.6.2
Paddle-Pipelines: An End to End Natural Language Proceessing Development Kit Based on PaddleNLP
11 versions - Latest release: over 2 years ago - 1 dependent repositories - 74 downloads last month - 11,927 stars on GitHub - 1 maintainer
tool-helpers 0.1.2
Data tool helpers for PaddleNLP pre-training.
3 versions - Latest release: over 1 year ago - 1 dependent package - 9.91 thousand downloads last month - 11,927 stars on GitHub - 1 maintainer
netintel-ocr 0.1.17
Enterprise Document Intelligence Platform with High-Performance C++ Extensions, API v2, MCP, and ...
41 versions - Latest release: 6 months ago - 463 downloads last month - 1 stars on GitHub - 2 maintainers
fast-dataindex 0.1.2
Data tool helpers for PaddleNLP pre-training.
1 version - Latest release: over 1 year ago - 1.6 thousand downloads last month - 12,887 stars on GitHub - 1 maintainer
Top 3.3% on pypi.org
fast-tokenizer-python 1.0.2
PaddleNLP Fast Tokenizer Library written in C++
4 versions - Latest release: about 3 years ago - 2 dependent packages - 14 dependent repositories - 199 downloads last month - 11,927 stars on GitHub - 1 maintainer
faster-tokenizers 0.1.1
PaddleNLP Faster Tokenizer Library written in C++
2 versions - Latest release: almost 4 years ago - 1 dependent repositories - 110 downloads last month - 11,927 stars on GitHub - 1 maintainer
credeed-pdf-to-markdown 0.1.0
Convert PDF to Markdown using Azure AI Document Intelligence and upload to S3. Provided by the Cr...
1 version - Latest release: 11 months ago - 20 downloads last month - 0 stars on GitHub - 1 maintainer
tikara 0.1.6
The metadata and text content extractor for almost every file type.
6 versions - Latest release: about 1 year ago - 182 downloads last month - 4 stars on GitHub - 1 maintainer
Top 9.5% on pypi.org
contractex 0.1.1
Modern Python library for LLM-powered contract intelligence and legal document analysis
2 versions - Latest release: 26 days ago - 191 downloads last month - 1 stars on GitHub - 1 maintainer
omnidoc-sdk 0.3.9
Enterprise-grade SDK for document ingestion, OCR, semantic chunking, and RAG-ready processing
3 versions - Latest release: 2 months ago - 55 downloads last month - 1 maintainer
Related Keywords
llm 13 information-extraction 12 nlp 11 semantic-analysis 9 question-answering 9 ocr 8 document-processing 8 pdf 7 uie 7 transformers 7 sentiment-analysis 7 search-engine 7 pretrained-models 7 paddlenlp 7 neural-search 7 llama 7 ernie 7 embedding 7 distributed-training 7 compression 7 bert 7 rag 5 data-extraction 5 document-analysis 5 document-extraction 5 document-parsing 5 document-understanding 4 docx 4 ai 4 generative-ai 4 machine-learning 4 structured-data 4 text-extraction 4 llm-library 3 llm-framework 3 llm-extraction 3 large-language-models 3 knowledge-extraction 3 insights-extraction 3 artificial-intelligence 3 automated-prompting 3 content-extraction 3 contract-analysis 3 document 3 document-qa 3 document-pipeline 3 llm-reasoning 3 multilingual 3 multimodal 3 neural-segmentation 3 metadata-extraction 3 text-processing 3 zero-shot 3 unstructured-data 3 retrieval-augmented-generation 2 mcp 2 document-classification 2 api 2 llms 2 entity-extraction 2 extraction-justifications 2 extraction-pipeline 2 fintech 2 topic-extraction 2 text-analysis 2 structured-data-extraction 2 reference-mapping 2 prompt-free 2 no-prompt-engineering 2 legaltech 2 low-code 2 table-extraction 2 tesseract 2 pdf-extraction 2 office-documents 2 java 2 pdfium 2 markdown 2 python 2 langchain 2 contract-review 2 contract-parsing 2 contract-management 2 contract-intelligence 2 contract-automation 2 context-aware 2 concept-extraction 2 aspect-extraction 2 file-parsing 1 excel 1 file-analysis 1 file-conversion 1 document-text 1 document-reader 1 document-ocr 1 file-format 1 document-metadata 1 document-management 1 file-identification 1 document-indexing 1