An open API service providing package, version and dependency metadata of many open source software ecosystems and registries.

pypi.org "html-to-markdown" keyword

View the packages on the pypi.org package registry that are tagged with the "html-to-markdown" keyword.

Top 1.7% on pypi.org
trafilatura 2.0.0 πŸ’°
Python & Command-line tool to gather text and metadata on the Web: Crawling, scraping, extraction...
50 versions - Latest release: 9 months ago - 71 dependent packages - 63 dependent repositories - 1.37 million downloads last month - 4,627 stars on GitHub - 1 maintainer
confluence-markdown-exporter 3.0.5 πŸ’°
A tool to export Confluence pages to Markdown
32 versions - Latest release: 1 day ago - 3.19 thousand downloads last month - 116 stars on GitHub - 1 maintainer
markdown-crawler 0.0.8
A multithreaded πŸ•ΈοΈ web crawler that recursively crawls a website and creates a πŸ”½ markdown file fo...
7 versions - Latest release: almost 2 years ago - 735 downloads last month - 393 stars on GitHub - 1 maintainer
magicconvert 0.1.3
MagicConvert is a Python library that converts various document formats (PDF, DOCX, XLSX, PPTX, H...
3 versions - Latest release: 4 months ago - 43 downloads last month - 1 stars on GitHub - 1 maintainer
pyhtml2md 1.7.0
Transform your HTML into clean, easy-to-read markdown with pyhtml2md.
20 versions - Latest release: 4 months ago - 7.59 thousand downloads last month - 63 stars on GitHub - 1 maintainer
firecrawl-py 4.3.4
Python SDK for Firecrawl API
112 versions - Latest release: 1 day ago - 1 dependent package - 1.09 million downloads last month - 54,038 stars on GitHub - 1 maintainer
firecrawl 4.3.4
Python SDK for Firecrawl API
89 versions - Latest release: 1 day ago - 434 thousand downloads last month - 54,038 stars on GitHub - 1 maintainer
webdown 0.7.0
Convert web pages and HTML files to markdown and Claude XML formats
8 versions - Latest release: 5 months ago - 17 downloads last month - 3 stars on GitHub - 1 maintainer
docstrange 1.1.5
Extract and Convert PDF, Word, PowerPoint, Excel, images, URLs into multiple formats (Markdown, J...
16 versions - Latest release: 5 days ago - 1.93 thousand downloads last month - 493 stars on GitHub - 1 maintainer
spiderclient-py 0.0.1
Python SDK for Spider Cloud API
1 version - Latest release: over 1 year ago - 4 downloads last month - 19 stars on GitHub - 1 maintainer
spider-client 0.1.77
Python SDK for Spider Cloud API
103 versions - Latest release: 9 days ago - 1 dependent package - 164 thousand downloads last month - 19 stars on GitHub - 1 maintainer
spidercloud-py 0.0.1
Python SDK for Spider Cloud API
1 version - Latest release: over 1 year ago - 3 downloads last month - 0 stars on GitHub - 1 maintainer
llm-data-converter 2.2.0
Best open-source document to markdown converter for LLM training data. Convert PDF, Word, PowerPo...
23 versions - Latest release: about 1 month ago - 297 downloads last month - 3 stars on GitHub - 1 maintainer
document-data-extractor 1.0.4
Best open-source document to markdown extractor for LLM training data. Convert PDF, Word, PowerPo...
5 versions - Latest release: about 1 month ago - 78 downloads last month - 3 stars on GitHub - 1 maintainer
bookmarks2markdown 1.0.2
Convert HTML bookmarks to Markdown
3 versions - Latest release: over 6 years ago - 12 dependent repositories - 12 downloads last month - 5 stars on GitHub - 1 maintainer
html2text-rs 0.2.5
Convert HTML to markdown or plain text
7 versions - Latest release: 7 days ago - 5.11 thousand downloads last month - 7 stars on GitHub - 1 maintainer
wizardhtml 1.0.0
WHATWG-compliant HTML5 toolkit: DFA tokenizer, spec-guided tree builder, DOM, configurable serial...
1 version - Latest release: 9 days ago - 1 stars on GitHub - 1 maintainer
spiderforce4ai 2.6.9
Python wrapper for SpiderForce4AI HTML-to-Markdown conversion service with LLM post-processing
50 versions - Latest release: 6 months ago - 173 downloads last month - 1 maintainer
html2txt 0.6.0
Convert HTML to markdown
6 versions - Latest release: almost 5 years ago - 2 dependent repositories - 216 downloads last month - 1 stars on GitHub - 1 maintainer
bookmarkdown 0.1.1
Parse your browser's exported HTML bookmark file to Markdown.
2 versions - Latest release: about 4 years ago - 1 dependent repositories - 19 downloads last month - 17 stars on GitHub - 1 maintainer
webpage2md 1.0.0
Convert HTML files and web pages to Markdown format
1 version - Latest release: 7 months ago - 9 downloads last month - 1 maintainer
url2md4ai 0.1.2
πŸš€ Lean Python tool for extracting clean, LLM-optimized markdown from web pages. Handles dynamic c...
6 versions - Latest release: 2 months ago - 73 downloads last month - 2 stars on GitHub - 1 maintainer
spiderwebai-py 0.1.4
Python SDK for SpiderWebAI API
10 versions - Latest release: over 1 year ago - 25 downloads last month - 18 stars on GitHub - 1 maintainer
markdown-spider 0.1.1
A configurable web spider that converts online documentation into well-structured Markdown files ...
1 version - Latest release: 6 months ago - 10 downloads last month - 2 stars on GitHub - 1 maintainer
rapid-crawl 0.1.0 πŸ’°
A powerful Python SDK for web scraping, crawling, and data extraction - inspired by Firecrawl
1 version - Latest release: about 2 months ago - 1 maintainer
toprint 0.1.32
2print/toprint: Python library for printing and converting between HTML, PDF, ZPL, and image form...
6 versions - Latest release: 3 months ago - 454 downloads last month - 0 stars on GitHub - 1 maintainer
file2txt 1.0.1
file2txt is a Python library takes common file formats and turns them into plain text (a txt file...
3 versions - Latest release: 2 months ago - 12 stars on GitHub - 1 maintainer
conv-html-to-markdown 0.1.311
Curate scraped HTML for easy interpretation by large language models. Build more robust generativ...
7 versions - Latest release: over 1 year ago - 161 downloads last month - 3 stars on GitHub - 1 maintainer
Related Keywords
markdown 19 llm 11 web-scraping 10 scraper 9 ai 9 rag 8 crawler 7 spider 6 ai-scraping 6 text-extraction 6 pdf-to-markdown 6 document-processing 5 content-extraction 5 document-conversion 5 ocr 5 html 5 converter 5 pdf 4 ai-agents 4 llm-webcrawler 4 html2md 4 scraping 4 web-crawler 4 image-processing 3 document-ai 3 batch-document-processing 3 intelligent-document-processing 3 document-understanding 3 ai-training-data 3 word-to-markdown 3 powerpoint-to-markdown 3 html2markdown 3 excel-to-markdown 3 llm-ready-data 3 layout-detection 3 table-extraction 3 structured-data-extraction 3 webscraping 3 local-document-processing 3 document-to-markdown 3 tesseract-alternative 3 paddleocr-alternative 3 mineru-alternative 3 markitdown-alternative 3 supabase 3 unstructured-alternative 3 docling-alternative 3 marker-alternative 3 html2text 2 web 2 offline-document-converter 2 data 2 ppt-to-markdown 2 bookmarks 2 firecrawl 2 API 2 SDK 2 html5 2 python 2 web-automation 2 offline-document-extractor 2 python3 2 pdf-to-json 2 html-to-md 2 file-conversion 2 image-to-text 2 html-parser 2 image-to-markdown 2 doc2pdf 1 doc-to-print 1 doc2print 1 image-to-json 1 img2json 1 image-to-xml 1 img2xml 1 img2markdown 1 image-to-md 1 img2md 1 image-to-docx 1 img2docx 1 image-to-doc 1 img2doc 1 image-to-zpl 1 img2zpl 1 image-to-html 1 img2html 1 image-to-pdf 1 img2pdf 1 image-to-print 1 zpl-to-print 1 zpl2print 1 pdf2json 1 pdf-to-xml 1 pdf2xml 1 pdf2markdown 1 pdf-to-md 1 pdf2md 1 pdf-to-docx 1 pdf2docx 1 pdf-to-doc 1