An open API service providing package, version and dependency metadata of many open source software ecosystems and registries.

pypi.org "html-to-markdown" keyword

View the packages on the pypi.org package registry that are tagged with the "html-to-markdown" keyword.

Top 1.7% on pypi.org
trafilatura 2.0.0 πŸ’°
Python & Command-line tool to gather text and metadata on the Web: Crawling, scraping, extraction...
50 versions - Latest release: 11 months ago - 71 dependent packages - 63 dependent repositories - 2.56 million downloads last month - 4,829 stars on GitHub - 1 maintainer
bookmarks2markdown 1.0.2
Convert HTML bookmarks to Markdown
3 versions - Latest release: almost 7 years ago - 12 dependent repositories - 12 downloads last month - 5 stars on GitHub - 1 maintainer
spider-client 0.1.80
Python SDK for Spider Cloud API
105 versions - Latest release: 3 days ago - 1 dependent package - 199 thousand downloads last month - 19 stars on GitHub - 1 maintainer
pyhtml2md 1.7.0
Transform your HTML into clean, easy-to-read markdown with pyhtml2md.
20 versions - Latest release: 6 months ago - 16.8 thousand downloads last month - 63 stars on GitHub - 1 maintainer
confluence-markdown-exporter 3.1.0 πŸ’°
A tool to export Confluence pages to Markdown
33 versions - Latest release: 3 days ago - 3.72 thousand downloads last month - 131 stars on GitHub - 1 maintainer
firecrawl 4.6.0
Python SDK for Firecrawl API
95 versions - Latest release: 4 days ago - 919 thousand downloads last month - 65,512 stars on GitHub - 1 maintainer
firecrawl-py 4.6.0
Python SDK for Firecrawl API
118 versions - Latest release: 4 days ago - 1 dependent package - 1.28 million downloads last month - 65,512 stars on GitHub - 1 maintainer
docstrange 1.1.8
Extract and Convert PDF, Word, PowerPoint, Excel, images, URLs into multiple formats (Markdown, J...
19 versions - Latest release: 9 days ago - 3.48 thousand downloads last month - 935 stars on GitHub - 1 maintainer
bookmarkdown 0.1.1
Parse your browser's exported HTML bookmark file to Markdown.
2 versions - Latest release: about 4 years ago - 1 dependent repositories - 18 downloads last month - 17 stars on GitHub - 1 maintainer
html2txt 0.6.0
Convert HTML to markdown
6 versions - Latest release: about 5 years ago - 2 dependent repositories - 213 downloads last month - 1 stars on GitHub - 1 maintainer
wizardhtml 1.0.1
WHATWG-compliant HTML5 toolkit: DFA tokenizer, spec-guided tree builder, DOM, configurable serial...
2 versions - Latest release: 2 months ago - 19 downloads last month - 1 stars on GitHub - 1 maintainer
url2md4ai 0.1.2
πŸš€ Lean Python tool for extracting clean, LLM-optimized markdown from web pages. Handles dynamic c...
6 versions - Latest release: 4 months ago - 76 downloads last month - 3 stars on GitHub - 1 maintainer
spiderclient-py 0.0.1
Python SDK for Spider Cloud API
1 version - Latest release: over 1 year ago - 8 downloads last month - 19 stars on GitHub - 1 maintainer
rapid-crawl 0.1.0 πŸ’°
A powerful Python SDK for web scraping, crawling, and data extraction - inspired by Firecrawl
1 version - Latest release: 4 months ago - 20 downloads last month - 0 stars on GitHub - 1 maintainer
spiderwebai-py 0.1.4
Python SDK for SpiderWebAI API
10 versions - Latest release: over 1 year ago - 16 downloads last month - 19 stars on GitHub - 1 maintainer
llm-data-converter 2.2.0
Best open-source document to markdown converter for LLM training data. Convert PDF, Word, PowerPo...
23 versions - Latest release: 4 months ago - 133 downloads last month - 5 stars on GitHub - 1 maintainer
document-data-extractor 1.0.4
Best open-source document to markdown extractor for LLM training data. Convert PDF, Word, PowerPo...
5 versions - Latest release: 3 months ago - 38 downloads last month - 3 stars on GitHub - 1 maintainer
webpage2md 1.0.0
Convert HTML files and web pages to Markdown format
1 version - Latest release: 9 months ago - 10 downloads last month - 1 maintainer
markdown-spider 0.1.1
A configurable web spider that converts online documentation into well-structured Markdown files ...
1 version - Latest release: 8 months ago - 13 downloads last month - 2 stars on GitHub - 1 maintainer
markdown-crawler 0.0.8
A multithreaded πŸ•ΈοΈ web crawler that recursively crawls a website and creates a πŸ”½ markdown file fo...
7 versions - Latest release: about 2 years ago - 735 downloads last month - 400 stars on GitHub - 1 maintainer
toprint 0.1.32
2print/toprint: Python library for printing and converting between HTML, PDF, ZPL, and image form...
6 versions - Latest release: 5 months ago - 41 downloads last month - 0 stars on GitHub - 1 maintainer
webdown 0.7.0
Convert web pages and HTML files to markdown and Claude XML formats
8 versions - Latest release: 7 months ago - 43 downloads last month - 4 stars on GitHub - 1 maintainer
spiderforce4ai 2.6.9
Python wrapper for SpiderForce4AI HTML-to-Markdown conversion service with LLM post-processing
50 versions - Latest release: 8 months ago - 173 downloads last month - 1 maintainer
spidercloud-py 0.0.1
Python SDK for Spider Cloud API
1 version - Latest release: over 1 year ago - 7 downloads last month - 0 stars on GitHub - 1 maintainer
magicconvert 0.1.3
MagicConvert is a Python library that converts various document formats (PDF, DOCX, XLSX, PPTX, H...
3 versions - Latest release: 6 months ago - 71 downloads last month - 2 stars on GitHub - 1 maintainer
file2txt 1.0.5
file2txt is a Python library takes common file formats and turns them into plain text (a txt file...
7 versions - Latest release: about 1 month ago - 885 downloads last month - 12 stars on GitHub - 1 maintainer
html2text-rs 0.2.5
Convert HTML to markdown or plain text
7 versions - Latest release: 2 months ago - 8.07 thousand downloads last month - 7 stars on GitHub - 1 maintainer
conv-html-to-markdown 0.1.311
Curate scraped HTML for easy interpretation by large language models. Build more robust generativ...
7 versions - Latest release: almost 2 years ago - 161 downloads last month - 3 stars on GitHub - 1 maintainer
Related Keywords
markdown 19 web-scraping 12 llm 11 ai 9 scraper 9 crawler 7 spider 6 ai-scraping 6 rag 6 ai-agents 6 pdf-to-markdown 6 text-extraction 6 ocr 5 document-conversion 5 content-extraction 5 html 5 document-processing 5 converter 5 llm-webcrawler 4 html2md 4 scraping 4 web-crawler 4 pdf 4 tesseract-alternative 3 document-to-markdown 3 html2markdown 3 paddleocr-alternative 3 mineru-alternative 3 markitdown-alternative 3 marker-alternative 3 docling-alternative 3 unstructured-alternative 3 ai-training-data 3 document-understanding 3 intelligent-document-processing 3 image-processing 3 supabase 3 local-document-processing 3 data-extraction 3 structured-data-extraction 3 table-extraction 3 layout-detection 3 llm-ready-data 3 document-ai 3 excel-to-markdown 3 powerpoint-to-markdown 3 word-to-markdown 3 batch-document-processing 3 webscraping 3 web-scraper 3 html2text 2 offline-document-extractor 2 image-to-markdown 2 file-conversion 2 pdf-to-json 2 html-parser 2 html-to-md 2 web-automation 2 html-to-markdown-converter 2 offline-document-converter 2 ppt-to-markdown 2 web 2 image-to-text 2 bookmarks 2 python 2 python3 2 html5 2 web-search 2 web-data-extraction 2 ai-search 2 ai-crawler 2 firecrawl 2 SDK 2 web-data 2 API 2 image-to-xml 1 img2xml 1 img2markdown 1 doc-to-docx 1 doc2md 1 image-to-md 1 img2md 1 image-to-docx 1 img2docx 1 image-to-doc 1 doc-to-md 1 doc2docx 1 doc-to-image 1 doc2img 1 doc-to-zpl 1 doc2zpl 1 img2json 1 image-to-json 1 doc-to-html 1 doc2html 1 doc2print 1 doc-to-pdf 1 doc2pdf 1 doc-to-print 1 parser 1