pypi.org "html-to-markdown" keyword
View the packages on the pypi.org package registry that are tagged with the "html-to-markdown" keyword.
Top 1.7% on pypi.org
50 versions - Latest release: 11 months ago - 71 dependent packages - 63 dependent repositories - 2.56 million downloads last month - 4,829 stars on GitHub - 1 maintainer
trafilatura 2.0.0 π°
Python & Command-line tool to gather text and metadata on the Web: Crawling, scraping, extraction...50 versions - Latest release: 11 months ago - 71 dependent packages - 63 dependent repositories - 2.56 million downloads last month - 4,829 stars on GitHub - 1 maintainer
bookmarks2markdown 1.0.2
Convert HTML bookmarks to Markdown3 versions - Latest release: almost 7 years ago - 12 dependent repositories - 12 downloads last month - 5 stars on GitHub - 1 maintainer
spider-client 0.1.80
Python SDK for Spider Cloud API105 versions - Latest release: 3 days ago - 1 dependent package - 199 thousand downloads last month - 19 stars on GitHub - 1 maintainer
pyhtml2md 1.7.0
Transform your HTML into clean, easy-to-read markdown with pyhtml2md.20 versions - Latest release: 6 months ago - 16.8 thousand downloads last month - 63 stars on GitHub - 1 maintainer
confluence-markdown-exporter 3.1.0 π°
A tool to export Confluence pages to Markdown33 versions - Latest release: 3 days ago - 3.72 thousand downloads last month - 131 stars on GitHub - 1 maintainer
firecrawl 4.6.0
Python SDK for Firecrawl API95 versions - Latest release: 4 days ago - 919 thousand downloads last month - 65,512 stars on GitHub - 1 maintainer
firecrawl-py 4.6.0
Python SDK for Firecrawl API118 versions - Latest release: 4 days ago - 1 dependent package - 1.28 million downloads last month - 65,512 stars on GitHub - 1 maintainer
docstrange 1.1.8
Extract and Convert PDF, Word, PowerPoint, Excel, images, URLs into multiple formats (Markdown, J...19 versions - Latest release: 9 days ago - 3.48 thousand downloads last month - 935 stars on GitHub - 1 maintainer
bookmarkdown 0.1.1
Parse your browser's exported HTML bookmark file to Markdown.2 versions - Latest release: about 4 years ago - 1 dependent repositories - 18 downloads last month - 17 stars on GitHub - 1 maintainer
html2txt 0.6.0
Convert HTML to markdown6 versions - Latest release: about 5 years ago - 2 dependent repositories - 213 downloads last month - 1 stars on GitHub - 1 maintainer
wizardhtml 1.0.1
WHATWG-compliant HTML5 toolkit: DFA tokenizer, spec-guided tree builder, DOM, configurable serial...2 versions - Latest release: 2 months ago - 19 downloads last month - 1 stars on GitHub - 1 maintainer
url2md4ai 0.1.2
π Lean Python tool for extracting clean, LLM-optimized markdown from web pages. Handles dynamic c...6 versions - Latest release: 4 months ago - 76 downloads last month - 3 stars on GitHub - 1 maintainer
spiderclient-py 0.0.1
Python SDK for Spider Cloud API1 version - Latest release: over 1 year ago - 8 downloads last month - 19 stars on GitHub - 1 maintainer
rapid-crawl 0.1.0 π°
A powerful Python SDK for web scraping, crawling, and data extraction - inspired by Firecrawl1 version - Latest release: 4 months ago - 20 downloads last month - 0 stars on GitHub - 1 maintainer
spiderwebai-py 0.1.4
Python SDK for SpiderWebAI API10 versions - Latest release: over 1 year ago - 16 downloads last month - 19 stars on GitHub - 1 maintainer
llm-data-converter 2.2.0
Best open-source document to markdown converter for LLM training data. Convert PDF, Word, PowerPo...23 versions - Latest release: 4 months ago - 133 downloads last month - 5 stars on GitHub - 1 maintainer
document-data-extractor 1.0.4
Best open-source document to markdown extractor for LLM training data. Convert PDF, Word, PowerPo...5 versions - Latest release: 3 months ago - 38 downloads last month - 3 stars on GitHub - 1 maintainer
webpage2md 1.0.0
Convert HTML files and web pages to Markdown format1 version - Latest release: 9 months ago - 10 downloads last month - 1 maintainer
markdown-spider 0.1.1
A configurable web spider that converts online documentation into well-structured Markdown files ...1 version - Latest release: 8 months ago - 13 downloads last month - 2 stars on GitHub - 1 maintainer
markdown-crawler 0.0.8
A multithreaded πΈοΈ web crawler that recursively crawls a website and creates a π½ markdown file fo...7 versions - Latest release: about 2 years ago - 735 downloads last month - 400 stars on GitHub - 1 maintainer
toprint 0.1.32
2print/toprint: Python library for printing and converting between HTML, PDF, ZPL, and image form...6 versions - Latest release: 5 months ago - 41 downloads last month - 0 stars on GitHub - 1 maintainer
webdown 0.7.0
Convert web pages and HTML files to markdown and Claude XML formats8 versions - Latest release: 7 months ago - 43 downloads last month - 4 stars on GitHub - 1 maintainer
spiderforce4ai 2.6.9
Python wrapper for SpiderForce4AI HTML-to-Markdown conversion service with LLM post-processing50 versions - Latest release: 8 months ago - 173 downloads last month - 1 maintainer
spidercloud-py 0.0.1
Python SDK for Spider Cloud API1 version - Latest release: over 1 year ago - 7 downloads last month - 0 stars on GitHub - 1 maintainer
magicconvert 0.1.3
MagicConvert is a Python library that converts various document formats (PDF, DOCX, XLSX, PPTX, H...3 versions - Latest release: 6 months ago - 71 downloads last month - 2 stars on GitHub - 1 maintainer
file2txt 1.0.5
file2txt is a Python library takes common file formats and turns them into plain text (a txt file...7 versions - Latest release: about 1 month ago - 885 downloads last month - 12 stars on GitHub - 1 maintainer
html2text-rs 0.2.5
Convert HTML to markdown or plain text7 versions - Latest release: 2 months ago - 8.07 thousand downloads last month - 7 stars on GitHub - 1 maintainer
conv-html-to-markdown 0.1.311
Curate scraped HTML for easy interpretation by large language models. Build more robust generativ...7 versions - Latest release: almost 2 years ago - 161 downloads last month - 3 stars on GitHub - 1 maintainer
Related Keywords
markdown
19
web-scraping
12
llm
11
ai
9
scraper
9
crawler
7
spider
6
ai-scraping
6
rag
6
ai-agents
6
pdf-to-markdown
6
text-extraction
6
ocr
5
document-conversion
5
content-extraction
5
html
5
document-processing
5
converter
5
llm-webcrawler
4
html2md
4
scraping
4
web-crawler
4
pdf
4
tesseract-alternative
3
document-to-markdown
3
html2markdown
3
paddleocr-alternative
3
mineru-alternative
3
markitdown-alternative
3
marker-alternative
3
docling-alternative
3
unstructured-alternative
3
ai-training-data
3
document-understanding
3
intelligent-document-processing
3
image-processing
3
supabase
3
local-document-processing
3
data-extraction
3
structured-data-extraction
3
table-extraction
3
layout-detection
3
llm-ready-data
3
document-ai
3
excel-to-markdown
3
powerpoint-to-markdown
3
word-to-markdown
3
batch-document-processing
3
webscraping
3
web-scraper
3
html2text
2
offline-document-extractor
2
image-to-markdown
2
file-conversion
2
pdf-to-json
2
html-parser
2
html-to-md
2
web-automation
2
html-to-markdown-converter
2
offline-document-converter
2
ppt-to-markdown
2
web
2
image-to-text
2
bookmarks
2
python
2
python3
2
html5
2
web-search
2
web-data-extraction
2
ai-search
2
ai-crawler
2
firecrawl
2
SDK
2
web-data
2
API
2
image-to-xml
1
img2xml
1
img2markdown
1
doc-to-docx
1
doc2md
1
image-to-md
1
img2md
1
image-to-docx
1
img2docx
1
image-to-doc
1
doc-to-md
1
doc2docx
1
doc-to-image
1
doc2img
1
doc-to-zpl
1
doc2zpl
1
img2json
1
image-to-json
1
doc-to-html
1
doc2html
1
doc2print
1
doc-to-pdf
1
doc2pdf
1
doc-to-print
1
parser
1