pypi.org "extract" keyword
View the packages on the pypi.org package registry that are tagged with the "extract" keyword.
aadhaar-py 2.0.2
Extract embedded information from Aadhaar Secure QR Code.4 versions - Latest release: about 3 years ago - 1 dependent repositories - 285 downloads last month - 13 stars on GitHub - 1 maintainer
Top 8.7% on pypi.org
38 versions - Latest release: 3 months ago - 1 dependent package - 1 dependent repositories - 1.37 thousand downloads last month - 1,867 stars on GitHub - 3 maintainers
scancode-toolkit-mini 32.3.2 π°
ScanCode is a tool to scan code for license, copyright, package and their documented dependencies...38 versions - Latest release: 3 months ago - 1 dependent package - 1 dependent repositories - 1.37 thousand downloads last month - 1,867 stars on GitHub - 3 maintainers
tzar 0.1.5
Manage: Tar, Zip, Anything Really!6 versions - Latest release: over 3 years ago - 1 dependent repositories - 233 downloads last month - 38 stars on GitHub - 1 maintainer
cowpoke 0.0.2
Code in support of the 'Cooking with Python and KBpedia' (CWPK) series2 versions - Latest release: about 1 month ago - 1 dependent repositories - 114 downloads last month - 25 stars on GitHub - 1 maintainer
serpextract 0.7.3
Easy extraction of keywords from search engine results pages (SERPs).26 versions - Latest release: over 3 years ago - 2 dependent repositories - 546 downloads last month - 90 stars on GitHub - 4 maintainers
irv-autopkg-client 0.4.0
A client for accessing the infra-risk-vis autopackage API6 versions - Latest release: 1 day ago - 258 downloads last month - 0 stars on GitHub - 2 maintainers
markout-html 0.1.7 π°
A small Python package to extract content from web pages.8 versions - Latest release: about 5 years ago - 1 dependent repositories - 262 downloads last month - 2 stars on GitHub - 1 maintainer
Top 1.9% on pypi.org
65 versions - Latest release: about 1 month ago - 12 dependent packages - 68 dependent repositories - 50.2 thousand downloads last month - 2,254 stars on GitHub - 4 maintainers
scancode-toolkit 32.3.3 π°
ScanCode is a tool to scan code for license, copyright, package and their documented dependencies...65 versions - Latest release: about 1 month ago - 12 dependent packages - 68 dependent repositories - 50.2 thousand downloads last month - 2,254 stars on GitHub - 4 maintainers
tracextracturl 0.2.7030
Provides `extract_url` method to extract the URL from TracWiki links.3 versions - Latest release: over 15 years ago - 1 dependent repositories - 155 downloads last month - 2 maintainers
google-voice-parser 0.1.1 π°
Parse SMS from Google Voice2 versions - Latest release: almost 5 years ago - 1 dependent repositories - 86 downloads last month - 22 stars on GitHub - 1 maintainer
eml-extractor 0.1.1
A CLI tool to extract attachments from .eml files (email messages saved as files)1 version - Latest release: over 3 years ago - 1 dependent repositories - 2.25 thousand downloads last month - 13 stars on GitHub - 1 maintainer
miavisc 1.1.1
Miavisc is a video β slide converter1 version - Latest release: about 21 hours ago - 0 stars on GitHub - 1 maintainer
baet 1.0.1
A tool to bulk extract audio tracks from video using FFmpeg13 versions - Latest release: about 1 year ago - 514 downloads last month - 0 stars on GitHub - 1 maintainer
sling-mac-amd64 1.4.5
Sling Binary for Mac (AMD64)37 versions - Latest release: 1 day ago - 3.45 thousand downloads last month - 47 stars on GitHub - 1 maintainer
Top 9.4% on pypi.org
204 versions - Latest release: 1 day ago - 1 dependent package - 3 dependent repositories - 108 thousand downloads last month - 47 stars on GitHub - 1 maintainer
sling 1.4.5
Slings data from a source to a target204 versions - Latest release: 1 day ago - 1 dependent package - 3 dependent repositories - 108 thousand downloads last month - 47 stars on GitHub - 1 maintainer
sling-linux-amd64 1.4.5
Sling Binary for Linux (AMD64)55 versions - Latest release: 1 day ago - 90.5 thousand downloads last month - 47 stars on GitHub - 1 maintainer
sling-linux-arm64 1.4.5
Sling Binary for Linux (ARM64)55 versions - Latest release: 1 day ago - 8.15 thousand downloads last month - 47 stars on GitHub - 1 maintainer
sling-windows-amd64 1.4.5
Sling Binary for Windows55 versions - Latest release: 1 day ago - 4.1 thousand downloads last month - 47 stars on GitHub - 1 maintainer
sling-mac-arm64 1.4.5
Sling Binary for Mac (ARM64)37 versions - Latest release: 1 day ago - 21.4 thousand downloads last month - 47 stars on GitHub - 1 maintainer
kai 0.0.7
A Python package that gives you the power to extract any compressed file using the same simple sy...5 versions - Latest release: over 8 years ago - 1 dependent repositories - 227 downloads last month - 10 stars on GitHub - 1 maintainer
Top 7.2% on pypi.org
12 versions - Latest release: 4 months ago - 10 dependent repositories - 1.64 thousand downloads last month - 1,457 stars on GitHub - 1 maintainer
excalibur-py 1.0.1 π°
A web interface to extract tabular data from PDFs12 versions - Latest release: 4 months ago - 10 dependent repositories - 1.64 thousand downloads last month - 1,457 stars on GitHub - 1 maintainer
pandleau 0.4.1
A quick and easy way to convert a Pandas DataFrame to a Tableau extract.3 versions - Latest release: over 5 years ago - 1 dependent repositories - 5.16 thousand downloads last month - 62 stars on GitHub - 2 maintainers
sfx7z 0.2.0 π°
py7zr wrapper with SFX options.3 versions - Latest release: over 3 years ago - 1 dependent repositories - 72 downloads last month - 1 maintainer
rollet 0.1.5
Collect data from various sources16 versions - Latest release: about 3 years ago - 1 dependent repositories - 436 downloads last month - 2 maintainers
Top 1.6% on pypi.org
9 versions - Latest release: 25 days ago - 119 dependent packages - 2,595 dependent repositories - 1.39 million downloads last month - 486 stars on GitHub - 1 maintainer
docx2txt 0.9
A pure python-based utility to extract text and images from docx files.9 versions - Latest release: 25 days ago - 119 dependent packages - 2,595 dependent repositories - 1.39 million downloads last month - 486 stars on GitHub - 1 maintainer
adharvester 0.0.2
Harvester for any type of online shops and advertisements2 versions - Latest release: over 3 years ago - 95 downloads last month - 0 stars on GitHub - 1 maintainer
pdfwordify 0.0.1
Tool for extracting text and tables from PDF files and saving this data in docx format1 version - Latest release: 12 months ago - 66 downloads last month - 1 stars on GitHub - 1 maintainer
json-inline 0.1.8
JSON Low Code - Extract data inline7 versions - Latest release: over 3 years ago - 1 dependent repositories - 252 downloads last month - 14 stars on GitHub - 1 maintainer
gesichtfinder 0.10
Extracts faces from an image using different backend detectors and save the results in a DataFrame.1 version - Latest release: almost 2 years ago - 57 downloads last month - 0 stars on GitHub - 1 maintainer
rqmexcelimporter 0.1
".rqms" file conversion to excel with python1 version - Latest release: almost 3 years ago - 1 dependent repositories - 27 downloads last month - 0 stars on GitHub - 1 maintainer
etl2osm 0.2.0
Extract, Transform and Load to OpenStreetMap.4 versions - Latest release: over 8 years ago - 2 dependent repositories - 151 downloads last month - 4 stars on GitHub - 1 maintainer
sukhoi 1.1.0
Minimalist and powerful Web Crawler.9 versions - Latest release: over 4 years ago - 1 dependent repositories - 206 downloads last month - 881 stars on GitHub - 1 maintainer
gimagegrabber 0.1.16
Tools to download images from Google search30 versions - Latest release: about 5 years ago - 1 dependent repositories - 875 downloads last month - 7,333 stars on GitHub - 1 maintainer
extract-6a6f6a6f 1.0.1
Just pull an application from an Android device2 versions - Latest release: almost 3 years ago - 1 dependent repositories - 75 downloads last month - 4 stars on GitHub - 1 maintainer
twittergraph 1.2.5
TwitterGraph Extractor de datos y Generacion de grafos9 versions - Latest release: almost 7 years ago - 1 dependent repositories - 208 downloads last month - 1 maintainer
poplines 2.1.1
Tools to pop/peek lines from the head/tail or known position within a given file, and output to s...6 versions - Latest release: almost 8 years ago - 1 dependent repositories - 140 downloads last month - 5 stars on GitHub - 1 maintainer
sqldatamodel 1.3.0
SQLDataModel is a lightweight dataframe library designed for efficient data extraction, transform...83 versions - Latest release: 2 months ago - 1.83 thousand downloads last month - 6 stars on GitHub - 1 maintainer
pyisotools 2.4.7
Simple python library for extracting and rebuilding ISOs42 versions - Latest release: 3 months ago - 1 dependent repositories - 1.17 thousand downloads last month - 26 stars on GitHub - 1 maintainer
datatreegrab 1.03.03
Node-Tree based data extraction13 versions - Latest release: over 1 year ago - 1 dependent repositories - 26 downloads last month - 3 stars on GitHub - 1 maintainer
openerp-client-etl 1.1.1
OpenERP ETL Client allows to extract, transform and load data from any data source.3 versions - Latest release: over 10 years ago - 2 dependent repositories - 142 downloads last month - 1 maintainer
extractc 0.2.2
Extract Vietnamese Card10 versions - Latest release: about 4 years ago - 1 dependent repositories - 275 downloads last month - 5 stars on GitHub - 1 maintainer
property-lister 3.2 π°
Extract and convert property list files from SQLite database files and from other property list f...19 versions - Latest release: about 1 month ago - 160 downloads last month - 6 stars on GitHub - 1 maintainer
textmater 0.1
Extract Structured Data from text1 version - Latest release: 11 months ago - 56 downloads last month - 1 maintainer
summarext 0.0.1
Extraction most important keywords from any website1 version - Latest release: about 4 years ago - 1 dependent repositories - 53 downloads last month - 13 stars on GitHub - 1 maintainer
magnet4c 0.0.2
Corpora operator for multiple file formats.2 versions - Latest release: over 5 years ago - 1 dependent repositories - 96 downloads last month - 1 maintainer
dumptls 0.5.4
A tool to download TLS certificates including intermediate and root CA certs8 versions - Latest release: 10 months ago - 398 downloads last month - 2 stars on GitHub - 1 maintainer
scancodeio 34.10.1
Automate software composition analysis pipelines35 versions - Latest release: 24 days ago - 1.56 thousand downloads last month - 130 stars on GitHub - 3 maintainers
auto-unpack 2.12.0
εηΌ©ε θͺε¨θ§£εε·₯ε ·οΌζ―ζε€η§εηΌ©ε ζ ΌεΌγιθΏη»εεη§ζδ»ΆοΌηΌζζ΅η¨οΌεε―ζ»‘θΆ³ζ₯εΈΈθ§£ειζ±γ15 versions - Latest release: 4 months ago - 669 downloads last month - 16 stars on GitHub - 1 maintainer
plantextract 0.3.0
Python Plant Extract Document processor4 versions - Latest release: over 11 years ago - 2 dependent repositories - 156 downloads last month - 1 stars on GitHub - 1 maintainer
bulby 0.0.1.dev0
Manages the phillips hue lightbulbs1 version - Latest release: about 10 years ago - 2 dependent repositories - 33 downloads last month - 120 stars on GitHub - 1 maintainer
image2gps 1.4.0
Extract time and coords from image πΌπβ±οΈ5 versions - Latest release: over 1 year ago - 138 downloads last month - 0 stars on GitHub - 1 maintainer
Top 9.6% on pypi.org
8 versions - Latest release: over 3 years ago - 1 dependent package - 1 dependent repositories - 12.1 thousand downloads last month - 1 maintainer
cherrypicker 0.4.0
Pluck and flatten complex data.8 versions - Latest release: over 3 years ago - 1 dependent package - 1 dependent repositories - 12.1 thousand downloads last month - 1 maintainer
eatiht 0.1.14
A simple tool used to extract an article's text in html documents.14 versions - Latest release: about 10 years ago - 11 dependent repositories - 358 downloads last month - 435 stars on GitHub - 1 maintainer
Top 3.5% on pypi.org
6 versions - Latest release: over 5 years ago - 90 dependent repositories - 9.4 thousand downloads last month - 854 stars on GitHub - 1 maintainer
pyresparser 1.0.6 π°
A simple resume parser used for extracting information from resumes6 versions - Latest release: over 5 years ago - 90 dependent repositories - 9.4 thousand downloads last month - 854 stars on GitHub - 1 maintainer
resparserok 0.0.7 π°
A simple resume parser used for extracting information from resumes7 versions - Latest release: over 1 year ago - 190 downloads last month - 854 stars on GitHub - 1 maintainer
biosutilities 25.4.2
Various BIOS Utilities for Modding/Research13 versions - Latest release: 16 days ago - 870 downloads last month - 878 stars on GitHub - 1 maintainer
docxlatex 0.1.6
Extract text from .docx files with support for inserted equations7 versions - Latest release: almost 4 years ago - 1 dependent repositories - 559 downloads last month - 13 stars on GitHub - 1 maintainer
Top 7.7% on pypi.org
1 version - Latest release: over 7 years ago - 6 dependent repositories - 1.29 thousand downloads last month - 1 maintainer
pdfpages 0.1.0
Extract pages from PDF documents1 version - Latest release: over 7 years ago - 6 dependent repositories - 1.29 thousand downloads last month - 1 maintainer
ftransc 7.0.3
ftransc is a python library for converting audio files across various formats.19 versions - Latest release: about 4 years ago - 1 dependent package - 11 dependent repositories - 509 downloads last month - 17 stars on GitHub - 1 maintainer
Top 4.7% on pypi.org
5 versions - Latest release: over 7 years ago - 12 dependent repositories - 579 downloads last month - 2,234 stars on GitHub - 2 maintainers
pdftabextract 0.3.0
A set of tools for data mining (OCR-processed) PDFs5 versions - Latest release: over 7 years ago - 12 dependent repositories - 579 downloads last month - 2,234 stars on GitHub - 2 maintainers
ppt2txt 0.1.0
A pure python based utility to extract text from PPT files.1 version - Latest release: over 1 year ago - 691 downloads last month - 1 stars on GitHub - 1 maintainer
xextract 0.1.9
Extract structured data from HTML and XML documents like a boss.18 versions - Latest release: 4 months ago - 1 dependent package - 4 dependent repositories - 615 downloads last month - 50 stars on GitHub - 1 maintainer
dlt-dataops 0.5.4a0
dlt is an open-source python-first scalable data loading library that does not require any backen...1 version - Latest release: 8 months ago - 64 downloads last month - 3,458 stars on GitHub - 1 maintainer
mongo2pq 0.1.0
data load tool (dlt) is an open source Python library that makes data loading easy π οΈ1 version - Latest release: over 1 year ago - 57 downloads last month - 3,458 stars on GitHub - 1 maintainer
Top 3.6% on pypi.org
125 versions - Latest release: 23 days ago - 5 dependent packages - 23 dependent repositories - 1.49 million downloads last month - 2,522 stars on GitHub - 2 maintainers
dlt 1.9.0
dlt is an open-source python-first scalable data loading library that does not require any backen...125 versions - Latest release: 23 days ago - 5 dependent packages - 23 dependent repositories - 1.49 million downloads last month - 2,522 stars on GitHub - 2 maintainers
coordinates-extractor 0.4 π°
Extract Coordinates from semi-structured text, like Wikipedia xml3 versions - Latest release: about 8 years ago - 78 downloads last month - 1 stars on GitHub - 1 maintainer
icnsutil 1.1.0
A fully-featured python library to handle reading and writing icns files.3 versions - Latest release: about 2 years ago - 1 dependent package - 1 dependent repositories - 2.34 thousand downloads last month - 34 stars on GitHub - 1 maintainer
pyxurls 0.1.3
A regular expression based URL extractor which extracts URLs from text.3 versions - Latest release: almost 4 years ago - 1 dependent repositories - 5.29 thousand downloads last month - 3 stars on GitHub - 1 maintainer
extract-values 1.0
A Python module for extracting values out of a string using a simple pattern instead of a regular...1 version - Latest release: over 12 years ago - 2 dependent repositories - 32 downloads last month - 5 stars on GitHub - 1 maintainer
get-spotlights 3.2
Love windows 10 spotlight images that show up on lock-screen. Then here's a simple program to cop...3 versions - Latest release: about 2 years ago - 1 dependent repositories - 134 downloads last month - 0 stars on GitHub - 1 maintainer
skype-chat-history 0.0.3
Skype Chat History Extractor3 versions - Latest release: over 4 years ago - 1 dependent repositories - 72 downloads last month - 1 maintainer
dryjq 1.9.1
Drastically Reduced YAML / JSON Query23 versions - Latest release: over 1 year ago - 797 downloads last month - 1 stars on gitlab.com - 1 maintainer
Top 8.1% on pypi.org
50 versions - Latest release: about 1 year ago - 5 dependent packages - 3 dependent repositories - 3.21 thousand downloads last month - 90 stars on GitHub - 1 maintainer
pdf2doi 1.5.1
A python library/command-line tool to extract the DOI or other identifiers of a scientific paper...50 versions - Latest release: about 1 year ago - 5 dependent packages - 3 dependent repositories - 3.21 thousand downloads last month - 90 stars on GitHub - 1 maintainer
Top 8.3% on pypi.org
10 versions - Latest release: over 3 years ago - 2 dependent repositories - 754 downloads last month - 614 stars on GitHub - 1 maintainer
unitypackage-extractor 1.1.0
Extractor for .unitypackage files10 versions - Latest release: over 3 years ago - 2 dependent repositories - 754 downloads last month - 614 stars on GitHub - 1 maintainer
pptx2txt2 1.1.0
Extract text from .pptx and .odp files to strings in pure python.4 versions - Latest release: about 1 year ago - 561 downloads last month - 0 stars on GitHub - 1 maintainer
Top 3.1% on pypi.org
3 versions - Latest release: over 6 years ago - 3 dependent packages - 82 dependent repositories - 60.3 thousand downloads last month - 461 stars on GitHub - 1 maintainer
colorgram.py 1.2.0
A Python module for extracting colors from images. Get a palette of any picture!3 versions - Latest release: over 6 years ago - 3 dependent packages - 82 dependent repositories - 60.3 thousand downloads last month - 461 stars on GitHub - 1 maintainer
Top 5.6% on pypi.org
2 versions - Latest release: about 7 years ago - 12 dependent repositories - 7.95 thousand downloads last month - 273 stars on GitHub - 1 maintainer
unpy2exe 0.4
Extract pyc files from py2exe executable.2 versions - Latest release: about 7 years ago - 12 dependent repositories - 7.95 thousand downloads last month - 273 stars on GitHub - 1 maintainer
ubittool 0.7.0
Tool to interface with the BBC micro:bit.3 versions - Latest release: over 1 year ago - 131 downloads last month - 16 stars on GitHub - 1 maintainer
webfocus 0.13
extract the content from html docs4 versions - Latest release: almost 8 years ago - 1 dependent repositories - 182 downloads last month - 1 maintainer
ts-cc-extractor 0.0.3
Extract EIA-608 captions from Video Transport Stream (*.ts) fileand convert them to SubRip (*.srt...3 versions - Latest release: almost 3 years ago - 1 dependent repositories - 118 downloads last month - 2 stars on GitHub - 1 maintainer
googleplaystorescrape 0.0.4
GooglePlayStoreScrape Package for Scraping Play Store Reviews and App Information4 versions - Latest release: almost 4 years ago - 1 dependent repositories - 181 downloads last month - 2 stars on GitHub - 1 maintainer
django_etl 0.1.0
A Django application that provides a management command to make using the petl library easier.1 version - Latest release: over 9 years ago - 2 dependent repositories - 41 downloads last month - 3 stars on GitHub - 1 maintainer
sling-mac 0.0.dev0
Sling Binary for Mac1 version - Latest release: about 1 year ago - 15 stars on GitHub - 1 maintainer
preprocess-docs 0.0.6
An open source document preprocessor for AI.6 versions - Latest release: about 1 year ago - 199 downloads last month - 0 stars on GitHub - 1 maintainer
sling-mac-universal 1.2.2
Sling Binary for Mac (AMD64 & ARM64)18 versions - Latest release: about 1 year ago - 579 downloads last month - 44 stars on GitHub - 1 maintainer
ulsid 0.0.2
Analyses, generates and extract student numbers of University of Limpopo(UL)2 versions - Latest release: over 2 years ago - 79 downloads last month - 0 stars on GitHub - 1 maintainer
sqlnosql 0.2.0
Push semi-structured data (e.g. JSON documents) into a database with a minimum of fuss. Includes ...2 versions - Latest release: over 10 years ago - 2 dependent repositories - 37 downloads last month - 1 maintainer
df-extract 0.0.2
DecisionFacts Extraction Library extracts content from PDF, PPTX, Docx, png, jpg., and convert as...3 versions - Latest release: over 1 year ago - 1 dependent package - 1 dependent repositories - 92 downloads last month - 14 stars on GitHub - 1 maintainer
epub-extract-jpeg 0.6.2
Extract comic EPUB pages to Jpeg files.5 versions - Latest release: about 8 years ago - 2 dependent repositories - 134 downloads last month - 2 stars on GitHub - 1 maintainer
Top 9.3% on pypi.org
6 versions - Latest release: about 3 years ago - 6 dependent repositories - 28.6 thousand downloads last month - 19 stars on GitHub - 2 maintainers
cabarchive 0.2.4
A pure-python library for creating and extracting cab files6 versions - Latest release: about 3 years ago - 6 dependent repositories - 28.6 thousand downloads last month - 19 stars on GitHub - 2 maintainers
dejacode 5.0.0
Automate open source license compliance and ensure supply chain integrity1 version - Latest release: over 1 year ago - 38 downloads last month - 19 stars on GitHub - 3 maintainers
serpextract-meiqia 2016.8.16
Easy extraction of keywords from search engine results pages (SERPs).10 versions - Latest release: over 8 years ago - 1 dependent repositories - 264 downloads last month - 1 stars on GitHub - 1 maintainer
extrom 0.1.2
ROM Extraction tool from archived ROM set file2 versions - Latest release: about 8 years ago - 1 dependent repositories - 58 downloads last month - 0 stars on GitHub - 1 maintainer
pdf4me 0.8.24
Provides expert functionality to convert, optimize, merge, split, ocr, print documents & PDFs.24 versions - Latest release: over 4 years ago - 1 dependent repositories - 1.73 thousand downloads last month - 1 stars on GitHub - 1 maintainer
nyctibius 0.0.13
Nyctibius is a Python package for gathering and consolidating socio-demographic data.14 versions - Latest release: about 1 year ago - 321 downloads last month - 2 stars on GitHub - 1 maintainer
dirlister 1.0.0
Generates wordlists to use for enumeration and brute-forcing files and directories2 versions - Latest release: almost 2 years ago - 57 downloads last month - 30 stars on GitHub - 1 maintainer
wikiframe 0.0.6
Convert all csv files in a folder to a diccionary of dataframe and more6 versions - Latest release: over 2 years ago - 1 dependent repositories - 202 downloads last month - 2 stars on GitHub - 1 maintainer
mboxattachments 0.5.35
Utility for exporting attachments from mbox files1 version - Latest release: about 8 years ago - 1 dependent repositories - 24 downloads last month - 1 maintainer
urlextract-py2.7 0.0.2
Collects and extracts URLs from given text. Forked from https://pypi.python.org/pypi/urlextract.2 versions - Latest release: over 7 years ago - 1 dependent package - 58 downloads last month - 0 stars on GitHub - 1 maintainer
ext-util 1.0
Extract things without having to worry about parameters.1 version - Latest release: over 9 years ago - 2 dependent repositories - 33 downloads last month - 2 stars on GitHub - 1 maintainer
Related Keywords
python
55
load
26
etl
25
text
22
transform
19
pdf
16
archive
15
images
15
data
14
python3
14
parse
13
html
12
convert
12
extractor
12
extraction
11
elt
11
parser
11
audio
9
json
9
docx
9
zip
9
sling
8
tar
8
excel
8
url
7
content
7
image
7
csv
7
file
7
video
7
package
7
compress
6
unpack
6
scraper
5
pypi-package
5
wiki
5
scraping
5
word
5
library
5
merge
5
find
5
search
5
pages
5
scan
5
package-url
4
filter
4
decompress
4
author
4
unzip
4
web
4
license
4
collect
4
cli
4
parsing
4
jpeg
4
data-engineering
4
cyclonedx
4
spdx
4
sca
4
converter
4
licensing
4
hacktoberfest
4
open source
4
article
4
urls
4
remove
4
dependency
4
xlsx
4
tld
4
conversion
4
copyright
4
download
4
pandas
4
filetype
4
purl
4
ffmpeg
4
table
4
metadata
4
rename
3
comments
3
title
3
files
3
webpage
3
dataframe
3
decompression
3
domain
3
gzip
3
bzip2
3
dictionary
3
Wiktionary
3
pptx
3
pack
3
utility
3
python-library
3
mining
3
image-processing
3
tables
3
ppt
3
data-lake
3
data-loading
3