An open API service providing package, version and dependency metadata of many open source software ecosystems and registries.

pypi.org "preprocessing" keyword

niaarm 0.4.6
A minimalistic framework for numerical association rule mining
33 versions - Latest release: 3 months ago - 2 dependent packages - 1 dependent repositories - 494 downloads last month - 22 stars on GitHub - 1 maintainer
langmail 0.7.0
Email preprocessing for LLMs
4 versions - Latest release: 27 days ago - 1 stars on GitHub - 1 maintainer
Top 9.7% on pypi.org
xmip 0.7.2
Analysis ready CMIP6 data the easy way
4 versions - Latest release: over 2 years ago - 1 dependent package - 1 dependent repositories - 2.18 thousand downloads last month - 198 stars on GitHub - 1 maintainer
fmridata 0.11
A nifti utility
2 versions - Latest release: about 10 years ago - 1 dependent repositories - 9 downloads last month - 1 maintainer
hsi-preprocessing-toolkit 2.3.0
HSI Preprocessing Toolkit
13 versions - Latest release: about 1 month ago - 72 downloads last month - 2 stars on GitHub - 1 maintainer
kditransform 1.2.0
Kernel density integral transformation
5 versions - Latest release: 6 months ago - 8.45 thousand downloads last month - 9 stars on GitHub - 1 maintainer
dmriprep 0.5.0
dMRIPrep is a robust and easy-to-use pipeline for preprocessing of diverse dMRI data.
8 versions - Latest release: about 5 years ago - 1 dependent repositories - 54 downloads last month - 73 stars on GitHub - 1 maintainer
seqtools 1.4.1
A library for transparent transformation of indexable containers (lists, etc.)
13 versions - Latest release: almost 2 years ago - 2 dependent repositories - 250 downloads last month - 47 stars on GitHub - 1 maintainer
micofam 1.1.0
Micromechanical Composite Fatigue Modeler
2 versions - Latest release: over 1 year ago - 404 downloads last month - 0 stars on gitlab.com - 2 maintainers
Top 9.7% on pypi.org
niaaml 2.2.0
Python automated machine learning framework
34 versions - Latest release: 4 months ago - 2 dependent packages - 5 dependent repositories - 978 downloads last month - 34 stars on GitHub - 2 maintainers
dfclean-pipe 0.1.0
Automatic DataFrame cleaning: nulls, duplicates, types, outliers.
1 version - Latest release: 3 days ago - 95 downloads last month - 1 maintainer
media-preprocessor 10.0
tool for preprocessing media
12 versions - Latest release: over 5 years ago - 1 dependent repositories - 11 downloads last month - 0 stars on GitHub - 2 maintainers
scalde-data-factory 0.0.1
Data preparations tools for data science projects
1 version - Latest release: almost 3 years ago - 6 downloads last month - 1 maintainer
autots-mcp 1.0.2
Convenience package to install AutoTS with MCP dependencies
3 versions - Latest release: 2 months ago - 167 downloads last month - 1,374 stars on GitHub - 1 maintainer
toughio 1.15.1
Pre- and post-processing Python library for TOUGH
64 versions - Latest release: over 1 year ago - 1 dependent repositories - 372 downloads last month - 67 stars on GitHub - 1 maintainer
turkish-twitter-preprocess 0.0.7
a light-weight python package to pre-process turkish twitter statuses(tweets).
7 versions - Latest release: about 5 years ago - 1 dependent repositories - 32 downloads last month - 7 stars on GitHub - 1 maintainer
Top 3.2% on pypi.org
seqio-nightly 0.0.18.dev20250227
SeqIO: Task-based datasets, preprocessing, and evaluation for sequence models.
1,195 versions - Latest release: about 1 year ago - 3 dependent packages - 10 dependent repositories - 160 thousand downloads last month - 594 stars on GitHub - 1 maintainer
Top 3.1% on pypi.org
seqio 0.0.20
SeqIO: Task-based datasets, preprocessing, and evaluation for sequence models.
22 versions - Latest release: 7 months ago - 6 dependent packages - 137 dependent repositories - 287 thousand downloads last month - 594 stars on GitHub - 2 maintainers
utilsaxn 0.3.4
A modular set of data science utilities for EDA, cleaning, and more.
1 version - Latest release: 11 months ago - 15 downloads last month - 2 stars on GitHub - 1 maintainer
vflow 0.1.4
A framework for doing stability analysis with PCS.
7 versions - Latest release: about 2 years ago - 1 dependent repositories - 122 downloads last month - 72 stars on GitHub - 2 maintainers
datadoctor 1.0.15
A Python package for data cleaning and preprocessing.
14 versions - Latest release: almost 3 years ago - 23 downloads last month - 2 stars on GitHub - 1 maintainer
meeg-utils 0.1.7
A Python-based MEEG processing toolkit primarily based on MNE-Python.
6 versions - Latest release: about 1 month ago - 111 downloads last month - 0 stars on GitHub - 1 maintainer
mydatapreprocessing 3.0.3
Library/framework for making predictions.
53 versions - Latest release: over 3 years ago - 1 dependent package - 3 dependent repositories - 310 downloads last month - 1 stars on GitHub - 1 maintainer
Top 1.5% on pypi.org
unstructured 0.18.24
A library that prepares raw documents for downstream ML tasks.
197 versions - Latest release: 3 months ago - 113 dependent packages - 3,374 dependent repositories - 3.2 million downloads last month - 13,122 stars on GitHub - 1 maintainer
dataspree-preprocessing 0.2.1
Python package for DS data augmentations and transforms.
1 version - Latest release: about 1 month ago - 205 downloads last month - 1 maintainer
Top 3.9% on pypi.org
neologdn 0.5.6 ๐Ÿ’ฐ
Japanese text normalizer for mecab-neologd
18 versions - Latest release: 4 months ago - 2 dependent packages - 69 dependent repositories - 30.3 thousand downloads last month - 258 stars on GitHub - 1 maintainer
prepo 0.2.0
A Python package with automated data type detection, KNN imputation, outlier removal, and multipl...
9 versions - Latest release: 9 months ago - 28 downloads last month - 1 stars on GitHub - 1 maintainer
ftir-prep 0.1.0
A framework for designing and evaluating optimal preprocessing pipelines for FTIR spectral data u...
1 version - Latest release: 4 months ago - 51 downloads last month - 1 maintainer
stripje 0.1.0
High-performance single-row inference compiler for scikit-learn pipelines with 2-10x speedup
1 version - Latest release: 5 months ago - 9 downloads last month - 0 stars on GitHub - 1 maintainer
morphopretext 0.2.0
A bilingual text preprocessing toolkit for English and Persian.
4 versions - Latest release: 8 months ago - 14 downloads last month - 1 stars on GitHub - 1 maintainer
mlguide 1.0.0
Modular ML toolkit with a built-in guide system for students.
1 version - Latest release: about 1 month ago - 21 downloads last month - 1 maintainer
Top 8.3% on pypi.org
germalemma 0.1.3
A lemmatizer for German language text.
4 versions - Latest release: over 6 years ago - 1 dependent package - 9 dependent repositories - 470 downloads last month - 91 stars on GitHub - 2 maintainers
textinsight-nishtha 1.0.0
A lightweight NLP toolkit for text cleaning and basic linguistic insights.
1 version - Latest release: 5 months ago - 12 downloads last month - 1 maintainer
elikopy 0.2
A set of tools for analysing dMRI
1 version - Latest release: over 4 years ago - 1 dependent repositories - 10 downloads last month - 19 stars on GitHub - 1 maintainer
wjmdatascience 0.0.1
A very basic data cleaning and preprocessing library
1 version - Latest release: about 1 year ago - 5 downloads last month - 1 maintainer
tno.quantum.optimization.qubo.preprocessors 1.0.0
QUBO preprocessors
1 version - Latest release: 11 months ago - 29 downloads last month - 1 stars on GitHub - 1 maintainer
Top 7.5% on pypi.org
update-version 0.2.1 ๐Ÿ’ฐ
Updates your project's version from a versioning file
8 versions - Latest release: about 1 month ago - 561 downloads last month - 1 maintainer
ez-autoprep 0.1.0
A library for automated data preprocessing
1 version - Latest release: 2 months ago - 96 downloads last month
Top 3.8% on pypi.org
autoreject 0.4.3
Automated rejection and repair of epochs in M/EEG.
10 versions - Latest release: over 2 years ago - 4 dependent packages - 41 dependent repositories - 8 thousand downloads last month - 146 stars on GitHub - 3 maintainers
autocleaneeg-pipeline 3.2.1
A modular framework for automated EEG data processing, built on MNE-Python
29 versions - Latest release: 22 days ago - 876 downloads last month - 2 stars on GitHub - 1 maintainer
medip 0.1.0
DICOM-standard medical image preprocessing toolkit โ€” profile-driven, modality-extensible
1 version - Latest release: 6 days ago - 121 downloads last month - 1 maintainer
data-science-kit 0.0.1
Data Science Basic Functions
1 version - Latest release: almost 5 years ago - 1 dependent repositories - 13 downloads last month - 1 stars on GitHub - 1 maintainer
hotlib 1.0.33
Utilities for an AI-assisted mapping tool developed for HOT.
33 versions - Latest release: over 3 years ago - 1 dependent repositories - 142 downloads last month - 1 maintainer
pydtk 0.3.2
A Python toolkit for managing, retrieving and processing data.
30 versions - Latest release: over 2 years ago - 4 dependent repositories - 529 downloads last month - 14 stars on GitHub - 1 maintainer
Top 4.7% on pypi.org
contextualspellcheck 0.4.4 ๐Ÿ’ฐ
Contextual spell correction using BERT (bidirectional representations)
18 versions - Latest release: over 2 years ago - 1 dependent package - 4 dependent repositories - 19.1 thousand downloads last month - 419 stars on GitHub - 1 maintainer
missmixed 1.1.0
An Adaptive, Extensible and Configurable Multi-Layer Framework for Iterative Missing Value Imputa...
3 versions - Latest release: 6 months ago - 17 downloads last month - 0 stars on GitHub - 1 maintainer
cleanflo 0.1.0
A beginner-friendly Python package for easy data cleaning and preprocessing.
1 version - Latest release: about 1 year ago - 12 downloads last month - 1 maintainer
fifa-preprocessing 1.1.2
A package providing methods to preprocess data, with the intent to perform Machine Learning.
8 versions - Latest release: almost 6 years ago - 1 dependent repositories - 47 downloads last month - 0 stars on GitHub - 1 maintainer
Top 7.4% on pypi.org
tokmor 1.2.10
Dependency-free, fast deterministic tokenizer + morphology splitter for 375 languages (~4.6MB)
11 versions - Latest release: 3 months ago - 114 downloads last month - 1 maintainer
silk-ml 0.1.1
Simple Intelligent Learning Kit (SILK) for Machine learning
5 versions - Latest release: over 6 years ago - 1 dependent repositories - 41 downloads last month - 3 stars on GitHub - 1 maintainer
img-ops 0.1.2
Device-aware image operations (CPU/GPU/MPS fallback) with a clean Python API for preprocessing ta...
3 versions - Latest release: 8 months ago - 34 downloads last month - 1 stars on GitHub - 1 maintainer
Top 6.3% on pypi.org
unstructured-inference 1.6.2
A library for performing inference using trained models.
117 versions - Latest release: 5 days ago - 8 dependent packages - 16 dependent repositories - 982 thousand downloads last month - 194 stars on GitHub - 1 maintainer
logprep 19.0.0
Logprep allows to collect, process and forward log messages from various data sources.
56 versions - Latest release: 26 days ago - 1.34 thousand downloads last month - 35 stars on GitHub - 5 maintainers
dataset-doctor 1.0.1
Automatically diagnose and clean messy datasets for machine learning and data science.
2 versions - Latest release: 14 days ago - 202 downloads last month - 1 stars on GitHub - 1 maintainer
cordon 1.0.1
Semantic anomaly detection for system log files
12 versions - Latest release: 9 days ago - 735 downloads last month - 149 stars on GitHub - 1 maintainer
Top 9.5% on pypi.org
py-autoclean 1.1.3
AutoClean - Python Package for Automated Preprocessing & Cleaning of Datasets
21 versions - Latest release: over 3 years ago - 1 dependent package - 1 dependent repositories - 398 downloads last month - 294 stars on GitHub - 1 maintainer
ferroml 1.0.1
Statistically rigorous AutoML in Rust with Python bindings
2 versions - Latest release: 8 days ago - 338 downloads last month - 1 maintainer
preprocess-corpora 0.1.1
Preprocessing and sentence-aligning for parallel corpora
2 versions - Latest release: almost 6 years ago - 1 dependent repositories - 15 downloads last month - 2 stars on GitHub - 1 maintainer
seanox-ai-nlp 1.3.0
Lightweight NLP components for semantic processing of domain-specific content.
5 versions - Latest release: 6 months ago - 34 downloads last month - 0 stars on GitHub - 1 maintainer
knyfe 0.4.2
A utility for rapid exploration and preprocessing of datasets.
1 version - Latest release: almost 14 years ago - 2 dependent repositories - 4 downloads last month - 54 stars on GitHub - 1 maintainer
itu-turkish-nlp-pipeline-caller 2.2.0
A wrapper tool to use ITU Turkish NLP Pipeline API
4 versions - Latest release: about 10 years ago - 1 dependent repositories - 23 downloads last month - 45 stars on GitHub - 1 maintainer
table-toolkit 2025.11.9
A Python library for consistent preprocessing of tabular data with automatic type inference, cach...
9 versions - Latest release: 5 months ago - 38 downloads last month - 0 stars on GitHub - 1 maintainer
yosina 1.1.1
Japanese text transliteration library
5 versions - Latest release: 21 days ago - 439 downloads last month - 19 stars on GitHub - 1 maintainer
hot-fair-utilities 2.0.12 ๐Ÿ’ฐ
Utilities for AI - Assisted Mapping fAIr
35 versions - Latest release: 12 months ago - 947 downloads last month - 11 stars on GitHub - 1 maintainer
poroscleanlit 0.2.0
ๆ”ฏๆŒ Markdown ไปฃ็ ๅ—ไธŽ LaTeX ๅ…ฌๅผไฟๆŠคใ€ๅ‚่€ƒๆ–‡็Œฎ่‡ชๅŠจ่ง„่Œƒใ€ไธญ่‹ฑๆŽ’็‰ˆไผ˜ๅŒ–็š„ไธ“ไธšๆธ…ๆด—ๅทฅๅ…ท
1 version - Latest release: 4 months ago - 11 downloads last month - 1 maintainer
declarativeenum 1.0.0
A declarative and flexible approach to Python enums with preprocessing, validation, and more
1 version - Latest release: over 1 year ago - 13 downloads last month - 1 maintainer
little-data-preprocessor 1.0.4
A pandas dataframe preprocessing python package
4 versions - Latest release: about 1 year ago - 14 downloads last month - 0 stars on GitHub - 1 maintainer
yaml-ml 1.0.0
Your whole ML pipeline in one YAML file.
1 version - Latest release: about 1 year ago - 14 downloads last month - 1 stars on GitHub - 1 maintainer
autosweep-preprocessing 0.1.2
Flexible tabular data preprocessing utility with a single AutoSweep API
2 versions - Latest release: about 1 month ago - 55 downloads last month - 1 maintainer
ifcfill 0.1.0
Transformer of tabular data into Integer, Float and Categorical (IFC) variables with missing data...
1 version - Latest release: 12 days ago - 1 maintainer
Top 5.7% on pypi.org
torcharrow 0.1.0
A Pandas inspired, Arrow compatible, Velox supported dataframe library for PyTorch
9 versions - Latest release: over 3 years ago - 13 dependent repositories - 187 downloads last month - 646 stars on GitHub - 2 maintainers
pyspectrakit 1.9.6
Python toolkit for spectral data processing: baseline correction, normalization, smoothing, despi...
11 versions - Latest release: about 1 month ago - 1 maintainer
nlcodec 0.4.0
nlcodec is a collection of encoding schemes for natural language sequences. nlcodec.db is a effi...
10 versions - Latest release: over 4 years ago - 1 dependent package - 2 dependent repositories - 91 downloads last month - 5 stars on GitHub - 1 maintainer
Top 5.5% on pypi.org
nonechucks 0.4.2
nonechucks is a library that provides wrappers for PyTorch's datasets, samplers, and transforms t...
18 versions - Latest release: almost 5 years ago - 26 dependent repositories - 117 downloads last month - 377 stars on GitHub - 1 maintainer
uzpreprocessor 1.0.5
Uzbek text preprocessing library for converting numbers, dates, times, and currency to words
6 versions - Latest release: 4 months ago - 48 downloads last month - 1 maintainer
mercury-imgpprcs 0.0.1
Mercury: Image Pre-processing Open Source API for Artificial Intelligence
1 version - Latest release: about 5 years ago - 1 dependent repositories - 8 downloads last month - 0 stars on GitHub - 1 maintainer
one-data-processing 0.0.14
Data Processing is used for data processing through MinIO, databases, Web APIs, etc.
14 versions - Latest release: about 2 years ago - 52 downloads last month - 109 stars on GitHub - 1 maintainer
pybear 0.2.3
Python modules for miscellaneous data analytics applications
4 versions - Latest release: 6 months ago - 39 downloads last month - 0 stars on GitHub
aidatapilot 0.1.1
Lightweight Intelligent Data Automation Engine โ€” plug-and-play pipelines for everyone.
1 version - Latest release: 12 days ago - 188 downloads last month - 1 maintainer
designer2 2.0.12
designerV2
40 versions - Latest release: 12 months ago - 194 downloads last month - 31 stars on GitHub - 3 maintainers
Top 4.2% on pypi.org
nnaudio 0.3.4
A fast GPU audio processing toolbox with 1D convolutional neural network
36 versions - Latest release: 4 months ago - 4 dependent packages - 5 dependent repositories - 327 thousand downloads last month - 1,085 stars on GitHub - 1 maintainer
vibration-analysis 0.1.0
A library for vibration data preprocessing and analysis
1 version - Latest release: 5 months ago - 18 downloads last month - 1 maintainer
eclipsera 1.2.0
A comprehensive machine learning framework with 68 algorithms spanning classical ML, clustering, ...
2 versions - Latest release: 5 months ago - 15 downloads last month - 0 stars on GitHub - 1 maintainer
prepwizard 1.0.1
๐Ÿง™โ€โ™‚๏ธ Magical ML preprocessing that transforms your data with a wave of code
1 version - Latest release: 10 months ago - 18 downloads last month - 1 maintainer
paderborn-bearing 1.1.2
Preprocessed Paderborn Bearing Dataset for analysing multivariate motor current signals combined ...
6 versions - Latest release: 9 months ago - 1 dependent repositories - 310 downloads last month - 7 stars on GitHub - 1 maintainer
preprokit 0.1.0.1
preprokit is a comprehensive data preprocessing library designed to streamline the data preparati...
2 versions - Latest release: over 1 year ago - 10 downloads last month - 0 stars on GitHub - 1 maintainer
seq-qc 2.0.4
utilities for performing various preprocessing steps on sequencing reads
10 versions - Latest release: about 8 years ago - 2 dependent repositories - 16 downloads last month - 0 stars on GitHub - 1 maintainer
demv 1.0.2
Debiaser for Multiple Variables(DEMV) is a pre-processing algorithm for binary and multi-class da...
3 versions - Latest release: about 2 years ago - 34 downloads last month - 0 stars on GitHub - 1 maintainer
evopreprocess 0.5.0
Data Preprocessing with Evolutionary and Nature Inspired Algorithms.
15 versions - Latest release: about 3 years ago - 1 dependent repositories - 74 downloads last month - 9 stars on GitHub - 1 maintainer
prep-ml 0.1.1
Preprocessing for ML models made easy.
2 versions - Latest release: almost 5 years ago - 1 dependent repositories - 15 downloads last month - 1 stars on GitHub - 1 maintainer
langchain-addons 0.0.2
...
3 versions - Latest release: almost 3 years ago - 29 downloads last month - 1 maintainer
maldiamrkit 0.9.0
A comprehensive toolkit for MALDI-TOF mass spectrometry data preprocessing for antimicrobial resi...
15 versions - Latest release: 14 days ago - 236 downloads last month - 2 stars on GitHub - 1 maintainer
tweet-nlp-toolkit 1.0.5
NLP toolkit for tweets
6 versions - Latest release: almost 3 years ago - 1 dependent repositories - 31 downloads last month - 1 stars on GitHub - 1 maintainer
df2onehot 1.2.2 ๐Ÿ’ฐ
Python package df2onehot is to convert a pandas dataframe into a stuctured dataframe.
30 versions - Latest release: about 2 months ago - 3 dependent packages - 5 dependent repositories - 9.97 thousand downloads last month - 3 stars on GitHub - 1 maintainer
Top 4.6% on pypi.org
nvtabular 23.8.0
NVTabular is a feature engineering and preprocessing library for tabular data designed to quickly...
34 versions - Latest release: over 2 years ago - 24 dependent repositories - 7.36 thousand downloads last month - 1,031 stars on GitHub - 3 maintainers
mkdocs-mermaid-to-image 1.3.1
MkDocs plugin to preprocess Mermaid diagrams into static images
7 versions - Latest release: 9 months ago - 639 downloads last month - 0 stars on GitHub - 1 maintainer
visionner 0.0.7
Turn raw image dataset into numpy array ; more suitable for deep learning tasks
7 versions - Latest release: almost 3 years ago - 28 downloads last month - 10 stars on GitHub - 1 maintainer
pywatts 0.3.0
A python time series pipelining project
1 version - Latest release: about 4 years ago - 1 dependent repositories - 1.16 thousand downloads last month - 1 maintainer
adjdatatools 0.4.0
This library contains adjusted tools for data preprocessing and working with mixed data types.
5 versions - Latest release: about 5 years ago - 1 dependent repositories - 99 downloads last month - 21 stars on GitHub - 1 maintainer
rbat 0.1.1
Toolkit for preprocessing rat time-spatial data and quantifying their (artificially induced) OCD-...
3 versions - Latest release: 11 months ago - 17 downloads last month - 0 stars on GitHub - 1 maintainer