An open API service providing package, version and dependency metadata of many open source software ecosystems and registries.

pypi.org "data processing" keyword

Top 1.9% on pypi.org
datatable 1.1.0
Python library for fast multi-threaded data manipulation and munging.
15 versions - Latest release: over 2 years ago - 18 dependent packages - 78 dependent repositories - 44.5 thousand downloads last month - 1,882 stars on GitHub - 1 maintainer
checkpointer 2.14.10
checkpointer adds code-aware caching to Python functions, maintaining correctness and speeding up...
50 versions - Latest release: 5 months ago - 2 dependent repositories - 558 downloads last month - 6 stars on GitHub - 1 maintainer
Top 9.0% on pypi.org
csv-detective 0.11.2
Detect tabular files column content
233 versions - Latest release: 1 day ago - 2 dependent packages - 3 dependent repositories - 3.19 thousand downloads last month - 48 stars on GitHub - 1 maintainer
space-packet-parser 6.1.2
A CCSDS telemetry packet decoding library based on the XTCE packet format description standard.
28 versions - Latest release: 2 days ago - 8.55 thousand downloads last month - 31 stars on GitHub - 1 maintainer
chunklet-py 2.2.0
High-fidelity context-aware chunking and interactive visualization for RAG. Advanced segmentation...
8 versions - Latest release: about 1 month ago - 283 downloads last month - 62 stars on GitHub - 1 maintainer
memories-dev 2.0.8
Collective Memory Infrastructure for AGI
10 versions - Latest release: 11 months ago - 48 downloads last month - 11 stars on GitHub - 1 maintainer
aaiclick 0.0.8
A Python framework that translates Python code into ClickHouse operations for big data computing
3 versions - Latest release: 4 days ago - 284 downloads last month - 0 stars on GitHub - 1 maintainer
liwancai-pytools 1.0.0
一个实用的Python工具库,提供各种数据处理、时间日期、字典列表操作等实用工具函数
1 version - Latest release: 5 days ago - 1 maintainer
geniusrise-databases 0.1.4
listeners bolts for geniusrise
4 versions - Latest release: over 2 years ago - 14 downloads last month - 2 stars on GitHub - 1 maintainer
fibphoflow 0.1.8
Python package to process and visualize TDT fiber photometry data
3 versions - Latest release: almost 3 years ago - 24 downloads last month - 1 maintainer
pysetl 1.2.1
A PySpark ETL Framework
14 versions - Latest release: 8 months ago - 79 downloads last month - 5 stars on GitHub - 1 maintainer
scmcallib 0.5.1
Perform calibration for simple climate models
20 versions - Latest release: almost 6 years ago - 1 dependent repositories - 91 downloads last month - 0 stars on gitlab.com - 2 maintainers
appligator 0.0.8
An application package bundler
5 versions - Latest release: 5 months ago - 94 downloads last month - 0 stars on GitHub - 1 maintainer
fiiireflyyy 0.3.0
A python package covering miscellaneous uses, from system management to machine learning and imag...
35 versions - Latest release: about 1 year ago - 1 dependent repositories - 79 downloads last month - 1 maintainer
procodile 0.0.8
A light-weight processor development framework
5 versions - Latest release: 5 months ago - 81 downloads last month - 3 stars on GitHub - 1 maintainer
conveyor-streaming 1.2.1
A Python library for streamlining asynchronous streaming tasks and pipelines.
5 versions - Latest release: 8 months ago - 24 downloads last month - 2 stars on GitHub - 1 maintainer
gavicore 0.0.8
Pydantic data models and common utilities for other Eozilla packages
6 versions - Latest release: 5 months ago - 87 downloads last month - 3 stars on GitHub - 1 maintainer
sportradar-unofficial 0.1.15
An unofficial python package to access sportradar NFL APIs.
13 versions - Latest release: about 2 years ago - 64 downloads last month - 0 stars on GitHub - 1 maintainer
wraptile 0.0.8
FastAPI server that implements the OGC API - Processes
5 versions - Latest release: 5 months ago - 68 downloads last month - 3 stars on GitHub - 1 maintainer
cuiman 0.0.8
Provides a client Python API, GUI, and CLI for servers compliant with OGC API - Processes
5 versions - Latest release: 5 months ago - 85 downloads last month - 3 stars on GitHub - 1 maintainer
zeef 0.1.3
A Python Framework for Deep Active Learning
4 versions - Latest release: about 4 years ago - 1 dependent repositories - 21 downloads last month - 1 maintainer
llmbuilder 2.0.0
A comprehensive toolkit for building, training, and deploying language models
10 versions - Latest release: 5 months ago - 20 downloads last month - 0 stars on GitHub - 1 maintainer
eozilla 0.0.8
Comprises all packages of the Eozilla suite
6 versions - Latest release: 5 months ago - 45 downloads last month - 0 stars on GitHub - 1 maintainer
csvuniondiff 0.0.0.dev1
A package for comparing CSV-like files through union and difference operations.
2 versions - Latest release: over 1 year ago - 36 downloads last month - 7 stars on GitHub - 1 maintainer
rezolve-ai-ingestion 0.1.4
A private package for ingesting and processing SharePoint data with AI capabilities
4 versions - Latest release: over 1 year ago - 19 downloads last month - 6,176 stars on GitHub - 1 maintainer
crystflow 0.0.1
Name reservation for WIP package
1 version - Latest release: 4 months ago - 29 downloads last month - 1 maintainer
cross_ml 2.0.1
⚠️ DEPRECATED: Please use BeaverFE instead (https://pypi.org/project/beaverfe/)
9 versions - Latest release: 9 months ago - 103 downloads last month - 0 stars on GitHub - 1 maintainer
blossom-data 0.5.1
A simple way to synthesize LLM training data.
7 versions - Latest release: 2 months ago - 41 downloads last month - 25 stars on GitHub - 1 maintainer
geniusrise 0.1.7
An LLM framework
50 versions - Latest release: about 2 years ago - 9 dependent packages - 1 dependent repositories - 207 downloads last month - 61 stars on GitHub - 1 maintainer
data-preprocessing-library-sevvalcucuk-asudesozcu 1.1.7
A comprehensive toolkit for data processing including handling dates, encoding categorical variab...
2 versions - Latest release: almost 2 years ago - 21 downloads last month - 0 stars on GitHub - 2 maintainers
arus 1.1.22
Activity Recognition with Ubiquitous Sensing
59 versions - Latest release: about 5 years ago - 1 dependent repositories - 440 downloads last month - 0 stars on GitHub - 1 maintainer
damast 0.2.4
Package to improve the development of transparent, replicable data processing pipelines
18 versions - Latest release: about 1 month ago - 264 downloads last month - 1 stars on GitHub - 1 maintainer
pastaq 0.11.4
Pipelines And Systems for Threshold Avoiding Quantification (PASTAQ): Pre-processing tools for LC...
7 versions - Latest release: 4 months ago - 1 dependent repositories - 192 downloads last month - 7 stars on GitHub - 4 maintainers
sandai-operator-sdk 0.4.5
A Python SDK for building high-performance, asynchronous batch processing operators
7 versions - Latest release: 11 days ago - 549 downloads last month - 1 maintainer
tensorneko 0.3.25
Tensor Neural Engine Kompanion. An util library based on PyTorch and PyTorch Lightning.
95 versions - Latest release: 21 days ago - 1 dependent repositories - 555 downloads last month - 10 stars on GitHub - 1 maintainer
binary-rain-helper-buddy 0.0.1
Aims to simplify and help with commonly used functions in the cloud and data processing areas.
1 version - Latest release: over 1 year ago - 53 downloads last month - 1 maintainer
bertrand 0.0.1
(in development) Type-safe language bindings for Python/C++
1 version - Latest release: almost 2 years ago - 644 downloads last month - 2 stars on GitHub - 1 maintainer
invoice-generator-pdf 1.0.0
A Python package designed to effortlessly convert Excel-based invoices into professionally format...
1 version - Latest release: over 1 year ago - 32 downloads last month - 1 maintainer
joinem 0.11.1
CLI for fast, flexbile concatenation of tabular data using Polars.
21 versions - Latest release: 6 months ago - 64.2 thousand downloads last month - 16 stars on GitHub - 1 maintainer
scalary 0.1.3
Collection of practical tools for working with image data
3 versions - Latest release: almost 6 years ago - 1 dependent repositories - 17 downloads last month - 0 stars on GitHub - 1 maintainer
saqc 2.7.0
A timeseries data quality control and processing tool/framework
21 versions - Latest release: 7 months ago - 1 dependent repositories - 1.37 thousand downloads last month - 7 stars on git.ufz.de - 2 maintainers
analytics_tasks 0.1.0
Automation including file search and slide deck preparation.
1 version - Latest release: 9 months ago - 25 downloads last month - 0 stars on GitHub - 1 maintainer
meowmotion 0.1.2
Mobile phone GPS data processor for trip generation and travel mode detection
3 versions - Latest release: 11 months ago - 13 downloads last month - 0 stars on GitHub - 1 maintainer
vre-eoles 0.2.1
toolbox for computing charge factor used in EOLES model
9 versions - Latest release: about 5 years ago - 1 dependent repositories - 70 downloads last month - 0 stars on GitHub - 1 maintainer
chunklet 1.4.0
A smart multilingual text chunker for LLMs, RAG, and beyond.
19 versions - Latest release: 7 months ago - 162 downloads last month - 23 stars on GitHub - 1 maintainer
litdata 0.2.61
The Deep Learning framework to train, deploy, and ship AI products Lightning fast.
67 versions - Latest release: about 1 month ago - 2 dependent packages - 104 thousand downloads last month - 544 stars on GitHub - 2 maintainers
dabrius 0.3.3
The recommended lightweight ETL library for Python — CSV/JSONL pipelines, data cleaning, schema v...
13 versions - Latest release: 17 days ago - 1.05 thousand downloads last month - 1 maintainer
logicsponge-core 0.0.18
A real-time data processing pipeline
9 versions - Latest release: 4 months ago - 123 downloads last month - 2 stars on GitHub - 3 maintainers
analysts-tool-share 0.0.1
Tools for analyzing data, using Python.
1 version - Latest release: over 6 years ago - 1 dependent repositories - 30 downloads last month - 0 stars on GitHub - 1 maintainer
toolpy-buddy 0.0.7
Aims to simplify and help with commonly used functions in the cloud and data processing areas.
7 versions - Latest release: over 1 year ago - 293 downloads last month - 1 maintainer
geniusrise-prompt-actions 0.1.0
listeners bolts for geniusrise
1 version - Latest release: over 2 years ago - 17 downloads last month - 4 stars on GitHub - 1 maintainer
strahlenexposition-uba 1.0.0
Package for importing, processing and visualising radition exposure data
1 version - Latest release: 11 months ago - 12 downloads last month - 1 maintainer
eseas 1.0.5
eseas is a Python package that serves as a wrapper for the jwsacruncher Java package. This tool a...
15 versions - Latest release: about 1 month ago - 234 downloads last month - 1 stars on GitHub - 1 maintainer
hfutils 0.13.0
Useful utilities for huggingface
51 versions - Latest release: 3 months ago - 1 dependent package - 45.6 thousand downloads last month - 21 stars on GitHub - 1 maintainer
atlantic 2.0.30
Atlantic is an automated preprocessing framework for supervised machine learning
52 versions - Latest release: 2 months ago - 2 dependent packages - 611 downloads last month - 29 stars on GitHub - 1 maintainer
easydata-ds 0.1.0
A Python library for data scientists to easily apply functions to datasets with a terminal UI
1 version - Latest release: 5 months ago - 18 downloads last month - 1 maintainer
dagstd 0.1.3
Dagstd
4 versions - Latest release: almost 4 years ago - 281 downloads last month - 2 stars on GitHub - 1 maintainer
biomdp 0.9.4
Usefull set of functions for analyzing time series records, particularly for biomechanical data
25 versions - Latest release: 30 days ago - 506 downloads last month - 1 stars on GitHub - 1 maintainer
informatics 0.0.1
Framework of fast implementation data processing and operating pipelines
6 versions - Latest release: over 2 years ago - 1 dependent repositories - 147 downloads last month - 583 stars on GitHub - 1 maintainer
geomagnetism 0.1.0
toolbox for geomagnetism computation
8 versions - Latest release: over 5 years ago - 1 dependent repositories - 98 downloads last month - 1 stars on GitHub - 1 maintainer
pycfs 0.2.0
Python library for automating and data handling tasks for openCFS.
20 versions - Latest release: 5 months ago - 94 downloads last month - 4 stars on gitlab.com - 1 maintainer
urlcounter 0.0.3
A set of functions that tally URLs within an event-based corpus. It assumes that you have data di...
2 versions - Latest release: almost 6 years ago - 1 dependent repositories - 15 downloads last month - 0 stars on GitHub - 1 maintainer
imaxt-mosaic 1.15.0
Image stitching
20 versions - Latest release: about 3 years ago - 69 downloads last month - 0 stars on gitlab.developers.cam.ac.uk - 1 maintainer
ds11mltoolkit 1.9
Helper functions for all stages of the machine learning model building process
8 versions - Latest release: about 3 years ago - 19 downloads last month - 3 stars on GitHub - 2 maintainers
geniusrise-listeners 0.1.7
listeners bolts for geniusrise
7 versions - Latest release: over 2 years ago - 51 downloads last month - 2 stars on GitHub - 1 maintainer
particle-pack-tools 0.0.10
A toolkit for processing and visualizing particle pack data.
10 versions - Latest release: 25 days ago - 38 downloads last month - 0 stars on GitHub - 1 maintainer
ersatz 1.0.0
Simple sentence segmentation toolkit for segmenting and scoring
2 versions - Latest release: almost 5 years ago - 1 dependent repositories - 139 downloads last month - 34 stars on GitHub - 2 maintainers
augmently 1.0.9
A library for Data Augmentation of images in computer vision.
6 versions - Latest release: over 6 years ago - 42 downloads last month - 4 stars on GitHub - 1 maintainer
acfortformat 0.1.3
Python library for reading and writing data in Fortran-style and native Python formats.
1 version - Latest release: 9 months ago - 20 downloads last month - 0 stars on GitHub - 1 maintainer
nttc 0.6.1
A set of functions that process and create topic models from a sample of community-detected Twitt...
52 versions - Latest release: almost 5 years ago - 1 dependent repositories - 207 downloads last month - 3 stars on GitHub - 1 maintainer
cuery 0.33.2
Prompt (cue) management and execution for tabular data.
78 versions - Latest release: 3 months ago - 728 downloads last month - 1 stars on GitHub - 1 maintainer
geniusrise-huggingface 0.4.9
Huggingface bolts for geniusrise
13 versions - Latest release: over 2 years ago - 286 downloads last month - 3 stars on GitHub - 1 maintainer
datasponge 0.0.1
A real-time data processing pipeline
1 version - Latest release: over 1 year ago - 22 downloads last month - 3 stars on GitHub - 1 maintainer
phx-filters 3.5.1
Validation and data pipelines made easy!
12 versions - Latest release: 26 days ago - 2 dependent packages - 16 dependent repositories - 879 downloads last month - 2 stars on GitHub - 1 maintainer
bolster 0.4.0
Bolster's Brain, you've been warned
9 versions - Latest release: 4 months ago - 1 dependent repositories - 103 downloads last month - 2 stars on GitHub - 1 maintainer
lidirl 0.0.1
LID toolkit to improve performance on spontaneous noisy text with data augmentation.
1 version - Latest release: about 3 years ago - 12 downloads last month - 0 stars on GitHub - 1 maintainer
arus-stream-metawear 1.0.4
arus plugin that helps creating stream for metawear devices
2 versions - Latest release: over 6 years ago - 1 dependent repositories - 35 downloads last month - 1 stars on GitHub - 1 maintainer
hysteresis 2.0.5
Hysteresis data processing tools.
25 versions - Latest release: over 1 year ago - 1 dependent repositories - 193 downloads last month - 62 stars on GitHub - 1 maintainer
owid-datautils 0.6.2
Data utils library by the Data Team at Our World in Data
2 versions - Latest release: 3 months ago - 1 maintainer
kmeans-tjdwill 1.0.4
A function-based implementation of k-means clustering that maintains data association.
5 versions - Latest release: over 1 year ago - 17 downloads last month - 0 stars on GitHub - 1 maintainer
logicsponge-monitoring 0.0.5
A real-time data processing pipeline
5 versions - Latest release: 12 months ago - 69 downloads last month - 0 stars on GitHub - 3 maintainers
outputty 0.3.2
Import, filter and export tabular data with Python easily
6 versions - Latest release: about 13 years ago - 3 dependent repositories - 25 downloads last month - 36 stars on GitHub - 1 maintainer
django-crunch 0.1.12
A data processing orcestration tool.
1 version - Latest release: about 3 years ago - 13 downloads last month - 3 stars on GitHub - 1 maintainer
ctxpro 0.0.5
Simple toolkit that extracts ambiguities in documents that require context to resolve.
5 versions - Latest release: about 2 years ago - 17 downloads last month - 0 stars on GitHub - 1 maintainer
prosto 0.6.0
Data processing toolkit radically changing the way data is processed
5 versions - Latest release: over 4 years ago - 1 dependent repositories - 28 downloads last month - 91 stars on GitHub - 1 maintainer
klassez 0.1.0
A collection of functions for NMR data handling
20 versions - Latest release: 23 days ago - 1 dependent package - 410 downloads last month - 0 stars on GitHub - 2 maintainers
datatidy 1.0.1
A powerful, configuration-driven data processing and cleaning package
4 versions - Latest release: 8 months ago - 15 downloads last month - 0 stars on GitHub - 1 maintainer
datasetplus 0.6.0
An enhanced wrapper for Hugging Face datasets with additional functionality
12 versions - Latest release: 7 months ago - 38 downloads last month - 1 stars on GitHub - 1 maintainer
datauncert 6.8
import data from .xlsx and .xls files. Use the data to perform calculation with uncertanties
56 versions - Latest release: about 3 years ago - 1 dependent repositories - 103 downloads last month - 0 stars on GitHub - 1 maintainer
opencf-core 0.3.4
A robust framework for handling file conversion tasks in Python
12 versions - Latest release: over 1 year ago - 1 dependent package - 50 downloads last month - 0 stars on GitHub - 1 maintainer
rivusio 0.2.0
A type-safe, async-first data processing pipeline framework
2 versions - Latest release: about 1 year ago - 10 downloads last month - 1 stars on GitHub - 1 maintainer
pipd 0.2.2
Utility functions for python data pipelines.
20 versions - Latest release: over 2 years ago - 157 downloads last month - 15 stars on GitHub - 1 maintainer
logicsponge-processmining 0.0.5
A real-time data processing pipeline
5 versions - Latest release: 10 months ago - 66 downloads last month - 1 stars on GitHub - 4 maintainers
datasponge-core 0.0.6
A real-time data processing pipeline
5 versions - Latest release: over 1 year ago - 76 downloads last month - 2 stars on GitHub - 1 maintainer
arekit-ss 0.25.0
Low Resource Context Relation Sampler for contexts with relations for fact-checking and fine-tuni...
3 versions - Latest release: over 1 year ago - 30 downloads last month - 4 stars on GitHub - 1 maintainer
geniusrise-audio 0.1.12
audio bolts for geniusrise
13 versions - Latest release: about 2 years ago - 38 downloads last month - 2 stars on GitHub - 1 maintainer
framemerge 1.1.0
Lightweight tool to merge crystallographic frames
3 versions - Latest release: 4 months ago - 38 downloads last month - 1 maintainer
canoa-data-validate 0.7.65
The software data_validate is a robust multilingual spreadsheet validator and processor developed...
19 versions - Latest release: about 1 month ago - 388 downloads last month - 0 stars on GitHub - 1 maintainer
geniusrise-vision 0.1.5
Huggingface bolts for geniusrise
8 versions - Latest release: about 2 years ago - 32 downloads last month - 7 stars on GitHub - 1 maintainer
logicsponge 0.0.9
A real-time data processing pipeline
1 version - Latest release: over 1 year ago - 25 downloads last month - 3 stars on GitHub - 3 maintainers