An open API service providing package, version and dependency metadata of many open source software ecosystems and registries.

pypi.org "data processing" keyword

Top 1.9% on pypi.org
datatable 1.1.0
Python library for fast multi-threaded data manipulation and munging.
15 versions - Latest release: over 2 years ago - 18 dependent packages - 78 dependent repositories - 45.4 thousand downloads last month - 1,883 stars on GitHub - 1 maintainer
textmining-module 2.1.2
A Python Module for Comprehensive Text Mining, including Keyword Extraction and Text Analysis.
7 versions - Latest release: over 1 year ago - 25 downloads last month - 0 stars on GitHub - 1 maintainer
sreader 0.0.1
space-reader: Convert any file path into LLM-friendly inputs
1 version - Latest release: almost 2 years ago - 23 downloads last month - 1 maintainer
fast-writer 1.0.3
CLI Tools for writing data
2 versions - Latest release: 9 months ago - 17 downloads last month - 0 stars on GitHub - 1 maintainer
python-datatable 1.1.3
Python library for fast multi-threaded data manipulation and munging.
4 versions - Latest release: about 3 years ago - 3 dependent packages - 352 downloads last month - 1,878 stars on GitHub - 1 maintainer
irt-data-utils 0.0.1a2
Infrared Thermal Data Utils
3 versions - Latest release: almost 3 years ago - 27 downloads last month - 1 maintainer
constelation-astronomer 1.1.4
constelation-astronomer: results processing package for CONSTELATION coupled model
1 version - Latest release: almost 2 years ago - 17 downloads last month - 0 stars on GitHub - 1 maintainer
tensorneko-tool 0.3.24
The CLI Tools for Library TensorNeko.
11 versions - Latest release: 28 days ago - 291 downloads last month - 11 stars on GitHub - 1 maintainer
Top 9.0% on pypi.org
csv-detective 0.10.12674
Detect tabular files column content
220 versions - Latest release: about 2 months ago - 2 dependent packages - 3 dependent repositories - 3.19 thousand downloads last month - 48 stars on GitHub - 1 maintainer
checkpointer 2.14.10
checkpointer adds code-aware caching to Python functions, maintaining correctness and speeding up...
50 versions - Latest release: 4 months ago - 2 dependent repositories - 716 downloads last month - 6 stars on GitHub - 1 maintainer
space-packet-parser 6.1.0
A CCSDS telemetry packet decoding library based on the XTCE packet format description standard.
26 versions - Latest release: about 2 months ago - 6.26 thousand downloads last month - 31 stars on GitHub - 1 maintainer
chunklet-py 2.2.0
High-fidelity context-aware chunking and interactive visualization for RAG. Advanced segmentation...
8 versions - Latest release: 15 days ago - 494 downloads last month - 62 stars on GitHub - 1 maintainer
fibphoflow 0.1.8
Python package to process and visualize TDT fiber photometry data
3 versions - Latest release: almost 3 years ago - 24 downloads last month - 1 maintainer
geniusrise-databases 0.1.4
listeners bolts for geniusrise
4 versions - Latest release: over 2 years ago - 26 downloads last month - 2 stars on GitHub - 1 maintainer
scmcallib 0.5.1
Perform calibration for simple climate models
20 versions - Latest release: almost 6 years ago - 1 dependent repositories - 170 downloads last month - 0 stars on gitlab.com - 2 maintainers
llmbuilder 2.0.0
A comprehensive toolkit for building, training, and deploying language models
10 versions - Latest release: 4 months ago - 24 downloads last month - 0 stars on GitHub - 1 maintainer
cuiman 0.0.8
Provides a client Python API, GUI, and CLI for servers compliant with OGC API - Processes
5 versions - Latest release: 4 months ago - 66 downloads last month - 3 stars on GitHub - 1 maintainer
procodile 0.0.8
A light-weight processor development framework
5 versions - Latest release: 4 months ago - 74 downloads last month - 3 stars on GitHub - 1 maintainer
wraptile 0.0.8
FastAPI server that implements the OGC API - Processes
5 versions - Latest release: 4 months ago - 68 downloads last month - 3 stars on GitHub - 1 maintainer
appligator 0.0.8
An application package bundler
5 versions - Latest release: 4 months ago - 50 downloads last month - 0 stars on GitHub - 1 maintainer
eozilla 0.0.8
Comprises all packages of the Eozilla suite
6 versions - Latest release: 4 months ago - 45 downloads last month - 0 stars on GitHub - 1 maintainer
gavicore 0.0.8
Pydantic data models and common utilities for other Eozilla packages
6 versions - Latest release: 4 months ago - 60 downloads last month - 3 stars on GitHub - 1 maintainer
geniusrise 0.1.7
An LLM framework
50 versions - Latest release: about 2 years ago - 9 dependent packages - 1 dependent repositories - 97 downloads last month - 61 stars on GitHub - 1 maintainer
invoice-generator-pdf 1.0.0
A Python package designed to effortlessly convert Excel-based invoices into professionally format...
1 version - Latest release: over 1 year ago - 5 downloads last month - 1 maintainer
arus 1.1.22
Activity Recognition with Ubiquitous Sensing
59 versions - Latest release: about 5 years ago - 1 dependent repositories - 1.34 thousand downloads last month - 0 stars on GitHub - 1 maintainer
saqc 2.7.0
A timeseries data quality control and processing tool/framework
21 versions - Latest release: 6 months ago - 1 dependent repositories - 884 downloads last month - 7 stars on git.ufz.de - 2 maintainers
data-preprocessing-library-sevvalcucuk-asudesozcu 1.1.7
A comprehensive toolkit for data processing including handling dates, encoding categorical variab...
2 versions - Latest release: almost 2 years ago - 27 downloads last month - 0 stars on GitHub - 2 maintainers
conveyor-streaming 1.2.1
A Python library for streamlining asynchronous streaming tasks and pipelines.
5 versions - Latest release: 7 months ago - 52 downloads last month - 2 stars on GitHub - 1 maintainer
chunklet 1.4.0
A smart multilingual text chunker for LLMs, RAG, and beyond.
19 versions - Latest release: 6 months ago - 162 downloads last month - 23 stars on GitHub - 1 maintainer
csvuniondiff 0.0.0.dev1
A package for comparing CSV-like files through union and difference operations.
2 versions - Latest release: over 1 year ago - 36 downloads last month - 7 stars on GitHub - 1 maintainer
crystflow 0.0.1
Name reservation for WIP package
1 version - Latest release: 3 months ago - 29 downloads last month - 1 maintainer
adaptivebridge 1.1.0
Revolutionizing ML adaptive modelling for handling missing features and data. The model can predi...
5 versions - Latest release: about 2 years ago - 211 downloads last month - 1 stars on GitHub - 1 maintainer
atlantic 2.0.30
Atlantic is an automated preprocessing framework for supervised machine learning
52 versions - Latest release: about 1 month ago - 2 dependent packages - 611 downloads last month - 29 stars on GitHub - 1 maintainer
strahlenexposition-uba 1.0.0
Package for importing, processing and visualising radition exposure data
1 version - Latest release: 10 months ago - 8 downloads last month - 1 maintainer
scalary 0.1.3
Collection of practical tools for working with image data
3 versions - Latest release: almost 6 years ago - 1 dependent repositories - 8 downloads last month - 0 stars on GitHub - 1 maintainer
meowmotion 0.1.2
Mobile phone GPS data processor for trip generation and travel mode detection
3 versions - Latest release: 10 months ago - 13 downloads last month - 0 stars on GitHub - 1 maintainer
eseas 1.0.4
eseas is a Python package that serves as a wrapper for the jwsacruncher Java package. This tool a...
14 versions - Latest release: 12 months ago - 118 downloads last month - 1 stars on GitHub - 1 maintainer
urlcounter 0.0.3
A set of functions that tally URLs within an event-based corpus. It assumes that you have data di...
2 versions - Latest release: over 5 years ago - 1 dependent repositories - 32 downloads last month - 0 stars on GitHub - 1 maintainer
rezolve-ai-ingestion 0.1.4
A private package for ingesting and processing SharePoint data with AI capabilities
4 versions - Latest release: over 1 year ago - 28 downloads last month - 6,176 stars on GitHub - 1 maintainer
hfutils 0.13.0
Useful utilities for huggingface
51 versions - Latest release: 2 months ago - 1 dependent package - 45.6 thousand downloads last month - 21 stars on GitHub - 1 maintainer
logicsponge-core 0.0.18
A real-time data processing pipeline
9 versions - Latest release: 3 months ago - 123 downloads last month - 2 stars on GitHub - 3 maintainers
canoa-data-validate 0.7.64
O data_validate é um validador e processador de planilhas, que automatiza a checagem de integrida...
18 versions - Latest release: 3 months ago - 388 downloads last month - 0 stars on GitHub - 1 maintainer
pysetl 1.2.1
A PySpark ETL Framework
14 versions - Latest release: 7 months ago - 52 downloads last month - 5 stars on GitHub - 1 maintainer
particle-pack-tools 0.0.7
A toolkit for processing and visualizing particle pack data.
7 versions - Latest release: 3 months ago - 38 downloads last month - 0 stars on GitHub - 1 maintainer
pycfs 0.2.0
Python library for automating and data handling tasks for openCFS.
20 versions - Latest release: 4 months ago - 331 downloads last month - 4 stars on gitlab.com - 1 maintainer
fiiireflyyy 0.3.0
A python package covering miscellaneous uses, from system management to machine learning and imag...
35 versions - Latest release: about 1 year ago - 1 dependent repositories - 91 downloads last month - 1 maintainer
biomdp 0.9.0
Usefull set of functions for analyzing time series records, particularly for biomechanical data
21 versions - Latest release: 21 days ago - 508 downloads last month - 1 stars on GitHub - 1 maintainer
zeef 0.1.3
A Python Framework for Deep Active Learning
4 versions - Latest release: about 4 years ago - 1 dependent repositories - 46 downloads last month - 1 maintainer
augmently 1.0.9
A library for Data Augmentation of images in computer vision.
6 versions - Latest release: over 6 years ago - 58 downloads last month - 4 stars on GitHub - 1 maintainer
acfortformat 0.1.3
Python library for reading and writing data in Fortran-style and native Python formats.
1 version - Latest release: 8 months ago - 20 downloads last month - 0 stars on GitHub - 1 maintainer
geniusrise-listeners 0.1.7
listeners bolts for geniusrise
7 versions - Latest release: over 2 years ago - 51 downloads last month - 2 stars on GitHub - 1 maintainer
bertrand 0.0.1
(in development) Type-safe language bindings for Python/C++
1 version - Latest release: almost 2 years ago - 644 downloads last month - 2 stars on GitHub - 1 maintainer
informatics 0.0.1
Framework of fast implementation data processing and operating pipelines
6 versions - Latest release: over 2 years ago - 1 dependent repositories - 244 downloads last month - 583 stars on GitHub - 1 maintainer
ersatz 1.0.0
Simple sentence segmentation toolkit for segmenting and scoring
2 versions - Latest release: over 4 years ago - 1 dependent repositories - 108 downloads last month - 34 stars on GitHub - 2 maintainers
sportradar-unofficial 0.1.15
An unofficial python package to access sportradar NFL APIs.
13 versions - Latest release: about 2 years ago - 40 downloads last month - 0 stars on GitHub - 1 maintainer
blossom-data 0.5.1
A simple way to synthesize LLM training data.
7 versions - Latest release: about 1 month ago - 41 downloads last month - 25 stars on GitHub - 1 maintainer
arachnea 0.0.5
A Python library for efficient array operations using a fluent API.
4 versions - Latest release: over 1 year ago - 109 downloads last month - 0 stars on GitHub - 1 maintainer
damast 0.2.3
Package to improve the development of transparent, replicable data processing pipelines
17 versions - Latest release: 2 months ago - 33 downloads last month - 1 stars on GitHub - 1 maintainer
cross_ml 2.0.1
⚠️ DEPRECATED: Please use BeaverFE instead (https://pypi.org/project/beaverfe/)
9 versions - Latest release: 8 months ago - 103 downloads last month - 0 stars on GitHub - 1 maintainer
litdata 0.2.60
The Deep Learning framework to train, deploy, and ship AI products Lightning fast.
66 versions - Latest release: about 1 month ago - 2 dependent packages - 107 thousand downloads last month - 544 stars on GitHub - 2 maintainers
joinem 0.11.1
CLI for fast, flexbile concatenation of tabular data using Polars.
21 versions - Latest release: 5 months ago - 64.2 thousand downloads last month - 16 stars on GitHub - 1 maintainer
pastaq 0.11.4
Pipelines And Systems for Threshold Avoiding Quantification (PASTAQ): Pre-processing tools for LC...
7 versions - Latest release: 4 months ago - 1 dependent repositories - 205 downloads last month - 7 stars on GitHub - 4 maintainers
geniusrise-huggingface 0.4.9
Huggingface bolts for geniusrise
13 versions - Latest release: over 2 years ago - 286 downloads last month - 3 stars on GitHub - 1 maintainer
datasponge 0.0.1
A real-time data processing pipeline
1 version - Latest release: over 1 year ago - 22 downloads last month - 3 stars on GitHub - 1 maintainer
analytics_tasks 0.1.0
Automation including file search and slide deck preparation.
1 version - Latest release: 8 months ago - 43 downloads last month - 0 stars on GitHub - 1 maintainer
binary-rain-helper-buddy 0.0.1
Aims to simplify and help with commonly used functions in the cloud and data processing areas.
1 version - Latest release: over 1 year ago - 53 downloads last month - 1 maintainer
phx-filters 3.4.0
Validation and data pipelines made easy!
10 versions - Latest release: over 2 years ago - 2 dependent packages - 16 dependent repositories - 193 downloads last month - 2 stars on GitHub - 1 maintainer
hll 2.4.0
Fast HyperLogLog for Python
20 versions - Latest release: 7 months ago - 25.7 thousand downloads last month - 110 stars on GitHub - 1 maintainer
tensorneko 0.3.24
Tensor Neural Engine Kompanion. An util library based on PyTorch and PyTorch Lightning.
94 versions - Latest release: 28 days ago - 1 dependent repositories - 1.26 thousand downloads last month - 10 stars on GitHub - 1 maintainer
bolster 0.4.0
Bolster's Brain, you've been warned
9 versions - Latest release: 3 months ago - 1 dependent repositories - 156 downloads last month - 2 stars on GitHub - 1 maintainer
easydata-ds 0.1.0
A Python library for data scientists to easily apply functions to datasets with a terminal UI
1 version - Latest release: 5 months ago - 20 downloads last month - 1 maintainer
cuery 0.33.2
Prompt (cue) management and execution for tabular data.
78 versions - Latest release: 3 months ago - 874 downloads last month - 1 stars on GitHub - 1 maintainer
outputty 0.3.2
Import, filter and export tabular data with Python easily
6 versions - Latest release: almost 13 years ago - 3 dependent repositories - 15 downloads last month - 36 stars on GitHub - 1 maintainer
geomagnetism 0.1.0
toolbox for geomagnetism computation
8 versions - Latest release: over 5 years ago - 1 dependent repositories - 18 downloads last month - 1 stars on GitHub - 1 maintainer
nttc 0.6.1
A set of functions that process and create topic models from a sample of community-detected Twitt...
52 versions - Latest release: almost 5 years ago - 1 dependent repositories - 178 downloads last month - 3 stars on GitHub - 1 maintainer
arus-stream-metawear 1.0.4
arus plugin that helps creating stream for metawear devices
2 versions - Latest release: over 6 years ago - 1 dependent repositories - 24 downloads last month - 1 stars on GitHub - 1 maintainer
django-crunch 0.1.12
A data processing orcestration tool.
1 version - Latest release: about 3 years ago - 6 downloads last month - 3 stars on GitHub - 1 maintainer
dagstd 0.1.3
Dagstd
4 versions - Latest release: over 3 years ago - 281 downloads last month - 2 stars on GitHub - 1 maintainer
memories-dev 2.0.8
Collective Memory Infrastructure for AGI
10 versions - Latest release: 10 months ago - 19 downloads last month - 9 stars on GitHub - 1 maintainer
owid-datautils 0.6.2
Data utils library by the Data Team at Our World in Data
2 versions - Latest release: 3 months ago - 1 maintainer
prosto 0.6.0
Data processing toolkit radically changing the way data is processed
5 versions - Latest release: over 4 years ago - 1 dependent repositories - 26 downloads last month - 91 stars on GitHub - 1 maintainer
geniusrise-prompt-actions 0.1.0
listeners bolts for geniusrise
1 version - Latest release: over 2 years ago - 17 downloads last month - 4 stars on GitHub - 1 maintainer
hysteresis 2.0.5
Hysteresis data processing tools.
25 versions - Latest release: over 1 year ago - 1 dependent repositories - 1.06 thousand downloads last month - 62 stars on GitHub - 1 maintainer
datauncert 6.8
import data from .xlsx and .xls files. Use the data to perform calculation with uncertanties
56 versions - Latest release: about 3 years ago - 1 dependent repositories - 103 downloads last month - 0 stars on GitHub - 1 maintainer
lidirl 0.0.1
LID toolkit to improve performance on spontaneous noisy text with data augmentation.
1 version - Latest release: almost 3 years ago - 24 downloads last month - 0 stars on GitHub - 1 maintainer
rivusio 0.2.0
A type-safe, async-first data processing pipeline framework
2 versions - Latest release: about 1 year ago - 12 downloads last month - 1 stars on GitHub - 1 maintainer
vre-eoles 0.2.1
toolbox for computing charge factor used in EOLES model
9 versions - Latest release: about 5 years ago - 1 dependent repositories - 70 downloads last month - 0 stars on GitHub - 1 maintainer
geniusrise-audio 0.1.12
audio bolts for geniusrise
13 versions - Latest release: about 2 years ago - 69 downloads last month - 2 stars on GitHub - 1 maintainer
toolpy-buddy 0.0.7
Aims to simplify and help with commonly used functions in the cloud and data processing areas.
7 versions - Latest release: over 1 year ago - 293 downloads last month - 1 maintainer
pipd 0.2.2
Utility functions for python data pipelines.
20 versions - Latest release: over 2 years ago - 272 downloads last month - 15 stars on GitHub - 1 maintainer
logicsponge-monitoring 0.0.5
A real-time data processing pipeline
5 versions - Latest release: 11 months ago - 69 downloads last month - 0 stars on GitHub - 3 maintainers
logicsponge-processmining 0.0.5
A real-time data processing pipeline
5 versions - Latest release: 9 months ago - 66 downloads last month - 1 stars on GitHub - 4 maintainers
framemerge 1.1.0
Lightweight tool to merge crystallographic frames
3 versions - Latest release: 3 months ago - 38 downloads last month - 1 maintainer
imaxt-mosaic 1.15.0
Image stitching
20 versions - Latest release: about 3 years ago - 69 downloads last month - 0 stars on gitlab.developers.cam.ac.uk - 1 maintainer
arekit-ss 0.25.0
Low Resource Context Relation Sampler for contexts with relations for fact-checking and fine-tuni...
3 versions - Latest release: over 1 year ago - 42 downloads last month - 4 stars on GitHub - 1 maintainer
analysts-tool-share 0.0.1
Tools for analyzing data, using Python.
1 version - Latest release: about 6 years ago - 1 dependent repositories - 30 downloads last month - 0 stars on GitHub - 1 maintainer
datasetplus 0.6.0
An enhanced wrapper for Hugging Face datasets with additional functionality
12 versions - Latest release: 6 months ago - 1.22 thousand downloads last month - 1 stars on GitHub - 1 maintainer
logicsponge 0.0.9
A real-time data processing pipeline
1 version - Latest release: over 1 year ago - 25 downloads last month - 3 stars on GitHub - 3 maintainers
datatidy 1.0.1
A powerful, configuration-driven data processing and cleaning package
4 versions - Latest release: 7 months ago - 36 downloads last month - 0 stars on GitHub - 1 maintainer
ctxpro 0.0.5
Simple toolkit that extracts ambiguities in documents that require context to resolve.
5 versions - Latest release: about 2 years ago - 23 downloads last month - 0 stars on GitHub - 1 maintainer