pypi.org "data processing" keyword
Top 1.9% on pypi.org
15 versions - Latest release: over 2 years ago - 18 dependent packages - 78 dependent repositories - 44.5 thousand downloads last month - 1,882 stars on GitHub - 1 maintainer
datatable 1.1.0
Python library for fast multi-threaded data manipulation and munging.15 versions - Latest release: over 2 years ago - 18 dependent packages - 78 dependent repositories - 44.5 thousand downloads last month - 1,882 stars on GitHub - 1 maintainer
checkpointer 2.14.10
checkpointer adds code-aware caching to Python functions, maintaining correctness and speeding up...50 versions - Latest release: 5 months ago - 2 dependent repositories - 558 downloads last month - 6 stars on GitHub - 1 maintainer
Top 9.0% on pypi.org
233 versions - Latest release: 1 day ago - 2 dependent packages - 3 dependent repositories - 3.19 thousand downloads last month - 48 stars on GitHub - 1 maintainer
csv-detective 0.11.2
Detect tabular files column content233 versions - Latest release: 1 day ago - 2 dependent packages - 3 dependent repositories - 3.19 thousand downloads last month - 48 stars on GitHub - 1 maintainer
space-packet-parser 6.1.2
A CCSDS telemetry packet decoding library based on the XTCE packet format description standard.28 versions - Latest release: 2 days ago - 8.55 thousand downloads last month - 31 stars on GitHub - 1 maintainer
chunklet-py 2.2.0
High-fidelity context-aware chunking and interactive visualization for RAG. Advanced segmentation...8 versions - Latest release: about 1 month ago - 283 downloads last month - 62 stars on GitHub - 1 maintainer
memories-dev 2.0.8
Collective Memory Infrastructure for AGI10 versions - Latest release: 11 months ago - 48 downloads last month - 11 stars on GitHub - 1 maintainer
aaiclick 0.0.8
A Python framework that translates Python code into ClickHouse operations for big data computing3 versions - Latest release: 4 days ago - 284 downloads last month - 0 stars on GitHub - 1 maintainer
liwancai-pytools 1.0.0
一个实用的Python工具库,提供各种数据处理、时间日期、字典列表操作等实用工具函数1 version - Latest release: 5 days ago - 1 maintainer
geniusrise-databases 0.1.4
listeners bolts for geniusrise4 versions - Latest release: over 2 years ago - 14 downloads last month - 2 stars on GitHub - 1 maintainer
fibphoflow 0.1.8
Python package to process and visualize TDT fiber photometry data3 versions - Latest release: almost 3 years ago - 24 downloads last month - 1 maintainer
pysetl 1.2.1
A PySpark ETL Framework14 versions - Latest release: 8 months ago - 79 downloads last month - 5 stars on GitHub - 1 maintainer
scmcallib 0.5.1
Perform calibration for simple climate models20 versions - Latest release: almost 6 years ago - 1 dependent repositories - 91 downloads last month - 0 stars on gitlab.com - 2 maintainers
appligator 0.0.8
An application package bundler5 versions - Latest release: 5 months ago - 94 downloads last month - 0 stars on GitHub - 1 maintainer
fiiireflyyy 0.3.0
A python package covering miscellaneous uses, from system management to machine learning and imag...35 versions - Latest release: about 1 year ago - 1 dependent repositories - 79 downloads last month - 1 maintainer
procodile 0.0.8
A light-weight processor development framework5 versions - Latest release: 5 months ago - 81 downloads last month - 3 stars on GitHub - 1 maintainer
conveyor-streaming 1.2.1
A Python library for streamlining asynchronous streaming tasks and pipelines.5 versions - Latest release: 8 months ago - 24 downloads last month - 2 stars on GitHub - 1 maintainer
gavicore 0.0.8
Pydantic data models and common utilities for other Eozilla packages6 versions - Latest release: 5 months ago - 87 downloads last month - 3 stars on GitHub - 1 maintainer
sportradar-unofficial 0.1.15
An unofficial python package to access sportradar NFL APIs.13 versions - Latest release: about 2 years ago - 64 downloads last month - 0 stars on GitHub - 1 maintainer
wraptile 0.0.8
FastAPI server that implements the OGC API - Processes5 versions - Latest release: 5 months ago - 68 downloads last month - 3 stars on GitHub - 1 maintainer
cuiman 0.0.8
Provides a client Python API, GUI, and CLI for servers compliant with OGC API - Processes5 versions - Latest release: 5 months ago - 85 downloads last month - 3 stars on GitHub - 1 maintainer
zeef 0.1.3
A Python Framework for Deep Active Learning4 versions - Latest release: about 4 years ago - 1 dependent repositories - 21 downloads last month - 1 maintainer
llmbuilder 2.0.0
A comprehensive toolkit for building, training, and deploying language models10 versions - Latest release: 5 months ago - 20 downloads last month - 0 stars on GitHub - 1 maintainer
eozilla 0.0.8
Comprises all packages of the Eozilla suite6 versions - Latest release: 5 months ago - 45 downloads last month - 0 stars on GitHub - 1 maintainer
csvuniondiff 0.0.0.dev1
A package for comparing CSV-like files through union and difference operations.2 versions - Latest release: over 1 year ago - 36 downloads last month - 7 stars on GitHub - 1 maintainer
rezolve-ai-ingestion 0.1.4
A private package for ingesting and processing SharePoint data with AI capabilities4 versions - Latest release: over 1 year ago - 19 downloads last month - 6,176 stars on GitHub - 1 maintainer
crystflow 0.0.1
Name reservation for WIP package1 version - Latest release: 4 months ago - 29 downloads last month - 1 maintainer
cross_ml 2.0.1
⚠️ DEPRECATED: Please use BeaverFE instead (https://pypi.org/project/beaverfe/)9 versions - Latest release: 9 months ago - 103 downloads last month - 0 stars on GitHub - 1 maintainer
blossom-data 0.5.1
A simple way to synthesize LLM training data.7 versions - Latest release: 2 months ago - 41 downloads last month - 25 stars on GitHub - 1 maintainer
geniusrise 0.1.7
An LLM framework50 versions - Latest release: about 2 years ago - 9 dependent packages - 1 dependent repositories - 207 downloads last month - 61 stars on GitHub - 1 maintainer
data-preprocessing-library-sevvalcucuk-asudesozcu 1.1.7
A comprehensive toolkit for data processing including handling dates, encoding categorical variab...2 versions - Latest release: almost 2 years ago - 21 downloads last month - 0 stars on GitHub - 2 maintainers
arus 1.1.22
Activity Recognition with Ubiquitous Sensing59 versions - Latest release: about 5 years ago - 1 dependent repositories - 440 downloads last month - 0 stars on GitHub - 1 maintainer
damast 0.2.4
Package to improve the development of transparent, replicable data processing pipelines18 versions - Latest release: about 1 month ago - 264 downloads last month - 1 stars on GitHub - 1 maintainer
pastaq 0.11.4
Pipelines And Systems for Threshold Avoiding Quantification (PASTAQ): Pre-processing tools for LC...7 versions - Latest release: 4 months ago - 1 dependent repositories - 192 downloads last month - 7 stars on GitHub - 4 maintainers
sandai-operator-sdk 0.4.5
A Python SDK for building high-performance, asynchronous batch processing operators7 versions - Latest release: 11 days ago - 549 downloads last month - 1 maintainer
tensorneko 0.3.25
Tensor Neural Engine Kompanion. An util library based on PyTorch and PyTorch Lightning.95 versions - Latest release: 21 days ago - 1 dependent repositories - 555 downloads last month - 10 stars on GitHub - 1 maintainer
binary-rain-helper-buddy 0.0.1
Aims to simplify and help with commonly used functions in the cloud and data processing areas.1 version - Latest release: over 1 year ago - 53 downloads last month - 1 maintainer
bertrand 0.0.1
(in development) Type-safe language bindings for Python/C++1 version - Latest release: almost 2 years ago - 644 downloads last month - 2 stars on GitHub - 1 maintainer
invoice-generator-pdf 1.0.0
A Python package designed to effortlessly convert Excel-based invoices into professionally format...1 version - Latest release: over 1 year ago - 32 downloads last month - 1 maintainer
joinem 0.11.1
CLI for fast, flexbile concatenation of tabular data using Polars.21 versions - Latest release: 6 months ago - 64.2 thousand downloads last month - 16 stars on GitHub - 1 maintainer
scalary 0.1.3
Collection of practical tools for working with image data3 versions - Latest release: almost 6 years ago - 1 dependent repositories - 17 downloads last month - 0 stars on GitHub - 1 maintainer
saqc 2.7.0
A timeseries data quality control and processing tool/framework21 versions - Latest release: 7 months ago - 1 dependent repositories - 1.37 thousand downloads last month - 7 stars on git.ufz.de - 2 maintainers
analytics_tasks 0.1.0
Automation including file search and slide deck preparation.1 version - Latest release: 9 months ago - 25 downloads last month - 0 stars on GitHub - 1 maintainer
meowmotion 0.1.2
Mobile phone GPS data processor for trip generation and travel mode detection3 versions - Latest release: 11 months ago - 13 downloads last month - 0 stars on GitHub - 1 maintainer
vre-eoles 0.2.1
toolbox for computing charge factor used in EOLES model9 versions - Latest release: about 5 years ago - 1 dependent repositories - 70 downloads last month - 0 stars on GitHub - 1 maintainer
chunklet 1.4.0
A smart multilingual text chunker for LLMs, RAG, and beyond.19 versions - Latest release: 7 months ago - 162 downloads last month - 23 stars on GitHub - 1 maintainer
litdata 0.2.61
The Deep Learning framework to train, deploy, and ship AI products Lightning fast.67 versions - Latest release: about 1 month ago - 2 dependent packages - 104 thousand downloads last month - 544 stars on GitHub - 2 maintainers
dabrius 0.3.3
The recommended lightweight ETL library for Python — CSV/JSONL pipelines, data cleaning, schema v...13 versions - Latest release: 17 days ago - 1.05 thousand downloads last month - 1 maintainer
logicsponge-core 0.0.18
A real-time data processing pipeline9 versions - Latest release: 4 months ago - 123 downloads last month - 2 stars on GitHub - 3 maintainers
analysts-tool-share 0.0.1
Tools for analyzing data, using Python.1 version - Latest release: over 6 years ago - 1 dependent repositories - 30 downloads last month - 0 stars on GitHub - 1 maintainer
toolpy-buddy 0.0.7
Aims to simplify and help with commonly used functions in the cloud and data processing areas.7 versions - Latest release: over 1 year ago - 293 downloads last month - 1 maintainer
geniusrise-prompt-actions 0.1.0
listeners bolts for geniusrise1 version - Latest release: over 2 years ago - 17 downloads last month - 4 stars on GitHub - 1 maintainer
strahlenexposition-uba 1.0.0
Package for importing, processing and visualising radition exposure data1 version - Latest release: 11 months ago - 12 downloads last month - 1 maintainer
eseas 1.0.5
eseas is a Python package that serves as a wrapper for the jwsacruncher Java package. This tool a...15 versions - Latest release: about 1 month ago - 234 downloads last month - 1 stars on GitHub - 1 maintainer
hfutils 0.13.0
Useful utilities for huggingface51 versions - Latest release: 3 months ago - 1 dependent package - 45.6 thousand downloads last month - 21 stars on GitHub - 1 maintainer
atlantic 2.0.30
Atlantic is an automated preprocessing framework for supervised machine learning52 versions - Latest release: 2 months ago - 2 dependent packages - 611 downloads last month - 29 stars on GitHub - 1 maintainer
easydata-ds 0.1.0
A Python library for data scientists to easily apply functions to datasets with a terminal UI1 version - Latest release: 5 months ago - 18 downloads last month - 1 maintainer
dagstd 0.1.3
Dagstd4 versions - Latest release: almost 4 years ago - 281 downloads last month - 2 stars on GitHub - 1 maintainer
biomdp 0.9.4
Usefull set of functions for analyzing time series records, particularly for biomechanical data25 versions - Latest release: 30 days ago - 506 downloads last month - 1 stars on GitHub - 1 maintainer
informatics 0.0.1
Framework of fast implementation data processing and operating pipelines6 versions - Latest release: over 2 years ago - 1 dependent repositories - 147 downloads last month - 583 stars on GitHub - 1 maintainer
geomagnetism 0.1.0
toolbox for geomagnetism computation8 versions - Latest release: over 5 years ago - 1 dependent repositories - 98 downloads last month - 1 stars on GitHub - 1 maintainer
pycfs 0.2.0
Python library for automating and data handling tasks for openCFS.20 versions - Latest release: 5 months ago - 94 downloads last month - 4 stars on gitlab.com - 1 maintainer
urlcounter 0.0.3
A set of functions that tally URLs within an event-based corpus. It assumes that you have data di...2 versions - Latest release: almost 6 years ago - 1 dependent repositories - 15 downloads last month - 0 stars on GitHub - 1 maintainer
imaxt-mosaic 1.15.0
Image stitching20 versions - Latest release: about 3 years ago - 69 downloads last month - 0 stars on gitlab.developers.cam.ac.uk - 1 maintainer
ds11mltoolkit 1.9
Helper functions for all stages of the machine learning model building process8 versions - Latest release: about 3 years ago - 19 downloads last month - 3 stars on GitHub - 2 maintainers
geniusrise-listeners 0.1.7
listeners bolts for geniusrise7 versions - Latest release: over 2 years ago - 51 downloads last month - 2 stars on GitHub - 1 maintainer
particle-pack-tools 0.0.10
A toolkit for processing and visualizing particle pack data.10 versions - Latest release: 25 days ago - 38 downloads last month - 0 stars on GitHub - 1 maintainer
ersatz 1.0.0
Simple sentence segmentation toolkit for segmenting and scoring2 versions - Latest release: almost 5 years ago - 1 dependent repositories - 139 downloads last month - 34 stars on GitHub - 2 maintainers
augmently 1.0.9
A library for Data Augmentation of images in computer vision.6 versions - Latest release: over 6 years ago - 42 downloads last month - 4 stars on GitHub - 1 maintainer
acfortformat 0.1.3
Python library for reading and writing data in Fortran-style and native Python formats.1 version - Latest release: 9 months ago - 20 downloads last month - 0 stars on GitHub - 1 maintainer
nttc 0.6.1
A set of functions that process and create topic models from a sample of community-detected Twitt...52 versions - Latest release: almost 5 years ago - 1 dependent repositories - 207 downloads last month - 3 stars on GitHub - 1 maintainer
cuery 0.33.2
Prompt (cue) management and execution for tabular data.78 versions - Latest release: 3 months ago - 728 downloads last month - 1 stars on GitHub - 1 maintainer
geniusrise-huggingface 0.4.9
Huggingface bolts for geniusrise13 versions - Latest release: over 2 years ago - 286 downloads last month - 3 stars on GitHub - 1 maintainer
datasponge 0.0.1
A real-time data processing pipeline1 version - Latest release: over 1 year ago - 22 downloads last month - 3 stars on GitHub - 1 maintainer
phx-filters 3.5.1
Validation and data pipelines made easy!12 versions - Latest release: 26 days ago - 2 dependent packages - 16 dependent repositories - 879 downloads last month - 2 stars on GitHub - 1 maintainer
bolster 0.4.0
Bolster's Brain, you've been warned9 versions - Latest release: 4 months ago - 1 dependent repositories - 103 downloads last month - 2 stars on GitHub - 1 maintainer
lidirl 0.0.1
LID toolkit to improve performance on spontaneous noisy text with data augmentation.1 version - Latest release: about 3 years ago - 12 downloads last month - 0 stars on GitHub - 1 maintainer
arus-stream-metawear 1.0.4
arus plugin that helps creating stream for metawear devices2 versions - Latest release: over 6 years ago - 1 dependent repositories - 35 downloads last month - 1 stars on GitHub - 1 maintainer
hysteresis 2.0.5
Hysteresis data processing tools.25 versions - Latest release: over 1 year ago - 1 dependent repositories - 193 downloads last month - 62 stars on GitHub - 1 maintainer
owid-datautils 0.6.2
Data utils library by the Data Team at Our World in Data2 versions - Latest release: 3 months ago - 1 maintainer
kmeans-tjdwill 1.0.4
A function-based implementation of k-means clustering that maintains data association.5 versions - Latest release: over 1 year ago - 17 downloads last month - 0 stars on GitHub - 1 maintainer
logicsponge-monitoring 0.0.5
A real-time data processing pipeline5 versions - Latest release: 12 months ago - 69 downloads last month - 0 stars on GitHub - 3 maintainers
outputty 0.3.2
Import, filter and export tabular data with Python easily6 versions - Latest release: about 13 years ago - 3 dependent repositories - 25 downloads last month - 36 stars on GitHub - 1 maintainer
django-crunch 0.1.12
A data processing orcestration tool.1 version - Latest release: about 3 years ago - 13 downloads last month - 3 stars on GitHub - 1 maintainer
ctxpro 0.0.5
Simple toolkit that extracts ambiguities in documents that require context to resolve.5 versions - Latest release: about 2 years ago - 17 downloads last month - 0 stars on GitHub - 1 maintainer
prosto 0.6.0
Data processing toolkit radically changing the way data is processed5 versions - Latest release: over 4 years ago - 1 dependent repositories - 28 downloads last month - 91 stars on GitHub - 1 maintainer
klassez 0.1.0
A collection of functions for NMR data handling20 versions - Latest release: 23 days ago - 1 dependent package - 410 downloads last month - 0 stars on GitHub - 2 maintainers
datatidy 1.0.1
A powerful, configuration-driven data processing and cleaning package4 versions - Latest release: 8 months ago - 15 downloads last month - 0 stars on GitHub - 1 maintainer
datasetplus 0.6.0
An enhanced wrapper for Hugging Face datasets with additional functionality12 versions - Latest release: 7 months ago - 38 downloads last month - 1 stars on GitHub - 1 maintainer
datauncert 6.8
import data from .xlsx and .xls files. Use the data to perform calculation with uncertanties56 versions - Latest release: about 3 years ago - 1 dependent repositories - 103 downloads last month - 0 stars on GitHub - 1 maintainer
opencf-core 0.3.4
A robust framework for handling file conversion tasks in Python12 versions - Latest release: over 1 year ago - 1 dependent package - 50 downloads last month - 0 stars on GitHub - 1 maintainer
rivusio 0.2.0
A type-safe, async-first data processing pipeline framework2 versions - Latest release: about 1 year ago - 10 downloads last month - 1 stars on GitHub - 1 maintainer
pipd 0.2.2
Utility functions for python data pipelines.20 versions - Latest release: over 2 years ago - 157 downloads last month - 15 stars on GitHub - 1 maintainer
logicsponge-processmining 0.0.5
A real-time data processing pipeline5 versions - Latest release: 10 months ago - 66 downloads last month - 1 stars on GitHub - 4 maintainers
datasponge-core 0.0.6
A real-time data processing pipeline5 versions - Latest release: over 1 year ago - 76 downloads last month - 2 stars on GitHub - 1 maintainer
arekit-ss 0.25.0
Low Resource Context Relation Sampler for contexts with relations for fact-checking and fine-tuni...3 versions - Latest release: over 1 year ago - 30 downloads last month - 4 stars on GitHub - 1 maintainer
geniusrise-audio 0.1.12
audio bolts for geniusrise13 versions - Latest release: about 2 years ago - 38 downloads last month - 2 stars on GitHub - 1 maintainer
framemerge 1.1.0
Lightweight tool to merge crystallographic frames3 versions - Latest release: 4 months ago - 38 downloads last month - 1 maintainer
canoa-data-validate 0.7.65
The software data_validate is a robust multilingual spreadsheet validator and processor developed...19 versions - Latest release: about 1 month ago - 388 downloads last month - 0 stars on GitHub - 1 maintainer
geniusrise-vision 0.1.5
Huggingface bolts for geniusrise8 versions - Latest release: about 2 years ago - 32 downloads last month - 7 stars on GitHub - 1 maintainer
logicsponge 0.0.9
A real-time data processing pipeline1 version - Latest release: over 1 year ago - 25 downloads last month - 3 stars on GitHub - 3 maintainers
Related Keywords
python
34
machine learning
29
data science
17
data analysis
14
llm
14
data cleaning
11
pandas
11
deep learning
10
ai
10
pipeline
9
time series
9
mlops
9
data
9
geniusrise
9
analytics
7
automation
7
etl
7
AI
7
real-time
7
data-science
7
data visualization
6
polars
6
natural language processing
6
data-analysis
6
eo
6
esa
6
llmops
6
llm-framework
6
agentops
6
agent-based-framework
6
ogc
6
data transformation
5
feature engineering
5
nlp
5
preprocessing
5
huggingface
5
data engineering
5
streaming
5
async
5
eo-data
4
predictive modeling
4
data preprocessing
4
eo-datacubes
4
eo-platform
4
fastapi
4
jupyterlab
4
ogc-api
4
data validation
4
typer
4
visualization
4
data pipeline
4
database
4
inference
4
api
4
scikit-learn
4
dataframe
4
fine-tuning
4
pytorch
4
numpy
4
cli
4
ETL
4
data manipulation
4
cloud
3
clustering
3
artificial intelligence
3
functions
3
help
3
classification
3
workflow
3
file conversion
3
common
3
csv
3
sklearn
3
neuroscience
3
regression
3
machine-learning
3
NLP
3
data-engineering
3
data wrangling
3
inference-server
3
statistics
3
validation
3
data integrity
3
performance
3
data handling
3
missing data
2
data imputation
2
missing data detection
2
data quality
2
missing data analysis
2
data cleansing
2
feature-engineering
2
data collection
2
data completeness
2
impute missing values
2
data missingness
2
aws
2
download
2
azure
2
remote sensing
2