pypi.org : llmlingua-promptflow
To speed up LLMs' inference and enhance LLM's perceive of key information, compress the prompt and KV-Cache, which achieves up to 20x compression with minimal performance loss.
Registry
- Homepage
- Documentation
- JSON
purl: pkg:pypi/llmlingua-promptflow
Keywords:
Prompt Compression
, LLMs
, Inference Acceleration
, Black-box LLMs
, Efficient LLMs
License: MIT
Latest release: 12 months ago
First release: 12 months ago
Downloads: 78 last month
Last synced: about 19 hours ago