Semantic Caching on Valkey and AWS ElastiCache
LiteLLM now supports semantic prompt caching on Valkey. If you run a Valkey cluster with the valkey-search module, including AWS ElastiCache for Valkey, you can point LiteLLM at it with type: valkey-semantic and get embedding-based cache hits without standing up Redis Stack or a separate vector database.
