semantic-cache

Here are 122 public repositories matching this topic...

codefuse-ai / ModelCache

A LLM semantic caching system aiming to enhance user experience by reducing response time via cached query-result pairs.

llm semantic-cache

Updated Jun 30, 2025
Python

redis / redis-vl-python

Star

Redis Vector Library (RedisVL) -- the AI-native Python client for Redis.

python search redis mcp openai embedding redis-search vector-search huggingface vector-database large-language-models llm anthropic semantic-cache retrieval-augmented-generation llmcache

Updated Jun 25, 2026
Python

Unified AI Gateway for 30+ LLMs (OpenAI, Anthropic, Bedrock, Azure etc) with Caching, Guardrails, A/B test & cost controls. Go-native Fastest & Scalable AI Gateway LiteLLM & Kong AI Gateway alternative.

mcp gateway kong guardrails pii-detection llm llmops semantic-cache prompt-management llm-proxy litellm ai-gateway ai-infrastructure llm-cost llm-strategy

Updated Jun 26, 2026
Go

aqstack / mimir

Star

mimir is a drop-in proxy that caches LLM API responses using semantic similarity, reducing costs and latency for repeated or similar queries.

kubernetes golang caching proxy openai cost-optimization llm semantic-cache

Updated Dec 24, 2025
Go

peva3 / SmarterRouter

Star

SmarterRouter: An intelligent LLM gateway and VRAM-aware router for Ollama, llama.cpp, and OpenAI. Features semantic caching, model profiling, and automatic failover for local AI labs.

docker self-hosted model-serving gpu-monitoring fastapi llm openai-proxy semantic-cache local-llm ollama llm-proxy ollama-api ai-gateway llm-router self-hosted-ai ai-cache

Updated May 10, 2026
Python

sensoris / semcache

Star

Semantic caching layer for your LLM applications. Reuse responses and reduce token usage.

gemini openai llm anthropic semantic-cache genai

Updated Jan 2, 2026
Rust

vcache-project / vCache

Star

Reliable and Efficient Semantic Prompt Caching with vCache

Updated Dec 17, 2025
Python

Rohit-Dnath / RAMen

Star

RAMen is a fast in-memory data store like Redis, but built for AI: drop-in Redis protocol, native vector search, semantic caching, and a built-in MCP server for agents. Single Go binary, BSD-3.

redis golang mcp cache resp ai-agents key-value-store vector-search vector-database llm semantic-cache mcp-server redis-alternative valkey-alternative-in-memory-database

Updated Jun 16, 2026
Go

redis-developer / adk-redis

Star

Redis integration for Google Agent Development Kit (ADK) - Memory, Sessions, Search Tools, MCP

redis memory semantic-search adk long-term-memory vector-search hybrid-search semantic-cache mcp-server agent-memory session-memory google-adk adk-python redis-agent-memory

Updated Jun 24, 2026
Python

ashishpatel26 / omnicache-ai

Star

Unified multi-layer caching library for AI/agent pipelines — LangChain, LangGraph, AutoGen, CrewAI, Agno, A2A

agent embeddings caching-strategies autogen rag caching-memory agno langchain semantic-cache langchain-python crewai langgraph aiagents llm-cache ai-cache gptcache-alternative

Updated Jun 4, 2026
Python

redis / redis-vl-java

Star

Redis Vector Library (RedisVL) -- the AI-native Java client for Redis.

java redis ai embeddings vectors rag vector-search vector-database llm generative-ai semantic-cache llm-cache rag-chatbot semantic-routing agentic-ai

Updated Jun 5, 2026
Java

AlphaBitCore / nexus-gateway

Star

Enterprise AI traffic gateway — unified compliance, routing across 20+ LLM providers, semantic cache, quotas, and audit. SDK / network / OS-layer intercept.

Updated Jun 23, 2026
Go

Harras3 / Enterprise-Grade-RAG

Star

This is a RAG based chatbot in which semantic cache and guardrails have been incorporated.

guardrails semantic-cache retrieval-augmented-generation

Updated Nov 11, 2024
HTML

aws-samples / Reducing-Hallucinations-in-LLM-Agents-with-a-Verified-Semantic-Cache

Star

This repository contains sample code demonstrating how to implement a verified semantic cache using Amazon Bedrock Knowledge Bases to prevent hallucinations in Large Language Model (LLM) responses while improving latency and reducing costs.

agent aws demo bedrock rag aws-blog llm semantic-cache llm-agent amazon-bedrock amazon-bedrock-agents amazon-bedrock-knowledge-bases

Updated Apr 3, 2025
Jupyter Notebook

hedimanai-pro / toolops

Star

ToolOps is a framework-agnostic middleware SDK that treats every tool call as a first-class operation. By wrapping your tools in a single decorator, you instantly upgrade them with industrial-grade caching, resilience, and observability.

Updated May 30, 2026
Python

zakariaf / RAG-Cache

Star

High-performance LLM query cache with semantic search. Reduce API costs 80% and latency from 8.5s to 1ms using Redis + Qdrant vector DB. Multi-provider support (OpenAI, Anthropic).

redis embeddings openai cost-optimization rag fastapi vector-database qdrant semantic-cache llm-caching

Updated Dec 2, 2025
Python

AEndrix03 / Graft

Star

Local-first semantic cache for AI agents. A small C daemon + CLI that remembers what your agent learned across sessions. Plugs into Claude Code, Codex, Gemini CLI, and Claude Desktop / ChatGPT via MCP. No LLM calls, no SaaS, no API key.

c cli caching daemon sqlite mcp embeddings knowledge-graph command-line-tool semantic-search ai-agents local-first llama-cpp semantic-cache llm-agents bge-m3 sqlite-vec agent-memory claude-code

Updated May 20, 2026
C

Das-rebel / a3m-router

Star

RouterArena #1 among known public baselines: 96.77% accuracy, $0.0768/1K, 1.0000 robustness. OpenAI-compatible LLM router across 47+ providers.

Updated Jun 24, 2026
TypeScript

yastman / rag

Star

AI real-estate automation platform: Telegram bot, RAG, apartment search, CRM workflows, voice agent, Langfuse observability, and Dockerized AI runtime.

Updated Jun 26, 2026
Python

jonathanscholtes / LLM-Performance-with-Azure-Cosmos-DB-Semantic-Cache

Star

Enhance LLM retrieval performance with Azure Cosmos DB Semantic Cache. Learn how to integrate and optimize caching strategies in real-world web applications.

vector-search azurecosmosdb semantic-cache

Updated Mar 22, 2024
Python

Improve this page

Add a description, image, and links to the semantic-cache topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the semantic-cache topic, visit your repo's landing page and select "manage topics."

Learn more

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

semantic-cache

Here are 122 public repositories matching this topic...

codefuse-ai / ModelCache

redis / redis-vl-python

ferro-labs / ai-gateway

aqstack / mimir

peva3 / SmarterRouter

sensoris / semcache

vcache-project / vCache

Rohit-Dnath / RAMen

redis-developer / adk-redis

ashishpatel26 / omnicache-ai

redis / redis-vl-java

AlphaBitCore / nexus-gateway

Harras3 / Enterprise-Grade-RAG

aws-samples / Reducing-Hallucinations-in-LLM-Agents-with-a-Verified-Semantic-Cache

hedimanai-pro / toolops

zakariaf / RAG-Cache

AEndrix03 / Graft

Das-rebel / a3m-router

yastman / rag

jonathanscholtes / LLM-Performance-with-Azure-Cosmos-DB-Semantic-Cache

Improve this page

Add this topic to your repo