/sticky-notes

Semantic Caching

A short working note from the thinking workspace.

Tags

Semantic caching maps natural language queries to vector spaces. If a query falls within the similarity threshold, we bypass the LLM. But loose boundaries lead to cache collisions and serve stale data. Optimize for safety by raising the similarity gate.

Continue Previous Scope creep with manners Next Ship the sketch