langchain.cache.CassandraSemanticCache

class langchain.cache.CassandraSemanticCache(session: Optional[CassandraSession], keyspace: Optional[str], embedding: Embeddings, table_name: str = 'langchain_llm_semantic_cache', distance_metric: str = 'dot', score_threshold: float = 0.85, ttl_seconds: Optional[int] = None, skip_provisioning: bool = False)[source]

Cache that uses Cassandra as a vector-store backend for semantic (i.e. similarity-based) lookup.

It uses a single (vector) Cassandra table and stores, in principle, cached values from several LLMs, so the LLM’s llm_string is part of the rows’ primary keys.

The similarity is based on one of several distance metrics (default: “dot”). If choosing another metric, the default threshold is to be re-tuned accordingly.

Initialize the cache with all relevant parameters. :param session: an open Cassandra session :type session: cassandra.cluster.Session :param keyspace: the keyspace to use for storing the cache :type keyspace: str :param embedding: Embedding provider for semantic

encoding and search.

Parameters
  • table_name (str) – name of the Cassandra (vector) table to use as cache

  • distance_metric (str, 'dot') – which measure to adopt for similarity searches

  • score_threshold (optional float) – numeric value to use as cutoff for the similarity searches

  • ttl_seconds (optional int) – time-to-live for cache entries (default: None, i.e. forever)

The default score threshold is tuned to the default metric. Tune it carefully yourself if switching to another distance metric.

Methods

__init__(session, keyspace, embedding[, ...])

Initialize the cache with all relevant parameters. :param session: an open Cassandra session :type session: cassandra.cluster.Session :param keyspace: the keyspace to use for storing the cache :type keyspace: str :param embedding: Embedding provider for semantic encoding and search. :type embedding: Embedding :param table_name: name of the Cassandra (vector) table to use as cache :type table_name: str :param distance_metric: which measure to adopt for similarity searches :type distance_metric: str, 'dot' :param score_threshold: numeric value to use as cutoff for the similarity searches :type score_threshold: optional float :param ttl_seconds: time-to-live for cache entries (default: None, i.e. forever) :type ttl_seconds: optional int.

clear(**kwargs)

Clear the whole semantic cache.

delete_by_document_id(document_id)

Given this is a "similarity search" cache, an invalidation pattern that makes sense is first a lookup to get an ID, and then deleting with that ID.

lookup(prompt, llm_string)

Look up based on prompt and llm_string.

lookup_with_id(prompt, llm_string)

Look up based on prompt and llm_string.

lookup_with_id_through_llm(prompt, llm[, stop])

update(prompt, llm_string, return_val)

Update cache based on prompt and llm_string.

__init__(session: Optional[CassandraSession], keyspace: Optional[str], embedding: Embeddings, table_name: str = 'langchain_llm_semantic_cache', distance_metric: str = 'dot', score_threshold: float = 0.85, ttl_seconds: Optional[int] = None, skip_provisioning: bool = False)[source]

Initialize the cache with all relevant parameters. :param session: an open Cassandra session :type session: cassandra.cluster.Session :param keyspace: the keyspace to use for storing the cache :type keyspace: str :param embedding: Embedding provider for semantic

encoding and search.

Parameters
  • table_name (str) – name of the Cassandra (vector) table to use as cache

  • distance_metric (str, 'dot') – which measure to adopt for similarity searches

  • score_threshold (optional float) – numeric value to use as cutoff for the similarity searches

  • ttl_seconds (optional int) – time-to-live for cache entries (default: None, i.e. forever)

The default score threshold is tuned to the default metric. Tune it carefully yourself if switching to another distance metric.

clear(**kwargs: Any) None[source]

Clear the whole semantic cache.

delete_by_document_id(document_id: str) None[source]

Given this is a “similarity search” cache, an invalidation pattern that makes sense is first a lookup to get an ID, and then deleting with that ID. This is for the second step.

lookup(prompt: str, llm_string: str) Optional[Sequence[Generation]][source]

Look up based on prompt and llm_string.

lookup_with_id(prompt: str, llm_string: str) Optional[Tuple[str, Sequence[Generation]]][source]

Look up based on prompt and llm_string. If there are hits, return (document_id, cached_entry)

lookup_with_id_through_llm(prompt: str, llm: LLM, stop: Optional[List[str]] = None) Optional[Tuple[str, Sequence[Generation]]][source]
update(prompt: str, llm_string: str, return_val: Sequence[Generation]) None[source]

Update cache based on prompt and llm_string.