Azure Cosmos DB for MongoDB vCore vector store. To use this, you should have both:

  • the mongodb NPM package installed
  • a connection string associated with a MongoDB VCore Cluster

You do not need to create a database or collection, it will be created automatically.

Though you do need to create an index on the collection, which can be done using the createIndex method.

Hierarchy

Constructors

Properties

FilterType: string | object
embeddingKey: string
indexName: string
textKey: string

Methods

  • Closes any newly instanciated Azure Cosmos DB client. If the client was passed in the constructor, it will not be closed.

    Returns Promise<void>

    A promise that resolves when any newly instanciated Azure Cosmos DB client been closed.

  • Creates an index on the collection with the specified index name during instance construction.

    Setting the numLists parameter correctly is important for achieving good accuracy and performance. Since the vector store uses IVF as the indexing strategy, you should create the index only after you have loaded a large enough sample documents to ensure that the centroids for the respective buckets are faily distributed.

    We recommend that numLists is set to documentCount/1000 for up to 1 million documents and to sqrt(documentCount) for more than 1 million documents. As the number of items in your database grows, you should tune numLists to be larger in order to achieve good latency performance for vector search.

    If you're experimenting with a new scenario or creating a small demo, you can start with numLists set to 1 to perform a brute-force search across all vectors. This should provide you with the most accurate results from the vector search, however be aware that the search speed and latency will be slow. After your initial setup, you should go ahead and tune the numLists parameter using the above guidance.

    Parameters

    • numLists: number = 100

      This integer is the number of clusters that the inverted file (IVF) index uses to group the vector data. We recommend that numLists is set to documentCount/1000 for up to 1 million documents and to sqrt(documentCount) for more than 1 million documents. Using a numLists value of 1 is akin to performing brute-force search, which has limited performance

    • dimensions: number = 1536

      Number of dimensions for vector similarity. The maximum number of supported dimensions is 2000

    • similarity: AzureCosmosDBSimilarityType = AzureCosmosDBSimilarityType.COS

      Similarity metric to use with the IVF index. Possible options are:

      • CosmosDBSimilarityType.COS (cosine distance)
      • CosmosDBSimilarityType.L2 (Euclidean distance)
      • CosmosDBSimilarityType.IP (inner product)

    Returns Promise<void>

    A promise that resolves when the index has been created.

  • Removes specified documents from the AzureCosmosDBVectorStore.

    Parameters

    • Optional ids: string[]

      IDs of the documents to be removed. If no IDs are specified, all documents will be removed.

    Returns Promise<void>

    A promise that resolves when the documents have been removed.

  • Parameters

    • query: string
    • Optional k: number
    • Optional filter: string | object
    • Optional _callbacks: Callbacks

    Returns Promise<DocumentInterface<Record<string, any>>[]>

  • Method that performs a similarity search on the vectors stored in the collection. It returns a list of documents and their corresponding similarity scores.

    Parameters

    • queryVector: number[]

      Query vector for the similarity search.

    • k: number = 4

    Returns Promise<[Document<Record<string, any>>, number][]>

    Promise that resolves to a list of documents and their corresponding similarity scores.

  • Parameters

    • query: string
    • Optional k: number
    • Optional filter: string | object
    • Optional _callbacks: Callbacks

    Returns Promise<[DocumentInterface<Record<string, any>>, number][]>

Generated using TypeDoc