Document Search#
ragbits.document_search.DocumentSearch
#
DocumentSearch(embedder: Embeddings, vector_store: VectorStore, query_rephraser: QueryRephraser | None = None, reranker: Reranker | None = None, document_processor_router: DocumentProcessorRouter | None = None, processing_strategy: ProcessingExecutionStrategy | None = None)
A main entrypoint to the DocumentSearch functionality.
It provides methods for both ingestion and retrieval.
Retrieval:
1. Uses QueryRephraser to rephrase the query.
2. Uses VectorStore to retrieve the most relevant chunks.
3. Uses Reranker to rerank the chunks.
Source code in packages/ragbits-document-search/src/ragbits/document_search/_main.py
query_rephraser
instance-attribute
#
query_rephraser: QueryRephraser = query_rephraser or NoopQueryRephraser()
document_processor_router
instance-attribute
#
document_processor_router: DocumentProcessorRouter = document_processor_router or from_config()
processing_strategy
instance-attribute
#
processing_strategy: ProcessingExecutionStrategy = processing_strategy or SequentialProcessing()
from_config
classmethod
#
from_config(config: dict) -> DocumentSearch
Creates and returns an instance of the DocumentSearch class from the given configuration.
PARAMETER | DESCRIPTION |
---|---|
config |
A dictionary containing the configuration for initializing the DocumentSearch instance.
TYPE:
|
RETURNS | DESCRIPTION |
---|---|
DocumentSearch
|
An initialized instance of the DocumentSearch class.
TYPE:
|
Source code in packages/ragbits-document-search/src/ragbits/document_search/_main.py
search
async
#
search(query: str, config: SearchConfig | None = None) -> Sequence[Element]
Search for the most relevant chunks for a query.
PARAMETER | DESCRIPTION |
---|---|
query |
The query to search for.
TYPE:
|
config |
The search configuration.
TYPE:
|
RETURNS | DESCRIPTION |
---|---|
Sequence[Element]
|
A list of chunks. |
Source code in packages/ragbits-document-search/src/ragbits/document_search/_main.py
ingest
async
#
ingest(documents: Sequence[DocumentMeta | Document | Source], document_processor: BaseProvider | None = None) -> None
Ingest multiple documents.
PARAMETER | DESCRIPTION |
---|---|
documents |
The documents or metadata of the documents to ingest. |
document_processor |
The document processor to use. If not provided, the document processor will be determined based on the document metadata.
TYPE:
|
Source code in packages/ragbits-document-search/src/ragbits/document_search/_main.py
insert_elements
async
#
insert_elements(elements: list[Element]) -> None
Insert Elements into the vector store.
PARAMETER | DESCRIPTION |
---|---|
elements |
The list of Elements to insert.
TYPE:
|