Document Search#
ragbits.document_search.DocumentSearch
#
DocumentSearch(embedder: Embeddings, vector_store: VectorStore, query_rephraser: QueryRephraser | None = None, reranker: Reranker | None = None, document_processor_router: DocumentProcessorRouter | None = None, processing_strategy: ProcessingExecutionStrategy | None = None)
Bases: WithConstructionConfig
A main entrypoint to the DocumentSearch functionality.
It provides methods for both ingestion and retrieval.
Retrieval:
1. Uses QueryRephraser to rephrase the query.
2. Uses VectorStore to retrieve the most relevant chunks.
3. Uses Reranker to rerank the chunks.
Source code in packages/ragbits-document-search/src/ragbits/document_search/_main.py
configuration_key
class-attribute
instance-attribute
#
query_rephraser
instance-attribute
#
query_rephraser: QueryRephraser = query_rephraser or NoopQueryRephraser()
document_processor_router
instance-attribute
#
document_processor_router: DocumentProcessorRouter = document_processor_router or from_config()
processing_strategy
instance-attribute
#
processing_strategy: ProcessingExecutionStrategy = processing_strategy or SequentialProcessing()
subclass_from_config
classmethod
#
Initializes the class with the provided configuration. May return a subclass of the class, if requested by the configuration.
PARAMETER | DESCRIPTION |
---|---|
config |
A model containing configuration details for the class.
TYPE:
|
RETURNS | DESCRIPTION |
---|---|
Self
|
An instance of the class initialized with the provided configuration. |
RAISES | DESCRIPTION |
---|---|
InvalidConfigError
|
The class can't be found or is not a subclass of the current class. |
Source code in packages/ragbits-core/src/ragbits/core/utils/config_handling.py
subclass_from_factory
classmethod
#
Creates the class using the provided factory function. May return a subclass of the class, if requested by the factory.
PARAMETER | DESCRIPTION |
---|---|
factory_path |
A string representing the path to the factory function in the format of "module.submodule:factory_name".
TYPE:
|
RETURNS | DESCRIPTION |
---|---|
Self
|
An instance of the class initialized with the provided factory function. |
RAISES | DESCRIPTION |
---|---|
InvalidConfigError
|
The factory can't be found or the object returned is not a subclass of the current class. |
Source code in packages/ragbits-core/src/ragbits/core/utils/config_handling.py
from_config
classmethod
#
Creates and returns an instance of the DocumentSearch class from the given configuration.
PARAMETER | DESCRIPTION |
---|---|
config |
A configuration object containing the configuration for initializing the DocumentSearch instance.
TYPE:
|
RETURNS | DESCRIPTION |
---|---|
DocumentSearch
|
An initialized instance of the DocumentSearch class.
TYPE:
|
RAISES | DESCRIPTION |
---|---|
ValidationError
|
If the configuration doesn't follow the expected format. |
InvalidConfigError
|
If one of the specified classes can't be found or is not the correct type. |
Source code in packages/ragbits-document-search/src/ragbits/document_search/_main.py
subclass_from_defaults
classmethod
#
subclass_from_defaults(defaults: CoreConfig, factory_path_override: str | None = None, yaml_path_override: Path | None = None) -> Self
Tries to create an instance by looking at default configuration file, and default factory function. Takes optional overrides for both, which takes a higher precedence.
PARAMETER | DESCRIPTION |
---|---|
defaults |
The CoreConfig instance containing default factory and configuration details.
TYPE:
|
factory_path_override |
A string representing the path to the factory function in the format of "module.submodule:factory_name".
TYPE:
|
yaml_path_override |
A string representing the path to the YAML file containing the Ragstack instance configuration. Looks for the configuration under the key "document_search", and if not found, instantiates the class with the default configuration for each component.
TYPE:
|
RAISES | DESCRIPTION |
---|---|
InvalidConfigError
|
If the default factory or configuration can't be found. |
Source code in packages/ragbits-document-search/src/ragbits/document_search/_main.py
search
async
#
search(query: str, config: SearchConfig | None = None) -> Sequence[Element]
Search for the most relevant chunks for a query.
PARAMETER | DESCRIPTION |
---|---|
query |
The query to search for.
TYPE:
|
config |
The search configuration.
TYPE:
|
RETURNS | DESCRIPTION |
---|---|
Sequence[Element]
|
A list of chunks. |
Source code in packages/ragbits-document-search/src/ragbits/document_search/_main.py
ingest
async
#
ingest(documents: str | Sequence[DocumentMeta | Document | Source], document_processor: BaseProvider | None = None) -> None
Ingest documents into the search index.
PARAMETER | DESCRIPTION |
---|---|
documents |
Either:
- A sequence of |
document_processor |
The document processor to use. If not provided, the document processor will be determined based on the document metadata.
TYPE:
|
Source code in packages/ragbits-document-search/src/ragbits/document_search/_main.py
insert_elements
async
#
insert_elements(elements: list[Element]) -> None
Insert Elements into the vector store.
PARAMETER | DESCRIPTION |
---|---|
elements |
The list of Elements to insert.
TYPE:
|