Document Search#
ragbits.document_search.DocumentSearch
#
DocumentSearch(vector_store: VectorStore, query_rephraser: QueryRephraser | None = None, reranker: Reranker | None = None, ingest_strategy: IngestStrategy | None = None, parser_router: DocumentParserRouter | None = None, enricher_router: ElementEnricherRouter | None = None)
Bases: WithConstructionConfig
A main entrypoint to the DocumentSearch functionality.
It provides methods for both ingestion and retrieval.
Retrieval:
1. Uses QueryRephraser to rephrase the query.
2. Uses VectorStore to retrieve the most relevant chunks.
3. Uses Reranker to rerank the chunks.
Source code in packages/ragbits-document-search/src/ragbits/document_search/_main.py
query_rephraser
instance-attribute
#
query_rephraser: QueryRephraser = query_rephraser or NoopQueryRephraser()
ingest_strategy
instance-attribute
#
ingest_strategy: IngestStrategy = ingest_strategy or SequentialIngestStrategy()
parser_router
instance-attribute
#
parser_router: DocumentParserRouter = parser_router or DocumentParserRouter()
enricher_router
instance-attribute
#
enricher_router: ElementEnricherRouter = enricher_router or ElementEnricherRouter()
subclass_from_config
classmethod
#
Initializes the class with the provided configuration. May return a subclass of the class, if requested by the configuration.
PARAMETER | DESCRIPTION |
---|---|
config |
A model containing configuration details for the class.
TYPE:
|
RETURNS | DESCRIPTION |
---|---|
Self
|
An instance of the class initialized with the provided configuration. |
RAISES | DESCRIPTION |
---|---|
InvalidConfigError
|
The class can't be found or is not a subclass of the current class. |
Source code in packages/ragbits-core/src/ragbits/core/utils/config_handling.py
subclass_from_factory
classmethod
#
Creates the class using the provided factory function. May return a subclass of the class, if requested by the factory.
PARAMETER | DESCRIPTION |
---|---|
factory_path |
A string representing the path to the factory function in the format of "module.submodule:factory_name".
TYPE:
|
RETURNS | DESCRIPTION |
---|---|
Self
|
An instance of the class initialized with the provided factory function. |
RAISES | DESCRIPTION |
---|---|
InvalidConfigError
|
The factory can't be found or the object returned is not a subclass of the current class. |
Source code in packages/ragbits-core/src/ragbits/core/utils/config_handling.py
from_config
classmethod
#
Creates and returns an instance of the DocumentSearch class from the given configuration.
PARAMETER | DESCRIPTION |
---|---|
config |
A configuration object containing the configuration for initializing the DocumentSearch instance.
TYPE:
|
RETURNS | DESCRIPTION |
---|---|
DocumentSearch
|
An initialized instance of the DocumentSearch class.
TYPE:
|
RAISES | DESCRIPTION |
---|---|
ValidationError
|
If the configuration doesn't follow the expected format. |
InvalidConfigError
|
If one of the specified classes can't be found or is not the correct type. |
Source code in packages/ragbits-document-search/src/ragbits/document_search/_main.py
preferred_subclass
classmethod
#
preferred_subclass(config: CoreConfig, factory_path_override: str | None = None, yaml_path_override: Path | None = None) -> Self
Tries to create an instance by looking at project's component prefferences, either from YAML or from the factory. Takes optional overrides for both, which takes a higher precedence.
PARAMETER | DESCRIPTION |
---|---|
config |
The CoreConfig instance containing preferred factory and configuration details.
TYPE:
|
factory_path_override |
A string representing the path to the factory function in the format of "module.submodule:factory_name".
TYPE:
|
yaml_path_override |
A string representing the path to the YAML file containing the Ragstack instance configuration. Looks for the configuration under the key "document_search", and if not found, instantiates the class with the preferred configuration for each component.
TYPE:
|
RAISES | DESCRIPTION |
---|---|
InvalidConfigError
|
If the default factory or configuration can't be found. |
Source code in packages/ragbits-document-search/src/ragbits/document_search/_main.py
search
async
#
search(query: str, config: SearchConfig | None = None) -> Sequence[Element]
Search for the most relevant chunks for a query.
PARAMETER | DESCRIPTION |
---|---|
query |
The query to search for.
TYPE:
|
config |
The search configuration.
TYPE:
|
RETURNS | DESCRIPTION |
---|---|
Sequence[Element]
|
A list of chunks. |
Source code in packages/ragbits-document-search/src/ragbits/document_search/_main.py
ingest
async
#
ingest(documents: str | Iterable[DocumentMeta | Document | Source]) -> IngestExecutionResult
Ingest documents into the search index.
PARAMETER | DESCRIPTION |
---|---|
documents |
Either:
- A iterable of
TYPE:
|
RETURNS | DESCRIPTION |
---|---|
IngestExecutionResult
|
The ingest execution result. |