Execution Strategies#
ragbits.document_search.ingestion.processor_strategies.ProcessingExecutionStrategy
#
Bases: WithConstructionConfig
, ABC
Base class for processing execution strategies that define how documents are processed to become elements.
Processing execution strategies are responsible for processing documents using the appropriate processor, which means that they don't usually determine the business logic of the processing itself, but rather how the processing is executed.
subclass_from_config
classmethod
#
Initializes the class with the provided configuration. May return a subclass of the class, if requested by the configuration.
PARAMETER | DESCRIPTION |
---|---|
config |
A model containing configuration details for the class.
TYPE:
|
RETURNS | DESCRIPTION |
---|---|
Self
|
An instance of the class initialized with the provided configuration. |
RAISES | DESCRIPTION |
---|---|
InvalidConfigError
|
The class can't be found or is not a subclass of the current class. |
Source code in packages/ragbits-core/src/ragbits/core/utils/config_handling.py
subclass_from_factory
classmethod
#
Creates the class using the provided factory function. May return a subclass of the class, if requested by the factory.
PARAMETER | DESCRIPTION |
---|---|
factory_path |
A string representing the path to the factory function in the format of "module.submodule:factory_name".
TYPE:
|
RETURNS | DESCRIPTION |
---|---|
Self
|
An instance of the class initialized with the provided factory function. |
RAISES | DESCRIPTION |
---|---|
InvalidConfigError
|
The factory can't be found or the object returned is not a subclass of the current class. |
Source code in packages/ragbits-core/src/ragbits/core/utils/config_handling.py
subclass_from_defaults
classmethod
#
subclass_from_defaults(defaults: CoreConfig, factory_path_override: str | None = None, yaml_path_override: Path | None = None) -> Self
Tries to create an instance by looking at default configuration file, and default factory function. Takes optional overrides for both, which takes a higher precedence.
PARAMETER | DESCRIPTION |
---|---|
defaults |
The CoreConfig instance containing default factory and configuration details.
TYPE:
|
factory_path_override |
A string representing the path to the factory function in the format of "module.submodule:factory_name".
TYPE:
|
yaml_path_override |
A string representing the path to the YAML file containing the Ragstack instance configuration.
TYPE:
|
RAISES | DESCRIPTION |
---|---|
InvalidConfigError
|
If the default factory or configuration can't be found. |
Source code in packages/ragbits-core/src/ragbits/core/utils/config_handling.py
from_config
classmethod
#
Initializes the class with the provided configuration.
PARAMETER | DESCRIPTION |
---|---|
config |
A dictionary containing configuration details for the class.
TYPE:
|
RETURNS | DESCRIPTION |
---|---|
Self
|
An instance of the class initialized with the provided configuration. |
Source code in packages/ragbits-core/src/ragbits/core/utils/config_handling.py
to_document_meta
async
staticmethod
#
Convert a document, document meta or source to a document meta object.
PARAMETER | DESCRIPTION |
---|---|
document |
The document to convert. |
RETURNS | DESCRIPTION |
---|---|
DocumentMeta
|
The document meta object. |
Source code in packages/ragbits-document-search/src/ragbits/document_search/ingestion/processor_strategies/base.py
process_document
async
#
process_document(document: DocumentMeta | Document | Source, processor_router: DocumentProcessorRouter, processor_overwrite: BaseProvider | None = None) -> list[Element]
Process a single document and return the elements.
PARAMETER | DESCRIPTION |
---|---|
document |
The document to process. |
processor_router |
The document processor router to use.
TYPE:
|
processor_overwrite |
Forces the use of a specific processor, instead of the one provided by the router.
TYPE:
|
RETURNS | DESCRIPTION |
---|---|
list[Element]
|
A list of elements. |
Source code in packages/ragbits-document-search/src/ragbits/document_search/ingestion/processor_strategies/base.py
process_documents
abstractmethod
async
#
process_documents(documents: Sequence[DocumentMeta | Document | Source], processor_router: DocumentProcessorRouter, processor_overwrite: BaseProvider | None = None) -> list[Element]
Process documents using the given processor and return the resulting elements.
PARAMETER | DESCRIPTION |
---|---|
documents |
The documents to process. |
processor_router |
The document processor router to use.
TYPE:
|
processor_overwrite |
Forces the use of a specific processor, instead of the one provided by the router.
TYPE:
|
RETURNS | DESCRIPTION |
---|---|
list[Element]
|
A list of elements. |
Source code in packages/ragbits-document-search/src/ragbits/document_search/ingestion/processor_strategies/base.py
ragbits.document_search.ingestion.processor_strategies.SequentialProcessing
#
Bases: ProcessingExecutionStrategy
A processing execution strategy that processes documents in sequence, one at a time.
subclass_from_config
classmethod
#
Initializes the class with the provided configuration. May return a subclass of the class, if requested by the configuration.
PARAMETER | DESCRIPTION |
---|---|
config |
A model containing configuration details for the class.
TYPE:
|
RETURNS | DESCRIPTION |
---|---|
Self
|
An instance of the class initialized with the provided configuration. |
RAISES | DESCRIPTION |
---|---|
InvalidConfigError
|
The class can't be found or is not a subclass of the current class. |
Source code in packages/ragbits-core/src/ragbits/core/utils/config_handling.py
subclass_from_factory
classmethod
#
Creates the class using the provided factory function. May return a subclass of the class, if requested by the factory.
PARAMETER | DESCRIPTION |
---|---|
factory_path |
A string representing the path to the factory function in the format of "module.submodule:factory_name".
TYPE:
|
RETURNS | DESCRIPTION |
---|---|
Self
|
An instance of the class initialized with the provided factory function. |
RAISES | DESCRIPTION |
---|---|
InvalidConfigError
|
The factory can't be found or the object returned is not a subclass of the current class. |
Source code in packages/ragbits-core/src/ragbits/core/utils/config_handling.py
subclass_from_defaults
classmethod
#
subclass_from_defaults(defaults: CoreConfig, factory_path_override: str | None = None, yaml_path_override: Path | None = None) -> Self
Tries to create an instance by looking at default configuration file, and default factory function. Takes optional overrides for both, which takes a higher precedence.
PARAMETER | DESCRIPTION |
---|---|
defaults |
The CoreConfig instance containing default factory and configuration details.
TYPE:
|
factory_path_override |
A string representing the path to the factory function in the format of "module.submodule:factory_name".
TYPE:
|
yaml_path_override |
A string representing the path to the YAML file containing the Ragstack instance configuration.
TYPE:
|
RAISES | DESCRIPTION |
---|---|
InvalidConfigError
|
If the default factory or configuration can't be found. |
Source code in packages/ragbits-core/src/ragbits/core/utils/config_handling.py
from_config
classmethod
#
Initializes the class with the provided configuration.
PARAMETER | DESCRIPTION |
---|---|
config |
A dictionary containing configuration details for the class.
TYPE:
|
RETURNS | DESCRIPTION |
---|---|
Self
|
An instance of the class initialized with the provided configuration. |
Source code in packages/ragbits-core/src/ragbits/core/utils/config_handling.py
to_document_meta
async
staticmethod
#
Convert a document, document meta or source to a document meta object.
PARAMETER | DESCRIPTION |
---|---|
document |
The document to convert. |
RETURNS | DESCRIPTION |
---|---|
DocumentMeta
|
The document meta object. |
Source code in packages/ragbits-document-search/src/ragbits/document_search/ingestion/processor_strategies/base.py
process_document
async
#
process_document(document: DocumentMeta | Document | Source, processor_router: DocumentProcessorRouter, processor_overwrite: BaseProvider | None = None) -> list[Element]
Process a single document and return the elements.
PARAMETER | DESCRIPTION |
---|---|
document |
The document to process. |
processor_router |
The document processor router to use.
TYPE:
|
processor_overwrite |
Forces the use of a specific processor, instead of the one provided by the router.
TYPE:
|
RETURNS | DESCRIPTION |
---|---|
list[Element]
|
A list of elements. |
Source code in packages/ragbits-document-search/src/ragbits/document_search/ingestion/processor_strategies/base.py
process_documents
async
#
process_documents(documents: Sequence[DocumentMeta | Document | Source], processor_router: DocumentProcessorRouter, processor_overwrite: BaseProvider | None = None) -> list[Element]
Process documents using the given processor and return the resulting elements.
PARAMETER | DESCRIPTION |
---|---|
documents |
The documents to process. |
processor_router |
The document processor router to use.
TYPE:
|
processor_overwrite |
Forces the use of a specific processor, instead of the one provided by the router.
TYPE:
|
RETURNS | DESCRIPTION |
---|---|
list[Element]
|
A list of elements. |
RETURNS | DESCRIPTION |
---|---|
list[Element]
|
A list of elements. |
Source code in packages/ragbits-document-search/src/ragbits/document_search/ingestion/processor_strategies/sequential.py
ragbits.document_search.ingestion.processor_strategies.BatchedAsyncProcessing
#
Bases: ProcessingExecutionStrategy
A processing execution strategy that processes documents asynchronously in batches.
Initialize the BatchedAsyncProcessing instance.
PARAMETER | DESCRIPTION |
---|---|
batch_size |
The size of the batch to process documents in.
TYPE:
|
Source code in packages/ragbits-document-search/src/ragbits/document_search/ingestion/processor_strategies/batched.py
subclass_from_config
classmethod
#
Initializes the class with the provided configuration. May return a subclass of the class, if requested by the configuration.
PARAMETER | DESCRIPTION |
---|---|
config |
A model containing configuration details for the class.
TYPE:
|
RETURNS | DESCRIPTION |
---|---|
Self
|
An instance of the class initialized with the provided configuration. |
RAISES | DESCRIPTION |
---|---|
InvalidConfigError
|
The class can't be found or is not a subclass of the current class. |
Source code in packages/ragbits-core/src/ragbits/core/utils/config_handling.py
subclass_from_factory
classmethod
#
Creates the class using the provided factory function. May return a subclass of the class, if requested by the factory.
PARAMETER | DESCRIPTION |
---|---|
factory_path |
A string representing the path to the factory function in the format of "module.submodule:factory_name".
TYPE:
|
RETURNS | DESCRIPTION |
---|---|
Self
|
An instance of the class initialized with the provided factory function. |
RAISES | DESCRIPTION |
---|---|
InvalidConfigError
|
The factory can't be found or the object returned is not a subclass of the current class. |
Source code in packages/ragbits-core/src/ragbits/core/utils/config_handling.py
subclass_from_defaults
classmethod
#
subclass_from_defaults(defaults: CoreConfig, factory_path_override: str | None = None, yaml_path_override: Path | None = None) -> Self
Tries to create an instance by looking at default configuration file, and default factory function. Takes optional overrides for both, which takes a higher precedence.
PARAMETER | DESCRIPTION |
---|---|
defaults |
The CoreConfig instance containing default factory and configuration details.
TYPE:
|
factory_path_override |
A string representing the path to the factory function in the format of "module.submodule:factory_name".
TYPE:
|
yaml_path_override |
A string representing the path to the YAML file containing the Ragstack instance configuration.
TYPE:
|
RAISES | DESCRIPTION |
---|---|
InvalidConfigError
|
If the default factory or configuration can't be found. |
Source code in packages/ragbits-core/src/ragbits/core/utils/config_handling.py
from_config
classmethod
#
Initializes the class with the provided configuration.
PARAMETER | DESCRIPTION |
---|---|
config |
A dictionary containing configuration details for the class.
TYPE:
|
RETURNS | DESCRIPTION |
---|---|
Self
|
An instance of the class initialized with the provided configuration. |
Source code in packages/ragbits-core/src/ragbits/core/utils/config_handling.py
to_document_meta
async
staticmethod
#
Convert a document, document meta or source to a document meta object.
PARAMETER | DESCRIPTION |
---|---|
document |
The document to convert. |
RETURNS | DESCRIPTION |
---|---|
DocumentMeta
|
The document meta object. |
Source code in packages/ragbits-document-search/src/ragbits/document_search/ingestion/processor_strategies/base.py
process_document
async
#
process_document(document: DocumentMeta | Document | Source, processor_router: DocumentProcessorRouter, processor_overwrite: BaseProvider | None = None) -> list[Element]
Process a single document and return the elements.
PARAMETER | DESCRIPTION |
---|---|
document |
The document to process. |
processor_router |
The document processor router to use.
TYPE:
|
processor_overwrite |
Forces the use of a specific processor, instead of the one provided by the router.
TYPE:
|
RETURNS | DESCRIPTION |
---|---|
list[Element]
|
A list of elements. |
Source code in packages/ragbits-document-search/src/ragbits/document_search/ingestion/processor_strategies/base.py
process_documents
async
#
process_documents(documents: Sequence[DocumentMeta | Document | Source], processor_router: DocumentProcessorRouter, processor_overwrite: BaseProvider | None = None) -> list[Element]
Process documents using the given processor and return the resulting elements.
PARAMETER | DESCRIPTION |
---|---|
documents |
The documents to process. |
processor_router |
The document processor router to use.
TYPE:
|
processor_overwrite |
Forces the use of a specific processor, instead of the one provided by the router.
TYPE:
|
RETURNS | DESCRIPTION |
---|---|
list[Element]
|
A list of elements. |
RETURNS | DESCRIPTION |
---|---|
list[Element]
|
A list of elements. |