Execution Strategies#
ragbits.document_search.ingestion.processor_strategies.ProcessingExecutionStrategy
#
Bases: ABC
Base class for processing execution strategies that define how documents are processed to become elements.
Processing execution strategies are responsible for processing documents using the appropriate processor, which means that they don't usually determine the business logic of the processing itself, but rather how the processing is executed.
from_config
classmethod
#
Creates and returns an instance of the ProcessingExecutionStrategy subclass from the given configuration.
PARAMETER | DESCRIPTION |
---|---|
config |
A dictionary containing the configuration for initializing the instance.
TYPE:
|
RETURNS | DESCRIPTION |
---|---|
Self
|
An initialized instance of the ProcessingExecutionStrategy subclass. |
Source code in packages/ragbits-document-search/src/ragbits/document_search/ingestion/processor_strategies/base.py
to_document_meta
async
staticmethod
#
Convert a document, document meta or source to a document meta object.
PARAMETER | DESCRIPTION |
---|---|
document |
The document to convert. |
RETURNS | DESCRIPTION |
---|---|
DocumentMeta
|
The document meta object. |
Source code in packages/ragbits-document-search/src/ragbits/document_search/ingestion/processor_strategies/base.py
process_document
async
#
process_document(document: DocumentMeta | Document | Source, processor_router: DocumentProcessorRouter, processor_overwrite: BaseProvider | None = None) -> list[Element]
Process a single document and return the elements.
PARAMETER | DESCRIPTION |
---|---|
document |
The document to process. |
processor_router |
The document processor router to use.
TYPE:
|
processor_overwrite |
Forces the use of a specific processor, instead of the one provided by the router.
TYPE:
|
RETURNS | DESCRIPTION |
---|---|
list[Element]
|
A list of elements. |
Source code in packages/ragbits-document-search/src/ragbits/document_search/ingestion/processor_strategies/base.py
process_documents
abstractmethod
async
#
process_documents(documents: Sequence[DocumentMeta | Document | Source], processor_router: DocumentProcessorRouter, processor_overwrite: BaseProvider | None = None) -> list[Element]
Process documents using the given processor and return the resulting elements.
PARAMETER | DESCRIPTION |
---|---|
documents |
The documents to process. |
processor_router |
The document processor router to use.
TYPE:
|
processor_overwrite |
Forces the use of a specific processor, instead of the one provided by the router.
TYPE:
|
RETURNS | DESCRIPTION |
---|---|
list[Element]
|
A list of elements. |
Source code in packages/ragbits-document-search/src/ragbits/document_search/ingestion/processor_strategies/base.py
ragbits.document_search.ingestion.processor_strategies.SequentialProcessing
#
Bases: ProcessingExecutionStrategy
A processing execution strategy that processes documents in sequence, one at a time.
from_config
classmethod
#
Creates and returns an instance of the ProcessingExecutionStrategy subclass from the given configuration.
PARAMETER | DESCRIPTION |
---|---|
config |
A dictionary containing the configuration for initializing the instance.
TYPE:
|
RETURNS | DESCRIPTION |
---|---|
Self
|
An initialized instance of the ProcessingExecutionStrategy subclass. |
Source code in packages/ragbits-document-search/src/ragbits/document_search/ingestion/processor_strategies/base.py
to_document_meta
async
staticmethod
#
Convert a document, document meta or source to a document meta object.
PARAMETER | DESCRIPTION |
---|---|
document |
The document to convert. |
RETURNS | DESCRIPTION |
---|---|
DocumentMeta
|
The document meta object. |
Source code in packages/ragbits-document-search/src/ragbits/document_search/ingestion/processor_strategies/base.py
process_document
async
#
process_document(document: DocumentMeta | Document | Source, processor_router: DocumentProcessorRouter, processor_overwrite: BaseProvider | None = None) -> list[Element]
Process a single document and return the elements.
PARAMETER | DESCRIPTION |
---|---|
document |
The document to process. |
processor_router |
The document processor router to use.
TYPE:
|
processor_overwrite |
Forces the use of a specific processor, instead of the one provided by the router.
TYPE:
|
RETURNS | DESCRIPTION |
---|---|
list[Element]
|
A list of elements. |
Source code in packages/ragbits-document-search/src/ragbits/document_search/ingestion/processor_strategies/base.py
process_documents
async
#
process_documents(documents: Sequence[DocumentMeta | Document | Source], processor_router: DocumentProcessorRouter, processor_overwrite: BaseProvider | None = None) -> list[Element]
Process documents using the given processor and return the resulting elements.
PARAMETER | DESCRIPTION |
---|---|
documents |
The documents to process. |
processor_router |
The document processor router to use.
TYPE:
|
processor_overwrite |
Forces the use of a specific processor, instead of the one provided by the router.
TYPE:
|
RETURNS | DESCRIPTION |
---|---|
list[Element]
|
A list of elements. |
RETURNS | DESCRIPTION |
---|---|
list[Element]
|
A list of elements. |
Source code in packages/ragbits-document-search/src/ragbits/document_search/ingestion/processor_strategies/sequential.py
ragbits.document_search.ingestion.processor_strategies.BatchedAsyncProcessing
#
Bases: ProcessingExecutionStrategy
A processing execution strategy that processes documents asynchronously in batches.
Initialize the BatchedAsyncProcessing instance.
PARAMETER | DESCRIPTION |
---|---|
batch_size |
The size of the batch to process documents in.
TYPE:
|
Source code in packages/ragbits-document-search/src/ragbits/document_search/ingestion/processor_strategies/batched.py
from_config
classmethod
#
Creates and returns an instance of the ProcessingExecutionStrategy subclass from the given configuration.
PARAMETER | DESCRIPTION |
---|---|
config |
A dictionary containing the configuration for initializing the instance.
TYPE:
|
RETURNS | DESCRIPTION |
---|---|
Self
|
An initialized instance of the ProcessingExecutionStrategy subclass. |
Source code in packages/ragbits-document-search/src/ragbits/document_search/ingestion/processor_strategies/base.py
to_document_meta
async
staticmethod
#
Convert a document, document meta or source to a document meta object.
PARAMETER | DESCRIPTION |
---|---|
document |
The document to convert. |
RETURNS | DESCRIPTION |
---|---|
DocumentMeta
|
The document meta object. |
Source code in packages/ragbits-document-search/src/ragbits/document_search/ingestion/processor_strategies/base.py
process_document
async
#
process_document(document: DocumentMeta | Document | Source, processor_router: DocumentProcessorRouter, processor_overwrite: BaseProvider | None = None) -> list[Element]
Process a single document and return the elements.
PARAMETER | DESCRIPTION |
---|---|
document |
The document to process. |
processor_router |
The document processor router to use.
TYPE:
|
processor_overwrite |
Forces the use of a specific processor, instead of the one provided by the router.
TYPE:
|
RETURNS | DESCRIPTION |
---|---|
list[Element]
|
A list of elements. |
Source code in packages/ragbits-document-search/src/ragbits/document_search/ingestion/processor_strategies/base.py
process_documents
async
#
process_documents(documents: Sequence[DocumentMeta | Document | Source], processor_router: DocumentProcessorRouter, processor_overwrite: BaseProvider | None = None) -> list[Element]
Process documents using the given processor and return the resulting elements.
PARAMETER | DESCRIPTION |
---|---|
documents |
The documents to process. |
processor_router |
The document processor router to use.
TYPE:
|
processor_overwrite |
Forces the use of a specific processor, instead of the one provided by the router.
TYPE:
|
RETURNS | DESCRIPTION |
---|---|
list[Element]
|
A list of elements. |
RETURNS | DESCRIPTION |
---|---|
list[Element]
|
A list of elements. |