InMemoryExecutor - Superlinked

The InMemoryExecutor provides an in-memory implementation for executing queries against indexed data. It creates optimized vector spaces based on the provided indices and allows querying data from in-memory sources, making it ideal for development, testing, and prototyping scenarios.

Constructor

InMemoryExecutor(sources, indices, context_data=None)

Parameters

sources

InMemorySource | Sequence[InMemorySource]

required

One or more in-memory data sources to query against. Can be a single source or sequence of sources. All sources must be InMemorySource instances.

indices

Index | Sequence[Index]

required

One or more indices that define the vector spaces. Can be a single index or sequence of indices. These indices determine how the data will be organized and searched.

context_data

Mapping[str, Mapping[str, ContextValue]] | None

default:"None"

Additional context data for execution. The outer mapping key represents the context name, inner mapping contains key-value pairs for that context. Provides runtime configuration and environment-specific settings.

Inheritance

The InMemoryExecutor extends the executor hierarchy for in-memory processing: Inheritance Chain:

InMemoryExecutor
→ InteractiveExecutor
→ Executor
→ ABC
→ Generic

This inheritance provides:

Base executor functionality from Executor
Interactive processing capabilities from InteractiveExecutor
Abstract base class support from ABC
Generic type support for type safety

Methods

run()

Execute the in-memory pipeline and return a configured InMemoryApp instance.

run() -> InMemoryApp

Returns: An instance of InMemoryApp that can accept queries and provide immediate results from in-memory data. The run method:

Initializes all provided sources with their data
Creates optimized vector spaces from the indices
Configures the in-memory vector database
Returns a fully configured application ready for querying

Key Features

In-Memory Processing

Fast Access: All data stored in RAM for immediate access
No I/O Overhead: Eliminates disk and network latency
Immediate Results: Queries execute and return results instantly
Simple Setup: No external database configuration required

Development Optimization

Quick Iteration: Rapid development and testing cycles
Easy Debugging: Full control over data and execution flow
Flexible Testing: Easy to modify datasets and test scenarios
Minimal Dependencies: Self-contained execution environment

Use Cases

Development and Testing

Perfect for initial development phases:

Prototype Development: Quick testing of vector search concepts
Unit Testing: Isolated testing of search functionality
Integration Testing: Testing complete pipelines with controlled data
Algorithm Development: Experimenting with different vector spaces and configurations

Educational and Learning

Ideal for learning and demonstration:

Tutorials: Teaching vector search concepts with immediate feedback
Workshops: Hands-on learning with instant results
Proof of Concepts: Demonstrating search capabilities to stakeholders
Research: Academic research with controlled datasets

Small-Scale Applications

Suitable for applications with limited data requirements:

Personal Projects: Individual applications with small datasets
Internal Tools: Company tools with limited search requirements
Demos: Live demonstrations with pre-loaded data
MVPs: Minimum viable products for initial validation

Performance Characteristics

Advantages

Speed: Fastest possible query execution due to in-memory storage
Simplicity: No external dependencies or complex configuration
Consistency: Predictable performance without external factors
Development Speed: Rapid iteration and testing capabilities

Limitations

Memory Constraints: Limited by available system RAM
No Persistence: Data lost on application restart
Scalability: Not suitable for large datasets or high concurrency
Single Instance: Cannot distribute across multiple servers

Data Management

Memory Usage

Memory Monitoring: Monitor memory usage carefully as all data, vectors, and indices are stored in RAM. Large datasets can quickly consume available memory.

Data Lifecycle

Initialization: Data loaded into memory during startup
Runtime: All operations happen in memory
Shutdown: All data lost when application stops
No Persistence: Data must be reloaded on each restart

Best Practices

Development Workflow

Start Small: Begin with small datasets to validate your approach before scaling to production solutions.

Memory Management

Dataset Size: Keep datasets reasonably sized (typically under 1GB) for optimal performance and to prevent out-of-memory errors.

Testing Strategy

Isolated Tests: Use separate InMemoryExecutor instances for different test scenarios to ensure test isolation and prevent data contamination.

Migration Path

Production Transition: Design your application to easily migrate from InMemoryExecutor to production executors like RestExecutor when ready to deploy.

Context Configuration

The context_data parameter allows fine-tuning of execution behavior:

Performance Settings: Configure memory allocation and processing parameters
Debug Information: Enable detailed logging and debugging features
Environment Variables: Set environment-specific configurations
Feature Flags: Enable or disable specific functionality during development

Integration Pattern

InMemoryExecutor works seamlessly with the Superlinked development stack:

InMemorySource: Provides data input for testing and development
Various Indices: Supports all index types for experimentation
InMemoryVectorDatabase: Automatically configured for in-memory storage
InMemoryApp: Returns the appropriate application instance

Application Lifecycle

The InMemoryExecutor manages a simple lifecycle:

Initialization: Configure sources, indices, and context
Data Loading: Load all data into memory during startup
Vector Processing: Generate and store vectors in memory
Query Execution: Process queries with immediate results
Shutdown: Clean termination with data loss

The InMemoryExecutor provides the foundation for rapid development and testing while maintaining the same interface patterns used in production executors, ensuring smooth transitions as your application evolves.

Reference

​Constructor

​Parameters

​Inheritance

​Methods

​run()

​Key Features

​In-Memory Processing

​Development Optimization

​Use Cases

​Development and Testing

​Educational and Learning

​Small-Scale Applications

​Performance Characteristics

​Advantages

​Limitations

​Data Management

​Memory Usage

​Data Lifecycle

​Best Practices

​Development Workflow

​Memory Management

​Testing Strategy

​Migration Path

​Context Configuration

​Integration Pattern

​Application Lifecycle

Constructor

Parameters

Inheritance

Methods

run()

Key Features

In-Memory Processing

Development Optimization

Use Cases

Development and Testing

Educational and Learning

Small-Scale Applications

Performance Characteristics

Advantages

Limitations

Data Management

Memory Usage

Data Lifecycle

Best Practices

Development Workflow

Memory Management

Testing Strategy

Migration Path

Context Configuration

Integration Pattern

Application Lifecycle