The Index class is a core abstraction that represents a collection of spaces, enabling you to query your data efficiently. An index organizes multiple spaces and their associated fields to create a unified querying interface.

Constructor

Create a new index with the specified configuration.
Index(
    spaces,
    fields=None,
    fields_to_exclude=None,
    effects=None,
    max_age=None,
    max_count=None,
    temperature=0.5,
    event_influence=0.5,
    time_decay_floor=1.0
)

Parameters

spaces
Space | list[Space]
required
The space or list of spaces to include in the index. These define the vector representations for your data.
fields
SchemaField | list[SchemaField]
default:"None"
The field or list of fields to be indexed. If not specified, all fields from the spaces will be indexed.
fields_to_exclude
SchemaField | list[SchemaField]
default:"None"
Excludes fields from storage and query results.
effects
Effect | list[Effect]
default:"None"
A list of conditional interactions within a space that modify vector behavior based on events or conditions.
max_age
datetime.timedelta
default:"None"
Maximum age of events to be considered. Older events will be filtered out when specified. Use this to implement time-based data retention.
max_count
int
default:"None"
Batch system only: Restricts how many events should be considered based on their age. Does not apply to real-time systems.
temperature
float
default:"0.5"
Controls event contribution during aggregation. Must be between 0 and 1. - Values closer to 0: Give more weight to stored event aggregates - Values closer to 1: Give more weight to new events being added
event_influence
float
default:"0.5"
Controls how much the final aggregated event vector influences the base vector. Must be between 0 and 1. - Values closer to 0: Keep results closer to the base vector - Values closer to 1: Make results more influenced by aggregated events
time_decay_floor
float
default:"1.0"
Controls the time decay curve for event weights. - Higher values: Create more gradual decay over time - Lower values: Create steeper decay curves

Example

from superlinked import Index
from datetime import timedelta

# Create an index with multiple spaces
product_index = Index(
    spaces=[text_space, category_space, price_space],
    fields_to_exclude=[schema.internal_id, schema.created_by],
    max_age=timedelta(days=30),
    temperature=0.7,
    event_influence=0.6
)
The index will raise an InvalidInputException if no spaces are provided.

Properties

non_nullable_fields

non_nullable_fields: Sequence[SchemaField]
Returns the sequence of schema fields that cannot contain null values in this index.

schemas

schemas: Sequence[SchemaObject]
Returns the sequence of schema objects associated with this index’s spaces.

Methods

has_schema()

Check if a given schema is used as input to any of the index’s spaces.
has_schema(schema: SchemaObject) -> bool
schema
SchemaObject
required
The schema object to check for presence in the index.
Returns: bool - True if the schema is found, False otherwise.

Example

# Check if a schema is part of the index
if product_index.has_schema(product_schema):
    print("Product schema is indexed")

has_space()

Check if a given space is present in the index.
has_space(space: Space) -> bool
space
Space
required
The space object to check for presence in the index.
Returns: bool - True if the space is found, False otherwise.

Example

# Verify a space is included in the index
if product_index.has_space(text_space):
    print("Text space is part of this index")

Best Practices

Index Design: Group related spaces together in a single index to enable efficient multi-dimensional queries. For example, combine text, categorical, and numerical spaces for product data.
Event Parameters: Start with default values for temperature and event_influence (0.5), then adjust based on your specific use case and data patterns.
Batch vs Real-time: The max_count parameter only affects batch processing systems and has no impact on real-time query performance.