Space
class serves as the abstract foundation for all vector space implementations in Superlinked. It defines the interface for transforming data into vector representations that enable similarity search and retrieval operations.
Constructor
Create a new space with the specified fields and type configuration.Parameters
The sequence of schema fields that this space will process. These fields
define what data will be transformed into vectors.
The type specification for the space, defining the expected input and output
data types for the vector transformation.
The
Space
class is abstract and cannot be instantiated directly. Use
concrete implementations like TextSimilaritySpace
,
CategoricalSimilaritySpace
, or NumberSpace
.Properties
allow_similar_clause
True
, the space can be used in similarity searches and “looks like” queries.
annotation
length
Space Types
Superlinked provides several specialized space implementations for different data types:Text Processing
Text Processing
- TextSimilaritySpace - For semantic text similarity using embedding models
Categorical Data
Categorical Data
Numerical Data
Numerical Data
- NumberSpace - For numerical data with min-max or similarity-based transformations
Time-Based Data
Time-Based Data
- RecencySpace - For time-based decay and recency scoring
Images
Images
- ImageSpace - For image similarity using vision models
Custom Transformations
Custom Transformations
- CustomSpace - For custom vector transformations
Vector Transformation Pipeline
Data Flow
- Input Processing: Raw data from schema fields is ingested
- Type Validation: Data types are validated against the space configuration
- Transformation: Data is transformed into vector representations
- Normalization: Vectors are normalized according to space requirements
- Output: Standardized vectors ready for indexing and similarity search
Example Workflow
Design Patterns
Composition Pattern
Spaces can be combined in indexes for multi-dimensional similarity:Strategy Pattern
Different space types implement the same interface with different strategies:- TextSimilaritySpace: Uses embedding models for semantic similarity
- CategoricalSimilaritySpace: Uses one-hot encoding or learned embeddings
- NumberSpace: Uses normalization and binning strategies
Interface Contracts
HasTransformationConfig
Provides configuration for how data is transformed into vectors:HasLength
Defines the dimensionality of the resulting vectors:HasSpaceFieldSet (for some implementations)
Manages the fields and their processing within the space:Use Cases
Semantic Search
Create spaces for finding semantically similar content:Multi-Modal Search
Combine different data types for comprehensive search:Recommendation Systems
Build recommendation engines using multiple signal types:Best Practices
Space Selection: Choose the appropriate space type based on your data
characteristics. Use
TextSimilaritySpace
for unstructured text,
CategoricalSimilaritySpace
for discrete categories, and NumberSpace
for
continuous numerical values.Dimensionality: Consider the trade-off between vector dimensionality and
performance. Higher dimensions can capture more nuanced relationships but
require more computational resources.
Type Consistency: Ensure your schema field types match the expected input
types for your chosen space. Type mismatches will cause runtime errors.
Combination Strategy: When using multiple spaces in an index, consider how
they will be combined. Different space types may require different weighting
strategies for optimal results.