June 20, 2025
Framework v28.0.0 - v28.8.2 • Batch v3.1.0 - v3.11.0 • Server v1.29.5 - v1.34.1

Added

  • Custom OpenAI Base URL: Support for setting a custom base URL for OpenAI client configuration. This allows integration with custom OpenAI deployments. [Version: 28.8.0]
  • Model Dimension Cache Update: Update to the model dimension cache, including changes to the model name cleaning logic. [Version: 28.7.0]
  • Aiohttp Version Pinning: Pinned the version of aiohttp to ensure compatibility and stability in network requests. [Version: 28.6.0]
  • Modal OOM Safety: Enhanced modal with Out-Of-Memory (OOM) safety measures to handle GPU overflows efficiently. [Version: 28.5.0]
  • Redis Connection Pooling: Implementation of Redis connection pooling to enhance efficiency and performance. [Version: 28.4.0]
  • Asynchronous Query Operations: Transitioned query evaluations for Redis and NLQ to asynchronous operations, enhancing query handling efficiency. [Version: 28.3.0]
  • Configurable Redis Search: Introduced configurable search settings for Redis, including options for adjusting hybrid policy and batch sizes. [Version: 28.1.0]
  • Redis Hybrid Policy Support (Batch): Updates to support users changing the Redis hybrid policy.
  • Data Clustering (Batch): Added data clustering by join id during the embed stage.
  • Batch Processing Configuration: Configuration updates to enhance batch processing.
  • Async Query Handling (Server): Implemented async query handling to support batched queries in the server for improved performance.
  • Redis Hybrid Policy Headers (Server): Modified framework to include options for changing Redis hybrid policies with two new headers.

Changed

  • Redis Health Check Removal: Removed Redis’s health_check_interval to prevent timeout issues. [Version: 28.8.2]
  • Async Redis Client Update: Removed refresh from async Redis client to avoid recursion errors. [Version: 28.8.1]
  • Enhanced Async Handling: Added missing async for read_node_result, enhanced resilient async handling and Redis improvements. [Version: 28.5.1]
  • Coroutine Reuse Fix: Resolved RuntimeError preventing coroutine reuse. [Version: 28.4.1]
  • Model Download Stability: Mitigated model download issues impacting connection stability. [Version: 28.1.5]
  • Batch Worker Consistency: Addressed model partial download issues for batch workers to ensure consistency. [Version: 28.1.4]
  • Model Name Prefix Removal: Removed model name prefix from downloader to assist in accurate model retrieval. [Version: 28.1.3]
  • Embedding Engine Manager: Adjusted Embedding Engine Manager implementation to function as a true singleton. [Version: 28.1.2]
  • Half Precision Embedding: Only disable half precision embedding in CI and macOS environments. For migration, set SUPERLINKED_DISABLE_HALF_PRECISION_EMBEDDING to “True” if needed. [Version: 28.1.1]
  • Error Message Guidance: Rephrased error messages to guide users in adjusting their data ingestion configurations.
  • Performance Improvements (Batch): Critical updates to jobs.py for performance improvements.
  • Error Logging Enhancement (Batch): Improved error logging for non-recoverable job failures.
  • Dataframe and Vector Handling (Batch): Bugs fixed in dataframe joining and vector handling.
  • Performance Analysis (Batch): Removed unnecessary vector count from logs for better performance analysis.
  • Blob Loading Fix (Batch): Resolved Blob loading and bucket configuration issues.
  • Asyncio Loop Enforcement (Server): Enforced the use of asyncio type loop for better compatibility and performance.
  • Secret Detection Resolution (Server): Resolved secret detection issues to ensure secure handling of sensitive information. Updated to version 28.8.1.
  • GZip Middleware (Server): Added GZip middleware to enhance server performance by compressing HTTP responses.
June 6, 2025
Framework v25.1.0 - v28.0.0 • Batch v2.29.1 - v3.1.0 • Server v1.26.1 - v1.29.5

Breaking Changes

  • Framework (v27.0.0): Replaced Text type with Tag type for string fields in Redis backend, for optimal performance and query accuracy. See migration instructions below.
  • Framework (v26.0.0): All metadata fields (admin fields) are now indexed as UNF TAGs in Redis for improved latency—origin_id is now optionally returned, controlled via configuration. See migration instructions below.

Added

  • Boolean Field Type: Introduced Boolean field type to schema; now fully supported in Redis. [Version: 25.1.0]
  • Boolean Query Syntax: Introduced boolean.field.is_() and boolean.field.is_not_() to enable more ergonomic boolean query syntax. [Version: 27.5.0]
  • Boolean Type Demonstration: Added Boolean type demonstration in hard filter notebook; extended hard filtering capabilities in examples. [Version: 27.6.0]
  • EngineManager Abstraction: Introduced EngineManager abstraction to decouple model caching and embedding (enables multiple embedding handler configs, less memory usage, reduced serialization). [Version: 28.0.0]
  • Boolean Field Support (Batch): Added Boolean field support across schemas, improving integration with Redis databases. [Version: 25.1.0]
  • Error Code Filtering (Batch): Added error code filtering to reliably alert on non-recoverable Spark job failures, integrating with alerting workflows.

Changed

  • RedisVL Update: Updated RedisVL to resolve compatibility and build issues. [Version: 27.2.2]
  • Redis Hybrid Policy: Set sensible default values for Redis hybrid policy and batch size for improved search performance.
  • Dependency Management: Several Python dependencies (e.g., async-timeout, attrs, ml-dtypes) now marked as optional for improved redis/colab support; conditional dependencies fine-tuned for redis environment.
  • Batch Executor Refactor: Refactored batch executor to create embedding engine manager at the executor level rather than driver/worker, minimizing unnecessary serialization and memory usage. [Version: 28.0.0]

Fixed

  • Spark and Dataproc Settings (Batch): Added Spark and Dataproc settings to enable easier persistence and retrieval of executor logs for debugging and alerting.
  • Job Submission Logic (Batch): Improved job submission logic to exit and cancel Dataproc jobs on unrecoverable errors, minimizing cluster resource waste.
  • RedisVL Update (Server): Updated redisvl to the latest compatible version.

Migration Guide

Text-to-Tag migration in Redis

Re-ingest your data to use the new performance optimized types.

Admin field indexing change

Metadata and admin fields are now indexed as TAGs, not returned by default. If you previously relied on origin_id being returned in query responses, you must set QUERY_TO_RETURN_ORIGIN_ID=True in your settings to retain previous behavior.
May 25, 2025
Framework v23.2.0 - v25.1.0 • Batch v2.22.0 - v2.29.0 • Server v1.25.0 - v1.26.0

Breaking Changes

  • Framework (v24.0.0): Fixed embedding node node_id generation.
  • Framework (v25.0.0): Introduced full text field for Redis performance improvements.

Added

  • Boolean Field Type: Introduced boolean field type to schema for both Framework and Batch.
  • Qdrant gRPC Config: Added gRPC configurability for Qdrant.
  • Qdrant Search Algorithm Config: Implemented search algorithm configuration for Qdrant vector search.
  • Query Result Weight Explanation: Extended weight mechanics and explanation in query results.
  • Batch Data Sorting: Added sorting of parsed data by ID in Batch.

Changed

  • Redis Query Performance: Improved Redis query performance by using __schema__ as a tag.
  • Modal Embedding Latency: Enhanced Modal embeddings with a minimum latency setting.
  • Concurrent Processing Optimization: Optimized concurrent processing for blob loading, Hugging Face embeddings, and query DAG evaluation.
  • Log Argument Naming: Updated argument naming in framwork logs for clarity.

Fixed

  • Embedding Node ID Generation: Fixed embedding node with specific node_id generation.
  • Vector Operation Negative Filter: Resolved negative filter handling in vector operations.
  • Batch GPU/CPU OOM: Addressed batch GPU/CPU Out-Of-Memory issues.
  • Batch Negative Filter Indices: Ensured no None value for negative_filter_indices in Batch.
  • Batch Node ID Calculation: Adjusted batch to node_id calculation change.
  • Raw Schema Reading (Batch): Corrected raw schema reading in Batch.
  • Model Loading Memory Leak (Batch): Prevented memory leaks in model loading in Batch.
  • Parallel Download Connection Pooling (Batch): Ensured proper connection pooling for parallel downloads in Batch.
May 9, 2025
Framework v22.16.1 - v23.2.0 • Batch v2.18.1 - v2.22.0 • Server v1.24.2 - v1.25.0

Breaking Changes

  • Framework (v23.0.0):
    • Changed default of SUPERLINKED_CONCURRENT_BLOB_LOADING setting to True.
    • Introduced new setting: SUPERLINKED_CONCURRENT_EFFECT_EVALUATION which enables concurrent event processing (defaults to True).

Added

  • Concurrent Event Processing: Introduced concurrent event processing for improved throughput.
  • Image Resizing: Enabled resizing images for more flexible input handling.
  • Configurable Redis Search: Added support for configurable search algorithm (FLAT or HNSW) for Redis.

Changed

  • Modal Performance: Improved modal performance.
  • Silver Dataset Efficiency: Optimized silver dataset updates by using deltas instead of full recreation, improving efficiency.

Fixed

  • HuggingFace Error Handling: Improved error handling by catching HfHubHTTPError when interfacing with HuggingFace.
  • Batch Metadata Storage: Reworked metadata handling to enable storing (partial) node results in batch.
April 25, 2025
Framework v22.13.0 - v22.16.1 • Batch v2.16.2 - v2.18.1 • Server v1.23.0 - v1.24.2

Added

  • DescribedBlob Shorthand: DescribedBlob can now be imported using sl.DescribedBlob.
  • Infinity Image Embedding: Added support for image embedding with Infinity models.
  • Modal Handler: Introduced modal handler capabilities.

Changed

  • Result Loading Performance: Optimized query result loading to happen only once per query.
  • Text Embedding Performance: Optimized text embedding to process each unique text value only once.
  • Batch Job Timestamp Performance: Improved created_at timestamp generation in batch jobs by using Spark metadata instead of file scanning.

Fixed

  • OnlineAggregationNode Traversal: Improved parent traversal logic for OnlineAggregationNode.
  • Dataframe Nullable Support: Added support for nullable fields and columns in the dataframe parser.
  • NumberSpace Default Weight: Ensured NumberSpace with_vector queries default to the correct weight.
  • Server Blob Handling: Fixed issue where blob file names were incorrectly treated as directories.
  • Batch Empty Message Bus Handling: Prevented creation of dummy dataframes when the external message bus provides no data.
April 8, 2025
Framework v19.13.0 - v22.7.0 • Server v1.5.0 - v1.19.1 • Batch v1.59.1 - v2.9.0

Breaking Changes

  • HuggingFace API Re-ingestion (v22.0.0): Using external HuggingFace API for TextSimilaritySpace requires re-ingestion. Internal logic revisited to minimize this need.
  • Select Syntax Change (v21.0.0): .select(foo.bar, foo.baz) syntax is deprecated. Use .select([foo.bar, foo.baz]) to support additional parameters like metadata.
  • Non-deterministic Node ID Fix (v20.0.0): Resolved intermittent issue causing index reference problems with previously ingested data.

Added

  • Batch Settings: Introduced configuration settings in superlinked-batch.
  • VDB Timeouts: Added support for increasing timeouts for Redis and Qdrant.
  • FP16 Precision Support: Enabled FP16 precision for VDBs (SUPERLINKED_DISABLE_HALF_PRECISION_EMBEDDING=false), requiring data re-ingestion.
  • GCS Error Handling: Improved error handling for GCS file loading.
  • FP16 Model Embedding: Added support for FP16 precision in model embedding for performance.
  • Parallel Image Downloads: Implemented parallel image downloading for faster embedding.
  • Cached Model Size Limit: Added limit to cached model sizes to prevent out-of-memory errors in Spark.
  • Batch Cluster Configuration: Added support for different cluster configurations (GPU/CPU, memory) per batch job stage.
  • Multi-worker Server: Enabled server execution with multiple workers.
  • Redis Connection Persistence: Prevented Redis connection drops during inactivity.
  • Sentence-Transformers Timeout Handling: Added timeout catching for model downloads.
  • OR Operator in Hard-Filters: Added support for OR operator.
  • Per-Space Weights with Vector: Enabled per-space weights in .with_vector(...).
  • Vector Parts Metadata: Return vector parts as metadata for CustomSpace calculations.
  • Hard-Filters on ID Field: Allowed hard-filters on IdField for retrieving single entries (e.g., for CustomSpace interactions).
  • Batch File Timestamps: Use file creation time for created_at timestamp during batch ingestion.
  • Batched VDB Writes: Improved VDB write performance through batching.
  • Datetime Input Support: Added support for datetime input for created_at property.
  • Float/Int Ingestion: Allowed ingestion of floats and ints with FloatField type.

Changed

  • Image Loading Performance: Improved image loading time in batch by 7.5x.
  • Image Pre-processing Performance: Decreased image pre-processing time by 50% for image embeddings.
  • Framework Performance: Enhanced framework performance through parallel execution.
  • Redis Client Update: Switched to Redis’ official Python client for improved security and maintainability.
  • VectorSampler Weights: Removed weights from VectorSampler.

Fixed

  • Qdrant Input Sanitization: Sanitized input data before Qdrant ingestion.
  • Qdrant Partial Ingestion: Prevented partial result ingestion on errors with Qdrant.
  • Nullable Value Hard-Filters: Fixed hard-filters for nullable values.
  • .with_vector() with Qdrant: Fixed .with_vector() functionality with Qdrant.
  • Batch Nullable Values: Fixed support for nullable values in batch.
  • Categorical Space Uncategorized: Fixed CategoricalSpace behavior with uncategorized_as_category=True.
  • Non-deterministic Node IDs: Resolved non-deterministic node ID issues across the framework (See Breaking Changes v20.0.0).
  • Custom ID Fields: Fixed support for custom ID fields.
February 24, 2025
Framework v17.8.0 - v19.13.0 • Server v0.7.3 - v1.5.0 • Batch v1.33.0 - v1.59.1

Breaking Changes

  • Query Results Restructure (v19.0.0): Query results have been restructured to provide more detailed information, including partial scores per space. Check this notebook for example query structure.
  • Text Similarity Space Updates (v18.0.0): TextSimilaritySpace now defaults to query mode for supported models.
  • Server Query Structure (v1.0.0): Updated to align with new framework query result structure from v18.0.0.

Added

  • Embedding Cache: Implemented recalculation cache providing 30x speed-up on 1M datasets with optimized LLM usage.
  • Partial Scores: Added per-space partial scores to query results for improved explainability.
  • Field Selection: Introduced options to select specific return fields (select_all(), select(...), select_fields=...) for improved query latency.
  • Acceptance Testing: Added internal tool for validating end-to-end user flows.

Changed

  • Storage Optimization: Removed unnecessary object storage in VDB to improve latency and storage efficiency.
  • Example Updates: Updated all examples to use simplified superlinked imports.

Fixed

  • Various improvements to the storage layer of the framework supporting pilot project requirements.
December 20, 2024
Framework v12.28.1 - v17.8.0 • Server v14.7.1 - v0.7.3 • Batch v1.17.0 - v1.33.0

Added

  • Simplified Server Startup: Added support for single command server startup using python -m superlinked.server.
  • NLQ Prompt Suggestions: Enhanced NLQ with suggestions for better prompting.
  • Debugging Feature Flag: Added option to return debugging data for users.

Changed

  • Event System Redesign: Simplified and reshaped the event system across online and batch components.
  • Batch Index Creation: Moved index creation to dedicated component instead of server instances.
  • Dynamic Model Cache: Implemented dynamic cache directory setting for Hugging Face models.
  • Query Performance: Improved query performance from O(n) to O(1) using transactional reading.
  • NLQ Refactor: Restructured NLQ implementation for improved performance and quality.

Fixed

  • Parameter Validation: Added exception throwing for invalid query parameters.

Misc

  • Code Cleanup: Removed obsolete server code from repository.
November 20, 2024
Framework v12.23.0 - v12.28.1 • Server v12.23.0 - v14.7.1 • Batch v1.15.2 - v1.17.0

Added

  • Query Arithmetics: Completed foundation for multi-modal data representation in single vectors, enabling cohesive interaction between vectors, aggregations, and normalizations.
  • Qdrant Connector: Added support for Qdrant as an alternative to MongoDB and Redis.
  • System Prompts in NLQ: Added capability for users to customize NLQ translation prompts.
  • Contains All Filter: Implemented new filter operation.
  • Query Vector Exposure: Added ability to view query vectors in results for improved observability.
  • NLQ Parameter Constraints: Added support to constrain Parameters to fixed lists.

Changed

  • DSL Improvements: Added support for single items where lists were previously required.
  • Parameter Validation: Enhanced validation for space fields, schema references, and parameter names.
  • String Parameter Support: Added support for string parameters in filter, contains, and in operators.

Fixed

  • Executor Validation: Added guardrails for data ingestion without running executor.
  • User ID Validation: Improved error handling for invalid user IDs.
November 6, 2024
Framework v12.2.0 - v12.23.0 • Server v12.2.0 - v12.23.0 • Batch v1.13.1 - v1.15.2

Added

  • Simplified Superlinked Imports: Users can now import Superlinked using a single import statement import superlinked as sl, eliminating the need to remember the import path of each object.
  • Support for OpenCLIP Models: Added support for OpenCLIP models in image embeddings, extending the supported models to include OpenCLIP and sentence-transformers supported vision encoders.

Fixed

  • File Discovery Issues: Resolved file discovery issues experienced by Yassine.
  • Dynamic Cache Directory for Sentence-Transformers: Made the sentence-transformers cache directory dynamic to ensure compatibility with batch operations.
  • Zero Division in Events: Fixed an issue where events with virtually no age resulted in a division by zero error.
  • Storing Unreferenced Fields: Corrected the storage of fields that were not referenced in the index.
  • Default Similarity Weights: Adjusted similarity weights to default to one when a parameter is not provided.
  • StringLists Support in NLQ: Added support for StringLists to be filled by NLQ.
  • Categorical Similarity Node Bug: Fixed a bug affecting events in the categorical similarity node.
  • Error on Unknown ID: Now throws an error if an unknown ID is used, instead of returning similarities for the given scenario.

Misc

  • NLQ Feature Improvements: Made further improvements to the NLQ feature.
  • New NLQ Examples: Added new NLQ examples showcasing product search instead of product reviews.
  • Enhanced Logging: Added further logging for the external message bus.
  • Test Performance Improvement: Improved test performance, reducing execution time from 4.6 seconds to 0.6 seconds.
October 23, 2024
Framework v10.1.0 - v12.2.0 • Server v10.1.0 - v12.2.0 • Batch v1.11.1 - v1.13.1

Added

  • Stack Trace in JSON Logging: Enhanced JSON logging by adding stack traces to facilitate debugging.

Fixed

  • Support SchemaA and SchemaB both affecting SchemaC: Added support for scenarios where both SchemaA and SchemaB can affect SchemaC, such as users being influenced by paragraphs they read and comments they like.
  • Registry Bug: Resolved an internal registry bug.
October 9, 2024
Framework v9.43.0 - v10.1.0 • Server v9.42.1 - 10.1.0 • Batch v1.8.0 - v1.11.1

Added

  • Optional Params in Query: Params in Query are now optional. If not filled during a query, the clause will not affect the query results.
  • Support for VDBs in Notebooks: Introduced the InteractiveExecutor to support connecting to different VDBs from a notebook environment.
  • Simplified Query Definition: Users can now provide the space as a parameter when only one field is suitable, e.g., .similar(number_space, 3) instead of .similar(number_space.number, 3).
  • Stacktrace in JSON Logs: Added stacktrace to JSON logs for improved debugging.

Fixed

  • Plot Rendering in Notebooks: Added support to render notebooks properly across different environments.
  • Redis Incorrect Results: Fixed a naming issue that caused Redis to return incorrect results. Continuous testing of VDB integrations is planned to prevent such issues.
  • Constrain Category Choices in NLQ: NLQ can no longer “hallucinate” categories that do not exist for a given query.
September 25, 2024
Framework v9.33.0 - v9.43.0 • Server v9.33.0 - v9.42.1 • Batch v1.4.0 - v1.8.0

Added

  • NLQ support for IN and NOT_IN operators: Added support for NLQ to translate natural language to IN and NOT_IN operators.
  • Logging for superlinked components: Logging was added for unified debugging and observability, led by Krisztian.
  • Basic caching for text-embeddings: Implemented caching for 10k items to handle repeating inputs efficiently.

Fixed

  • Chunking with hard-filters: Fixed an issue where chunking was breaking hard-filters.
  • Limit NLQ to respect given categories: Ensured NLQ does not “hallucinate” categories not present in the application.

Changed

  • Image embedding model update: Replaced the image embedding model in the use-case notebook with the latest model.

Misc

  • Batch infrastructure risk mitigation: Verified that the GCP hosted solution works as intended.
  • Hard-filter compatibility with batch: Ensured compatibility of IN, NOT_IN, LT, LTE, GT, GTE operators with batch-encoded datasets.
September 11, 2024
Framework v9.22.1 - v9.33.0 • Server v9.33.0 - v9.33.0 • Batch v1.1.2 - v1.4.0

Added

  • Separated the user code into multiple files: We have separated the previous app.py into dedicated index.py/query.py/api.py, which promotes reusability and explainability of the system behavior.
  • Added “or” and “contains” to the notebooks: Added further examples to notebooks based on user feedback.

Fixed

  • Fixed skewed vectors, when recency was 0: A known limitation of the system is that, if full zero vectors make it to the index or query as an output of a space, then the weighting logic.
  • Added NLQ support for in and not_in operators: Newly introduced hard-filters were not properly handled by NLQ before.
  • Fixed hard-filters with chunking: Hard-filters were not tested with chunking, this is now fixed.

Changed

  • Improved the system message at server startup: Users were experiencing confusing when starting up server as the messages were not clear on the status.
August 28, 2024
Framework v9.12.1 - v9.22.1 • Server v9.12.1 - v9.22.0 • Batch v1.1.0 - v1.1.2

Added

  • Event support in batch: Users can now batch calculate with events as well, rendering the online and batch systems in feature parity.
  • Support for more hard filter operations: Added support for LT, LTE, GT, GTE, AND, OR, CONTAINS, NOT_CONTAINS, IN, NOT_IN, unlocking more tabular data heavy use-cases.
  • Optimized GPU usage: Optimized GPU utilization for best performance, which below a certain size (~10k embeddings) is more optimal with CPU.
  • Integration tests of online executor with GPU: Added tests to measure if the frameworks correctly recognize the underlying GPU, if any.
  • Added scores to example notebooks: Expanded all notebooks to show how scores work, allowing users to better understand the underlying vector similarity scores.
  • Further NLQ support: Extended NLQ features with filter support and temperature tuning based on feedback.

Fixed

  • Results were not ordered by score with Redis: Fixed issue where results were not properly ordered by their scores in ascending order.
  • Long categorical embeddings were dominating the results: Fixed an issue that resulted in all 0 vectors that skewed the results by breaking the normalization.
  • Source.put was behaving differently with different inputs: Fixed to work as expected with arrays and data frames alike.
  • Index temperature was not accepting integers: Fixed the index to accept integer temperatures.

Changed

  • Added better formatted logs in tests: Improved logging support for easier issue identification during development.
  • (breaking) Removed status endpoints for initial data loader: Necessary step to allow the executor to be stateless, preparing it for high availability hosting.
August 14, 2024
Framework v9.7.0 - v9.12.1 • Server v9.6.0 - v9.7.0 • Batch v1.0.2 - v1.1.0

Added

  • Return similarity scores: Now returning similarity scores along with results, allowing clients to better understand the distribution.
  • Improved feature notebooks: Extended with examples containing similarity scores, querying of recency space, and event parameters (max_count, max_age, temperature).
  • Code quality checks for server: Added automations to ensure server testing and unified code format.

Fixed

  • Recency was not always using the same NOW from the context: Fixed to use the correct reference point.
August 7, 2024
Framework v9.7.0 • Server v9.6.0 • Batch v1.0.2

Added

  • Bumped sentence-transformers to 3.0.1: Allows experimentation with highest scoring models from mteb leaderboard.
  • Support for empty list embedding: Enables ingestion of data with lists where not all rows have an item.
  • Default limits to vector database connectors: Set default return of 10 items for both Redis and Mongo.
  • Natural Language Queries: Users can describe parameters for prompting the underlying model.
  • Support logarithmic number embeddings: Captures non-linear preferences with large values.

Fixed

  • Negative weights boosting recommendations: Fixed issues within the event handling system.
  • Negative weight now applied even for the first received event: Addressed bug where negative weights only worked after a positively weighted event.