LogoLogo
👋 Get in touch⭐️ GitHub
  • Welcome
  • Getting Started
    • Why Superlinked?
    • Setup Superlinked
    • Basic Building Blocks
  • Run in Production
    • Overview
    • Setup Superlinked Server
      • Configuring your app
      • Interacting with app via API
    • Supported Vector Databases
      • Redis
      • Mongo DB
      • Qdrant
  • Concepts
    • Overview
    • Combining Multiple Embeddings for Better Retrieval Outcomes
    • Dynamic Parameters/Query Time weights
  • Reference
    • Overview
    • Changelog
    • Components
      • Dag
        • Period Time
      • Parser
        • Json Parser
        • Dataframe Parser
        • Data Parser
      • Schema
        • Id Schema Object
        • Schema Object
        • Schema
        • Event Schema Object
        • Event Schema
      • App
        • App
        • Interactive
          • Interactive App
        • Online
          • Online App
        • Rest
          • Rest App
        • In Memory
          • In Memory App
      • Space
        • Custom Space
        • Exception
        • Has Space Field Set
        • Number Space
        • Image Space Field Set
        • Text Similarity Space
        • Input Aggregation Mode
        • Image Space
        • Recency Space
        • Space Field Set
        • Categorical Similarity Space
        • Space
      • Executor
        • Exception
        • Executor
        • Interactive
          • Interactive Executor
        • Rest
          • Rest Descriptor
          • Rest Handler
          • Rest Executor
          • Rest Configuration
        • In Memory
          • In Memory Executor
        • Query
          • Query Executor
      • Registry
        • Superlinked Registry
        • Exception
      • Storage
        • Mongo Db Vector Database
        • Vector Database
        • Redis Vector Database
        • In Memory Vector Database
        • Qdrant Vector Database
      • Index
        • Effect
        • Index
        • Util
          • Aggregation Node Util
          • Event Aggregation Node Util
          • Event Aggregation Effect Group
          • Effect With Referenced Schema Object
          • Aggregation Effect Group
      • Source
        • Data Loader Source
        • Interactive Source
        • Types
        • In Memory Source
        • Source
        • Rest Source
      • Query
        • Param
        • Typed Param
        • Query
        • Query Weighting
        • Query Descriptor
        • Nlq Param Evaluator
        • Space Weight Param Info
        • Query Param Information
        • Query Filters
        • Nlq Pydantic Model Builder
        • Clause Params
        • Param Evaluator
        • Query Mixin
        • Query Param Value Setter
        • Query Filter Validator
        • Natural Language Query Param Handler
        • Query Filter Information
        • Query Vector Factory
        • Query Clause
        • Result
        • Query Result Converter
          • Default Query Result Converter
          • Query Result Converter
          • Serializable Query Result Converter
        • Predicate
          • Binary Op
          • Query Predicate
          • Binary Predicate
        • Query Clause
          • Similar Filter Clause
          • Overriden Now Clause
          • Looks Like Filter Clause
          • Space Weight Map
          • Nlq System Prompt Clause
          • Nlq Clause
          • Radius Clause
          • Weight By Space Clause
          • Base Looks Like Filter Clause
          • Limit Clause
          • Select Clause
          • Looks Like Filter Clause Weights By Space
          • Single Value Param Query Clause
          • Hard Filter Clause
          • Query Clause
        • Nlq
          • Nlq Compatible Clause Handler
          • Exception
          • Nlq Clause Collector
          • Nlq Handler
          • Suggestion
            • Query Suggestion Model
            • Query Suggestions Prompt Builder
          • Param Filler
            • Query Param Model Validator
            • Query Param Model Validator Info
            • Query Param Model Builder
            • Query Param Prompt Builder
            • Nlq Annotation
            • Templates
  • Recipes
    • Overview
    • Multi-Modal Semantic Search
      • Hotel Search
    • Recommendation System
      • E-Commerce RecSys
  • Tutorials
    • Overview
    • Semantic Search - News
    • Semantic Search - Movies
    • Semantic Search - Product Images & Descriptions
    • RecSys - Ecommerce
    • RAG - HR
    • Analytics - User Acquisition
    • Analytics - Keyword Expansion
  • Help & FAQ
    • Logging
    • Support
    • Discussion
  • Policies
    • Terms of Use
    • Privacy Policy
Powered by GitBook
On this page
  • Functions
  • Classes

Was this helpful?

Edit on GitHub
  1. Reference
  2. Components
  3. Space

Text Similarity Space

Functions

chunk(text: superlinked.framework.common.schema.schema_object.String, chunk_size: int | None = None, chunk_overlap: int | None = None, split_chars_keep: list[str] | None = None, split_chars_remove: list[str] | None = None) ‑> superlinked.framework.common.dag.chunking_node.ChunkingNode : Create smaller chunks from the given text, a String SchemaFieldObject. It is helpful when you search for more granular information in your text corpus. It is recommended to try different chunk_sizes to find what fits best your use-case. Chunking respects word boundaries.

    Args:
        text (String): The String field the text of which is to be chunked.
        chunk_size (int | None, optional): The maximum size of each chunk in characters. Defaults to None, which means
        effectively using 250.
        chunk_overlap (int | None, optional): The maximum overlap between chunks in characters. Defaults to None, which
        means effectively using {}.
        split_chars_keep: Characters to split at, but also keep in the text. Should be characters that can signal
        meaningful breakpoints in the text. Effectively defaults to ["!", "?", "."].
        split_chars_remove: Characters to split at and remove from the text. Should be characters that can signal
        meaningful breakpoints in the text. Effectively defaults to ["
"].

    Returns:
        ChunkingNode: The chunking node.

Classes

TextSimilaritySpace(text: superlinked.framework.common.schema.schema_object.String | superlinked.framework.common.dag.chunking_node.ChunkingNode | None | collections.abc.Sequence[superlinked.framework.common.schema.schema_object.String | superlinked.framework.common.dag.chunking_node.ChunkingNode | None], model: str, cache_size: int = 10000, model_cache_dir: pathlib.Path | None = None, model_handler: superlinked.framework.common.space.config.embedding.text_similarity_embedding_config.TextModelHandler = TextModelHandler.SENTENCE_TRANSFORMERS) : A text similarity space is used to create vectors from documents in order to search in them later on. We only support (SentenceTransformers)[https://www.sbert.net/] models as they have finetuned pooling to encode longer text sequences most efficiently.

Initialize the TextSimilaritySpace.

Args:
    text (TextInput | list[TextInput]): The Text input or a list of Text inputs.
        It is a SchemaFieldObject (String), not a regular python string.
    model (str): The model used for text similarity.
    cache_size (int): The number of embeddings to be stored in an inmemory LRU cache.
        Set it to 0, to disable caching. Defaults to 10000.
    model_cache_dir (Path | None, optional): Directory to cache downloaded models.
        If None, uses the default cache directory. Defaults to None.
    model_handler (TextModelHandler, optional): The handler for the model,
        defaults to ModelHandler.SENTENCE_TRANSFORMERS.

### Ancestors (in MRO)

* superlinked.framework.dsl.space.space.Space
* superlinked.framework.common.space.interface.has_transformation_config.HasTransformationConfig
* superlinked.framework.common.interface.has_length.HasLength
* typing.Generic
* superlinked.framework.dsl.space.has_space_field_set.HasSpaceFieldSet
* abc.ABC

### Instance variables

`space_field_set: superlinked.framework.dsl.space.space_field_set.SpaceFieldSet`
:

`transformation_config: superlinked.framework.common.space.config.transformation_config.TransformationConfig[superlinked.framework.common.data_types.Vector, str]`
:
PreviousImage Space Field SetNextInput Aggregation Mode

Last updated 2 months ago

Was this helpful?