LogoLogo
👋 Get in touch⭐️ GitHub
  • Welcome
  • Getting Started
    • Why Superlinked?
    • Setup Superlinked
    • Basic Building Blocks
  • Run in Production
    • Overview
    • Setup Superlinked Server
      • Configuring your app
      • Interacting with app via API
    • Supported Vector Databases
      • Redis
      • Mongo DB
      • Qdrant
  • Concepts
    • Overview
    • Combining Multiple Embeddings for Better Retrieval Outcomes
    • Dynamic Parameters/Query Time weights
  • Reference
    • Overview
    • Changelog
    • Components
      • Dag
        • Period Time
      • Parser
        • Json Parser
        • Dataframe Parser
        • Data Parser
      • Schema
        • Id Schema Object
        • Schema Object
        • Schema
        • Event Schema Object
        • Event Schema
      • App
        • App
        • Interactive
          • Interactive App
        • Online
          • Online App
        • Rest
          • Rest App
        • In Memory
          • In Memory App
      • Space
        • Custom Space
        • Exception
        • Has Space Field Set
        • Number Space
        • Image Space Field Set
        • Text Similarity Space
        • Input Aggregation Mode
        • Image Space
        • Recency Space
        • Space Field Set
        • Categorical Similarity Space
        • Space
      • Executor
        • Exception
        • Executor
        • Interactive
          • Interactive Executor
        • Rest
          • Rest Descriptor
          • Rest Handler
          • Rest Executor
          • Rest Configuration
        • In Memory
          • In Memory Executor
        • Query
          • Query Executor
      • Registry
        • Superlinked Registry
        • Exception
      • Storage
        • Mongo Db Vector Database
        • Vector Database
        • Redis Vector Database
        • In Memory Vector Database
        • Qdrant Vector Database
      • Index
        • Effect
        • Index
        • Util
          • Aggregation Node Util
          • Event Aggregation Node Util
          • Event Aggregation Effect Group
          • Effect With Referenced Schema Object
          • Aggregation Effect Group
      • Source
        • Data Loader Source
        • Interactive Source
        • Types
        • In Memory Source
        • Source
        • Rest Source
      • Query
        • Param
        • Typed Param
        • Query
        • Query Weighting
        • Query Descriptor
        • Nlq Param Evaluator
        • Space Weight Param Info
        • Query Param Information
        • Query Filters
        • Nlq Pydantic Model Builder
        • Clause Params
        • Param Evaluator
        • Query Mixin
        • Query Param Value Setter
        • Query Filter Validator
        • Natural Language Query Param Handler
        • Query Filter Information
        • Query Vector Factory
        • Query Clause
        • Result
        • Query Result Converter
          • Default Query Result Converter
          • Query Result Converter
          • Serializable Query Result Converter
        • Predicate
          • Binary Op
          • Query Predicate
          • Binary Predicate
        • Query Clause
          • Similar Filter Clause
          • Overriden Now Clause
          • Looks Like Filter Clause
          • Space Weight Map
          • Nlq System Prompt Clause
          • Nlq Clause
          • Radius Clause
          • Weight By Space Clause
          • Base Looks Like Filter Clause
          • Limit Clause
          • Select Clause
          • Looks Like Filter Clause Weights By Space
          • Single Value Param Query Clause
          • Hard Filter Clause
          • Query Clause
        • Nlq
          • Nlq Compatible Clause Handler
          • Exception
          • Nlq Clause Collector
          • Nlq Handler
          • Suggestion
            • Query Suggestion Model
            • Query Suggestions Prompt Builder
          • Param Filler
            • Query Param Model Validator
            • Query Param Model Validator Info
            • Query Param Model Builder
            • Query Param Prompt Builder
            • Nlq Annotation
            • Templates
  • Recipes
    • Overview
    • Multi-Modal Semantic Search
      • Hotel Search
    • Recommendation System
      • E-Commerce RecSys
  • Tutorials
    • Overview
    • Semantic Search - News
    • Semantic Search - Movies
    • Semantic Search - Product Images & Descriptions
    • RecSys - Ecommerce
    • RAG - HR
    • Analytics - User Acquisition
    • Analytics - Keyword Expansion
  • Help & FAQ
    • Logging
    • Support
    • Discussion
  • Policies
    • Terms of Use
    • Privacy Policy
Powered by GitBook
On this page

Was this helpful?

Edit on GitHub
  1. Reference
  2. Components
  3. Space

Categorical Similarity Space

Classes

CategoricalSimilaritySpace(category_input: superlinked.framework.common.schema.schema_object.String | superlinked.framework.common.schema.schema_object.StringList | None | list[superlinked.framework.common.schema.schema_object.String | superlinked.framework.common.schema.schema_object.StringList | None], categories: list[str], negative_filter: float = 0.0, uncategorized_as_category: bool = True) : Represents a space for encoding categorical similarity.

A CategoricalSimilaritySpace is designed to measure the similarity between items that
are grouped into a finite number of textual categories. The categories are represented
in an n-hot encoded vector, with the option to apply a negative filter for unmatched
categories, enhancing the distinction between matching and non-matching category items.
Negative_filter allows for filtering out unmatched categories, by setting
it to a large negative value, effectively resulting in large negative similarity between
non-matching category items. A category input not present in categories will be encoded
as `other` category. These categories will be similar to each other by default. Set
uncategorized_as_category parameter to False in order to suppress this behavior - this
way other categories are not similar to each other in any case - not even to the same
`other` category. To make that specific category value similar to only the same category
items, consider adding it to `categories`.

Attributes:
    category_input (StringList | String | list[String | StringList]):
        The schema field containing input categories to be considered in the similarity space.
        Input contains one or more categories in a list if `StringList` is provided.
        If `String` is provided, then the input must be a single value.
    categories (List[str]): A list of categories that defines the dimensionality of the
        one-hot encoded vector. Any category not listed is considered as 'other'.
    negative_filter (float): A value to represent unmatched categories in the one-hot vector.
        Instead of using 0, which typically represents the absence of a category, this allows
        for a different representation - resulting in effectively filtering out items that has
        non-matching categories.
    uncategorized_as_category (bool): If set to False, the similarity between other categories will be
        set to 0, or negative_filter if set. By this we can control if a category_input not in
        categories will be similar to other category_inputs not in categories. Note that the same
        category_inputs not in categories will not be similar to each other either.
Raises:
    InvalidSchemaException: If a schema object does not have a corresponding node in the
        similarity space.

Initializes a new instance of the CategoricalSimilaritySpace.

This constructor sets up the space with the necessary configurations to encode and measure categorical
similarity based on the provided parameters.

Args:
    category_input (StringList | String | list[String | StringList]):
    The schema field containing input categories to be considered in the similarity space.
    Input contains one or more categories in a list if `StringList` is provided.
    If `String` is provided, then the input must be a single value.
    categories (list[str]): A list of all the recognized categories. Categories not included in this list will
        be treated as 'other', unless `uncategorized_as_category` is False.
    negative_filter (float, optional): A value used to represent unmatched categories in the encoding process.
        This allows for a penalizing non-matching categories - in contrast to them contributing 0 to similarity,
         it is possible to influence the similarity score negatively. Defaults to 0.0.
    uncategorized_as_category (bool, optional): Determines whether categories not listed in `categories` should
        be treated as a distinct 'other' category. Defaults to True.

Raises:
    InvalidSchemaException: If a schema object does not have a corresponding node in the similarity space,
    indicating a configuration or implementation error.

### Ancestors (in MRO)

* superlinked.framework.dsl.space.space.Space
* superlinked.framework.common.space.interface.has_transformation_config.HasTransformationConfig
* superlinked.framework.common.interface.has_length.HasLength
* typing.Generic
* superlinked.framework.dsl.space.has_space_field_set.HasSpaceFieldSet
* abc.ABC

### Instance variables

`category: superlinked.framework.dsl.space.space_field_set.SpaceFieldSet`
:

`space_field_set: superlinked.framework.dsl.space.space_field_set.SpaceFieldSet`
:

`transformation_config: superlinked.framework.common.space.config.transformation_config.TransformationConfig[superlinked.framework.common.data_types.Vector, list[str]]`
:

`uncategorized_as_category: bool`
:
PreviousSpace Field SetNextSpace

Last updated 2 months ago

Was this helpful?