Intro
Superlinked’s framework is built on key components: @schema, Source, Spaces, Index, Query, and Executor. These building blocks allow you to create a modular system tailored to your specific use cases. You begin by defining your desired endpoints - how you want your embeddings to represent your data. This guides your system setup, allowing you to customize your modules before running queries. You can adjust query weights for different scenarios, such as a user’s interests or recent items. This modular approach separates query description from execution, enabling you to run the same query across different environments without reimplementation. You build your Query using descriptive elements like @schema, Source, Spaces, Index, or Event, which can be reused with different Executors. Superlinked’s focus on connectors facilitates easy transitions between deployments, from in-memory to batch or real-time data pipelines. This flexibility allows for rapid experimentation in embedding and retrieval while maximizing control over index creation. Let’s explore these building blocks in more detail.Turning classes into Schemas
Once you’ve parsed data into your notebook via JSON or a pandas dataframe, it’s time to create a Schema describing your data. To do this, you use the Schema decorator to annotate your class as a schema representing your structured data. Schemas translate to searchable entities in the embedding space. To get started, type @schema, and then define the field types to match the different types of data you’ve imported.Declaring how to embed your data using Spaces
Spaces is a declarative class developed with this in mind. The Space module encapsulates the vector creation logic that will be used at ingestion time, and again at query time. Spaces lets you tailor how you embed different attributes of your data and can be categorized along 2 key dimensions:- what input types the Space permits - e.g., text, timestamp, numeric, categorical
- whether the Space represents similarity (e.g, TextSimilaritySpace) or scale (e.g., numeric space)
Indexing
Superlinked’s Index module components enable you to group Spaces into indices that make your queries more efficient.Executing your query to your chosen endpoints
Before running your code, you need to structure your query using the following arguments:Query
: defines the index you want it to search, and you can add Params here (details in our notebook).find
: tells it what to look for.similar
: tells it how to identify relevant results (details in notebook).select_all
: returns all the stored fields, without this clause, it will only return the id(s) (details in notebook)
.with_vector
to search with an embedded vector of a specific element of your data (see details in notebook)).
Experimenting with some sample data
Now we can insert some sample data…body | id | |
---|---|---|
0 | That is a very happy person | happy_person |
1 | That is a happy dog | happy_dog |
2 | Today is a sunny day | sunny_day |
body | id | |
---|---|---|
0 | That is a happy dog | happy_dog |
1 | That is a very happy person | happy_person |
2 | Today is a sunny day | sunny_day |