Skip to main content

Splore API (0.24.0)

Download OpenAPI specification:Download

Ingestion & Search API for Splore search platform!

Search Documents

search documents with query and filters

Search for documents in a collection

Search for documents in a collection that match the search criteria.

Authorizations:
None
path Parameters
collectionName
required
string

The name of the collection to search for the document under

query Parameters
required
object (SearchParameters)

Responses

Response samples

Content type
application/json
{
  • "facet_counts": [
    ],
  • "found": 0,
  • "search_time_ms": 0,
  • "out_of": 0,
  • "search_cutoff": true,
  • "page": 0,
  • "grouped_hits": [
    ],
  • "hits": [
    ],
  • "request_params": {
    }
}

send multiple search requests in a single HTTP request

This is especially useful to avoid round-trip network latencies incurred otherwise if each of these requests are sent in separate HTTP requests. You can also use this feature to do a federated search across multiple collections in a single HTTP request.

Authorizations:
None
query Parameters
required
object (MultiSearchParameters)

Parameters for the multi search API.

Request Body schema: application/json
required
Array of objects (MultiSearchCollectionParameters)
Array
q
string

The query text to search for in the collection. Use * as the search string to return all documents. This is typically useful when used in conjunction with filter_by.

query_by
string

A list of string fields that should be queried against. Multiple fields are separated with a comma.

query_by_weights
string

The relative weight to give each query_by field when ranking results. This can be used to boost fields in priority, when looking for matches. Multiple fields are separated with a comma.

prefix
string

Boolean field to indicate that the last word in the query should be treated as a prefix, and not as a whole word. This is used for building autocomplete and instant search interfaces. Defaults to true.

infix
string

If infix index is enabled for this field, infix searching can be done on a per-field basis by sending a comma separated string parameter called infix to the search query. This parameter can have 3 values; off infix search is disabled, which is default always infix search is performed along with regular search fallback infix search is performed if regular search does not produce results

max_extra_prefix
integer

There are also 2 parameters that allow you to control the extent of infix searching max_extra_prefix and max_extra_suffix which specify the maximum number of symbols before or after the query that can be present in the token. For example query "K2100" has 2 extra symbols in "6PK2100". By default, any number of prefixes/suffixes can be present for a match.

max_extra_suffix
integer

There are also 2 parameters that allow you to control the extent of infix searching max_extra_prefix and max_extra_suffix which specify the maximum number of symbols before or after the query that can be present in the token. For example query "K2100" has 2 extra symbols in "6PK2100". By default, any number of prefixes/suffixes can be present for a match.

filter_by
string

Filter conditions for refining youropen api validator search results. Separate multiple conditions with &&.

sort_by
string

A list of numerical fields and their corresponding sort orders that will be used for ordering your results. Up to 3 sort fields can be specified. The text similarity score is exposed as a special _text_match field that you can use in the list of sorting fields. If no sort_by parameter is specified, results are sorted by _text_match:desc,default_sorting_field:desc

facet_by
string

A list of fields that will be used for faceting your results on. Separate multiple fields with a comma.

max_facet_values
integer

Maximum number of facet values to be returned.

facet_query
string

Facet values that are returned can now be filtered via this parameter. The matching facet text is also highlighted. For example, when faceting by category, you can set facet_query=category:shoe to return only facet values that contain the prefix "shoe".

num_typos
string

The number of typographical errors (1 or 2) that would be tolerated. Default: 2

page
integer

Results from this specific page number would be fetched.

per_page
integer

Number of results to fetch per page. Default: 10

group_by
string

You can aggregate search results into groups or buckets by specify one or more group_by fields. Separate multiple fields with a comma. To group on a particular field, it must be a faceted field.

group_limit
integer

Maximum number of hits to be returned for every group. If the group_limit is set as K then only the top K hits in each group are returned in the response. Default: 3

include_fields
string

List of fields from the document to include in the search result

exclude_fields
string

List of fields from the document to exclude in the search result

highlight_full_fields
string

List of fields which should be highlighted fully without snippeting

highlight_affix_num_tokens
integer

The number of tokens that should surround the highlighted text on each side. Default: 4

highlight_start_tag
string

The start tag used for the highlighted snippets. Default: <mark>

highlight_end_tag
string

The end tag used for the highlighted snippets. Default: </mark>

snippet_threshold
integer

Field values under this length will be fully highlighted, instead of showing a snippet of relevant portion. Default: 30

drop_tokens_threshold
integer

If the number of results found for a specific query is less than this number, Splore will attempt to drop the tokens in the query until enough results are found. Tokens that have the least individual hits are dropped first. Set to 0 to disable. Default: 10

typo_tokens_threshold
integer

If the number of results found for a specific query is less than this number, Splore will attempt to look for tokens with more typos until enough results are found. Default: 100

pinned_hits
string

A list of records to unconditionally include in the search results at specific positions. An example use case would be to feature or promote certain items on the top of search results. A list of record_id:hit_position. Eg: to include a record with ID 123 at Position 1 and another record with ID 456 at Position 5, you'd specify 123:1,456:5. You could also use the Overrides feature to override search results based on rules. Overrides are applied first, followed by pinned_hits and finally hidden_hits.

hidden_hits
string

A list of records to unconditionally hide from search results. A list of record_ids to hide. Eg: to hide records with IDs 123 and 456, you'd specify 123,456. You could also use the Overrides feature to override search results based on rules. Overrides are applied first, followed by pinned_hits and finally hidden_hits.

highlight_fields
string

A list of custom fields that must be highlighted even if you don't query for them

pre_segmented_query
boolean

You can index content from any logographic language into Splore if you are able to segment / split the text into space-separated words yourself before indexing and querying. Set this parameter to true to do the same

preset
string

Search using a bunch of search parameters by setting this parameter to the name of the existing Preset.

enable_overrides
boolean

If you have some overrides defined but want to disable all of them during query time, you can do that by setting this parameter to false

prioritize_exact_match
boolean

Set this parameter to true to ensure that an exact match is ranked above the others

exhaustive_search
boolean

Setting this to true will make Splore consider all prefixes and typo corrections of the words in the query without stopping early when enough results are found (drop_tokens_threshold and typo_tokens_threshold configurations are ignored).

search_cutoff_ms
integer

Splore will attempt to return results early if the cutoff time has elapsed. This is not a strict guarantee and facet computation is not bound by this parameter.

use_cache
boolean

Enable server side caching of search query results. By default, caching is disabled.

cache_ttl
integer

The duration (in seconds) that determines how long the search query is cached. This value can be set on a per-query basis. Default: 60.

min_len_1typo
integer

Minimum word length for 1-typo correction to be applied. The value of num_typos is still treated as the maximum allowed typos.

min_len_2typo
integer

Minimum word length for 2-typo correction to be applied. The value of num_typos is still treated as the maximum allowed typos.

vector_query
string

Vector query expression for fetching documents "closest" to a given query/document vector.

collection
required
string

The collection to search in.

Responses

Request samples

Content type
application/json
{
  • "searches": [
    ]
}

Response samples

Content type
application/json
{
  • "results": [
    ]
}

Index Document(s)

Add document to search index

Index a document

A document to be indexed in a given collection must conform to the schema of the collection.

Authorizations:
None
path Parameters
collectionName
required
string

The name of the collection to add the document to

query Parameters
action
string
Value: "upsert"
Example: action=upsert

Additional action to perform

Request Body schema: application/json

The document object to be indexed

object

Can be any key-value pair

Responses

Request samples

Content type
application/json
{ }

Response samples

Content type
application/json
{ }

Import documents into a collection

The documents to be imported must be formatted in a newline delimited JSON structure. You can feed the output file from a Splore export operation directly as import.

Authorizations:
None
path Parameters
collectionName
required
string

The name of the collection

query Parameters
object
Request Body schema: application/octet-stream

The json array of documents or the JSONL file to import

string

The JSONL file to import

Responses

Response samples

Content type
application/json
{
  • "message": "string"
}

Search Parameters

q
required
string

The query text to search for in the collection. Use * as the search string to return all documents. This is typically useful when used in conjunction with filter_by.

query_by
required
string

A list of string fields that should be queried against. Multiple fields are separated with a comma.

query_by_weights
string

The relative weight to give each query_by field when ranking results. This can be used to boost fields in priority, when looking for matches. Multiple fields are separated with a comma.

prefix
string

Boolean field to indicate that the last word in the query should be treated as a prefix, and not as a whole word. This is used for building autocomplete and instant search interfaces. Defaults to true.

infix
string

If infix index is enabled for this field, infix searching can be done on a per-field basis by sending a comma separated string parameter called infix to the search query. This parameter can have 3 values; off infix search is disabled, which is default always infix search is performed along with regular search fallback infix search is performed if regular search does not produce results

max_extra_prefix
integer

There are also 2 parameters that allow you to control the extent of infix searching max_extra_prefix and max_extra_suffix which specify the maximum number of symbols before or after the query that can be present in the token. For example query "K2100" has 2 extra symbols in "6PK2100". By default, any number of prefixes/suffixes can be present for a match.

max_extra_suffix
integer

There are also 2 parameters that allow you to control the extent of infix searching max_extra_prefix and max_extra_suffix which specify the maximum number of symbols before or after the query that can be present in the token. For example query "K2100" has 2 extra symbols in "6PK2100". By default, any number of prefixes/suffixes can be present for a match.

filter_by
string

Filter conditions for refining youropen api validator search results. Separate multiple conditions with &&.

sort_by
string

A list of numerical fields and their corresponding sort orders that will be used for ordering your results. Up to 3 sort fields can be specified. The text similarity score is exposed as a special _text_match field that you can use in the list of sorting fields. If no sort_by parameter is specified, results are sorted by _text_match:desc,default_sorting_field:desc

facet_by
string

A list of fields that will be used for faceting your results on. Separate multiple fields with a comma.

max_facet_values
integer

Maximum number of facet values to be returned.

facet_query
string

Facet values that are returned can now be filtered via this parameter. The matching facet text is also highlighted. For example, when faceting by category, you can set facet_query=category:shoe to return only facet values that contain the prefix "shoe".

num_typos
string

The number of typographical errors (1 or 2) that would be tolerated. Default: 2

page
integer

Results from this specific page number would be fetched.

per_page
integer

Number of results to fetch per page. Default: 10

group_by
string

You can aggregate search results into groups or buckets by specify one or more group_by fields. Separate multiple fields with a comma. To group on a particular field, it must be a faceted field.

group_limit
integer

Maximum number of hits to be returned for every group. If the group_limit is set as K then only the top K hits in each group are returned in the response. Default: 3

include_fields
string

List of fields from the document to include in the search result

exclude_fields
string

List of fields from the document to exclude in the search result

highlight_full_fields
string

List of fields which should be highlighted fully without snippeting

highlight_affix_num_tokens
integer

The number of tokens that should surround the highlighted text on each side. Default: 4

highlight_start_tag
string

The start tag used for the highlighted snippets. Default: <mark>

highlight_end_tag
string

The end tag used for the highlighted snippets. Default: </mark>

enable_highlight_v1
boolean
Default: true

Flag for enabling/disabling the deprecated, old highlight structure in the response. Default: true

snippet_threshold
integer

Field values under this length will be fully highlighted, instead of showing a snippet of relevant portion. Default: 30

drop_tokens_threshold
integer

If the number of results found for a specific query is less than this number, Splore will attempt to drop the tokens in the query until enough results are found. Tokens that have the least individual hits are dropped first. Set to 0 to disable. Default: 10

typo_tokens_threshold
integer

If the number of results found for a specific query is less than this number, Splore will attempt to look for tokens with more typos until enough results are found. Default: 100

pinned_hits
string

A list of records to unconditionally include in the search results at specific positions. An example use case would be to feature or promote certain items on the top of search results. A list of record_id:hit_position. Eg: to include a record with ID 123 at Position 1 and another record with ID 456 at Position 5, you'd specify 123:1,456:5. You could also use the Overrides feature to override search results based on rules. Overrides are applied first, followed by pinned_hits and finally hidden_hits.

hidden_hits
string

A list of records to unconditionally hide from search results. A list of record_ids to hide. Eg: to hide records with IDs 123 and 456, you'd specify 123,456. You could also use the Overrides feature to override search results based on rules. Overrides are applied first, followed by pinned_hits and finally hidden_hits.

highlight_fields
string

A list of custom fields that must be highlighted even if you don't query for them

split_join_tokens
string

Treat space as typo: search for q=basket ball if q=basketball is not found or vice-versa. Splitting/joining of tokens will only be attempted if the original query produces no results. To always trigger this behavior, set value to always``. To disable, set value to off. Default is fallback`.

pre_segmented_query
boolean

You can index content from any logographic language into Splore if you are able to segment / split the text into space-separated words yourself before indexing and querying. Set this parameter to true to do the same

preset
string

Search using a bunch of search parameters by setting this parameter to the name of the existing Preset.

enable_overrides
boolean

If you have some overrides defined but want to disable all of them during query time, you can do that by setting this parameter to false

prioritize_exact_match
boolean

Set this parameter to true to ensure that an exact match is ranked above the others

max_candidates
integer

Control the number of words that Splore considers for typo and prefix searching.

prioritize_token_position
boolean

Make Splore prioritize documents where the query words appear earlier in the text.

exhaustive_search
boolean

Setting this to true will make Splore consider all prefixes and typo corrections of the words in the query without stopping early when enough results are found (drop_tokens_threshold and typo_tokens_threshold configurations are ignored).

search_cutoff_ms
integer

Splore will attempt to return results early if the cutoff time has elapsed. This is not a strict guarantee and facet computation is not bound by this parameter.

use_cache
boolean

Enable server side caching of search query results. By default, caching is disabled.

cache_ttl
integer

The duration (in seconds) that determines how long the search query is cached. This value can be set on a per-query basis. Default: 60.

min_len_1typo
integer

Minimum word length for 1-typo correction to be applied. The value of num_typos is still treated as the maximum allowed typos.

min_len_2typo
integer

Minimum word length for 2-typo correction to be applied. The value of num_typos is still treated as the maximum allowed typos.

vector_query
string

Vector query expression for fetching documents "closest" to a given query/document vector.

{
  • "q": "string",
  • "query_by": "string",
  • "query_by_weights": "string",
  • "prefix": "string",
  • "infix": "string",
  • "max_extra_prefix": 0,
  • "max_extra_suffix": 0,
  • "filter_by": "num_employees:>100 && country: [USA, UK]",
  • "sort_by": "num_employees:desc",
  • "facet_by": "string",
  • "max_facet_values": 0,
  • "facet_query": "string",
  • "num_typos": "string",
  • "page": 0,
  • "per_page": 0,
  • "group_by": "string",
  • "group_limit": 0,
  • "include_fields": "string",
  • "exclude_fields": "string",
  • "highlight_full_fields": "string",
  • "highlight_affix_num_tokens": 0,
  • "highlight_start_tag": "string",
  • "highlight_end_tag": "string",
  • "enable_highlight_v1": true,
  • "snippet_threshold": 0,
  • "drop_tokens_threshold": 0,
  • "typo_tokens_threshold": 0,
  • "pinned_hits": "string",
  • "hidden_hits": "string",
  • "highlight_fields": "string",
  • "split_join_tokens": "string",
  • "pre_segmented_query": true,
  • "preset": "string",
  • "enable_overrides": true,
  • "prioritize_exact_match": true,
  • "max_candidates": 0,
  • "prioritize_token_position": true,
  • "exhaustive_search": true,
  • "search_cutoff_ms": 0,
  • "use_cache": true,
  • "cache_ttl": 0,
  • "min_len_1typo": 0,
  • "min_len_2typo": 0,
  • "vector_query": "string"
}

MultiSearch Result

required
Array of objects (SearchResult)
Array
Array of objects (FacetCounts)
found
integer

The number of documents found

search_time_ms
integer

The number of milliseconds the search took

out_of
integer

The total number of documents in the collection

search_cutoff
boolean

Whether the search was cut off

page
integer

The search result page number

Array of objects (SearchGroupedHit)
Array of objects (SearchResultHit)

The documents that matched the search query

object
{
  • "results": [
    ]
}

Search Result Hit

Array of objects (SearchHighlight)

(Deprecated) Contains highlighted portions of the search fields

object

Highlighted version of the matching document

object

Can be any key-value pair

text_match
integer <int64>
object

Can be any key-value pair

vector_distance
number <float>

Distance between the query vector and matching document's vector value

{
  • "highlights": {
    },
  • "document": {
    },
  • "text_match": 1234556
}