Splore API (0.24.0)

Download OpenAPI specification:Download

Ingestion & Search API for Splore search platform!

Search Documents

search documents with query and filters

Search for documents in a collection

Search for documents in a collection that match the search criteria.

Authorizations:

None

path Parameters

collectionName

required

string

The name of the collection to search for the document under

query Parameters

required

object (SearchParameters)

Responses

Response samples

Content type

application/json

{"facet_counts": [{"counts": [{"count": 0,
"highlighted": "string",
"value": "string"
}
],
"field_name": "string",
"stats": {"max": 0,
"min": 0,
"sum": 0,
"total_values": 0,
"avg": 0
}
}
],
"found": 0,
"search_time_ms": 0,
"out_of": 0,
"search_cutoff": true,
"page": 0,
"grouped_hits": [{"group_key": [{ }
],
"hits": [{"highlights": {"company_name": {"field": "company_name",
"snippet": "<mark>Stark</mark> Industries"
}
},
"document": {"id": "124",
"company_name": "Stark Industries",
"num_employees": 5215,
"country": "USA"
},
"text_match": 1234556
}
]
}
],
"hits": [{"highlights": {"company_name": {"field": "company_name",
"snippet": "<mark>Stark</mark> Industries"
}
},
"document": {"id": "124",
"company_name": "Stark Industries",
"num_employees": 5215,
"country": "USA"
},
"text_match": 1234556
}
],
"request_params": {"collection_name": "string",
"q": "string",
"per_page": 0
}
}

send multiple search requests in a single HTTP request

This is especially useful to avoid round-trip network latencies incurred otherwise if each of these requests are sent in separate HTTP requests. You can also use this feature to do a federated search across multiple collections in a single HTTP request.

Authorizations:

None

query Parameters

required

object (MultiSearchParameters)

Parameters for the multi search API.

Request Body schema: application/json

required

Array of objects (MultiSearchCollectionParameters)

Array

q	string The query text to search for in the collection. Use * as the search string to return all documents. This is typically useful when used in conjunction with filter_by.
query_by	string A list of `string` fields that should be queried against. Multiple fields are separated with a comma.
query_by_weights	string The relative weight to give each `query_by` field when ranking results. This can be used to boost fields in priority, when looking for matches. Multiple fields are separated with a comma.
prefix	string Boolean field to indicate that the last word in the query should be treated as a prefix, and not as a whole word. This is used for building autocomplete and instant search interfaces. Defaults to true.
infix	string If infix index is enabled for this field, infix searching can be done on a per-field basis by sending a comma separated string parameter called infix to the search query. This parameter can have 3 values; `off` infix search is disabled, which is default `always` infix search is performed along with regular search `fallback` infix search is performed if regular search does not produce results
max_extra_prefix	integer There are also 2 parameters that allow you to control the extent of infix searching max_extra_prefix and max_extra_suffix which specify the maximum number of symbols before or after the query that can be present in the token. For example query "K2100" has 2 extra symbols in "6PK2100". By default, any number of prefixes/suffixes can be present for a match.
max_extra_suffix	integer There are also 2 parameters that allow you to control the extent of infix searching max_extra_prefix and max_extra_suffix which specify the maximum number of symbols before or after the query that can be present in the token. For example query "K2100" has 2 extra symbols in "6PK2100". By default, any number of prefixes/suffixes can be present for a match.
filter_by	string Filter conditions for refining youropen api validator search results. Separate multiple conditions with &&.
sort_by	string A list of numerical fields and their corresponding sort orders that will be used for ordering your results. Up to 3 sort fields can be specified. The text similarity score is exposed as a special `_text_match` field that you can use in the list of sorting fields. If no `sort_by` parameter is specified, results are sorted by `_text_match:desc,default_sorting_field:desc`
facet_by	string A list of fields that will be used for faceting your results on. Separate multiple fields with a comma.
max_facet_values	integer Maximum number of facet values to be returned.
facet_query	string Facet values that are returned can now be filtered via this parameter. The matching facet text is also highlighted. For example, when faceting by `category`, you can set `facet_query=category:shoe` to return only facet values that contain the prefix "shoe".
num_typos	string The number of typographical errors (1 or 2) that would be tolerated. Default: 2
page	integer Results from this specific page number would be fetched.
per_page	integer Number of results to fetch per page. Default: 10
group_by	string You can aggregate search results into groups or buckets by specify one or more `group_by` fields. Separate multiple fields with a comma. To group on a particular field, it must be a faceted field.
group_limit	integer Maximum number of hits to be returned for every group. If the `group_limit` is set as `K` then only the top K hits in each group are returned in the response. Default: 3
include_fields	string List of fields from the document to include in the search result
exclude_fields	string List of fields from the document to exclude in the search result
highlight_full_fields	string List of fields which should be highlighted fully without snippeting
highlight_affix_num_tokens	integer The number of tokens that should surround the highlighted text on each side. Default: 4
highlight_start_tag	string The start tag used for the highlighted snippets. Default: `<mark>`
highlight_end_tag	string The end tag used for the highlighted snippets. Default: `</mark>`
snippet_threshold	integer Field values under this length will be fully highlighted, instead of showing a snippet of relevant portion. Default: 30
drop_tokens_threshold	integer If the number of results found for a specific query is less than this number, Splore will attempt to drop the tokens in the query until enough results are found. Tokens that have the least individual hits are dropped first. Set to 0 to disable. Default: 10
typo_tokens_threshold	integer If the number of results found for a specific query is less than this number, Splore will attempt to look for tokens with more typos until enough results are found. Default: 100
pinned_hits	string A list of records to unconditionally include in the search results at specific positions. An example use case would be to feature or promote certain items on the top of search results. A list of `record_id:hit_position`. Eg: to include a record with ID 123 at Position 1 and another record with ID 456 at Position 5, you'd specify `123:1,456:5`. You could also use the Overrides feature to override search results based on rules. Overrides are applied first, followed by `pinned_hits` and finally `hidden_hits`.
hidden_hits	string A list of records to unconditionally hide from search results. A list of `record_id`s to hide. Eg: to hide records with IDs 123 and 456, you'd specify `123,456`. You could also use the Overrides feature to override search results based on rules. Overrides are applied first, followed by `pinned_hits` and finally `hidden_hits`.
highlight_fields	string A list of custom fields that must be highlighted even if you don't query for them
pre_segmented_query	boolean You can index content from any logographic language into Splore if you are able to segment / split the text into space-separated words yourself before indexing and querying. Set this parameter to true to do the same
preset	string Search using a bunch of search parameters by setting this parameter to the name of the existing Preset.
enable_overrides	boolean If you have some overrides defined but want to disable all of them during query time, you can do that by setting this parameter to false
prioritize_exact_match	boolean Set this parameter to true to ensure that an exact match is ranked above the others
exhaustive_search	boolean Setting this to true will make Splore consider all prefixes and typo corrections of the words in the query without stopping early when enough results are found (drop_tokens_threshold and typo_tokens_threshold configurations are ignored).
search_cutoff_ms	integer Splore will attempt to return results early if the cutoff time has elapsed. This is not a strict guarantee and facet computation is not bound by this parameter.
use_cache	boolean Enable server side caching of search query results. By default, caching is disabled.
cache_ttl	integer The duration (in seconds) that determines how long the search query is cached. This value can be set on a per-query basis. Default: 60.
min_len_1typo	integer Minimum word length for 1-typo correction to be applied. The value of num_typos is still treated as the maximum allowed typos.
min_len_2typo	integer Minimum word length for 2-typo correction to be applied. The value of num_typos is still treated as the maximum allowed typos.
vector_query	string Vector query expression for fetching documents "closest" to a given query/document vector.
collection required	string The collection to search in.

Responses

Request samples

Payload

Content type

application/json

{"searches": [{"q": "string",
"query_by": "string",
"query_by_weights": "string",
"prefix": "string",
"infix": "string",
"max_extra_prefix": 0,
"max_extra_suffix": 0,
"filter_by": "num_employees:>100 && country: [USA, UK]",
"sort_by": "string",
"facet_by": "string",
"max_facet_values": 0,
"facet_query": "string",
"num_typos": "string",
"page": 0,
"per_page": 0,
"group_by": "string",
"group_limit": 0,
"include_fields": "string",
"exclude_fields": "string",
"highlight_full_fields": "string",
"highlight_affix_num_tokens": 0,
"highlight_start_tag": "string",
"highlight_end_tag": "string",
"snippet_threshold": 0,
"drop_tokens_threshold": 0,
"typo_tokens_threshold": 0,
"pinned_hits": "string",
"hidden_hits": "string",
"highlight_fields": "string",
"pre_segmented_query": true,
"preset": "string",
"enable_overrides": true,
"prioritize_exact_match": true,
"exhaustive_search": true,
"search_cutoff_ms": 0,
"use_cache": true,
"cache_ttl": 0,
"min_len_1typo": 0,
"min_len_2typo": 0,
"vector_query": "string",
"collection": "string"
}
]
}

Response samples

Content type

application/json

{"results": [{"facet_counts": [{"counts": [{"count": 0,
"highlighted": "string",
"value": "string"
}
],
"field_name": "string",
"stats": {"max": 0,
"min": 0,
"sum": 0,
"total_values": 0,
"avg": 0
}
}
],
"found": 0,
"search_time_ms": 0,
"out_of": 0,
"search_cutoff": true,
"page": 0,
"grouped_hits": [{"group_key": [{ }
],
"hits": [{"highlights": {"company_name": {"field": "company_name",
"snippet": "<mark>Stark</mark> Industries"
}
},
"document": {"id": "124",
"company_name": "Stark Industries",
"num_employees": 5215,
"country": "USA"
},
"text_match": 1234556
}
]
}
],
"hits": [{"highlights": {"company_name": {"field": "company_name",
"snippet": "<mark>Stark</mark> Industries"
}
},
"document": {"id": "124",
"company_name": "Stark Industries",
"num_employees": 5215,
"country": "USA"
},
"text_match": 1234556
}
],
"request_params": {"collection_name": "string",
"q": "string",
"per_page": 0
}
}
]
}

Index Document(s)

Add document to search index

Index a document

A document to be indexed in a given collection must conform to the schema of the collection.

Authorizations:

None

path Parameters

collectionName

required

string

The name of the collection to add the document to

query Parameters

action

string

Value: "upsert"

Example: action=upsert

Additional action to perform

Request Body schema: application/json

The document object to be indexed

object

Can be any key-value pair

Responses

Request samples

Payload

Content type

application/json

{ }

Response samples

Content type

application/json

{ }

Import documents into a collection

The documents to be imported must be formatted in a newline delimited JSON structure. You can feed the output file from a Splore export operation directly as import.

Authorizations:

None

path Parameters

collectionName

required

string

The name of the collection

query Parameters

object

Request Body schema: application/octet-stream

The json array of documents or the JSONL file to import

string

The JSONL file to import

Responses

Response samples

Content type

application/json

{"message": "string"
}

Search Parameters

q required	string The query text to search for in the collection. Use * as the search string to return all documents. This is typically useful when used in conjunction with filter_by.
query_by required	string A list of `string` fields that should be queried against. Multiple fields are separated with a comma.
query_by_weights	string The relative weight to give each `query_by` field when ranking results. This can be used to boost fields in priority, when looking for matches. Multiple fields are separated with a comma.
prefix	string Boolean field to indicate that the last word in the query should be treated as a prefix, and not as a whole word. This is used for building autocomplete and instant search interfaces. Defaults to true.
infix	string If infix index is enabled for this field, infix searching can be done on a per-field basis by sending a comma separated string parameter called infix to the search query. This parameter can have 3 values; `off` infix search is disabled, which is default `always` infix search is performed along with regular search `fallback` infix search is performed if regular search does not produce results
max_extra_prefix	integer There are also 2 parameters that allow you to control the extent of infix searching max_extra_prefix and max_extra_suffix which specify the maximum number of symbols before or after the query that can be present in the token. For example query "K2100" has 2 extra symbols in "6PK2100". By default, any number of prefixes/suffixes can be present for a match.
max_extra_suffix	integer There are also 2 parameters that allow you to control the extent of infix searching max_extra_prefix and max_extra_suffix which specify the maximum number of symbols before or after the query that can be present in the token. For example query "K2100" has 2 extra symbols in "6PK2100". By default, any number of prefixes/suffixes can be present for a match.
filter_by	string Filter conditions for refining youropen api validator search results. Separate multiple conditions with &&.
sort_by	string A list of numerical fields and their corresponding sort orders that will be used for ordering your results. Up to 3 sort fields can be specified. The text similarity score is exposed as a special `_text_match` field that you can use in the list of sorting fields. If no `sort_by` parameter is specified, results are sorted by `_text_match:desc,default_sorting_field:desc`
facet_by	string A list of fields that will be used for faceting your results on. Separate multiple fields with a comma.
max_facet_values	integer Maximum number of facet values to be returned.
facet_query	string Facet values that are returned can now be filtered via this parameter. The matching facet text is also highlighted. For example, when faceting by `category`, you can set `facet_query=category:shoe` to return only facet values that contain the prefix "shoe".
num_typos	string The number of typographical errors (1 or 2) that would be tolerated. Default: 2
page	integer Results from this specific page number would be fetched.
per_page	integer Number of results to fetch per page. Default: 10
group_by	string You can aggregate search results into groups or buckets by specify one or more `group_by` fields. Separate multiple fields with a comma. To group on a particular field, it must be a faceted field.
group_limit	integer Maximum number of hits to be returned for every group. If the `group_limit` is set as `K` then only the top K hits in each group are returned in the response. Default: 3
include_fields	string List of fields from the document to include in the search result
exclude_fields	string List of fields from the document to exclude in the search result
highlight_full_fields	string List of fields which should be highlighted fully without snippeting
highlight_affix_num_tokens	integer The number of tokens that should surround the highlighted text on each side. Default: 4
highlight_start_tag	string The start tag used for the highlighted snippets. Default: `<mark>`
highlight_end_tag	string The end tag used for the highlighted snippets. Default: `</mark>`
enable_highlight_v1	boolean Default: true Flag for enabling/disabling the deprecated, old highlight structure in the response. Default: true
snippet_threshold	integer Field values under this length will be fully highlighted, instead of showing a snippet of relevant portion. Default: 30
drop_tokens_threshold	integer If the number of results found for a specific query is less than this number, Splore will attempt to drop the tokens in the query until enough results are found. Tokens that have the least individual hits are dropped first. Set to 0 to disable. Default: 10
typo_tokens_threshold	integer If the number of results found for a specific query is less than this number, Splore will attempt to look for tokens with more typos until enough results are found. Default: 100
pinned_hits	string A list of records to unconditionally include in the search results at specific positions. An example use case would be to feature or promote certain items on the top of search results. A list of `record_id:hit_position`. Eg: to include a record with ID 123 at Position 1 and another record with ID 456 at Position 5, you'd specify `123:1,456:5`. You could also use the Overrides feature to override search results based on rules. Overrides are applied first, followed by `pinned_hits` and finally `hidden_hits`.
hidden_hits	string A list of records to unconditionally hide from search results. A list of `record_id`s to hide. Eg: to hide records with IDs 123 and 456, you'd specify `123,456`. You could also use the Overrides feature to override search results based on rules. Overrides are applied first, followed by `pinned_hits` and finally `hidden_hits`.
highlight_fields	string A list of custom fields that must be highlighted even if you don't query for them
split_join_tokens	string Treat space as typo: search for q=basket ball if q=basketball is not found or vice-versa. Splitting/joining of tokens will only be attempted if the original query produces no results. To always trigger this behavior, set value to always``. To disable, set value to off`. Default is` fallback`.
pre_segmented_query	boolean You can index content from any logographic language into Splore if you are able to segment / split the text into space-separated words yourself before indexing and querying. Set this parameter to true to do the same
preset	string Search using a bunch of search parameters by setting this parameter to the name of the existing Preset.
enable_overrides	boolean If you have some overrides defined but want to disable all of them during query time, you can do that by setting this parameter to false
prioritize_exact_match	boolean Set this parameter to true to ensure that an exact match is ranked above the others
max_candidates	integer Control the number of words that Splore considers for typo and prefix searching.
prioritize_token_position	boolean Make Splore prioritize documents where the query words appear earlier in the text.
exhaustive_search	boolean Setting this to true will make Splore consider all prefixes and typo corrections of the words in the query without stopping early when enough results are found (drop_tokens_threshold and typo_tokens_threshold configurations are ignored).
search_cutoff_ms	integer Splore will attempt to return results early if the cutoff time has elapsed. This is not a strict guarantee and facet computation is not bound by this parameter.
use_cache	boolean Enable server side caching of search query results. By default, caching is disabled.
cache_ttl	integer The duration (in seconds) that determines how long the search query is cached. This value can be set on a per-query basis. Default: 60.
min_len_1typo	integer Minimum word length for 1-typo correction to be applied. The value of num_typos is still treated as the maximum allowed typos.
min_len_2typo	integer Minimum word length for 2-typo correction to be applied. The value of num_typos is still treated as the maximum allowed typos.
vector_query	string Vector query expression for fetching documents "closest" to a given query/document vector.

{"q": "string",
"query_by": "string",
"query_by_weights": "string",
"prefix": "string",
"infix": "string",
"max_extra_prefix": 0,
"max_extra_suffix": 0,
"filter_by": "num_employees:>100 && country: [USA, UK]",
"sort_by": "num_employees:desc",
"facet_by": "string",
"max_facet_values": 0,
"facet_query": "string",
"num_typos": "string",
"page": 0,
"per_page": 0,
"group_by": "string",
"group_limit": 0,
"include_fields": "string",
"exclude_fields": "string",
"highlight_full_fields": "string",
"highlight_affix_num_tokens": 0,
"highlight_start_tag": "string",
"highlight_end_tag": "string",
"enable_highlight_v1": true,
"snippet_threshold": 0,
"drop_tokens_threshold": 0,
"typo_tokens_threshold": 0,
"pinned_hits": "string",
"hidden_hits": "string",
"highlight_fields": "string",
"split_join_tokens": "string",
"pre_segmented_query": true,
"preset": "string",
"enable_overrides": true,
"prioritize_exact_match": true,
"max_candidates": 0,
"prioritize_token_position": true,
"exhaustive_search": true,
"search_cutoff_ms": 0,
"use_cache": true,
"cache_ttl": 0,
"min_len_1typo": 0,
"min_len_2typo": 0,
"vector_query": "string"
}

MultiSearch Result

required

Array of objects (SearchResult)

Array

	Array of objects (FacetCounts)
found	integer The number of documents found
search_time_ms	integer The number of milliseconds the search took
out_of	integer The total number of documents in the collection
search_cutoff	boolean Whether the search was cut off
page	integer The search result page number
	Array of objects (SearchGroupedHit)
	Array of objects (SearchResultHit) The documents that matched the search query
	object

{"results": [{"facet_counts": [{"counts": [{"count": 0,
"highlighted": "string",
"value": "string"
}
],
"field_name": "string",
"stats": {"max": 0,
"min": 0,
"sum": 0,
"total_values": 0,
"avg": 0
}
}
],
"found": 0,
"search_time_ms": 0,
"out_of": 0,
"search_cutoff": true,
"page": 0,
"grouped_hits": [{"group_key": [{ }
],
"hits": [{"highlights": {"company_name": {"field": "company_name",
"snippet": "<mark>Stark</mark> Industries"
}
},
"document": {"id": "124",
"company_name": "Stark Industries",
"num_employees": 5215,
"country": "USA"
},
"text_match": 1234556
}
]
}
],
"hits": [{"highlights": {"company_name": {"field": "company_name",
"snippet": "<mark>Stark</mark> Industries"
}
},
"document": {"id": "124",
"company_name": "Stark Industries",
"num_employees": 5215,
"country": "USA"
},
"text_match": 1234556
}
],
"request_params": {"collection_name": "string",
"q": "string",
"per_page": 0
}
}
]
}

Search Result Hit

	Array of objects (SearchHighlight) (Deprecated) Contains highlighted portions of the search fields
	object Highlighted version of the matching document
	object Can be any key-value pair
text_match	integer <int64>
	object Can be any key-value pair
vector_distance	number <float> Distance between the query vector and matching document's vector value

{"highlights": {"company_name": {"field": "company_name",
"snippet": "<mark>Stark</mark> Industries"
}
},
"document": {"id": "124",
"company_name": "Stark Industries",
"num_employees": 5215,
"country": "USA"
},
"text_match": 1234556
}