JSDoc: Class: TokenSet

new TokenSet()

A token set is used to store the unique list of all tokens within an index. Token sets are also used to represent an incoming query to the index, this query token set and index token set are then intersected to find which tokens to look up in the inverted index.

A token set can hold multiple tokens, as in the case of the index token set, or it can hold a single token as in the case of a simple query token set.

Additionally token sets are used to perform wildcard matching. Leading, contained and trailing wildcards are supported, and from this edit distance matching can also be provided.

Token sets are implemented as a minimal finite state automata, where both common prefixes and suffixes are shared between tokens. This helps to reduce the space used for storing the token set.

Source:

token_set.js, line 27

Methods

(static) fromArray(arr) → {lunr.TokenSet}

Creates a TokenSet instance from the given sorted array of words.

Parameters:

Name	Type	Description
`arr`		A sorted array of strings to create the set from.

Source:

token_set.js, line 51

Throws:

Will throw an error if the input array is not sorted.

Returns:

Type: lunr.TokenSet

(static) fromFuzzyString(str, editDistance) → {lunr.Vector}

Creates a token set representing a single string with a specified edit distance.

Insertions, deletions, substitutions and transpositions are each treated as an edit distance of 1.

Increasing the allowed edit distance will have a dramatic impact on the performance of both creating and intersecting these TokenSets. It is advised to keep the edit distance less than 3.

Parameters:

Name	Type	Description
`str`	string	The string to create the token set from.
`editDistance`	number	The allowed edit distance to match.

Source:

token_set.js, line 94

Returns:

Type: lunr.Vector

(static) fromString(str) → {lunr.TokenSet}

Creates a TokenSet from a string.

The string may contain one or more wildcard characters (*) that will allow wildcard matching when intersecting with another TokenSet.

Parameters:

Name	Type	Description
`str`	string	The string to create a TokenSet from.

Source:

token_set.js, line 230

Returns:

Type: lunr.TokenSet

intersect(b) → {lunr.TokenSet}

Returns a new TokenSet that is the intersection of this TokenSet and the passed TokenSet.

This intersection will take into account any wildcards contained within the TokenSet.

Parameters:

Name	Type	Description
`b`	lunr.TokenSet	An other TokenSet to intersect with.

Source:

token_set.js, line 354

Returns:

Type: lunr.TokenSet

toArray() → {}

Converts this TokenSet into an array of strings contained within the TokenSet.

This is not intended to be used on a TokenSet that contains wildcards, in these cases the results are undefined and are likely to cause an infinite loop.

Source:

token_set.js, line 272

Returns:

Type

toString() → {string}

Generates a string representation of a TokenSet.

This is intended to allow TokenSets to be used as keys in objects, largely to aid the construction and minimisation of a TokenSet. As such it is not designed to be a human friendly representation of the TokenSet.

Source:

token_set.js, line 317

Returns:

Type: string