Types in Makaira
Types in Makaira are interesting for those who want to use their own fields as a base for searching and filtering. To provide this function the data has to be delivered with the correct type, this can be ensured by using type suffixes which give hints for the conversion which can also be named casting or mapping.
For example, if you want to be able to set filters by the slider, you should choose the numeric type float
or int
.
The string handling is a bit more complex as Makaira offers multiple string types like short-, long-fields, or fields with proper names. Only the right choice of type will deliver the desired result.
The table contains all types with a short explanation about the use case.
Custom String Field Suffixes in Makaira
This guide defines the meaning and behavior of suffixes used for string fields in Makaira mappings. Each suffix specifies how a field should be analyzed at index and search time, optimized for different content types such as product titles, descriptions, names, or IDs.
_str_short
– Short, Structured Text with High Precision- Purpose: For short, structured fields like product names or compact attributes.
- Behavior:
Index time: Performs compound decomposition, lowercasing, normalization (e.g. umlauts), and stemming.
Search time: Applies a lighter analysis — no decomposition. This makes searches more exact and controlled. - Strength: Optimized for precision — ideal when accurate matching is more important than broad recall.
- Use Case: Product titles, short attribute values.
- Example
{ "productName_str_short": "Kaffeemaschine mit Milchaufschäumer" }
_str_long
– Descriptive Text for Full-Text Search- Purpose: For longer free-text fields like product descriptions or marketing text.
- Behavior: Handles stopwords (e.g. "und", "mit"), lowercasing, normalization, stemming.
- Strength: Good for rich text search with long inputs and detailed queries.
- Use Case: Product details, specifications, descriptions.
- Example
{ "description_str_long": "Diese Kaffeemaschine hat WLAN und eine 4K-Anzeige." }
_str_short_key
– Proper Names, Unaltered Structure- Purpose: For structured names or keys such as brands, manufacturers, or model series.
- Behavior: Normalized (e.g. lowercased, folded), but no stemming or decomposition. Original structure is preserved.
- Strength: Maintains integrity of case-sensitive or compound names. Ideal for fields where structure must be retained.
- Use Case: Brand names, model families, collections.
- Example
{ "manufacturer_str_short_key": "Bosch Professional" }
_str_keyword
– Exact Matching Fields- Purpose: For fields requiring exact matching, such as SKUs, part numbers, or internal codes.
- Behavior: No tokenization. Entire value is kept as a single token, lowercased and trimmed.
- Strength: Ensures exact-match behavior, not affected by token splitting or stemming.
- Use Case: Product identifiers, SKUs, internal references.
- Example
{ "sku_str_keyword": "ABC-12345-X" }
_split_number_string
– Alphanumeric Split for Model- Purpose: Designed for fields that mix letters and numbers (e.g., serials, model codes).
- Behavior: Splits on letter/number boundaries (e.g., "X1A45Z" → "X", "1", "A", "45", "Z"), applies normalization and stemming.
- Strength: Improves matchability of complex alphanumerics by enabling partial component search.
- Use Case: Serial numbers, model numbers, version strings.
- Example
{ "model_split_number_string": "X1A45Z" }
_str_decomposed
– High Recall for Search Fields- Purpose: For search-relevant fields where broad matching is important.
- Behavior: Performs full compound decomposition and stemming at both index and search time, ensuring flexible matching.
- Strength: Optimized for recall — increases likelihood that various user inputs will hit relevant content.
- Use Case: Search-focused content fields (e.g., search index, query expansions).
- Example
{ "searchable_str_decomposed": "Milchaufschäumer für Espressomaschinen" }
Datatype | Type description | Suffix | Description |
---|---|---|---|
date | Fields with date formatting | _date | Used for values that contains date. |
float | Fields with floating point numbers | _float | Represents numbers with decimal place. |
int | Fields with integer values | _int | Represents whole number that can be positive, negative, or zero, but not fraction. |
bool | Fields with boolean value | _bool | Boolean is used when one of two possible values needs to be choosen. For example, True or False. |
str | Fields with text content | 1._str_short 2. _str_long 3. _str_short_key 4. _str_keyword 5. _split_number_string 6. _str_decomposed | 1. Short description of a product 2. Long description of a product 3. Proper names such as Manufacturer name 4. Useful for special applications where case sensitive proper names are important. 5. Makes shure that new tokens are generated if there are number and letters in a word. 6. Decomposes searchterm and indexed document into sub-words. |
data storage only | Deactivate typing | _data_only | Useful for large objects that are not to be searched and for example are only used for output in the frontend. |
{
...
// type: *_date
"marketRelease_date": "2015-01-01T12:10:30Z"
// type: *_float
"specialRate_float": 1.258
// type: *_int
"usageScore_int": 150,
// type: *_bool
"hasConfiguration_bool": true,
// type: *_str_short
"shortDesc_str_short: "A shorter text for the record",
// type: *_str_long
"longDesc_str_long": "A very long text can be written",
// type: *_str_short_key
"bundleManufacturer_str_short_key": "Bertrand",
// type: *_str_keyword
"bundle_str_keyword": "Probierpaket für Einsteiger",
// type: *_data_only
"bundle_data_only": {
"bundleId": 49625,
"options": {
"101": "Glas",
"102": "PVC",
"103": "Wood"
},
},
...
}
Please be aware that some functionalities will not work when you do not use attributes, attributeStr, attributeInt, attributeFloat (see benfits of attributes ... )
Updated about 21 hours ago