groonga - An open-source fulltext search engine and column store.

8.11.6. query

8.11.6.1. Summary

query enables you to specify --match_columns and --query parameters as function arguments.

query is one of the groonga builtin functions, so you can specify multiple query function as parameters of --filter option.

Because of such flexibility, you can control full text search behavior by combination of multiple query function.

query can be used in only --filter in select.

8.11.6.2. Syntax

query requires two arguments - match_column and query_string.

query(match_column, query_string)

8.11.6.3. Usage

Here are a schema definition and sample data to show usage.

Sample schema:

Execution example:

table_create Documents TABLE_NO_KEY
# [[0, 1337566253.89858, 0.000355720520019531], true]
column_create Documents content COLUMN_SCALAR Text
# [[0, 1337566253.89858, 0.000355720520019531], true]
table_create Terms TABLE_PAT_KEY|KEY_NORMALIZE ShortText --default_tokenizer TokenBigram
# [[0, 1337566253.89858, 0.000355720520019531], true]
column_create Terms documents_content_index COLUMN_INDEX|WITH_POSITION Documents content
# [[0, 1337566253.89858, 0.000355720520019531], true]
table_create Users TABLE_NO_KEY
# [[0, 1337566253.89858, 0.000355720520019531], true]
column_create Users name COLUMN_SCALAR ShortText
# [[0, 1337566253.89858, 0.000355720520019531], true]
column_create Users memo COLUMN_SCALAR ShortText
# [[0, 1337566253.89858, 0.000355720520019531], true]
table_create Lexicon TABLE_HASH_KEY ShortText \
  --default_tokenizer TokenBigramSplitSymbolAlphaDigit \
  --normalizer NormalizerAuto
column_create Lexicon users_name COLUMN_INDEX|WITH_POSITION Users name
# [[0, 1337566253.89858, 0.000355720520019531], true]
column_create Lexicon users_memo COLUMN_INDEX|WITH_POSITION Users memo
# [[0, 1337566253.89858, 0.000355720520019531], true]

Sample data:

Execution example:

load --table Users
[
{"name": "Alice", "memo": "groonga user"},
{"name": "Alisa", "memo": "mroonga user"},
{"name": "Bob",   "memo": "rroonga user"},
{"name": "Tom",   "memo": "nroonga user"},
{"name": "Tobby", "memo": "groonga and mroonga user. mroonga is ..."},
]
# [[0, 1337566253.89858, 0.000355720520019531], true]
# [[0, 1337566253.89858, 0.000355720520019531], 5]

Here is the simple usage of query function which execute full text search by keyword 'alice' without using --match_column and --query arguments in --filter.

Execution example:

select Users --output_columns name,_score --filter 'query("name * 10", "alice")'
# [
#   [
#     0,
#     1337566253.89858,
#     0.000355720520019531
#   ],
#   [
#     [
#       [
#         1
#       ],
#       [
#         [
#           "name",
#           "ShortText"
#         ],
#         [
#           "_score",
#           "Int32"
#         ]
#       ],
#       [
#         "Alice",
#         10
#       ]
#     ]
#   ]
# ]

When executing above query, the keyword 'alice' is weighted to the value - '10'.

Here are the contrasting examples with/without query.

Execution example:

select Users --output_columns name,memo,_score --match_columns "memo * 10" --query "memo:@groonga OR memo:@mroonga OR memo:@user" --sortby -_score
# [
#   [
#     0,
#     1337566253.89858,
#     0.000355720520019531
#   ],
#   [
#     [
#       [
#         5
#       ],
#       [
#         [
#           "name",
#           "ShortText"
#         ],
#         [
#           "memo",
#           "ShortText"
#         ],
#         [
#           "_score",
#           "Int32"
#         ]
#       ],
#       [
#         "Tobby",
#         "groonga and mroonga user. mroonga is ...",
#         4
#       ],
#       [
#         "Alice",
#         "groonga user",
#         2
#       ],
#       [
#         "Alisa",
#         "mroonga user",
#         2
#       ],
#       [
#         "Bob",
#         "rroonga user",
#         1
#       ],
#       [
#         "Tom",
#         "nroonga user",
#         1
#       ]
#     ]
#   ]
# ]

In this case, the keywords 'groonga' and 'mroonga' and 'user' are given same weight value. You can't pass different weight value to each keyword in this way.

Execution example:

select Users --output_columns name,memo,_score --filter 'query("memo * 10", "groonga") || query("memo * 20", "mroonga") || query("memo * 1", "user")' --sortby -_score
# [
#   [
#     0,
#     1337566253.89858,
#     0.000355720520019531
#   ],
#   [
#     [
#       [
#         5
#       ],
#       [
#         [
#           "name",
#           "ShortText"
#         ],
#         [
#           "memo",
#           "ShortText"
#         ],
#         [
#           "_score",
#           "Int32"
#         ]
#       ],
#       [
#         "Tobby",
#         "groonga and mroonga user. mroonga is ...",
#         51
#       ],
#       [
#         "Alisa",
#         "mroonga user",
#         21
#       ],
#       [
#         "Alice",
#         "groonga user",
#         11
#       ],
#       [
#         "Tom",
#         "nroonga user",
#         1
#       ],
#       [
#         "Bob",
#         "rroonga user",
#         1
#       ]
#     ]
#   ]
# ]

On the other hand, by specifying multiple query, the keywords 'groonga' and 'mroonga' and 'user' are given different value of weight.

As a result, you can control full text search result by giving different weight to the keywords on your purpose.

8.11.6.4. Parameters

There are two required parameter, match_column and query_string.

8.11.6.4.1. match_column

It specifies match_column equivalent parameter.

See match_columns about match_column.

8.11.6.4.2. query_string

It specifies query equivalent parameter.

See query about query string.

8.11.6.5. Return value

query returns a value of boolean (true or false).

8.11.6.6. TODO

  • Support query_expansion
  • Support query_flags

8.11.6.7. See also