ElasticSearch Cookbook
上QQ阅读APP看书,第一时间看更新

Managing nested objects

There is a special type of embedded object: the nested one. This resolves a problem related to the Lucene indexing architecture, in which all the fields of embedded objects are viewed as a single object, because during search it is not possible to distinguish values between different embedded objects in the same multi-valued array.

If we consider the previous order example, it's not possible to disguise an item name and its quantity with the same query. We need to index them in different elements and when we join them. This entire trip is managed by nested objects and nested queries.

Getting ready

You need a working ElasticSearch cluster.

How to do it...

A nested object is defined as the standard object with the nested type.

From the example of the Mapping an object recipe, we can change the type from object to nested as shown in the following code:

{
  "order" : {
    "properties" : {
      "id" : {"type" : "string", 
      "store" : "yes", "index":"not_analyzed"},
      "date" : {"type" : "date", "store" : "no", 
      "index":"not_analyzed"},
      "customer_id" : {"type" : "string", "store" : "yes",
      "index":"not_analyzed"},
      "sent" : {"type" : "boolean", "store" : "no", 
      "index":"not_analyzed"},
      "item" : {
        "type" : "nested",
        "properties" : {
          "name" : {"type" : "string", "store" : "no",
          "index":"analyzed"},
          "quantity" : {"type" : "integer", 
            "store" : "no",
            "index":"not_analyzed"},
          "vat" : {"type" : "double", "store" : "no",
            "index":"not_analyzed"}
        }
      }
    }
  }
}

How it works...

When a document is indexed and if an embedded object is marked as nested, it's extracted by the original document and indexed in a new external document.

In the above example, we have reused the mapping of the Mapping an object recipe, but we have changed the type of the item from object to nested. No other required action must be taken to convert an embedded object to a nested object.

The nested objects are special Lucene documents that are saved in the same block of data of its parent. This approach allows fast joining with the parent document.

Nested objects are not searchable with standard queries, but only with the nested one. They are not shown in standard query results.

The lives of nested objects are related to their parents: deleting/updating a parent, automatically deletes/updates all nested children. Changing the parent means ElasticSearch will do the following:

  • Delete the old document
  • Reindex the old document with less nested data
  • Delete the new document
  • Reindex the new document with the new nested data

There's more...

Sometimes it is required to propagate information of the nested objects to their parent or root objects, mainly to build simpler queries about their parents. To achieve this goal, there are two special properties of the nested objects, as follows:

  • include_in_parent: This allows automatic addition of the nested fields to the immediate parent
  • include_in_root: This adds the nested object fields to the root object

These settings add data replication, but they reduce the complexity of some queries, thus improving performance.

See also

  • Managing a child document