Development Quick Tips

How to create nested fields inside Elasticsearch when importing SQL data using Logstash

The nested type is a specialized version of the object datatype that allows arrays of objects to be indexed in a way that they can be queried independently of each other.

In order to achieve this we will follow these steps: 

1. Sort the results returned by the initial input jdbc statement using ORDER BY 

input {  

    jdbc {  

        jdbc_connection_string => "jdbc:<connection_string>" 

        jdbc_user => "<user>" 

        jdbc_password => "<password>" 

        jdbc_driver_library => "<path_to_sqljdbc>" 

        jdbc_driver_class => "Java::com.microsoft.sqlserver.jdbc.SQLServerDriver" 

        statement => "<SQL_statement> ORDER BY <field> <order>"     

} 

2. Use the mutate filter. 

This Logstash filter plugin allows you to force fields into specific data types and add, copy, and update specific fields to make them compatible across the environment.  

Example: 

mutate { 

    rename => { 

        "[objectName][PropertyName1]" => "[objectName][PropertyName1]" 

        "[objectName][PropertyName2]" => "[objectName][PropertyName2]"             

    } 

} 

3. Create a mapping file that will mark the object as nested.  

{
   "template" : "template_name",
   "version" : template_version, 
    "mappings" : {
      "properties" : {	  
        "field_name" : {
          "type" : "field_type"
        },
	  "field_name" : {
          "type" : "field_type"
        },    
        "nested_field_name" : {
          "type" : "nested",
          "properties" : {
            "propertyName1" : {
              "type" : "field_type"
            },
            "propertyName2" : {
              "type" : "date"
            }           
          }
        },        
      }
    }
}

4. Use the following options inside the output plugin 

action => "update" 
doc_as_upsert => true 
manage_template => true 
template => "<path_to_file>" 
template_name => "<template_name>" 
template_overwrite => true 
retry_on_conflict => 5 

5. This is an optional step. If the “document_type” tag is used inside the output filter, we will remove this.