The JsonTreeReader Controller Service reads a JSON Object and creates a Record object for the entire JSON Object tree. The Controller Service must be configured with a Schema that describes the structure of the JSON data. If any field exists in the JSON that is not in the schema, that field will be skipped. If the schema contains a field for which no JSON field exists, a null value will be used in the Record (or the default value defined in the schema, if applicable).

If the root element of the JSON is a JSON Array, each JSON Object within that array will be treated as its own separate Record. If the root element is a JSON Object, the JSON will all be treated as a single Record.

Schemas and Type Coercion

When a record is parsed from incoming data, it is separated into fields. Each of these fields is then looked up against the configured schema (by field name) in order to determine what the type of the data should be. If the field is not present in the schema, that field is omitted from the Record. If the field is found in the schema, the data type of the received data is compared against the data type specified in the schema. If the types match, the value of that field is used as-is. If the schema indicates that the field should be of a different type, then the Controller Service will attempt to coerce the data into the type specified by the schema. If the field cannot be coerced into the specified type, an Exception will be thrown.

The following rules apply when attempting to coerce a field value from one data type to another:

If none of the above rules apply when attempting to coerce a value from one data type to another, the coercion will fail and an Exception will be thrown.

Examples

As an example, consider the following JSON is read:

[{
    "id": 17,
    "name": "John",
    "child": {
        "id": "1"
    },
    "dob": "10-29-1982"
    "siblings": [
        { "name": "Jeremy", "id": 4 },
        { "name": "Julia", "id": 8}
    ]
  },
  {
    "id": 98,
    "name": "Jane",
    "child": {
        "id": 2
    },
    "dob": "08-30-1984"
    "gender": "F",
    "siblingIds": [],
    "siblings": []
  }]

Also, consider that the schema that is configured for this JSON is as follows (assuming that the AvroSchemaRegistry Controller Service is chosen to denote the Schema:

{
	"namespace": "nifi",
	"name": "person",
	"type": "record",
	"fields": [
		{ "name": "id", "type": "int" },
		{ "name": "name", "type": "string" },
		{ "name": "gender", "type": "string" },
		{ "name": "dob", "type": {
			"type": "int",
			"logicalType": "date"
		}},
		{ "name": "siblings", "type": {
			"type": "array",
			"items": {
				"type": "record",
				"fields": [
					{ "name": "name", "type": "string" }
				]
			}
		}}
	]
}

Let us also assume that this Controller Service is configured with the "Date Format" property set to "MM-dd-yyyy", as this matches the date format used for our JSON data. This will result in the JSON creating two separate records, because the root element is a JSON array with two elements.

The first Record will consist of the following values:

Field Name Field Value
id 17
name John
gender null
dob 11-30-1983
siblings array with two elements, each of which is itself a Record:
Field Name Field Value
name Jeremy

and:
Field Name Field Value
name Julia

The second Record will consist of the following values:

Field Name Field Value
id 98
name Jane
gender F
dob 08-30-1984
siblings empty array