PutBigQueryStreaming

Description:

Load data into Google BigQuery table using the streaming API. This processor is not intended to load large flow files as it will load the full content into memory. If you need to insert large flow files, consider using PutBigQueryBatch instead.

Tags:

google, google cloud, bq, gcp, bigquery, record

Properties:

In the list below, the names of required properties appear in bold. Any other properties (not in bold) are considered optional. The table also indicates any default values, whether a property supports the NiFi Expression Language, and whether a property is considered "sensitive", meaning that its value will be encrypted. Before entering a value in a sensitive property, ensure that the nifi.properties file has an entry for the property nifi.sensitive.props.key.

NameDefault ValueAllowable ValuesDescription
Project IDGoogle Cloud Project ID
Supports Expression Language: true (will be evaluated using variable registry only)
GCP Credentials Provider ServiceController Service API:
GCPCredentialsService
Implementation: GCPCredentialsControllerService
The Controller Service used to obtain Google Cloud Platform credentials.
Number of retries6How many retry attempts should be made before routing to the failure relationship.
Proxy hostIP or hostname of the proxy to be used. You might need to set the following properties in bootstrap for https proxy usage: -Djdk.http.auth.tunneling.disabledSchemes= -Djdk.http.auth.proxying.disabledSchemes=
Supports Expression Language: true (will be evaluated using variable registry only)
Proxy portProxy port number
Supports Expression Language: true (will be evaluated using variable registry only)
HTTP Proxy UsernameHTTP Proxy Username
Supports Expression Language: true (will be evaluated using variable registry only)
HTTP Proxy PasswordHTTP Proxy Password
Sensitive Property: true
Supports Expression Language: true (will be evaluated using variable registry only)
Proxy Configuration ServiceController Service API:
ProxyConfigurationService
Implementation: StandardProxyConfigurationService
Specifies the Proxy Configuration Controller Service to proxy network requests. If set, it supersedes proxy settings configured per component. Supported proxies: HTTP + AuthN
Dataset${bq.dataset}BigQuery dataset name (Note - The dataset must exist in GCP)
Supports Expression Language: true (will be evaluated using flow file attributes and variable registry)
Table Name${bq.table.name}BigQuery table name
Supports Expression Language: true (will be evaluated using flow file attributes and variable registry)
Ignore Unknown ValuesfalseSets whether BigQuery should allow extra values that are not represented in the table schema. If true, the extra values are ignored. If false, records with extra columns are treated as bad records, and if there are too many bad records, an invalid error is returned in the job result. By default unknown values are not allowed.
Supports Expression Language: true (will be evaluated using flow file attributes and variable registry)
Record ReaderController Service API:
RecordReaderFactory
Implementations: JsonPathReader
AvroReader
XMLReader
WindowsEventLogReader
ReaderLookup
Syslog5424Reader
GrokReader
ScriptedReader
CSVReader
SyslogReader
ParquetReader
JsonTreeReader
CEFReader
Specifies the Controller Service to use for parsing incoming data.
Skip Invalid RowsfalseSets whether to insert all valid rows of a request, even if invalid rows exist. If not set the entire insert request will fail if it contains an invalid row.
Supports Expression Language: true (will be evaluated using flow file attributes and variable registry)

Relationships:

NameDescription
successFlowFiles are routed to this relationship after a successful Google BigQuery operation.
failureFlowFiles are routed to this relationship if the Google BigQuery operation fails.

Reads Attributes:

None specified.

Writes Attributes:

NameDescription
bq.records.countNumber of records successfully inserted

State management:

This component does not store state.

Restricted:

This component is not restricted.

Input requirement:

This component requires an incoming relationship.

System Resource Considerations:

ResourceDescription
MEMORYAn instance of this component can cause high usage of this system resource. Multiple instances or high concurrency settings may result a degradation of performance.

See Also:

PutBigQueryBatch