Snowflake
Allows the connection to Snowflake to automatically retrieve from or publish data to a Snowflake table.
Code repository: https://github.com/AmadeusITGroup/dataio-framework/tree/main/src/main/scala/com/amadeus/dataio/pipes/snowflake
Useful links:
Common
The following fields are available for all Snowflake components:
Name | Mandatory | Description | Example | Default |
---|---|---|---|---|
Options | No | Snowflake options to specify such as key = value pairs. These options are then passed as option to the Spark connector for Snowflake | Options { Database = "example_database" Schema = "example_schema" ... } |
Batch
Input
Type: com.amadeus.dataio.pipes.snowflake.batch.SnowflakeInput
Output
Type: com.amadeus.dataio.pipes.snowflake.batch.SnowflakeOutput
Name | Mandatory | Description | Example | Default |
---|---|---|---|---|
Mode | true | Writing mode on the Snowflake table | Mode = "append" |
Streaming
No streaming input is currently available for Snowflake in Data I/O.
Output
Type: com.amadeus.dataio.pipes.snowflake.streaming.SnowflakeOutput
Name | Mandatory | Description | Example | Default |
---|---|---|---|---|
Duration | No | Sets the trigger for the stream query. Controls the trigger() Spark function. | Duration = "60 seconds" | |
Timeout | Yes | Controls the amount of time before returning from the streaming query, in hours. It can be a String or an Int. | Timeout = 24 | |
Mode | Yes | The Spark Structured Streaming output mode. | Mode = "complete" | append |
AddTimestampOnInsert | No | Add a column named "timestamp" containing the current timestamp at the start of query evaluation | AddTimestampOnInsert = "true" | false |