It might be necessary for your applications to filter input datasets by a specific a date range. This is made possible
by Data I/O directly in the configuration file, via the date_filter
input field.
date_filter
’s availability is decided at pipe level. Please refer to their specific documentation to know whether it
is available.
date_filter
requires a reference
and an offset
, in order to define a date range, as well as a column
field in
order to specify where to apply the filter.
Name | Mandatory | Description | Example | Default |
---|
If the upper limit of the date range has a time past midnight, it will include the day (e.g. if the upper limit is 2022-09-28, 03h00, the 28th of September will be included in the range). The lower limit of the date range is always included.
Here’s an example of input using the date_filter
feature:
(...)
input {
name = "my-input"
type = "com.amadeus.dataio.pipes.spark.batch.SparkInput"
format = "delta"
path = "hdfs://path/to/data"
date_filter {
reference = "2023-07-01"
offset = "-7D"
column = "date"
}
}
(...)