Date Filters

It might be necessary for your applications to filter input datasets by a specific a date range. This is made possible by Data I/O directly in the configuration file, via the date_filter input field.

date_filter’s availability is decided at pipe level. Please refer to their specific documentation to know whether it is available.

Fields

date_filter requires a reference and an offset, in order to define a date range, as well as a column field in order to specify where to apply the filter.

Name Mandatory Description Example Default

If the upper limit of the date range has a time past midnight, it will include the day (e.g. if the upper limit is 2022-09-28, 03h00, the 28th of September will be included in the range). The lower limit of the date range is always included.

Example

Here’s an example of input using the date_filter feature:

(...)

input {
  name = "my-input"
  type = "com.amadeus.dataio.pipes.spark.batch.SparkInput"
  format = "delta"
  path = "hdfs://path/to/data"
  date_filter {
    reference = "2023-07-01"
    offset = "-7D"
    column = "date"
    }
}

(...)