Data I/O

Automated handling of inputs, outputs and files distribution through a unifed interface.

Get started View it on GitHub


Welcome to the Data I/O documentation!

Data I/O is a Spark SQL Scala framework automating the process of reading, writing and distributing data. It gives you the tools to write ETL pipelines by focusing on the transformation of the data, regardless of its origin or destination.

Be sure to check out the getting started page for an overview of Data I/O’s features, or visit the main concepts page for a deeper dive.

If you know your way around Data I/O and want to learn how to create your own pipes and distributors, head to the advanced section.