Writing tests

Table of contents
  1. Writing tests
    1. Installation
    2. Overview
      1. Interacting with the file system
      2. Interacting with a SparkSession
      3. Interacting with a Streaming context
      4. Implicitly converting Scala Maps and Lists in Java equivalents

Data I/O offers a separate library with utility traits and methods designed to facilitate testing Scala/Spark SQL applications.

Installation

Published releases are available on GitHub Packages, in the AmadeusITGroup repository.

Using Maven:

<dependency>
    <groupId>com.amadeus.dataio</groupId>
    <artifactId>dataio-test</artifactId>
    <version>x.x.x</version>
</dependency>

Overview

Interacting with the file system

The FileSystemSpec trait provides the Hadoop LocalFileSystem for tests needing direct access to an instance of FileSystem.

Example:


import com.amadeus.dataio.test._
import org.scalatest.flatspec.AnyFlatSpec

case class MyAppTest extends AnyFlatSpec with FileSystemSpec {
  "MyAppTest" should "do something" in {
    assert(fs.exists("file:///my_file.txt"))
  }
}

Interacting with a SparkSession

The SparkSpec trait provides a local Spark session and helper functions for Spark tests:

  • getTestName: String: Returns the test suiteā€™s name.
  • collectData(path: String, format: String, schema: Option[String] = None): Array[String]): Collects data from the file system.

Note that extending this trait, you will have to override the getTestName: String function.

Example:


import com.amadeus.dataio.test._
import org.scalatest.flatspec.AnyFlatSpec

case class MyAppTest extends AnyFlatSpec with SparkSpec {
  override def getTestName = "MyAppTest"
  
  "MyAppTest" should "do something" in {
    spark.read.format("csv").load("my_data.csv")
    collectData
  }
}

Interacting with a Streaming context

The SparkStreamingSpec trait provides a local Spark session and helper functions for Spark Streaming tests:

  • enableSparkStreamingSchemaInference(): Unit: Enables Spark streaming schema inference.
  • collectDataStream(dataFrame: DataFrame): Array[String]: Collects data from a DataFrame read from a stream using an in-memory sink.

Implicitly converting Scala Maps and Lists in Java equivalents

It it sometimes necessary to build complex map structures while building Typesafe Config objects, requiring redundant Scala-to-Java conversions.

To simplify this, you may extend the JavaImplicitConverters trait.

Example:


import com.amadeus.dataio.test._
import com.typesafe.config.ConfigFactory
import org.scalatest.flatspec.AnyFlatSpec

case class MyAppTest extends AnyFlatSpec with JavaImplicitConverters {
  "MyAppTest" should "do something" in {
    ConfigFactory.parseMap(
      Map("NodeName" -> Seq(Map("Type" -> "com.Entity"), Map("Type" -> "com.Entity")))
    )
  }
}