Other complex datatypes

Spark contains several other complex data types.

MapType, ArrayType, DecimalType and DayTimeIntervalType

These can be used in typedspark as follows:

[1]:
from typing import Literal
from pyspark.sql.types import StringType
from typedspark import (
    ArrayType,
    DayTimeIntervalType,
    DecimalType,
    IntervalType,
    MapType,
    Schema,
    Column,
)


class Values(Schema):
    array: Column[ArrayType[StringType]]
    map: Column[MapType[StringType, StringType]]
    decimal: Column[DecimalType[Literal[38], Literal[18]]]
    interval: Column[DayTimeIntervalType[IntervalType.HOUR, IntervalType.SECOND]]

Generating DataSets

You can generate DataSets using complex data types in the following way:

[2]:
from pyspark.sql import SparkSession

spark = SparkSession.Builder().config("spark.ui.showConsoleProgress", "false").getOrCreate()
spark.sparkContext.setLogLevel("ERROR")
[3]:
from datetime import date, datetime, timedelta
from decimal import Decimal
from pyspark.sql.types import DateType, TimestampType
from typedspark._utils.create_dataset import create_partially_filled_dataset


class MoreValues(Values):
    date: Column[DateType]
    timestamp: Column[TimestampType]


create_partially_filled_dataset(
    spark,
    MoreValues,
    {
        MoreValues.array: [["a", "b", "c"]],
        MoreValues.map: [{"a": "b"}],
        MoreValues.decimal: [Decimal(32)],
        MoreValues.interval: [timedelta(days=1, hours=2, minutes=3, seconds=4)],
        MoreValues.date: [date(2020, 1, 1)],
        MoreValues.timestamp: [datetime(2020, 1, 1, 10, 15)],
    },
).show()
+---------+--------+--------------------+--------------------+----------+-------------------+
|    array|     map|             decimal|            interval|      date|          timestamp|
+---------+--------+--------------------+--------------------+----------+-------------------+
|[a, b, c]|{a -> b}|32.00000000000000...|INTERVAL '26:03:0...|2020-01-01|2020-01-01 10:15:00|
+---------+--------+--------------------+--------------------+----------+-------------------+

Did we miss a data type?

Feel free to make an issue! We can extend the list of supported data types.