Documentation of tables
You can add documentation to schemas as follows:
[1]:
from typing import Annotated
from pyspark.sql.types import DateType, StringType
from typedspark import Column, ColumnMeta, Schema
class Person(Schema):
"""Dimension table that contains information about a person."""
person_id: Annotated[
Column[StringType],
ColumnMeta(comment="Unique person id"),
]
gender: Annotated[Column[StringType], ColumnMeta(comment="Gender of the person")]
birthdate: Annotated[Column[DateType], ColumnMeta(comment="Date of birth of the person")]
job_id: Annotated[Column[StringType], ColumnMeta(comment="Id of the job")]
If you use Databricks and Delta Live Tables, you can make the documentation appear in the Databricks UI by using the following Delta Live Table definition:
@dlt.table(**Person.get_dlt_kwargs())
def table_definition() -> DataSet[Person]:
# your table definition here