{
 "cells": [
  {
   "attachments": {},
   "cell_type": "markdown",
   "id": "c3322756",
   "metadata": {},
   "source": [
    "# Autocomplete in Databricks & Jupyter notebooks\n",
    "When we use `Catalogs`, `Databases`, `Database`, `load_table()` or `create_schema()` in a Databricks or Jupyter notebook, we also get autocomplete on the column names. No more looking at `df.columns` every minute to remember the column names!\n",
    "\n",
    "## The basics\n",
    "\n",
    "To illustrate this, let us first generate a table that we'll write to the table `person_table`."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "id": "87752202",
   "metadata": {},
   "outputs": [],
   "source": [
    "from pyspark.sql import SparkSession\n",
    "\n",
    "spark = SparkSession.Builder().config(\"spark.ui.showConsoleProgress\", \"false\").getOrCreate()\n",
    "spark.sparkContext.setLogLevel(\"ERROR\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "id": "6c1e5acc",
   "metadata": {},
   "outputs": [],
   "source": [
    "import pandas as pd\n",
    "\n",
    "(\n",
    "    spark.createDataFrame(\n",
    "        pd.DataFrame(\n",
    "            dict(\n",
    "                name=[\"Jack\", \"John\", \"Jane\"],\n",
    "                age=[20, 30, 40],\n",
    "            )\n",
    "        )\n",
    "    ).createOrReplaceTempView(\"person_table\")\n",
    ")"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "id": "4bd96763",
   "metadata": {},
   "source": [
    "We can now load these data using `load_table()`. Note that the `Schema` is inferred: it doesn't need to have been serialized using `typedspark`."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "id": "3003dea9",
   "metadata": {},
   "outputs": [],
   "source": [
    "from typedspark import load_table\n",
    "\n",
    "df, Person = load_table(spark, \"person_table\")"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "id": "65c183c1",
   "metadata": {},
   "source": [
    "You can now use `df` and `Person` just like you would in your IDE, including autocomplete!"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "id": "f38e0e20",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "+----+---+\n",
      "|name|age|\n",
      "+----+---+\n",
      "|John| 30|\n",
      "|Jane| 40|\n",
      "+----+---+\n",
      "\n"
     ]
    }
   ],
   "source": [
    "df.filter(Person.age > 25).show()"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "id": "eb3f8c9c",
   "metadata": {},
   "source": [
    "## Other notebook types\n",
    "\n",
    "Auto-complete of dynamically loaded schemas (e.g. through `load_table()` or `create_schema()`) has been verified to work on Databricks, JupyterLab and Jupyter Notebook. At the time of writing, it doesn't work in VSCode and PyCharm notebooks."
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "id": "c342aec1",
   "metadata": {},
   "source": []
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3 (ipykernel)",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.11.9"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 5
}