From the course: Complete Guide to Databricks for Data Engineering

Unlock this course with a free trial

Join today to access over 24,700 courses taught by industry experts.

Write a DataFrame as external table in PySpark

Write a DataFrame as external table in PySpark - Databricks Tutorial

From the course: Complete Guide to Databricks for Data Engineering

Write a DataFrame as external table in PySpark

- [Instructor] You can also write your DataFrame as an external tables, but the question come into mind is what is external table? So, let's understand first what is an external table. External table in Databricks is a table where the metadata is managed by the Databricks metastore. And the actual data, or the data which you are pointing, might be available in the external locations, like Azure Data Lake, AWS, DBFS, or Google Cloud Storage, anywhere. The Databricks is not going to move or manage your data. So, in case if you drop the table, in that case your data is safe. So, long story short, external table is something which manages only the metadata, not the data. So, if we talk about some of the important features of the external tables, the external tables always refers to external data, or data which is available on an external location. And metadata is capped within the Databricks. For example, the name of the table, the columns, names, the column data type, all this…

Contents