From the course: Complete Guide to Databricks for Data Engineering
Unlock this course with a free trial
Join today to access over 24,700 courses taught by industry experts.
Write a DataFrame as external table in PySpark - Databricks Tutorial
From the course: Complete Guide to Databricks for Data Engineering
Write a DataFrame as external table in PySpark
- [Instructor] You can also write your DataFrame as an external tables, but the question come into mind is what is external table? So, let's understand first what is an external table. External table in Databricks is a table where the metadata is managed by the Databricks metastore. And the actual data, or the data which you are pointing, might be available in the external locations, like Azure Data Lake, AWS, DBFS, or Google Cloud Storage, anywhere. The Databricks is not going to move or manage your data. So, in case if you drop the table, in that case your data is safe. So, long story short, external table is something which manages only the metadata, not the data. So, if we talk about some of the important features of the external tables, the external tables always refers to external data, or data which is available on an external location. And metadata is capped within the Databricks. For example, the name of the table, the columns, names, the column data type, all this…
Practice while you learn with exercise files
Download the files the instructor uses to teach the course. Follow along and learn by watching, listening and practicing.
Contents
-
-
-
-
-
-
-
-
(Locked)
What is Spark SQL?5m 47s
-
(Locked)
Create temporary views in Databricks10m 17s
-
(Locked)
Create global temp views in Databricks7m 25s
-
(Locked)
Use Spark SQL transformations7m
-
(Locked)
Write DataFrames as managed tables in PySpark9m 26s
-
(Locked)
Write a DataFrame as external table in PySpark8m 31s
-
(Locked)
-
-
-
-
-
-
-
-
-