recover_partitions(): Recovers all the partitions in the directory of a
table and update the catalog. This only works for partitioned tables and not
un-partitioned tables or views.
refresh_by_path(): Invalidates and refreshes all the cached data (and the
associated metadata) for any Dataset that contains the given data source
path. Path matching is by prefix, i.e. "/" would invalidate everything that
refresh_table(): Invalidates and refreshes all the cached data and
metadata of the given table. For performance reasons, Spark SQL or the
external data source library it uses might cache certain metadata about a
table, such as the location of blocks. When those change outside of Spark
SQL, users should call this function to invalidate the cache. If this table
is cached as an
InMemoryRelation, drop the original cached version and make
the new version cached lazily.
recover_partitions(sc, table) refresh_by_path(sc, path) refresh_table(sc, table)
NULL, invisibly. These functions are mostly called for their side effects.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.