pyspark.sql.DataFrameWriter#

class pyspark.sql.DataFrameWriter(df)[source]#

Interface used to write a DataFrame to external storage systems (e.g. file systems, key-value stores, etc). Use DataFrame.write to access this.

New in version 1.4.0.

Changed in version 3.4.0: Supports Spark Connect.

Methods

bucketBy(numBuckets,��col,��*cols)

Buckets the output by the given columns.

clusterBy(*cols)

Clusters the data by the given columns to optimize query performance.

csv(path[,��mode,��compression,��sep,��quote,��...])

Saves the content of the DataFrame in CSV format at the specified path.

format(source)

Specifies the underlying output data source.

insertInto(tableName[,��overwrite])

Inserts the content of the DataFrame to the specified table.

jdbc(url,��table[,��mode,��properties])

Saves the content of the DataFrame to an external database table via JDBC.

json(path[,��mode,��compression,��dateFormat,��...])

Saves the content of the DataFrame in JSON format (JSON Lines text format or newline-delimited JSON) at the specified path.

mode(saveMode)

Specifies the behavior when data or table already exists.

option(key,��value)

Adds an output option for the underlying data source.

options(**options)

Adds output options for the underlying data source.

orc(path[,��mode,��partitionBy,��compression])

Saves the content of the DataFrame in ORC format at the specified path.

parquet(path[,��mode,��partitionBy,��compression])

Saves the content of the DataFrame in Parquet format at the specified path.

partitionBy(*cols)

Partitions the output by the given columns on the file system.

save([path,��format,��mode,��partitionBy])

Saves the contents of the DataFrame to a data source.

saveAsTable(name[,��format,��mode,��partitionBy])

Saves the content of the DataFrame as the specified table.

sortBy(col,��*cols)

Sorts the output in each bucket by the given columns on the file system.

text(path[,��compression,��lineSep])

Saves the content of the DataFrame in a text file at the specified path.

xml(path[,��rowTag,��mode,��attributePrefix,��...])

Saves the content of the DataFrame in XML format at the specified path.