pyspark.sql.DataFrameWriter#
- class pyspark.sql.DataFrameWriter(df)[source]#
Interface used to write a
DataFrame
to external storage systems (e.g. file systems, key-value stores, etc). UseDataFrame.write
to access this.New in version 1.4.0.
Changed in version 3.4.0: Supports Spark Connect.
Methods
bucketBy
(numBuckets,��col,��*cols)Buckets the output by the given columns.
clusterBy
(*cols)Clusters the data by the given columns to optimize query performance.
csv
(path[,��mode,��compression,��sep,��quote,��...])Saves the content of the
DataFrame
in CSV format at the specified path.format
(source)Specifies the underlying output data source.
insertInto
(tableName[,��overwrite])Inserts the content of the
DataFrame
to the specified table.jdbc
(url,��table[,��mode,��properties])Saves the content of the
DataFrame
to an external database table via JDBC.json
(path[,��mode,��compression,��dateFormat,��...])Saves the content of the
DataFrame
in JSON format (JSON Lines text format or newline-delimited JSON) at the specified path.mode
(saveMode)Specifies the behavior when data or table already exists.
option
(key,��value)Adds an output option for the underlying data source.
options
(**options)Adds output options for the underlying data source.
orc
(path[,��mode,��partitionBy,��compression])Saves the content of the
DataFrame
in ORC format at the specified path.parquet
(path[,��mode,��partitionBy,��compression])Saves the content of the
DataFrame
in Parquet format at the specified path.partitionBy
(*cols)Partitions the output by the given columns on the file system.
save
([path,��format,��mode,��partitionBy])Saves the contents of the
DataFrame
to a data source.saveAsTable
(name[,��format,��mode,��partitionBy])Saves the content of the
DataFrame
as the specified table.sortBy
(col,��*cols)Sorts the output in each bucket by the given columns on the file system.
text
(path[,��compression,��lineSep])Saves the content of the DataFrame in a text file at the specified path.
xml
(path[,��rowTag,��mode,��attributePrefix,��...])Saves the content of the
DataFrame
in XML format at the specified path.