pyspark.sql.DataFrameWriter#

class pyspark.sql.DataFrameWriter(df)[source]#

Interface used to write a DataFrame to external storage systems (e.g. file systems, key-value stores, etc). Use DataFrame.write to access this.

New in version 1.4.0.

Changed in version 3.4.0: Supports Spark Connect.

Methods

`bucketBy`(numBuckets,��col,��*cols)	Buckets the output by the given columns.
`clusterBy`(*cols)	Clusters the data by the given columns to optimize query performance.
`csv`(path[,��mode,��compression,��sep,��quote,��...])	Saves the content of the `DataFrame` in CSV format at the specified path.
`format`(source)	Specifies the underlying output data source.
`insertInto`(tableName[,��overwrite])	Inserts the content of the `DataFrame` to the specified table.
`jdbc`(url,��table[,��mode,��properties])	Saves the content of the `DataFrame` to an external database table via JDBC.
`json`(path[,��mode,��compression,��dateFormat,��...])	Saves the content of the `DataFrame` in JSON format (JSON Lines text format or newline-delimited JSON) at the specified path.
`mode`(saveMode)	Specifies the behavior when data or table already exists.
`option`(key,��value)	Adds an output option for the underlying data source.
`options`(**options)	Adds output options for the underlying data source.
`orc`(path[,��mode,��partitionBy,��compression])	Saves the content of the `DataFrame` in ORC format at the specified path.
`parquet`(path[,��mode,��partitionBy,��compression])	Saves the content of the `DataFrame` in Parquet format at the specified path.
`partitionBy`(*cols)	Partitions the output by the given columns on the file system.
`save`([path,��format,��mode,��partitionBy])	Saves the contents of the `DataFrame` to a data source.
`saveAsTable`(name[,��format,��mode,��partitionBy])	Saves the content of the `DataFrame` as the specified table.
`sortBy`(col,��*cols)	Sorts the output in each bucket by the given columns on the file system.
`text`(path[,��compression,��lineSep])	Saves the content of the DataFrame in a text file at the specified path.
`xml`(path[,��rowTag,��mode,��attributePrefix,��...])	Saves the content of the `DataFrame` in XML format at the specified path.