final class DataFrameNaFunctions extends sql.api.DataFrameNaFunctions
Functionality for working with missing data in DataFrame
s.
- Annotations
- @Stable()
- Since
1.3.1
- Alphabetic
- By Inheritance
- DataFrameNaFunctions
- DataFrameNaFunctions
- AnyRef
- Any
- Hide All
- Show All
- Public
- Protected
Value Members
- final def !=(arg0: Any): Boolean
- Definition Classes
- AnyRef → Any
- final def ##: Int
- Definition Classes
- AnyRef → Any
- final def ==(arg0: Any): Boolean
- Definition Classes
- AnyRef → Any
- final def asInstanceOf[T0]: T0
- Definition Classes
- Any
- def clone(): AnyRef
- Attributes
- protected[lang]
- Definition Classes
- AnyRef
- Annotations
- @throws(classOf[java.lang.CloneNotSupportedException]) @IntrinsicCandidate() @native()
- def drop(minNonNulls: Int, cols: Seq[String]): DataFrame
(Scala-specific) Returns a new
DataFrame
that drops rows containing less thanminNonNulls
non-null and non-NaN values in the specified columns.(Scala-specific) Returns a new
DataFrame
that drops rows containing less thanminNonNulls
non-null and non-NaN values in the specified columns.- Definition Classes
- DataFrameNaFunctions → DataFrameNaFunctions
- def drop(minNonNulls: Int): DataFrame
Returns a new
DataFrame
that drops rows containing less thanminNonNulls
non-null and non-NaN values.Returns a new
DataFrame
that drops rows containing less thanminNonNulls
non-null and non-NaN values.- Definition Classes
- DataFrameNaFunctions → DataFrameNaFunctions
- def drop(how: String, cols: Seq[String]): DataFrame
(Scala-specific) Returns a new
DataFrame
that drops rows containing null or NaN values in the specified columns.(Scala-specific) Returns a new
DataFrame
that drops rows containing null or NaN values in the specified columns.If
how
is "any", then drop rows containing any null or NaN values in the specified columns. Ifhow
is "all", then drop rows only if every specified column is null or NaN for that row.- Definition Classes
- DataFrameNaFunctions → DataFrameNaFunctions
- def drop(how: String): DataFrame
Returns a new
DataFrame
that drops rows containing null or NaN values.Returns a new
DataFrame
that drops rows containing null or NaN values.If
how
is "any", then drop rows containing any null or NaN values. Ifhow
is "all", then drop rows only if every column is null or NaN for that row.- Definition Classes
- DataFrameNaFunctions → DataFrameNaFunctions
- def drop(minNonNulls: Int, cols: Array[String]): DataFrame
Returns a new
DataFrame
that drops rows containing less thanminNonNulls
non-null and non-NaN values in the specified columns.Returns a new
DataFrame
that drops rows containing less thanminNonNulls
non-null and non-NaN values in the specified columns.- Definition Classes
- DataFrameNaFunctions → DataFrameNaFunctions
- def drop(how: String, cols: Array[String]): DataFrame
Returns a new
DataFrame
that drops rows containing null or NaN values in the specified columns.Returns a new
DataFrame
that drops rows containing null or NaN values in the specified columns.If
how
is "any", then drop rows containing any null or NaN values in the specified columns. Ifhow
is "all", then drop rows only if every specified column is null or NaN for that row.- Definition Classes
- DataFrameNaFunctions → DataFrameNaFunctions
- def drop(cols: Seq[String]): DataFrame
(Scala-specific) Returns a new
DataFrame
that drops rows containing any null or NaN values in the specified columns.(Scala-specific) Returns a new
DataFrame
that drops rows containing any null or NaN values in the specified columns.- Definition Classes
- DataFrameNaFunctions → DataFrameNaFunctions
- def drop(cols: Array[String]): DataFrame
Returns a new
DataFrame
that drops rows containing any null or NaN values in the specified columns.Returns a new
DataFrame
that drops rows containing any null or NaN values in the specified columns.- Definition Classes
- DataFrameNaFunctions → DataFrameNaFunctions
- def drop(): DataFrame
Returns a new
DataFrame
that drops rows containing any null or NaN values.Returns a new
DataFrame
that drops rows containing any null or NaN values.- Definition Classes
- DataFrameNaFunctions → DataFrameNaFunctions
- def drop(minNonNulls: Option[Int], cols: Seq[String]): Dataset[Row]
- Attributes
- protected
- Definition Classes
- DataFrameNaFunctions → DataFrameNaFunctions
- def drop(minNonNulls: Option[Int]): Dataset[Row]
- Attributes
- protected
- Definition Classes
- DataFrameNaFunctions → DataFrameNaFunctions
- final def eq(arg0: AnyRef): Boolean
- Definition Classes
- AnyRef
- def equals(arg0: AnyRef): Boolean
- Definition Classes
- AnyRef → Any
- def fill(valueMap: Map[String, Any]): DataFrame
(Scala-specific) Returns a new
DataFrame
that replaces null values.(Scala-specific) Returns a new
DataFrame
that replaces null values.The key of the map is the column name, and the value of the map is the replacement value. The value must be of the following type:
Int
,Long
,Float
,Double
,String
,Boolean
. Replacement values are cast to the column data type.For example, the following replaces null values in column "A" with string "unknown", and null values in column "B" with numeric value 1.0.
df.na.fill(Map( "A" -> "unknown", "B" -> 1.0 ))
- Definition Classes
- DataFrameNaFunctions → DataFrameNaFunctions
- def fill(valueMap: Map[String, Any]): DataFrame
Returns a new
DataFrame
that replaces null values.Returns a new
DataFrame
that replaces null values.The key of the map is the column name, and the value of the map is the replacement value. The value must be of the following type:
Integer
,Long
,Float
,Double
,String
,Boolean
. Replacement values are cast to the column data type.For example, the following replaces null values in column "A" with string "unknown", and null values in column "B" with numeric value 1.0.
import com.google.common.collect.ImmutableMap; df.na.fill(ImmutableMap.of("A", "unknown", "B", 1.0));
- Definition Classes
- DataFrameNaFunctions → DataFrameNaFunctions
- def fill(value: Boolean, cols: Array[String]): DataFrame
Returns a new
DataFrame
that replaces null values in specified boolean columns.Returns a new
DataFrame
that replaces null values in specified boolean columns. If a specified column is not a boolean column, it is ignored.- Definition Classes
- DataFrameNaFunctions → DataFrameNaFunctions
- def fill(value: String, cols: Array[String]): DataFrame
Returns a new
DataFrame
that replaces null values in specified string columns.Returns a new
DataFrame
that replaces null values in specified string columns. If a specified column is not a string column, it is ignored.- Definition Classes
- DataFrameNaFunctions → DataFrameNaFunctions
- def fill(value: Double, cols: Array[String]): DataFrame
Returns a new
DataFrame
that replaces null or NaN values in specified numeric columns.Returns a new
DataFrame
that replaces null or NaN values in specified numeric columns. If a specified column is not a numeric column, it is ignored.- Definition Classes
- DataFrameNaFunctions → DataFrameNaFunctions
- def fill(value: Long, cols: Array[String]): DataFrame
Returns a new
DataFrame
that replaces null or NaN values in specified numeric columns.Returns a new
DataFrame
that replaces null or NaN values in specified numeric columns. If a specified column is not a numeric column, it is ignored.- Definition Classes
- DataFrameNaFunctions → DataFrameNaFunctions
- def fill(value: Boolean, cols: Seq[String]): DataFrame
(Scala-specific) Returns a new
DataFrame
that replaces null values in specified boolean columns.(Scala-specific) Returns a new
DataFrame
that replaces null values in specified boolean columns. If a specified column is not a boolean column, it is ignored.- Definition Classes
- DataFrameNaFunctions → DataFrameNaFunctions
- def fill(value: Boolean): DataFrame
Returns a new
DataFrame
that replaces null values in boolean columns withvalue
.Returns a new
DataFrame
that replaces null values in boolean columns withvalue
.- Definition Classes
- DataFrameNaFunctions → DataFrameNaFunctions
- def fill(value: String, cols: Seq[String]): DataFrame
(Scala-specific) Returns a new
DataFrame
that replaces null values in specified string columns.(Scala-specific) Returns a new
DataFrame
that replaces null values in specified string columns. If a specified column is not a string column, it is ignored.- Definition Classes
- DataFrameNaFunctions → DataFrameNaFunctions
- def fill(value: Double, cols: Seq[String]): DataFrame
(Scala-specific) Returns a new
DataFrame
that replaces null or NaN values in specified numeric columns.(Scala-specific) Returns a new
DataFrame
that replaces null or NaN values in specified numeric columns. If a specified column is not a numeric column, it is ignored.- Definition Classes
- DataFrameNaFunctions → DataFrameNaFunctions
- def fill(value: Long, cols: Seq[String]): DataFrame
(Scala-specific) Returns a new
DataFrame
that replaces null or NaN values in specified numeric columns.(Scala-specific) Returns a new
DataFrame
that replaces null or NaN values in specified numeric columns. If a specified column is not a numeric column, it is ignored.- Definition Classes
- DataFrameNaFunctions → DataFrameNaFunctions
- def fill(value: String): DataFrame
Returns a new
DataFrame
that replaces null values in string columns withvalue
.Returns a new
DataFrame
that replaces null values in string columns withvalue
.- Definition Classes
- DataFrameNaFunctions → DataFrameNaFunctions
- def fill(value: Double): DataFrame
Returns a new
DataFrame
that replaces null or NaN values in numeric columns withvalue
.Returns a new
DataFrame
that replaces null or NaN values in numeric columns withvalue
.- Definition Classes
- DataFrameNaFunctions → DataFrameNaFunctions
- def fill(value: Long): DataFrame
Returns a new
DataFrame
that replaces null or NaN values in numeric columns withvalue
.Returns a new
DataFrame
that replaces null or NaN values in numeric columns withvalue
.- Definition Classes
- DataFrameNaFunctions → DataFrameNaFunctions
- def fillMap(values: Seq[(String, Any)]): DataFrame
- Attributes
- protected
- Definition Classes
- DataFrameNaFunctions → DataFrameNaFunctions
- final def getClass(): Class[_ <: AnyRef]
- Definition Classes
- AnyRef → Any
- Annotations
- @IntrinsicCandidate() @native()
- def hashCode(): Int
- Definition Classes
- AnyRef → Any
- Annotations
- @IntrinsicCandidate() @native()
- final def isInstanceOf[T0]: Boolean
- Definition Classes
- Any
- final def ne(arg0: AnyRef): Boolean
- Definition Classes
- AnyRef
- final def notify(): Unit
- Definition Classes
- AnyRef
- Annotations
- @IntrinsicCandidate() @native()
- final def notifyAll(): Unit
- Definition Classes
- AnyRef
- Annotations
- @IntrinsicCandidate() @native()
- def replace[T](cols: Array[String], replacement: Map[T, T]): DataFrame
Replaces values matching keys in
replacement
map with the corresponding values.Replaces values matching keys in
replacement
map with the corresponding values.import com.google.common.collect.ImmutableMap; // Replaces all occurrences of 1.0 with 2.0 in column "height" and "weight". df.na.replace(new String[] {"height", "weight"}, ImmutableMap.of(1.0, 2.0)); // Replaces all occurrences of "UNKNOWN" with "unnamed" in column "firstname" and "lastname". df.na.replace(new String[] {"firstname", "lastname"}, ImmutableMap.of("UNKNOWN", "unnamed"));
- cols
list of columns to apply the value replacement. If
col
is "*", replacement is applied on all string, numeric or boolean columns.- replacement
value replacement map. Key and value of
replacement
map must have the same type, and can only be doubles, strings or booleans. The map value can have nulls.
- Definition Classes
- DataFrameNaFunctions → DataFrameNaFunctions
- def replace[T](col: String, replacement: Map[T, T]): DataFrame
Replaces values matching keys in
replacement
map with the corresponding values.Replaces values matching keys in
replacement
map with the corresponding values.import com.google.common.collect.ImmutableMap; // Replaces all occurrences of 1.0 with 2.0 in column "height". df.na.replace("height", ImmutableMap.of(1.0, 2.0)); // Replaces all occurrences of "UNKNOWN" with "unnamed" in column "name". df.na.replace("name", ImmutableMap.of("UNKNOWN", "unnamed")); // Replaces all occurrences of "UNKNOWN" with "unnamed" in all string columns. df.na.replace("*", ImmutableMap.of("UNKNOWN", "unnamed"));
- col
name of the column to apply the value replacement. If
col
is "*", replacement is applied on all string, numeric or boolean columns.- replacement
value replacement map. Key and value of
replacement
map must have the same type, and can only be doubles, strings or booleans. The map value can have nulls.
- Definition Classes
- DataFrameNaFunctions → DataFrameNaFunctions
- def replace[T](cols: Seq[String], replacement: Map[T, T]): DataFrame
(Scala-specific) Replaces values matching keys in
replacement
map.(Scala-specific) Replaces values matching keys in
replacement
map.// Replaces all occurrences of 1.0 with 2.0 in column "height" and "weight". df.na.replace("height" :: "weight" :: Nil, Map(1.0 -> 2.0)); // Replaces all occurrences of "UNKNOWN" with "unnamed" in column "firstname" and "lastname". df.na.replace("firstname" :: "lastname" :: Nil, Map("UNKNOWN" -> "unnamed"));
- cols
list of columns to apply the value replacement. If
col
is "*", replacement is applied on all string, numeric or boolean columns.- replacement
value replacement map. Key and value of
replacement
map must have the same type, and can only be doubles, strings or booleans. The map value can have nulls.
- Definition Classes
- DataFrameNaFunctions → DataFrameNaFunctions
- def replace[T](col: String, replacement: Map[T, T]): DataFrame
(Scala-specific) Replaces values matching keys in
replacement
map.(Scala-specific) Replaces values matching keys in
replacement
map.// Replaces all occurrences of 1.0 with 2.0 in column "height". df.na.replace("height", Map(1.0 -> 2.0)); // Replaces all occurrences of "UNKNOWN" with "unnamed" in column "name". df.na.replace("name", Map("UNKNOWN" -> "unnamed")); // Replaces all occurrences of "UNKNOWN" with "unnamed" in all string columns. df.na.replace("*", Map("UNKNOWN" -> "unnamed"));
- col
name of the column to apply the value replacement. If
col
is "*", replacement is applied on all string, numeric or boolean columns.- replacement
value replacement map. Key and value of
replacement
map must have the same type, and can only be doubles, strings or booleans. The map value can have nulls.
- Definition Classes
- DataFrameNaFunctions → DataFrameNaFunctions
- final def synchronized[T0](arg0: => T0): T0
- Definition Classes
- AnyRef
- def toString(): String
- Definition Classes
- AnyRef → Any
- final def wait(arg0: Long, arg1: Int): Unit
- Definition Classes
- AnyRef
- Annotations
- @throws(classOf[java.lang.InterruptedException])
- final def wait(arg0: Long): Unit
- Definition Classes
- AnyRef
- Annotations
- @throws(classOf[java.lang.InterruptedException]) @native()
- final def wait(): Unit
- Definition Classes
- AnyRef
- Annotations
- @throws(classOf[java.lang.InterruptedException])
Deprecated Value Members
- def finalize(): Unit
- Attributes
- protected[lang]
- Definition Classes
- AnyRef
- Annotations
- @throws(classOf[java.lang.Throwable]) @Deprecated
- Deprecated
(Since version 9)