Drop Missing Values - Maple Help

DataFrame/FillMissing

fill missing values in a DataFrame

DataFrame/DropMissing

drop missing values in a DataFrame

 Calling Sequence FillMissing(df, fill, missopt, methopt) DropMissing(df, missopt)

Parameters

 df - a DataFrame object fill - (optional) a value to replace missing values missopt - (optional) equation of the form missing = m1, where m1 can be any expression methopt - (optional) equation of the form method = m2, where m2 can be forward or "forward" or backward or "backward"

Description

 • The FillMissing command creates a copy of a DataFrame where missing values are replaced with a value of your choice.
 • The DropMissing command creates a copy of a DataFrame where columns that contain missing values are removed.
 • For both commands, by default, a missing value is determined by the data type of the column:
 – For floating point data types, the missing value is the appropriate version of undefined. For example, columns with data type float[8] use Float(undefined).
 – For hardware integer data types integer[$k$], where $k$ is 1, 2, 4, or 8, the missing value is 0. (Such columns cannot store non-integer values, so one cannot use a version of undefined here.)
 – For string columns, the empty string is the missing value.
 – For columns of type truefalseFAIL and boolean_constant, the missing value is FAIL.
 – For all other data types, the missing value is undefined.
 • In order to use a different value as the missing value, you can use the option missing = m1. If you supply this option, then any occurrence of m1 will be considered missing.
 • For the FillMissing command, by default, the value used to replace a missing value depends on the data type of the column it occurs in:
 – For all numeric data types, including floating point and integer, the default value is 0.
 – For the data type string, the default value is the empty string.
 – For the data types truefalse, truefalseFAIL, boolean, and boolean_constant, the default value is false.
 • The fill argument, if it is specified, overrides the value used to replace a missing value.
 • If the method option is used, then any missing values will be replaced either by the last non-missing value before it (with method = forward or method = "forward"), or the first non-missing value after it (with method = backward or method = "backward"). If no such value is available (for example, if the first value is missing and method = forward is specified), then the value used is determined in the same way as if the method option were not specified.
 • For columns of hardware integer type and string columns, the default missing value and the default fill value are the same. Using the FillMissing command on such columns has no effect, unless one or more of the fill, missopt, and methopt arguments are specified.

Examples

 > $\mathrm{df}≔\mathrm{DataFrame}\left(\left[\left[8,9,0\right],\left[8,9,0\right],\left[8.,9.,Float\left(\mathrm{undefined}\right)\right]\right],\mathrm{columns}=\left[A,B,C\right],\mathrm{rows}=\left[a,b,c\right],\mathrm{datatypes}=\left[\mathrm{anything},\mathrm{integer}\left[4\right],\mathrm{float}\left[8\right]\right]\right)$
 ${\mathrm{df}}{≔}\left[\begin{array}{cccc}{}& {A}& {B}& {C}\\ {a}& {8}& {8}& {8.}\\ {b}& {9}& {9}& {9.}\\ {c}& {0}& {0}& {Float}{}\left({\mathrm{undefined}}\right)\end{array}\right]$ (1)

Column $A$ has declared type anything, and $B$ has type integer[4]. This means that, for a value in column $A$ to be considered missing, it would have to be undefined; in $B$, the value considered missing is 0. Consequently, the DropMissing command will remove column $B$, but not column $A$. Column $C$ also contains the default missing value for its data type, float[8], and consequently it is also removed.

 > $\mathrm{DropMissing}\left(\mathrm{df}\right)$
 $\left[\begin{array}{cc}{}& {A}\\ {a}& {8}\\ {b}& {9}\\ {c}& {0}\end{array}\right]$ (2)

With FillMissing, one can only see a change in column $C$ by default.

 > $\mathrm{FillMissing}\left(\mathrm{df}\right)$
 $\left[\begin{array}{cccc}{}& {A}& {B}& {C}\\ {a}& {8}& {8}& {8.}\\ {b}& {9}& {9}& {9.}\\ {c}& {0}& {0}& {0.}\end{array}\right]$ (3)

If we specify a missing value manually, for example, 8, then DropMissing removes columns containing that exact value. This applies to columns $A$ and $B$, but not to $C$, which contains 8. but not 8.

 > $\mathrm{DropMissing}\left(\mathrm{df},\mathrm{missing}=8\right)$
 $\left[\begin{array}{cc}{}& {C}\\ {a}& {8.}\\ {b}& {9.}\\ {c}& {Float}{}\left({\mathrm{undefined}}\right)\end{array}\right]$ (4)

For FillMissing, we can specify the value to be used for replacing missing values. This is the fill argument. In the following example, we specify the value 6. This is stored in the last entries of columns $B$ and $C$; in column $C$, it is automatically changed to the floating point value 6., because of the data type of that column.

 > $\mathrm{FillMissing}\left(\mathrm{df},6\right)$
 $\left[\begin{array}{cccc}{}& {A}& {B}& {C}\\ {a}& {8}& {8}& {8.}\\ {b}& {9}& {9}& {9.}\\ {c}& {0}& {6}& {6.}\end{array}\right]$ (5)

If we specify the option method = backward, then missing values are replaced with later values.

 > $\mathrm{FillMissing}\left(\mathrm{df},'\mathrm{missing}'=9,'\mathrm{method}'='\mathrm{backward}'\right)$
 $\left[\begin{array}{cccc}{}& {A}& {B}& {C}\\ {a}& {8}& {8}& {8.}\\ {b}& {0}& {0}& {Float}{}\left({\mathrm{undefined}}\right)\\ {c}& {0}& {0}& {Float}{}\left({\mathrm{undefined}}\right)\end{array}\right]$ (6)

Compatibility

 • The DataFrame/FillMissing and DataFrame/DropMissing commands were introduced in Maple 2016.