handle_missing
handle_missing
Functions
| Name | Description |
|---|---|
| handle_missing | Handles missing data in a pandas DataFrame. |
handle_missing
handle_missing.handle_missing(df, strategy='drop', columns=None)Handles missing data in a pandas DataFrame.
Function returns a pandas DataFrame where missing values are handled in a user-defined way.
Parameters
| Name | Type | Description | Default |
|---|---|---|---|
| df | pandas.DataFrame | Input DataFrame | required |
| strategy | str | The strategy to use for handling missing values. Permissible values (numeric): mean, median, max, min, mode, drop Permissible values (else): mode, drop | 'drop' |
| columns | list | Columns where the missing values are to be handled. Default handles all columns. | None |
Returns
| Name | Type | Description |
|---|---|---|
| pandas.DataFrame | Dataframe where missing values have been handled. |
Raises
| Name | Type | Description |
|---|---|---|
| TypeError | If df is not a pandas DataFrame. If strategy is not a string. If columns is not a list or None. If strategy cannot be used for dtype of column. If dtype of column is not designed to be handled. |
|
| ValueError | If strategy is not permitted. If column is not in df.columns. If column only contains NaN. |
Examples
>>> import numpy as np
>>> import pandas as pd
>>>df = pd.DataFrame({
... "A": [1, 1, 2],
... "B": [np.nan, 3, 4]
... })
>>> handle_missing(df)
A B
1 1 3
2 2 4>>> handle_missing(df, strategy='mean')
A B
0 1 3.5
1 1 3.0
2 2 4.0