eda.eda

eda.eda(X, y)

Perform exploratory data analysis on a single numeric column of a DataFrame.

This function computes descriptive statistics for a specified column and generates a histogram to visualize its distribution. It is written defensively and will raise informative errors when invalid inputs or unsupported data types are provided.

Parameters

Name	Type	Description	Default
X	pandas.DataFrame	Input DataFrame containing the column to be analyzed. Typically the target column	required
y	str	Name of the column in `X` for which summary statistics and a histogram will be generated. The column must exist in `X` and contain numeric values.	required

Returns

Name	Type	Description
summary_stats	pandas.Series	Descriptive statistics for column `y`, as returned by pandas.Series.describe.
histogram	matplotlib.axes.Axes	Matplotlib Axes object containing the histogram of column `y`.

Raises

Name	Type	Description
	TypeError	If `X` is not a pandas DataFrame. If `y` is not a string. If column `y` is not numeric.
	KeyError	If column `y` does not exist in `X`.
	ValueError	If column `y` is empty. If column `y` contains only missing values (NaNs).

Notes

This function creates a matplotlib plot but does not display it. To render the histogram, call matplotlib.pyplot.show() after invoking this function.

Examples

>>> import pandas as pd
>>> import matplotlib.pyplot as plt
>>> # Create sample data
>>> data = pd.DataFrame({'val': [1, 2, 2, 3, 3, 3, 4, 4, 5]})
>>> # Run EDA
>>> stats, ax = eda(data, 'val')
>>> # The plot object (ax) can be used to tweak the visual
>>> ax.set_title("Target Distribution")
>>> # The plot will then display
>>> # plt.show()