assign()
|
Assigns data points to k clusters |
avg_sil_score()
|
avg_sil_score
This function takes in a vector of cluster labels and a corresponding array of points and calculates the average silhouette score across all clusters.
It returns the value of the average silhouette score. |
calc_centers()
|
Calculates center coordinates of each cluster |
find_elbow()
|
find_elbow
This function takes in unlabeled, scaled data and performs clustering using the KMeans clustering algorithm values of K up to the min(10, n_samples - 1).
It returns the value for K which maximizes the mean silhouette scores across all clusters. |
fit()
|
Finds k clusters in data points. |
fit_assign()
|
Finds k clusters in data points and assigns each point to a cluster. |
init_centers()
|
Chooses initial cluster locations using Kmeans++ |
measure_dist()
|
Measures distance from data points to cluster centers |
preprocess()
|
This function takes in training data and applies some preprocessing steps
such as scaling and imputation |
show_clusters()
|
This function reduces a data set to 2 dimensions using principle component
analysis (PCA) and colours clusters of points. |