All functions

assign()

Assigns data points to k clusters

avg_sil_score()

avg_sil_score This function takes in a vector of cluster labels and a corresponding array of points and calculates the average silhouette score across all clusters. It returns the value of the average silhouette score.

calc_centers()

Calculates center coordinates of each cluster

find_elbow()

find_elbow This function takes in unlabeled, scaled data and performs clustering using the KMeans clustering algorithm values of K up to the min(10, n_samples - 1). It returns the value for K which maximizes the mean silhouette scores across all clusters.

fit()

Finds k clusters in data points.

fit_assign()

Finds k clusters in data points and assigns each point to a cluster.

init_centers()

Chooses initial cluster locations using Kmeans++

measure_dist()

Measures distance from data points to cluster centers

preprocess()

This function takes in training data and applies some preprocessing steps such as scaling and imputation

show_clusters()

This function reduces a data set to 2 dimensions using principle component analysis (PCA) and colours clusters of points.