statisticsmachine-learning

K-Means Clustering

Cluster 2D points with MATLAB's kmeans — the simplest entry into the Statistics Toolbox.

MATLAB's `kmeans(X, K)` returns `[idx, C]` — the cluster assignment per row and the centroid matrix. scikit-learn's `KMeans(n_clusters=K).fit(X)` returns an object with `.labels_` and `.cluster_centers_`. Same data, different idioms. The converter flags this one for manual review because the output shape isn't a 1:1 swap.

MATLAB source10 lines
% K-means on 2D Gaussian blobs
rng(42);
X = [randn(100, 2) + 2; randn(100, 2) - 2; randn(100, 2) + [4 -2]];

[idx, C] = kmeans(X, 3);

figure
gscatter(X(:, 1), X(:, 2), idx); hold on;
plot(C(:, 1), C(:, 2), 'kx', 'MarkerSize', 15, 'LineWidth', 3);
title('K-means clusters with centroids');
Python output (converter-generated)15 lines · 2 flags
import numpy as np
import matplotlib.pyplot as plt

# K-means on 2D Gaussian blobs
np.random.seed(42)
X = np.array([[np.random.randn(100, 2) + 2], [np.random.randn(100, 2) - 2], [np.random.randn(100, 2) + [4, -2]]])

idx, C = kmeans(X, 3)

plt.figure()
gscatter(X[:, 0], X[:, 1], idx)
# hold on removed — matplotlib accumulates plots by default
plt.plot(C[:, 0], C[:, 1], 'kx', markersize=15, linewidth=3)
plt.title('K-means clusters with centroids')
Converter flags (2)
  • TOOLBOXLine 8: gscatter → gscatter (Statistics Toolbox) — check that arguments and return values match. Some functions have different default parameters or output formats.
  • TODOLine 8: gscatter — use plt.scatter() with a loop per group, or seaborn.scatterplot(x, y, hue=labels) for grouped scatter plots.
Implementation notes
The converter flags kmeans with a TODO because scikit-learn's KMeans API doesn't match MATLAB's call signature. The output suggests `from sklearn.cluster import KMeans; km = KMeans(n_clusters=3).fit(X); idx = km.labels_; C = km.cluster_centers_`.
Try it on your own MATLAB

Free for 50 lines. Same converter that produced the Python above.

More examples like this, once a week

New canonical conversions and release notes from the converter. One email, no spam.