K_means clustering shows artefact after StandardScaler
df_RT
holds a lot of data and the K_means clustering is performed on this data set. Then, I load a new data set df_data
corresponding to the image you see. If I scale the data, I obtain incorrect clustering (the diagonal belongs to a specific cluster).
What am I doing wrong?
scaler = StandardScaler().fit(df_RT) df_RT_scaled = scaler.transform(df_RT) df_data_scaled = scaler.transform(df_data) pca = PCA(n_components=3).fit(df_RT_scaled) pca_df = pca.transform(df_RT_scaled) pca_data = pca.transform(df_data_scaled)
The first image is without scaling (looks good). The second one is with scaling (the diagonal comes up as a category on its own… I’m not sure what I did wrong here…