K_means clustering shows artefact after StandardScaler

df_RT holds a lot of data and the K_means clustering is performed on this data set. Then, I load a new data set df_data corresponding to the image you see. If I scale the data, I obtain incorrect clustering (the diagonal belongs to a specific cluster).

What am I doing wrong?

scaler = StandardScaler().fit(df_RT) df_RT_scaled = scaler.transform(df_RT) df_data_scaled = scaler.transform(df_data) pca = PCA(n_components=3).fit(df_RT_scaled) pca_df = pca.transform(df_RT_scaled) pca_data = pca.transform(df_data_scaled) 

The first image is without scaling (looks good). The second one is with scaling (the diagonal comes up as a category on its own… I’m not sure what I did wrong here… Without any scaling: everything is good

With scaling: artifact

Add Comment
0 Answer(s)

Your Answer

By posting your answer, you agree to the privacy policy and terms of service.