RandomForestClassifier – Odd error with trying to identify feature importance in sklearn?
I’m trying to retrieve the importance of features within a RandomForestClassifier model, retrieving the coef for each feature in the model,
I’m running the following code here,
random_forest = SelectFromModel(RandomForestClassifier(n_estimators = 200, random_state = 123)) random_forest.fit(X_train, y_train) print(random_forest.estimator.feature_importances_)
but am receiving the following error
NotFittedError: This RandomForestClassifier instance is not fitted yet. Call 'fit' with appropriate arguments before using this method.
What exactly am I doing wrong? You can see I fit the model right before looking to identify the importance of features, but it doesn’t seem to work as it should,
Similarily, I have the code below with a LogisticRegression model and it works fine,
log_reg = SelectFromModel(LogisticRegression(class_weight = "balanced", random_state = 123)) log_reg.fit(X_train, y_train) print(log_reg.estimator_.coef_)
You have to call the attribute estimator_
to access the fitted estimator (see the docs). Observe that you forgot the trailing _
. So it should be:
print(random_forest.estimator_.feature_importances_)
Interestingly, you did it correctly for your example with the LogisticRegression
model.