String categorical features

There seems to be only two other discussion on this forum about this topic, and they feel unanswered.

I would like to know what is the best way to encode categorical string features. Are there other options aside from one hot encoding or transforming to int values.

Does anyone have indications when considering working with GNNs?

You could try category_encoders package that has a lot of options:
https://contrib.scikit-learn.org/category_encoders/

But in general, encoding them to one-hot vectors should be enough, unless you have very high cardinality.

1 Like

This topic was automatically closed 30 days after the last reply. New replies are no longer allowed.