Hey,
I have 2 questions regarding how FLAML handles categorical variables on new data, different from the initial training dataset (for example, during inference after model deployment).
- Does it handle new categories in categorical features (unseen during training)?
- SKLearn and XGBoost estimators use ordinal encodings of categorical features. But it seems the categorical codes are extracted during inference (code). Doesn't it mean that the encodings will be different when running on a different dataset, thus mixing the categories passed to the model? If so, then sklearn's OrdinalEncoder would be a better choice here (persisting correct category codes).
Hey,
I have 2 questions regarding how FLAML handles categorical variables on new data, different from the initial training dataset (for example, during inference after model deployment).