feature-crossing
Parent: data-representation
Source: google-ml-course
Synthetic features (feature crossing)
- e.g. generate feature $x_3$ by combining $x_1$ and $x_2$
$$x_3 = x_1 x_2$$ - crossing boolean features can result in a very sparse feature set
- a more sophisticated version of feature crossing is a neural network
Advantage
Enables learning with nonlinear features while making use of a linear model
–> nonlinear features scale well with large scale data sets
Disadvantage
Crossing sparse features may significantly increase the size of the feature space.
May lead to
- Increased model size (more RAM usage)
- Noise coefficients (of ‘redundant’ feature subsets created by the cross), overfitting
Solution
- try to ‘zero’ out some of those noise coefficients/weights
- must not lose useful features, only the ‘noise’ ones
- –> L$_1$ regularisation