Estimated reading time: 3 mins
As enterprises strive to increase their market share and stay ahead in a competitive business environment, they need to increasingly take the technology recourse. Predictive Analytics is a technology, which studies the behavior of existing as well as potential customers through social media channels and their online website activities to predict future scenarios and their probability of purchase.
By processing the gathered information to collect key features, including essential patterns and attributes, and selecting only the most appropriate features helps to create a simple yet powerful predictive model. Predictive Analytics has helped Amazon increase its sales by 30%. As per a report from McKinsey, predictive maintenance helps reduce call centre cost by approx 20-50%.
Predictive analytics encompasses a variety of statistical techniques from data mining, predictive modelling, and machine learning, which analyze current and historical facts to make predictions about future or otherwise unknown events.
A highly effective model is built using a minimal set of features, which contain enough information to train a model to make accurate predictions/classifications.
Selection of useful features out of N given features is quite complex. Mathematically, we have 2N number of possible subsets. For example, for a problem with 20 features, we have 1048576 number of possible subsets of features, so one can easily guess the complexity if there are 40 features to deal with.
To avoid the computational burden, the number of selected input variables should not be too high. Similarly, it should not be too less either as the input variables would not be able to provide essential information.
To build an efficient model, the number of feature vectors should be optimum so that the behavior of the given phenomenon can be described with minimum non-redundant features with informative variables.
Feature selection methods can be categorized into three categories: filters, wrappers, and embedded methods.
The main difference between embedded method and filter method is while embedded method requires iterative updates, the model parameters are selected according to the model performance. The wrapper method considers only the model performance of the selected set of features.
Suppose, we have 26 features viz. a,b,c,d,...,z. Out of these 26 features, each individual attribute may not be informative by itself, but a combination of them may be (for example: Perhaps b and c have no information separately, but (b + c) or b*c, on the other hand, might have some information). Now, Filter feature selection approach may miss it as it evaluates features in isolation, not in combination, but Wrapper approach can leverage this information as it is using a prediction or classification algorithm actually for evaluation.
To sum up, the optimal set of features should contain the minimum number of input variables, which are required to describe the behavior of the considered system or phenomenon with minimum redundant variables, which at the same time provide maximum information. A more accurate, efficient, simple and easily interpretable model can be built if the optimal set of input variables is identified.
The rule of thumb, which is followed for feature selection across analytics is that Filter method is used when the number of features is large in the dataset while Wrapper method is used when the number of features is moderate. But in practice, it's usually a better idea to use Wrapper method for key feature selection as it takes the performance of the actual classifier you want to use into account, and different classifiers vary widely in the usage of information.
Using these simple, yet powerful techniques to build a predictive model, enterprises can improve their sales and OpEx figures and stay ahead in a globally competitive business environment.