Two types of statistical models that are well known in machine learning are Discriminative models and Generative models. Discriminative models are used for modeling the dependence of unobserved target variable *y on o*bserved variables x. In particular, these models aim to estimate the conditional probability P(y|x) for the target variable given the observed variables. Thus, discriminative models predict the target output(s) based on input(s) variables. On the other hand, Generative models are used to generate all values for a phenomenon, both the observed inputs as well as the target outputs, often based on hidden parameters. In other words, discriminative models are used to specify outputs based on inputs (by models such as Logistic regression. Neural networks and Random forests), while generative models generate both inputs and outputs (for example, by Hidden Markov model, Bayesian Networks and Gaussian mixture model).

One retail analytics example for both types of models is related to competing (cannibalism) or complementary product offerings. A competing or cannibalism effect happens when a retailer markets a product that is somewhat similar to other products already sold and, therefore, an increase in the market share or the sales of the product will results in a market share decrease of the existing products. A clear example is the introduction of a product (for example a chocolate ice-cream box) that is very similar to an existing product from another brand.

A complementary effect happens when the introduction of a product will actually result in a market share increase of the other products. This happens when the products are complementary to one another. For example, launching a ready-to-go sushi pack will result in an increase of a Japanese beer located close enough to the sushi stand.

A good understanding of cannibalism and complementary effects often requires a very sophisticated analysis, since the cross effects among products are complex, non-linear, and often hidden.

Using a generative model to analyze this problem is appealing since these models can be used to analyze the sales value of all the products involved in this phenomenon. Moreover, one key feature of Generative models is their ability to represent complex interactions and dependencies among input variables and output variables. These models can be used to reflect complexity in real life scenarios. Thus, generative models are typically more flexible than discriminative models in expressing dependencies in complex learning tasks.

On the other hand, since cannibalism and complementary analyses are often very complex, involving a chain-reaction effect among products, discriminative models might obtain superior performance. These discriminative models have fewer variables to estimate and analyze. For example, the analysis can be focused on one target product and all the possible affecting products, while using (for example) a logistic regression model to measure the direction and magnitude of their effects. Accordingly, such a simpler model can better balance the bias-variance tradeoff and is more immune with respect to overfitting.

The above observations led us to look for a model that, on the one hand, is focused on efficiently modeling the target variables like other discriminative models, while on the other hand, can represent complex dependencies like other Generative models when such a complexity is needed. The result of the research, jointly performed with Aviv Gruber, led to the Targeted Bayesian Network Learning (TBNL) model.

The Targeted Bayesian Network Learning (TBNL) model is a specific Bayesian Network classifier that enables control over the complexity-accuracy tradeoff in the network during its construction. It is a supervised learning model that accounts for the classification objective during the construction stage of the network by modeling the effects of attributes on the class variable. Essentially the model approximates the expected conditional probability distribution of the class variable by applying a discriminative model learning approach, which is constrained by Information Theory measurements.

Similar to other target-oriented Baesian network (BN) classifiers, the TBNL constructs a model from the variables that are closely associated with the predetermined target variable. The TBNL can be considered as an interpretable Bayesian classifier, similar to the simple and well-known Naive Bayes classifier (NB) or the Tree Augmented Network (TAN). These classifiers are well-known methods in the realm of target-oriented Bayesian classifiers. They are initially constructed for classification, focusing the learning efforts by modeling the dependence of unobserved (target) variables on observed variables. However, the structural complexity of these models is often fixed and uncontrolled directly by the model, unlike the TBNL.

The TBNL method, combines the approach of a discriminative model and a generative model of learning. It exploits the advantages of both the target-based approach that focus on the class variable as well as the traditional canonical approach of general BN learning (GBN) methods that encode the joint probability distribution of the domain variables in a more general manner – enabling the increase of the model complexity to gain further insights. In particular, the TBNL controls the model complexity during the learning stage by using information-theoretic measurements, while focusing the modeling efforts on the class variable. This dual approach enables to manage the accuracy-complexity trade-off with respect to the expected conditional probability distributions of the class variable.

Other accuracy-complexity control can be found in generative models such as the Minimal Description Length (MDL) model and other regularization methods. Although the TBNL is competitive in accuracy, recall, and precision measures, in complex scenarios it may require large computational resources compared with the simpler NB and TAN classifiers. An important feature of the TBNL associated with Bayesian networks in general is the handling of missing data values.

The TBNL can be applied to various real-world applications. For example, a condition-based maintenance (CBM) framework. In this study the TBNL models the interactions between the mechanical components of a freight rail wagon, along with maintenance-related variables and their effects on failure prediction. Another application of the TBNL is related to suspect identification in cellular networks to other real-world datasets.

To learn more about how CB4 leverages AI and machine learning to improve in-store operations and execution, take a look at our research and reports, where you’ll find customer success stories, product overviews, and more.