Skip to main content

Table 1 The algorithm of DID-Class

From: Supervised classification with interdependent variables to support targeted energy efficiency measures in the residential sector

Input: training data set \( \left\{ {\left( {x_{1} , s_{1} , y_{1} } \right), \ldots ,\left( {x_{n} ,s_{n} ,y_{n} } \right)} \right\} \) and test examples \( \left\{ {\left( {x_{n + 1} , s_{n + 1} } \right), \ldots , \left( {x_{n + m} ,s_{n + m} } \right)} \right\} \)

Output: probability distributions \( \left\{ {P^{n + 1} , \ldots , P^{n + m} } \right\} \) over the classes

1. Show that the correlation between independent and dependent measurements exists

2. Show that the independent measurements are not affected by the class labels

3. Compute the influence (\( f_{j} \)) of different independent measurements on distinct class labels, by solving the generalized linear model

4. \( x_{i} = \mathop \sum \limits_{j \in J} d_{ij} f_{j} \left( {s_{i} } \right) + \varepsilon_{i} \)

5. For each \( i \in \{ 1, \ldots , n\} \) compute the normalized measurements:

6. \( x_{i}^{{\prime }} = x_{i} - f_{{y_{i} }} \left( {s_{i} } \right) + f_{{y_{i} }} \left( {s_{1} } \right) \)

7. Create the probabilistic classifier \( C \) with the training set \( \left\{ {\left( {x_{1}^{{\prime }} ,y_{1} } \right), \ldots ,\left( {x_{n}^{{\prime }} ,y_{n} } \right)} \right\} \)

8. For each \( i \in \{ n + 1, \ldots ,n + m\} \)

9. For each \( j \in J \) compute the normalized measurements for unlabelled data as each possible class

10. \( x_{i}^{{j^{{\prime }} }} = x_{i} - f_{j} \left( {s_{i} } \right) + f_{j} \left( {s_{1} } \right) \)

11. Apply the classifier \( C \) to the normalized measurements

12. \( \forall k \in J {\text{let}} p_{jk}^{i} : = \) probability of \( x_{i}^{j^\prime} \) belonging to class \( k \)

13. Let \( P_{l}^{i} = \frac{{\varSigma_{k} p_{kl}^{i} }}{a} \) be the probability of \( x_{i} \) belonging to class \( l \)

14. Return \( \left\{ {P^{n + 1} , \ldots , P^{n + m} } \right\} \)