Create and evaluate handcrafted classification rules in decision-rules

In this tutorial we will create manually decision rules for and evaluate them.

We begin by loading the iris dataset into a DataFrame.

[1]:
import pandas as pd
IRIS_PATH = 'resources/iris.csv'
iris_df = pd.read_csv(IRIS_PATH)
display(iris_df)
print('Columns: ', iris_df.columns.values)
print('Class names:', iris_df['class'].unique())
sepallength sepalwidth petallength petalwidth class
0 5.1 3.5 1.4 0.2 Iris-setosa
1 4.9 3.0 1.4 0.2 Iris-setosa
2 4.7 3.2 1.3 0.2 Iris-setosa
3 4.6 3.1 1.5 0.2 Iris-setosa
4 5.0 3.6 1.4 0.2 Iris-setosa
... ... ... ... ... ...
145 6.7 3.0 5.2 2.3 Iris-virginica
146 6.3 2.5 5.0 1.9 Iris-virginica
147 6.5 3.0 5.2 2.0 Iris-virginica
148 6.2 3.4 5.4 2.3 Iris-virginica
149 5.9 3.0 5.1 1.8 Iris-virginica

150 rows × 5 columns

Columns:  ['sepallength' 'sepalwidth' 'petallength' 'petalwidth' 'class']
Class names: ['Iris-setosa' 'Iris-versicolor' 'Iris-virginica']

The task is to predict the class of an example (Iris-setosa, Iris-versicolor or Iris-virginica) using the values in the other columns (‘sepallength’, ‘sepalwidth’, ‘petallength’, ‘petalwidth’). We will store the predictors in the X variable and the target in y.

[2]:
X = iris_df.drop(columns=['class'])
y = iris_df['class']

Someone suggested the following simple rules:

  1. If \(petallength < 2.5\), then it’s Iris-setosa.

  2. If \(petallength \geq 2.5\) and \(petalwidth < 1.65\), then it’s Iris-versicolor.

  3. Otherwise, it’s Iris-virginica

Let’s implement them. We will start from the first one.

[3]:
from decision_rules.classification.rule import ClassificationConclusion
from decision_rules.classification.rule import ClassificationRule
from decision_rules.conditions import ElementaryCondition, CompoundCondition

rule_1 = ClassificationRule(
    premise=CompoundCondition(
        subconditions=[
            ElementaryCondition(
                column_index=X.columns.get_loc('petallength'),
                right = 2.5
            )
        ]
    ),
    conclusion=ClassificationConclusion(
        value='Iris-setosa',
        column_name='class',
    ),
    column_names=X.columns,
)

We use ClassificationRule class to create the rule. Every rule has two parts: premise (e.g. \(petallength < 2.5\)) and conclusion (e.g. Iris-setosa).

You can create a premise using the conditions from decision_rules.conditions:

  • NominalCondition: checks if a value of the attribute is equal to a value, e.g. \(x = 1\). Useful for nominal attributes.

  • ElementaryCondition: checks if a value is inside an interval, e.g. \(x \in [2.3, 3.1)\).

  • CompoundCondition: a conjunction or alternative of ElementaryConditions, e.g. \(x \in [2.3, 3.1)\) and \(y \in (8, +\infty)\).

  • AttributesCondition: checks if a relationship between two attributes is met, e.g. \(x < y\).

In case of this rule, the premise \(petallength < 2.5\) can be written as \(petallength \in [-\infty, 2.5)\), so we can use the ElementaryCondition. The column_index argument is the index of the column with the relevant attribute (petallength). The arguments left and right are the boundaries of the interval, their default values are minus and plus infinity respectively. The interval is open by default, we can change that by setting the left_closed or right_colsed attribute to True.

Currently, the premise needs to be a CompoundCondition, so we put that ElementaryCondition as a subcondition inside a CompoundCondition. CompoundCondition has one required argument named subconditions which must be a list of ElementaryCondition objects.

The conclusion is a ClassificationConclusion object. It accepts two arguments: the predicted value (Iris-setosa) and the column name (class).

Now, let’s create the rule 2. The premise is a conjunction (has two parts joined by “and”), so we will use again CompoundCondition. This time the subconditions list will have two elementary conditions.

[4]:
rule_2 = ClassificationRule(
    premise=CompoundCondition(
        subconditions=[
            ElementaryCondition(
                column_index=X.columns.get_loc('petallength'),
                left = 2.5,
                left_closed=True,
            ),
            ElementaryCondition(
                column_index=X.columns.get_loc('petalwidth'),
                right = 1.65,
            ),
        ]
    ),
    conclusion=ClassificationConclusion(
        value='Iris-versicolor',
        column_name='class',
    ),
    column_names=X.columns,
)

Now that we have all the rules, we can create the rule set. To to this, we use the ClassificationRuleSet class. It has one mandatory argument rules which is a list of ClassificationRule objects

[5]:
from decision_rules.classification.ruleset import ClassificationRuleSet

ruleset = ClassificationRuleSet(rules=[rule_1, rule_2])

We still need to add the 3rd rule (otherwise, it’s Iris-virginica). We implement such a rule using the default_conclusion property of a ruleset.

[6]:
ruleset.default_conclusion = ClassificationConclusion(
    value='Iris-virginica',
    column_name='class',
)

Now the rule set is almost ready. After defining the rules, you should call the update function. It accepts 3 arguments:

  • a DataFrame of predictors (X)

  • a Series of target values (y)

  • a measure of rule quality

This function calculates also the coverage matrix, which shows which rules cover each row.

[7]:
from decision_rules.measures import accuracy
coverage_matrix = ruleset.update(X, y, accuracy)
display(coverage_matrix)
array([[ True, False],
       [ True, False],
       [ True, False],
       [ True, False],
       [ True, False],
       [ True, False],
       [ True, False],
       [ True, False],
       [ True, False],
       [ True, False],
       [ True, False],
       [ True, False],
       [ True, False],
       [ True, False],
       [ True, False],
       [ True, False],
       [ True, False],
       [ True, False],
       [ True, False],
       [ True, False],
       [ True, False],
       [ True, False],
       [ True, False],
       [ True, False],
       [ True, False],
       [ True, False],
       [ True, False],
       [ True, False],
       [ True, False],
       [ True, False],
       [ True, False],
       [ True, False],
       [ True, False],
       [ True, False],
       [ True, False],
       [ True, False],
       [ True, False],
       [ True, False],
       [ True, False],
       [ True, False],
       [ True, False],
       [ True, False],
       [ True, False],
       [ True, False],
       [ True, False],
       [ True, False],
       [ True, False],
       [ True, False],
       [ True, False],
       [ True, False],
       [False,  True],
       [False,  True],
       [False,  True],
       [False,  True],
       [False,  True],
       [False,  True],
       [False,  True],
       [False,  True],
       [False,  True],
       [False,  True],
       [False,  True],
       [False,  True],
       [False,  True],
       [False,  True],
       [False,  True],
       [False,  True],
       [False,  True],
       [False,  True],
       [False,  True],
       [False,  True],
       [False, False],
       [False,  True],
       [False,  True],
       [False,  True],
       [False,  True],
       [False,  True],
       [False,  True],
       [False, False],
       [False,  True],
       [False,  True],
       [False,  True],
       [False,  True],
       [False,  True],
       [False,  True],
       [False,  True],
       [False,  True],
       [False,  True],
       [False,  True],
       [False,  True],
       [False,  True],
       [False,  True],
       [False,  True],
       [False,  True],
       [False,  True],
       [False,  True],
       [False,  True],
       [False,  True],
       [False,  True],
       [False,  True],
       [False,  True],
       [False, False],
       [False, False],
       [False, False],
       [False, False],
       [False, False],
       [False, False],
       [False, False],
       [False, False],
       [False, False],
       [False, False],
       [False, False],
       [False, False],
       [False, False],
       [False, False],
       [False, False],
       [False, False],
       [False, False],
       [False, False],
       [False, False],
       [False,  True],
       [False, False],
       [False, False],
       [False, False],
       [False, False],
       [False, False],
       [False, False],
       [False, False],
       [False, False],
       [False, False],
       [False,  True],
       [False, False],
       [False, False],
       [False, False],
       [False,  True],
       [False,  True],
       [False, False],
       [False, False],
       [False, False],
       [False, False],
       [False, False],
       [False, False],
       [False, False],
       [False, False],
       [False, False],
       [False, False],
       [False, False],
       [False, False],
       [False, False],
       [False, False],
       [False, False]])

The ruleset is ready now. Let’s see what it will predict for our data stored in X variable.

[8]:
# The predictions for each row in X will be stored in the y_pred array.
y_pred = ruleset.predict(X)
# The compare_df will show us the true class and the prediction for each example.
compare_df = iris_df.copy()
compare_df['predictions'] = y_pred
with pd.option_context('display.max_rows', 150):
    display(compare_df)
sepallength sepalwidth petallength petalwidth class predictions
0 5.1 3.5 1.4 0.2 Iris-setosa Iris-setosa
1 4.9 3.0 1.4 0.2 Iris-setosa Iris-setosa
2 4.7 3.2 1.3 0.2 Iris-setosa Iris-setosa
3 4.6 3.1 1.5 0.2 Iris-setosa Iris-setosa
4 5.0 3.6 1.4 0.2 Iris-setosa Iris-setosa
5 5.4 3.9 1.7 0.4 Iris-setosa Iris-setosa
6 4.6 3.4 1.4 0.3 Iris-setosa Iris-setosa
7 5.0 3.4 1.5 0.2 Iris-setosa Iris-setosa
8 4.4 2.9 1.4 0.2 Iris-setosa Iris-setosa
9 4.9 3.1 1.5 0.1 Iris-setosa Iris-setosa
10 5.4 3.7 1.5 0.2 Iris-setosa Iris-setosa
11 4.8 3.4 1.6 0.2 Iris-setosa Iris-setosa
12 4.8 3.0 1.4 0.1 Iris-setosa Iris-setosa
13 4.3 3.0 1.1 0.1 Iris-setosa Iris-setosa
14 5.8 4.0 1.2 0.2 Iris-setosa Iris-setosa
15 5.7 4.4 1.5 0.4 Iris-setosa Iris-setosa
16 5.4 3.9 1.3 0.4 Iris-setosa Iris-setosa
17 5.1 3.5 1.4 0.3 Iris-setosa Iris-setosa
18 5.7 3.8 1.7 0.3 Iris-setosa Iris-setosa
19 5.1 3.8 1.5 0.3 Iris-setosa Iris-setosa
20 5.4 3.4 1.7 0.2 Iris-setosa Iris-setosa
21 5.1 3.7 1.5 0.4 Iris-setosa Iris-setosa
22 4.6 3.6 1.0 0.2 Iris-setosa Iris-setosa
23 5.1 3.3 1.7 0.5 Iris-setosa Iris-setosa
24 4.8 3.4 1.9 0.2 Iris-setosa Iris-setosa
25 5.0 3.0 1.6 0.2 Iris-setosa Iris-setosa
26 5.0 3.4 1.6 0.4 Iris-setosa Iris-setosa
27 5.2 3.5 1.5 0.2 Iris-setosa Iris-setosa
28 5.2 3.4 1.4 0.2 Iris-setosa Iris-setosa
29 4.7 3.2 1.6 0.2 Iris-setosa Iris-setosa
30 4.8 3.1 1.6 0.2 Iris-setosa Iris-setosa
31 5.4 3.4 1.5 0.4 Iris-setosa Iris-setosa
32 5.2 4.1 1.5 0.1 Iris-setosa Iris-setosa
33 5.5 4.2 1.4 0.2 Iris-setosa Iris-setosa
34 4.9 3.1 1.5 0.1 Iris-setosa Iris-setosa
35 5.0 3.2 1.2 0.2 Iris-setosa Iris-setosa
36 5.5 3.5 1.3 0.2 Iris-setosa Iris-setosa
37 4.9 3.1 1.5 0.1 Iris-setosa Iris-setosa
38 4.4 3.0 1.3 0.2 Iris-setosa Iris-setosa
39 5.1 3.4 1.5 0.2 Iris-setosa Iris-setosa
40 5.0 3.5 1.3 0.3 Iris-setosa Iris-setosa
41 4.5 2.3 1.3 0.3 Iris-setosa Iris-setosa
42 4.4 3.2 1.3 0.2 Iris-setosa Iris-setosa
43 5.0 3.5 1.6 0.6 Iris-setosa Iris-setosa
44 5.1 3.8 1.9 0.4 Iris-setosa Iris-setosa
45 4.8 3.0 1.4 0.3 Iris-setosa Iris-setosa
46 5.1 3.8 1.6 0.2 Iris-setosa Iris-setosa
47 4.6 3.2 1.4 0.2 Iris-setosa Iris-setosa
48 5.3 3.7 1.5 0.2 Iris-setosa Iris-setosa
49 5.0 3.3 1.4 0.2 Iris-setosa Iris-setosa
50 7.0 3.2 4.7 1.4 Iris-versicolor Iris-versicolor
51 6.4 3.2 4.5 1.5 Iris-versicolor Iris-versicolor
52 6.9 3.1 4.9 1.5 Iris-versicolor Iris-versicolor
53 5.5 2.3 4.0 1.3 Iris-versicolor Iris-versicolor
54 6.5 2.8 4.6 1.5 Iris-versicolor Iris-versicolor
55 5.7 2.8 4.5 1.3 Iris-versicolor Iris-versicolor
56 6.3 3.3 4.7 1.6 Iris-versicolor Iris-versicolor
57 4.9 2.4 3.3 1.0 Iris-versicolor Iris-versicolor
58 6.6 2.9 4.6 1.3 Iris-versicolor Iris-versicolor
59 5.2 2.7 3.9 1.4 Iris-versicolor Iris-versicolor
60 5.0 2.0 3.5 1.0 Iris-versicolor Iris-versicolor
61 5.9 3.0 4.2 1.5 Iris-versicolor Iris-versicolor
62 6.0 2.2 4.0 1.0 Iris-versicolor Iris-versicolor
63 6.1 2.9 4.7 1.4 Iris-versicolor Iris-versicolor
64 5.6 2.9 3.6 1.3 Iris-versicolor Iris-versicolor
65 6.7 3.1 4.4 1.4 Iris-versicolor Iris-versicolor
66 5.6 3.0 4.5 1.5 Iris-versicolor Iris-versicolor
67 5.8 2.7 4.1 1.0 Iris-versicolor Iris-versicolor
68 6.2 2.2 4.5 1.5 Iris-versicolor Iris-versicolor
69 5.6 2.5 3.9 1.1 Iris-versicolor Iris-versicolor
70 5.9 3.2 4.8 1.8 Iris-versicolor Iris-virginica
71 6.1 2.8 4.0 1.3 Iris-versicolor Iris-versicolor
72 6.3 2.5 4.9 1.5 Iris-versicolor Iris-versicolor
73 6.1 2.8 4.7 1.2 Iris-versicolor Iris-versicolor
74 6.4 2.9 4.3 1.3 Iris-versicolor Iris-versicolor
75 6.6 3.0 4.4 1.4 Iris-versicolor Iris-versicolor
76 6.8 2.8 4.8 1.4 Iris-versicolor Iris-versicolor
77 6.7 3.0 5.0 1.7 Iris-versicolor Iris-virginica
78 6.0 2.9 4.5 1.5 Iris-versicolor Iris-versicolor
79 5.7 2.6 3.5 1.0 Iris-versicolor Iris-versicolor
80 5.5 2.4 3.8 1.1 Iris-versicolor Iris-versicolor
81 5.5 2.4 3.7 1.0 Iris-versicolor Iris-versicolor
82 5.8 2.7 3.9 1.2 Iris-versicolor Iris-versicolor
83 6.0 2.7 5.1 1.6 Iris-versicolor Iris-versicolor
84 5.4 3.0 4.5 1.5 Iris-versicolor Iris-versicolor
85 6.0 3.4 4.5 1.6 Iris-versicolor Iris-versicolor
86 6.7 3.1 4.7 1.5 Iris-versicolor Iris-versicolor
87 6.3 2.3 4.4 1.3 Iris-versicolor Iris-versicolor
88 5.6 3.0 4.1 1.3 Iris-versicolor Iris-versicolor
89 5.5 2.5 4.0 1.3 Iris-versicolor Iris-versicolor
90 5.5 2.6 4.4 1.2 Iris-versicolor Iris-versicolor
91 6.1 3.0 4.6 1.4 Iris-versicolor Iris-versicolor
92 5.8 2.6 4.0 1.2 Iris-versicolor Iris-versicolor
93 5.0 2.3 3.3 1.0 Iris-versicolor Iris-versicolor
94 5.6 2.7 4.2 1.3 Iris-versicolor Iris-versicolor
95 5.7 3.0 4.2 1.2 Iris-versicolor Iris-versicolor
96 5.7 2.9 4.2 1.3 Iris-versicolor Iris-versicolor
97 6.2 2.9 4.3 1.3 Iris-versicolor Iris-versicolor
98 5.1 2.5 3.0 1.1 Iris-versicolor Iris-versicolor
99 5.7 2.8 4.1 1.3 Iris-versicolor Iris-versicolor
100 6.3 3.3 6.0 2.5 Iris-virginica Iris-virginica
101 5.8 2.7 5.1 1.9 Iris-virginica Iris-virginica
102 7.1 3.0 5.9 2.1 Iris-virginica Iris-virginica
103 6.3 2.9 5.6 1.8 Iris-virginica Iris-virginica
104 6.5 3.0 5.8 2.2 Iris-virginica Iris-virginica
105 7.6 3.0 6.6 2.1 Iris-virginica Iris-virginica
106 4.9 2.5 4.5 1.7 Iris-virginica Iris-virginica
107 7.3 2.9 6.3 1.8 Iris-virginica Iris-virginica
108 6.7 2.5 5.8 1.8 Iris-virginica Iris-virginica
109 7.2 3.6 6.1 2.5 Iris-virginica Iris-virginica
110 6.5 3.2 5.1 2.0 Iris-virginica Iris-virginica
111 6.4 2.7 5.3 1.9 Iris-virginica Iris-virginica
112 6.8 3.0 5.5 2.1 Iris-virginica Iris-virginica
113 5.7 2.5 5.0 2.0 Iris-virginica Iris-virginica
114 5.8 2.8 5.1 2.4 Iris-virginica Iris-virginica
115 6.4 3.2 5.3 2.3 Iris-virginica Iris-virginica
116 6.5 3.0 5.5 1.8 Iris-virginica Iris-virginica
117 7.7 3.8 6.7 2.2 Iris-virginica Iris-virginica
118 7.7 2.6 6.9 2.3 Iris-virginica Iris-virginica
119 6.0 2.2 5.0 1.5 Iris-virginica Iris-versicolor
120 6.9 3.2 5.7 2.3 Iris-virginica Iris-virginica
121 5.6 2.8 4.9 2.0 Iris-virginica Iris-virginica
122 7.7 2.8 6.7 2.0 Iris-virginica Iris-virginica
123 6.3 2.7 4.9 1.8 Iris-virginica Iris-virginica
124 6.7 3.3 5.7 2.1 Iris-virginica Iris-virginica
125 7.2 3.2 6.0 1.8 Iris-virginica Iris-virginica
126 6.2 2.8 4.8 1.8 Iris-virginica Iris-virginica
127 6.1 3.0 4.9 1.8 Iris-virginica Iris-virginica
128 6.4 2.8 5.6 2.1 Iris-virginica Iris-virginica
129 7.2 3.0 5.8 1.6 Iris-virginica Iris-versicolor
130 7.4 2.8 6.1 1.9 Iris-virginica Iris-virginica
131 7.9 3.8 6.4 2.0 Iris-virginica Iris-virginica
132 6.4 2.8 5.6 2.2 Iris-virginica Iris-virginica
133 6.3 2.8 5.1 1.5 Iris-virginica Iris-versicolor
134 6.1 2.6 5.6 1.4 Iris-virginica Iris-versicolor
135 7.7 3.0 6.1 2.3 Iris-virginica Iris-virginica
136 6.3 3.4 5.6 2.4 Iris-virginica Iris-virginica
137 6.4 3.1 5.5 1.8 Iris-virginica Iris-virginica
138 6.0 3.0 4.8 1.8 Iris-virginica Iris-virginica
139 6.9 3.1 5.4 2.1 Iris-virginica Iris-virginica
140 6.7 3.1 5.6 2.4 Iris-virginica Iris-virginica
141 6.9 3.1 5.1 2.3 Iris-virginica Iris-virginica
142 5.8 2.7 5.1 1.9 Iris-virginica Iris-virginica
143 6.8 3.2 5.9 2.3 Iris-virginica Iris-virginica
144 6.7 3.3 5.7 2.5 Iris-virginica Iris-virginica
145 6.7 3.0 5.2 2.3 Iris-virginica Iris-virginica
146 6.3 2.5 5.0 1.9 Iris-virginica Iris-virginica
147 6.5 3.0 5.2 2.0 Iris-virginica Iris-virginica
148 6.2 3.4 5.4 2.3 Iris-virginica Iris-virginica
149 5.9 3.0 5.1 1.8 Iris-virginica Iris-virginica

We will check now how well our rules perform on the data using various metrics. The calculate_for_classification function computes the typical classification metrics, such as accuracy, f1 or Cohen kappa.

[9]:
from decision_rules.classification.prediction_indicators import calculate_for_classification
metrics = calculate_for_classification(y, y_pred)
display(metrics)
{'type_of_problem': 'classification',
 'general': {'Balanced_accuracy': 0.96,
  'Accuracy': 0.96,
  'Cohen_kappa': 0.94,
  'F1_micro': 0.96,
  'F1_macro': 0.9599839935974389,
  'F1_weighted': 0.9599839935974391,
  'G_mean_micro': 0.9699484522385713,
  'G_mean_macro': 0.9699484522385713,
  'G_mean_weighted': 0.9699484522385713,
  'Recall_micro': 0.96,
  'Recall_macro': 0.96,
  'Recall_weighted': 0.96,
  'Specificity': 1.0,
  'Confusion_matrix': {'classes': ['Iris-setosa',
    'Iris-versicolor',
    'Iris-virginica'],
   'Iris-setosa': [50, 0, 0],
   'Iris-versicolor': [0, 48, 2],
   'Iris-virginica': [0, 4, 46]}},
 'for_classes': {'Iris-setosa': {'TP': 50,
   'FP': 0,
   'TN': 100,
   'FN': 0,
   'Recall': 1.0,
   'Specificity': 1.0,
   'F1_score': 1.0,
   'G_mean': 1.0,
   'MCC': 1.0,
   'PPV': 1.0,
   'NPV': 1.0,
   'LR_plus': 0,
   'LR_minus': 0.0,
   'Odd_ratio': 0,
   'Relative_risk': 0,
   'Confusion_matrix': {'classes': ['Iris-setosa', 'other'],
    'Iris-setosa': [50, 0],
    'other': [0, 100]}},
  'Iris-versicolor': {'TP': 48,
   'FP': 4,
   'TN': 96,
   'FN': 2,
   'Recall': 0.96,
   'Specificity': 0.96,
   'F1_score': 0.9411764705882353,
   'G_mean': 0.96,
   'MCC': 0.9112931795128765,
   'PPV': 0.9230769230769231,
   'NPV': 0.9795918367346939,
   'LR_plus': 23.99999999999998,
   'LR_minus': 0.041666666666666706,
   'Odd_ratio': 576.0,
   'Relative_risk': 45.23076923076923,
   'Confusion_matrix': {'classes': ['Iris-versicolor', 'other'],
    'Iris-versicolor': [48, 2],
    'other': [4, 96]}},
  'Iris-virginica': {'TP': 46,
   'FP': 2,
   'TN': 98,
   'FN': 4,
   'Recall': 0.92,
   'Specificity': 0.98,
   'F1_score': 0.9387755102040817,
   'G_mean': 0.9495261976375375,
   'MCC': 0.9095085938862487,
   'PPV': 0.9583333333333334,
   'NPV': 0.9607843137254902,
   'LR_plus': 45.999999999999964,
   'LR_minus': 0.08163265306122446,
   'Odd_ratio': 563.5,
   'Relative_risk': 24.4375,
   'Confusion_matrix': {'classes': ['Iris-virginica', 'other'],
    'Iris-virginica': [46, 4],
    'other': [2, 98]}}}}

The calculate_rules_metrics function of a ruleset object computes metrics describing each of the rules in the rule set. The full description of the metrics can be found in the documentation of ruleminer.

[10]:
metrics = ruleset.calculate_rules_metrics(X, y)
for rule_id, metrics in metrics.items():
    print('Rule', rule_id)
    print(metrics)
Rule fd0d78eb-aea8-4ab0-9704-6999d87573c2
{'p': 50, 'n': 0, 'P': 50, 'N': 100, 'p_unique': 50, 'n_unique': 50, 'support': 50, 'conditions_count': 1, 'precision': 1.0, 'coverage': 1.0, 'C2': 1.0, 'RSS': 1.0, 'correlation': 1.0, 'lift': 3.0, 'p_value': 4.968040370318492e-41, 'TP': 50, 'FP': 0, 'TN': 100, 'FN': 0, 'sensitivity': 1.0, 'specificity': 1.0, 'negative_predictive_value': 1.0, 'odds_ratio': inf, 'relative_risk': inf, 'lr+': inf, 'lr-': 0.0}
Rule 1cd50058-f754-4d69-ab42-5690bdccfa1b
{'p': 48, 'n': 4, 'P': 50, 'N': 100, 'p_unique': 48, 'n_unique': 48, 'support': 52, 'conditions_count': 2, 'precision': 0.9230769230769231, 'coverage': 0.96, 'C2': 0.8669230769230769, 'RSS': 0.9199999999999999, 'correlation': 0.9112931795128765, 'lift': 2.769230769230769, 'p_value': 6.403421751602081e-32, 'TP': 48, 'FP': 4, 'TN': 96, 'FN': 2, 'sensitivity': 0.96, 'specificity': 0.96, 'negative_predictive_value': 0.9795918367346939, 'odds_ratio': 576.0, 'relative_risk': 45.23076923076923, 'lr+': 23.99999999999998, 'lr-': 0.041666666666666706}

The calculate_ruleset_stats function returns some general statistics regarding the rules present in the rule set.

[11]:
general_stats = ruleset.calculate_ruleset_stats()
print(general_stats)
{'rules_count': 2, 'avg_conditions_count': 1.5, 'avg_precision': 0.96, 'avg_coverage': 0.98, 'total_conditions_count': 3}

We can serialize the ruleset into a dict using JSONSerializer.serialize. The dict can be later stored in a string or in a text file.

[12]:
import os
import json
from decision_rules.serialization import JSONSerializer

OUTPUT_DIR = 'output'
RULESET_FILENAME = 'manual_iris.json'
os.makedirs(OUTPUT_DIR, exist_ok=True)
ruleset_path = os.path.join(OUTPUT_DIR, RULESET_FILENAME)
# Serialize the ruleset
ruleset_dict = JSONSerializer.serialize(ruleset)
display(ruleset_dict)
# Save to JSON
with open(ruleset_path, 'w') as fp:
    json.dump(ruleset_dict, fp)
{'meta': {'attributes': ['sepallength',
   'sepalwidth',
   'petallength',
   'petalwidth'],
  'decision_attribute': 'class',
  'decision_attribute_distribution': {'Iris-setosa': 50,
   'Iris-versicolor': 50,
   'Iris-virginica': 50}},
 'rules': [{'uuid': 'fd0d78eb-aea8-4ab0-9704-6999d87573c2',
   'string': 'IF petallength < 2.50 THEN class = Iris-setosa',
   'premise': {'type': 'compound',
    'operator': 'CONJUNCTION',
    'subconditions': [{'type': 'elementary_numerical',
      'attributes': [2],
      'negated': False,
      'left': None,
      'right': 2.5,
      'left_closed': False,
      'right_closed': False}]},
   'conclusion': {'value': 'Iris-setosa'},
   'coverage': {'p': 50, 'n': 0, 'P': 50, 'N': 100}},
  {'uuid': '1cd50058-f754-4d69-ab42-5690bdccfa1b',
   'string': 'IF petallength >= 2.50 AND petalwidth < 1.65 THEN class = Iris-versicolor',
   'premise': {'type': 'compound',
    'operator': 'CONJUNCTION',
    'subconditions': [{'type': 'elementary_numerical',
      'attributes': [2],
      'negated': False,
      'left': 2.5,
      'right': None,
      'left_closed': True,
      'right_closed': False},
     {'type': 'elementary_numerical',
      'attributes': [3],
      'negated': False,
      'left': None,
      'right': 1.65,
      'left_closed': False,
      'right_closed': False}]},
   'conclusion': {'value': 'Iris-versicolor'},
   'coverage': {'p': 48, 'n': 4, 'P': 50, 'N': 100}}]}

The ruleset can be loaded using JSONSerializer.deserialize

[13]:
with open(ruleset_path) as fp:
    reloaded_json = json.load(fp)
reloaded_ruleset = JSONSerializer.deserialize(reloaded_json, ClassificationRuleSet)
assert reloaded_ruleset == ruleset

Before using some of the functions of the deserialized ruleset, it may be necessary to call the update function. After that the object will be ready for prediction.

[14]:
reloaded_ruleset.update(X, y, accuracy)
y_pred = reloaded_ruleset.predict(X)
display(y_pred)
array(['Iris-setosa', 'Iris-setosa', 'Iris-setosa', 'Iris-setosa',
       'Iris-setosa', 'Iris-setosa', 'Iris-setosa', 'Iris-setosa',
       'Iris-setosa', 'Iris-setosa', 'Iris-setosa', 'Iris-setosa',
       'Iris-setosa', 'Iris-setosa', 'Iris-setosa', 'Iris-setosa',
       'Iris-setosa', 'Iris-setosa', 'Iris-setosa', 'Iris-setosa',
       'Iris-setosa', 'Iris-setosa', 'Iris-setosa', 'Iris-setosa',
       'Iris-setosa', 'Iris-setosa', 'Iris-setosa', 'Iris-setosa',
       'Iris-setosa', 'Iris-setosa', 'Iris-setosa', 'Iris-setosa',
       'Iris-setosa', 'Iris-setosa', 'Iris-setosa', 'Iris-setosa',
       'Iris-setosa', 'Iris-setosa', 'Iris-setosa', 'Iris-setosa',
       'Iris-setosa', 'Iris-setosa', 'Iris-setosa', 'Iris-setosa',
       'Iris-setosa', 'Iris-setosa', 'Iris-setosa', 'Iris-setosa',
       'Iris-setosa', 'Iris-setosa', 'Iris-versicolor', 'Iris-versicolor',
       'Iris-versicolor', 'Iris-versicolor', 'Iris-versicolor',
       'Iris-versicolor', 'Iris-versicolor', 'Iris-versicolor',
       'Iris-versicolor', 'Iris-versicolor', 'Iris-versicolor',
       'Iris-versicolor', 'Iris-versicolor', 'Iris-versicolor',
       'Iris-versicolor', 'Iris-versicolor', 'Iris-versicolor',
       'Iris-versicolor', 'Iris-versicolor', 'Iris-versicolor',
       'Iris-virginica', 'Iris-versicolor', 'Iris-versicolor',
       'Iris-versicolor', 'Iris-versicolor', 'Iris-versicolor',
       'Iris-versicolor', 'Iris-virginica', 'Iris-versicolor',
       'Iris-versicolor', 'Iris-versicolor', 'Iris-versicolor',
       'Iris-versicolor', 'Iris-versicolor', 'Iris-versicolor',
       'Iris-versicolor', 'Iris-versicolor', 'Iris-versicolor',
       'Iris-versicolor', 'Iris-versicolor', 'Iris-versicolor',
       'Iris-versicolor', 'Iris-versicolor', 'Iris-versicolor',
       'Iris-versicolor', 'Iris-versicolor', 'Iris-versicolor',
       'Iris-versicolor', 'Iris-versicolor', 'Iris-versicolor',
       'Iris-virginica', 'Iris-virginica', 'Iris-virginica',
       'Iris-virginica', 'Iris-virginica', 'Iris-virginica',
       'Iris-virginica', 'Iris-virginica', 'Iris-virginica',
       'Iris-virginica', 'Iris-virginica', 'Iris-virginica',
       'Iris-virginica', 'Iris-virginica', 'Iris-virginica',
       'Iris-virginica', 'Iris-virginica', 'Iris-virginica',
       'Iris-virginica', 'Iris-versicolor', 'Iris-virginica',
       'Iris-virginica', 'Iris-virginica', 'Iris-virginica',
       'Iris-virginica', 'Iris-virginica', 'Iris-virginica',
       'Iris-virginica', 'Iris-virginica', 'Iris-versicolor',
       'Iris-virginica', 'Iris-virginica', 'Iris-virginica',
       'Iris-versicolor', 'Iris-versicolor', 'Iris-virginica',
       'Iris-virginica', 'Iris-virginica', 'Iris-virginica',
       'Iris-virginica', 'Iris-virginica', 'Iris-virginica',
       'Iris-virginica', 'Iris-virginica', 'Iris-virginica',
       'Iris-virginica', 'Iris-virginica', 'Iris-virginica',
       'Iris-virginica', 'Iris-virginica'], dtype='<U15')