Create and evaluate handcrafted classification rules in decision-rules
In this tutorial we will create manually decision rules for and evaluate them.
We begin by loading the iris dataset into a DataFrame.
[1]:
import pandas as pd
IRIS_PATH = 'resources/iris.csv'
iris_df = pd.read_csv(IRIS_PATH)
display(iris_df)
print('Columns: ', iris_df.columns.values)
print('Class names:', iris_df['class'].unique())
| sepallength | sepalwidth | petallength | petalwidth | class | |
|---|---|---|---|---|---|
| 0 | 5.1 | 3.5 | 1.4 | 0.2 | Iris-setosa |
| 1 | 4.9 | 3.0 | 1.4 | 0.2 | Iris-setosa |
| 2 | 4.7 | 3.2 | 1.3 | 0.2 | Iris-setosa |
| 3 | 4.6 | 3.1 | 1.5 | 0.2 | Iris-setosa |
| 4 | 5.0 | 3.6 | 1.4 | 0.2 | Iris-setosa |
| ... | ... | ... | ... | ... | ... |
| 145 | 6.7 | 3.0 | 5.2 | 2.3 | Iris-virginica |
| 146 | 6.3 | 2.5 | 5.0 | 1.9 | Iris-virginica |
| 147 | 6.5 | 3.0 | 5.2 | 2.0 | Iris-virginica |
| 148 | 6.2 | 3.4 | 5.4 | 2.3 | Iris-virginica |
| 149 | 5.9 | 3.0 | 5.1 | 1.8 | Iris-virginica |
150 rows × 5 columns
Columns: ['sepallength' 'sepalwidth' 'petallength' 'petalwidth' 'class']
Class names: ['Iris-setosa' 'Iris-versicolor' 'Iris-virginica']
The task is to predict the class of an example (Iris-setosa, Iris-versicolor or Iris-virginica) using the values in the other columns (‘sepallength’, ‘sepalwidth’, ‘petallength’, ‘petalwidth’). We will store the predictors in the X variable and the target in y.
[2]:
X = iris_df.drop(columns=['class'])
y = iris_df['class']
Someone suggested the following simple rules:
If \(petallength < 2.5\), then it’s Iris-setosa.
If \(petallength \geq 2.5\) and \(petalwidth < 1.65\), then it’s Iris-versicolor.
Otherwise, it’s Iris-virginica
Let’s implement them. We will start from the first one.
[3]:
from decision_rules.classification.rule import ClassificationConclusion
from decision_rules.classification.rule import ClassificationRule
from decision_rules.conditions import ElementaryCondition, CompoundCondition
rule_1 = ClassificationRule(
premise=CompoundCondition(
subconditions=[
ElementaryCondition(
column_index=X.columns.get_loc('petallength'),
right = 2.5
)
]
),
conclusion=ClassificationConclusion(
value='Iris-setosa',
column_name='class',
),
column_names=X.columns,
)
We use ClassificationRule class to create the rule. Every rule has two parts: premise (e.g. \(petallength < 2.5\)) and conclusion (e.g. Iris-setosa).
You can create a premise using the conditions from decision_rules.conditions:
NominalCondition: checks if a value of the attribute is equal to a value, e.g. \(x = 1\). Useful for nominal attributes.ElementaryCondition: checks if a value is inside an interval, e.g. \(x \in [2.3, 3.1)\).CompoundCondition: a conjunction or alternative ofElementaryConditions, e.g. \(x \in [2.3, 3.1)\) and \(y \in (8, +\infty)\).AttributesCondition: checks if a relationship between two attributes is met, e.g. \(x < y\).
In case of this rule, the premise \(petallength < 2.5\) can be written as \(petallength \in [-\infty, 2.5)\), so we can use the ElementaryCondition. The column_index argument is the index of the column with the relevant attribute (petallength). The arguments left and right are the boundaries of the interval, their default values are minus and plus infinity respectively. The interval is open by default, we can change that by setting the left_closed or right_colsed
attribute to True.
Currently, the premise needs to be a CompoundCondition, so we put that ElementaryCondition as a subcondition inside a CompoundCondition. CompoundCondition has one required argument named subconditions which must be a list of ElementaryCondition objects.
The conclusion is a ClassificationConclusion object. It accepts two arguments: the predicted value (Iris-setosa) and the column name (class).
Now, let’s create the rule 2. The premise is a conjunction (has two parts joined by “and”), so we will use again CompoundCondition. This time the subconditions list will have two elementary conditions.
[4]:
rule_2 = ClassificationRule(
premise=CompoundCondition(
subconditions=[
ElementaryCondition(
column_index=X.columns.get_loc('petallength'),
left = 2.5,
left_closed=True,
),
ElementaryCondition(
column_index=X.columns.get_loc('petalwidth'),
right = 1.65,
),
]
),
conclusion=ClassificationConclusion(
value='Iris-versicolor',
column_name='class',
),
column_names=X.columns,
)
Now that we have all the rules, we can create the rule set. To to this, we use the ClassificationRuleSet class. It has one mandatory argument rules which is a list of ClassificationRule objects
[5]:
from decision_rules.classification.ruleset import ClassificationRuleSet
ruleset = ClassificationRuleSet(rules=[rule_1, rule_2])
We still need to add the 3rd rule (otherwise, it’s Iris-virginica). We implement such a rule using the default_conclusion property of a ruleset.
[6]:
ruleset.default_conclusion = ClassificationConclusion(
value='Iris-virginica',
column_name='class',
)
Now the rule set is almost ready. After defining the rules, you should call the update function. It accepts 3 arguments:
a
DataFrameof predictors (X)a
Seriesof target values (y)a measure of rule quality
This function calculates also the coverage matrix, which shows which rules cover each row.
[7]:
from decision_rules.measures import accuracy
coverage_matrix = ruleset.update(X, y, accuracy)
display(coverage_matrix)
array([[ True, False],
[ True, False],
[ True, False],
[ True, False],
[ True, False],
[ True, False],
[ True, False],
[ True, False],
[ True, False],
[ True, False],
[ True, False],
[ True, False],
[ True, False],
[ True, False],
[ True, False],
[ True, False],
[ True, False],
[ True, False],
[ True, False],
[ True, False],
[ True, False],
[ True, False],
[ True, False],
[ True, False],
[ True, False],
[ True, False],
[ True, False],
[ True, False],
[ True, False],
[ True, False],
[ True, False],
[ True, False],
[ True, False],
[ True, False],
[ True, False],
[ True, False],
[ True, False],
[ True, False],
[ True, False],
[ True, False],
[ True, False],
[ True, False],
[ True, False],
[ True, False],
[ True, False],
[ True, False],
[ True, False],
[ True, False],
[ True, False],
[ True, False],
[False, True],
[False, True],
[False, True],
[False, True],
[False, True],
[False, True],
[False, True],
[False, True],
[False, True],
[False, True],
[False, True],
[False, True],
[False, True],
[False, True],
[False, True],
[False, True],
[False, True],
[False, True],
[False, True],
[False, True],
[False, False],
[False, True],
[False, True],
[False, True],
[False, True],
[False, True],
[False, True],
[False, False],
[False, True],
[False, True],
[False, True],
[False, True],
[False, True],
[False, True],
[False, True],
[False, True],
[False, True],
[False, True],
[False, True],
[False, True],
[False, True],
[False, True],
[False, True],
[False, True],
[False, True],
[False, True],
[False, True],
[False, True],
[False, True],
[False, True],
[False, False],
[False, False],
[False, False],
[False, False],
[False, False],
[False, False],
[False, False],
[False, False],
[False, False],
[False, False],
[False, False],
[False, False],
[False, False],
[False, False],
[False, False],
[False, False],
[False, False],
[False, False],
[False, False],
[False, True],
[False, False],
[False, False],
[False, False],
[False, False],
[False, False],
[False, False],
[False, False],
[False, False],
[False, False],
[False, True],
[False, False],
[False, False],
[False, False],
[False, True],
[False, True],
[False, False],
[False, False],
[False, False],
[False, False],
[False, False],
[False, False],
[False, False],
[False, False],
[False, False],
[False, False],
[False, False],
[False, False],
[False, False],
[False, False],
[False, False]])
The ruleset is ready now. Let’s see what it will predict for our data stored in X variable.
[8]:
# The predictions for each row in X will be stored in the y_pred array.
y_pred = ruleset.predict(X)
# The compare_df will show us the true class and the prediction for each example.
compare_df = iris_df.copy()
compare_df['predictions'] = y_pred
with pd.option_context('display.max_rows', 150):
display(compare_df)
| sepallength | sepalwidth | petallength | petalwidth | class | predictions | |
|---|---|---|---|---|---|---|
| 0 | 5.1 | 3.5 | 1.4 | 0.2 | Iris-setosa | Iris-setosa |
| 1 | 4.9 | 3.0 | 1.4 | 0.2 | Iris-setosa | Iris-setosa |
| 2 | 4.7 | 3.2 | 1.3 | 0.2 | Iris-setosa | Iris-setosa |
| 3 | 4.6 | 3.1 | 1.5 | 0.2 | Iris-setosa | Iris-setosa |
| 4 | 5.0 | 3.6 | 1.4 | 0.2 | Iris-setosa | Iris-setosa |
| 5 | 5.4 | 3.9 | 1.7 | 0.4 | Iris-setosa | Iris-setosa |
| 6 | 4.6 | 3.4 | 1.4 | 0.3 | Iris-setosa | Iris-setosa |
| 7 | 5.0 | 3.4 | 1.5 | 0.2 | Iris-setosa | Iris-setosa |
| 8 | 4.4 | 2.9 | 1.4 | 0.2 | Iris-setosa | Iris-setosa |
| 9 | 4.9 | 3.1 | 1.5 | 0.1 | Iris-setosa | Iris-setosa |
| 10 | 5.4 | 3.7 | 1.5 | 0.2 | Iris-setosa | Iris-setosa |
| 11 | 4.8 | 3.4 | 1.6 | 0.2 | Iris-setosa | Iris-setosa |
| 12 | 4.8 | 3.0 | 1.4 | 0.1 | Iris-setosa | Iris-setosa |
| 13 | 4.3 | 3.0 | 1.1 | 0.1 | Iris-setosa | Iris-setosa |
| 14 | 5.8 | 4.0 | 1.2 | 0.2 | Iris-setosa | Iris-setosa |
| 15 | 5.7 | 4.4 | 1.5 | 0.4 | Iris-setosa | Iris-setosa |
| 16 | 5.4 | 3.9 | 1.3 | 0.4 | Iris-setosa | Iris-setosa |
| 17 | 5.1 | 3.5 | 1.4 | 0.3 | Iris-setosa | Iris-setosa |
| 18 | 5.7 | 3.8 | 1.7 | 0.3 | Iris-setosa | Iris-setosa |
| 19 | 5.1 | 3.8 | 1.5 | 0.3 | Iris-setosa | Iris-setosa |
| 20 | 5.4 | 3.4 | 1.7 | 0.2 | Iris-setosa | Iris-setosa |
| 21 | 5.1 | 3.7 | 1.5 | 0.4 | Iris-setosa | Iris-setosa |
| 22 | 4.6 | 3.6 | 1.0 | 0.2 | Iris-setosa | Iris-setosa |
| 23 | 5.1 | 3.3 | 1.7 | 0.5 | Iris-setosa | Iris-setosa |
| 24 | 4.8 | 3.4 | 1.9 | 0.2 | Iris-setosa | Iris-setosa |
| 25 | 5.0 | 3.0 | 1.6 | 0.2 | Iris-setosa | Iris-setosa |
| 26 | 5.0 | 3.4 | 1.6 | 0.4 | Iris-setosa | Iris-setosa |
| 27 | 5.2 | 3.5 | 1.5 | 0.2 | Iris-setosa | Iris-setosa |
| 28 | 5.2 | 3.4 | 1.4 | 0.2 | Iris-setosa | Iris-setosa |
| 29 | 4.7 | 3.2 | 1.6 | 0.2 | Iris-setosa | Iris-setosa |
| 30 | 4.8 | 3.1 | 1.6 | 0.2 | Iris-setosa | Iris-setosa |
| 31 | 5.4 | 3.4 | 1.5 | 0.4 | Iris-setosa | Iris-setosa |
| 32 | 5.2 | 4.1 | 1.5 | 0.1 | Iris-setosa | Iris-setosa |
| 33 | 5.5 | 4.2 | 1.4 | 0.2 | Iris-setosa | Iris-setosa |
| 34 | 4.9 | 3.1 | 1.5 | 0.1 | Iris-setosa | Iris-setosa |
| 35 | 5.0 | 3.2 | 1.2 | 0.2 | Iris-setosa | Iris-setosa |
| 36 | 5.5 | 3.5 | 1.3 | 0.2 | Iris-setosa | Iris-setosa |
| 37 | 4.9 | 3.1 | 1.5 | 0.1 | Iris-setosa | Iris-setosa |
| 38 | 4.4 | 3.0 | 1.3 | 0.2 | Iris-setosa | Iris-setosa |
| 39 | 5.1 | 3.4 | 1.5 | 0.2 | Iris-setosa | Iris-setosa |
| 40 | 5.0 | 3.5 | 1.3 | 0.3 | Iris-setosa | Iris-setosa |
| 41 | 4.5 | 2.3 | 1.3 | 0.3 | Iris-setosa | Iris-setosa |
| 42 | 4.4 | 3.2 | 1.3 | 0.2 | Iris-setosa | Iris-setosa |
| 43 | 5.0 | 3.5 | 1.6 | 0.6 | Iris-setosa | Iris-setosa |
| 44 | 5.1 | 3.8 | 1.9 | 0.4 | Iris-setosa | Iris-setosa |
| 45 | 4.8 | 3.0 | 1.4 | 0.3 | Iris-setosa | Iris-setosa |
| 46 | 5.1 | 3.8 | 1.6 | 0.2 | Iris-setosa | Iris-setosa |
| 47 | 4.6 | 3.2 | 1.4 | 0.2 | Iris-setosa | Iris-setosa |
| 48 | 5.3 | 3.7 | 1.5 | 0.2 | Iris-setosa | Iris-setosa |
| 49 | 5.0 | 3.3 | 1.4 | 0.2 | Iris-setosa | Iris-setosa |
| 50 | 7.0 | 3.2 | 4.7 | 1.4 | Iris-versicolor | Iris-versicolor |
| 51 | 6.4 | 3.2 | 4.5 | 1.5 | Iris-versicolor | Iris-versicolor |
| 52 | 6.9 | 3.1 | 4.9 | 1.5 | Iris-versicolor | Iris-versicolor |
| 53 | 5.5 | 2.3 | 4.0 | 1.3 | Iris-versicolor | Iris-versicolor |
| 54 | 6.5 | 2.8 | 4.6 | 1.5 | Iris-versicolor | Iris-versicolor |
| 55 | 5.7 | 2.8 | 4.5 | 1.3 | Iris-versicolor | Iris-versicolor |
| 56 | 6.3 | 3.3 | 4.7 | 1.6 | Iris-versicolor | Iris-versicolor |
| 57 | 4.9 | 2.4 | 3.3 | 1.0 | Iris-versicolor | Iris-versicolor |
| 58 | 6.6 | 2.9 | 4.6 | 1.3 | Iris-versicolor | Iris-versicolor |
| 59 | 5.2 | 2.7 | 3.9 | 1.4 | Iris-versicolor | Iris-versicolor |
| 60 | 5.0 | 2.0 | 3.5 | 1.0 | Iris-versicolor | Iris-versicolor |
| 61 | 5.9 | 3.0 | 4.2 | 1.5 | Iris-versicolor | Iris-versicolor |
| 62 | 6.0 | 2.2 | 4.0 | 1.0 | Iris-versicolor | Iris-versicolor |
| 63 | 6.1 | 2.9 | 4.7 | 1.4 | Iris-versicolor | Iris-versicolor |
| 64 | 5.6 | 2.9 | 3.6 | 1.3 | Iris-versicolor | Iris-versicolor |
| 65 | 6.7 | 3.1 | 4.4 | 1.4 | Iris-versicolor | Iris-versicolor |
| 66 | 5.6 | 3.0 | 4.5 | 1.5 | Iris-versicolor | Iris-versicolor |
| 67 | 5.8 | 2.7 | 4.1 | 1.0 | Iris-versicolor | Iris-versicolor |
| 68 | 6.2 | 2.2 | 4.5 | 1.5 | Iris-versicolor | Iris-versicolor |
| 69 | 5.6 | 2.5 | 3.9 | 1.1 | Iris-versicolor | Iris-versicolor |
| 70 | 5.9 | 3.2 | 4.8 | 1.8 | Iris-versicolor | Iris-virginica |
| 71 | 6.1 | 2.8 | 4.0 | 1.3 | Iris-versicolor | Iris-versicolor |
| 72 | 6.3 | 2.5 | 4.9 | 1.5 | Iris-versicolor | Iris-versicolor |
| 73 | 6.1 | 2.8 | 4.7 | 1.2 | Iris-versicolor | Iris-versicolor |
| 74 | 6.4 | 2.9 | 4.3 | 1.3 | Iris-versicolor | Iris-versicolor |
| 75 | 6.6 | 3.0 | 4.4 | 1.4 | Iris-versicolor | Iris-versicolor |
| 76 | 6.8 | 2.8 | 4.8 | 1.4 | Iris-versicolor | Iris-versicolor |
| 77 | 6.7 | 3.0 | 5.0 | 1.7 | Iris-versicolor | Iris-virginica |
| 78 | 6.0 | 2.9 | 4.5 | 1.5 | Iris-versicolor | Iris-versicolor |
| 79 | 5.7 | 2.6 | 3.5 | 1.0 | Iris-versicolor | Iris-versicolor |
| 80 | 5.5 | 2.4 | 3.8 | 1.1 | Iris-versicolor | Iris-versicolor |
| 81 | 5.5 | 2.4 | 3.7 | 1.0 | Iris-versicolor | Iris-versicolor |
| 82 | 5.8 | 2.7 | 3.9 | 1.2 | Iris-versicolor | Iris-versicolor |
| 83 | 6.0 | 2.7 | 5.1 | 1.6 | Iris-versicolor | Iris-versicolor |
| 84 | 5.4 | 3.0 | 4.5 | 1.5 | Iris-versicolor | Iris-versicolor |
| 85 | 6.0 | 3.4 | 4.5 | 1.6 | Iris-versicolor | Iris-versicolor |
| 86 | 6.7 | 3.1 | 4.7 | 1.5 | Iris-versicolor | Iris-versicolor |
| 87 | 6.3 | 2.3 | 4.4 | 1.3 | Iris-versicolor | Iris-versicolor |
| 88 | 5.6 | 3.0 | 4.1 | 1.3 | Iris-versicolor | Iris-versicolor |
| 89 | 5.5 | 2.5 | 4.0 | 1.3 | Iris-versicolor | Iris-versicolor |
| 90 | 5.5 | 2.6 | 4.4 | 1.2 | Iris-versicolor | Iris-versicolor |
| 91 | 6.1 | 3.0 | 4.6 | 1.4 | Iris-versicolor | Iris-versicolor |
| 92 | 5.8 | 2.6 | 4.0 | 1.2 | Iris-versicolor | Iris-versicolor |
| 93 | 5.0 | 2.3 | 3.3 | 1.0 | Iris-versicolor | Iris-versicolor |
| 94 | 5.6 | 2.7 | 4.2 | 1.3 | Iris-versicolor | Iris-versicolor |
| 95 | 5.7 | 3.0 | 4.2 | 1.2 | Iris-versicolor | Iris-versicolor |
| 96 | 5.7 | 2.9 | 4.2 | 1.3 | Iris-versicolor | Iris-versicolor |
| 97 | 6.2 | 2.9 | 4.3 | 1.3 | Iris-versicolor | Iris-versicolor |
| 98 | 5.1 | 2.5 | 3.0 | 1.1 | Iris-versicolor | Iris-versicolor |
| 99 | 5.7 | 2.8 | 4.1 | 1.3 | Iris-versicolor | Iris-versicolor |
| 100 | 6.3 | 3.3 | 6.0 | 2.5 | Iris-virginica | Iris-virginica |
| 101 | 5.8 | 2.7 | 5.1 | 1.9 | Iris-virginica | Iris-virginica |
| 102 | 7.1 | 3.0 | 5.9 | 2.1 | Iris-virginica | Iris-virginica |
| 103 | 6.3 | 2.9 | 5.6 | 1.8 | Iris-virginica | Iris-virginica |
| 104 | 6.5 | 3.0 | 5.8 | 2.2 | Iris-virginica | Iris-virginica |
| 105 | 7.6 | 3.0 | 6.6 | 2.1 | Iris-virginica | Iris-virginica |
| 106 | 4.9 | 2.5 | 4.5 | 1.7 | Iris-virginica | Iris-virginica |
| 107 | 7.3 | 2.9 | 6.3 | 1.8 | Iris-virginica | Iris-virginica |
| 108 | 6.7 | 2.5 | 5.8 | 1.8 | Iris-virginica | Iris-virginica |
| 109 | 7.2 | 3.6 | 6.1 | 2.5 | Iris-virginica | Iris-virginica |
| 110 | 6.5 | 3.2 | 5.1 | 2.0 | Iris-virginica | Iris-virginica |
| 111 | 6.4 | 2.7 | 5.3 | 1.9 | Iris-virginica | Iris-virginica |
| 112 | 6.8 | 3.0 | 5.5 | 2.1 | Iris-virginica | Iris-virginica |
| 113 | 5.7 | 2.5 | 5.0 | 2.0 | Iris-virginica | Iris-virginica |
| 114 | 5.8 | 2.8 | 5.1 | 2.4 | Iris-virginica | Iris-virginica |
| 115 | 6.4 | 3.2 | 5.3 | 2.3 | Iris-virginica | Iris-virginica |
| 116 | 6.5 | 3.0 | 5.5 | 1.8 | Iris-virginica | Iris-virginica |
| 117 | 7.7 | 3.8 | 6.7 | 2.2 | Iris-virginica | Iris-virginica |
| 118 | 7.7 | 2.6 | 6.9 | 2.3 | Iris-virginica | Iris-virginica |
| 119 | 6.0 | 2.2 | 5.0 | 1.5 | Iris-virginica | Iris-versicolor |
| 120 | 6.9 | 3.2 | 5.7 | 2.3 | Iris-virginica | Iris-virginica |
| 121 | 5.6 | 2.8 | 4.9 | 2.0 | Iris-virginica | Iris-virginica |
| 122 | 7.7 | 2.8 | 6.7 | 2.0 | Iris-virginica | Iris-virginica |
| 123 | 6.3 | 2.7 | 4.9 | 1.8 | Iris-virginica | Iris-virginica |
| 124 | 6.7 | 3.3 | 5.7 | 2.1 | Iris-virginica | Iris-virginica |
| 125 | 7.2 | 3.2 | 6.0 | 1.8 | Iris-virginica | Iris-virginica |
| 126 | 6.2 | 2.8 | 4.8 | 1.8 | Iris-virginica | Iris-virginica |
| 127 | 6.1 | 3.0 | 4.9 | 1.8 | Iris-virginica | Iris-virginica |
| 128 | 6.4 | 2.8 | 5.6 | 2.1 | Iris-virginica | Iris-virginica |
| 129 | 7.2 | 3.0 | 5.8 | 1.6 | Iris-virginica | Iris-versicolor |
| 130 | 7.4 | 2.8 | 6.1 | 1.9 | Iris-virginica | Iris-virginica |
| 131 | 7.9 | 3.8 | 6.4 | 2.0 | Iris-virginica | Iris-virginica |
| 132 | 6.4 | 2.8 | 5.6 | 2.2 | Iris-virginica | Iris-virginica |
| 133 | 6.3 | 2.8 | 5.1 | 1.5 | Iris-virginica | Iris-versicolor |
| 134 | 6.1 | 2.6 | 5.6 | 1.4 | Iris-virginica | Iris-versicolor |
| 135 | 7.7 | 3.0 | 6.1 | 2.3 | Iris-virginica | Iris-virginica |
| 136 | 6.3 | 3.4 | 5.6 | 2.4 | Iris-virginica | Iris-virginica |
| 137 | 6.4 | 3.1 | 5.5 | 1.8 | Iris-virginica | Iris-virginica |
| 138 | 6.0 | 3.0 | 4.8 | 1.8 | Iris-virginica | Iris-virginica |
| 139 | 6.9 | 3.1 | 5.4 | 2.1 | Iris-virginica | Iris-virginica |
| 140 | 6.7 | 3.1 | 5.6 | 2.4 | Iris-virginica | Iris-virginica |
| 141 | 6.9 | 3.1 | 5.1 | 2.3 | Iris-virginica | Iris-virginica |
| 142 | 5.8 | 2.7 | 5.1 | 1.9 | Iris-virginica | Iris-virginica |
| 143 | 6.8 | 3.2 | 5.9 | 2.3 | Iris-virginica | Iris-virginica |
| 144 | 6.7 | 3.3 | 5.7 | 2.5 | Iris-virginica | Iris-virginica |
| 145 | 6.7 | 3.0 | 5.2 | 2.3 | Iris-virginica | Iris-virginica |
| 146 | 6.3 | 2.5 | 5.0 | 1.9 | Iris-virginica | Iris-virginica |
| 147 | 6.5 | 3.0 | 5.2 | 2.0 | Iris-virginica | Iris-virginica |
| 148 | 6.2 | 3.4 | 5.4 | 2.3 | Iris-virginica | Iris-virginica |
| 149 | 5.9 | 3.0 | 5.1 | 1.8 | Iris-virginica | Iris-virginica |
We will check now how well our rules perform on the data using various metrics. The calculate_for_classification function computes the typical classification metrics, such as accuracy, f1 or Cohen kappa.
[9]:
from decision_rules.classification.prediction_indicators import calculate_for_classification
metrics = calculate_for_classification(y, y_pred)
display(metrics)
{'type_of_problem': 'classification',
'general': {'Balanced_accuracy': 0.96,
'Accuracy': 0.96,
'Cohen_kappa': 0.94,
'F1_micro': 0.96,
'F1_macro': 0.9599839935974389,
'F1_weighted': 0.9599839935974391,
'G_mean_micro': 0.9699484522385713,
'G_mean_macro': 0.9699484522385713,
'G_mean_weighted': 0.9699484522385713,
'Recall_micro': 0.96,
'Recall_macro': 0.96,
'Recall_weighted': 0.96,
'Specificity': 1.0,
'Confusion_matrix': {'classes': ['Iris-setosa',
'Iris-versicolor',
'Iris-virginica'],
'Iris-setosa': [50, 0, 0],
'Iris-versicolor': [0, 48, 2],
'Iris-virginica': [0, 4, 46]}},
'for_classes': {'Iris-setosa': {'TP': 50,
'FP': 0,
'TN': 100,
'FN': 0,
'Recall': 1.0,
'Specificity': 1.0,
'F1_score': 1.0,
'G_mean': 1.0,
'MCC': 1.0,
'PPV': 1.0,
'NPV': 1.0,
'LR_plus': 0,
'LR_minus': 0.0,
'Odd_ratio': 0,
'Relative_risk': 0,
'Confusion_matrix': {'classes': ['Iris-setosa', 'other'],
'Iris-setosa': [50, 0],
'other': [0, 100]}},
'Iris-versicolor': {'TP': 48,
'FP': 4,
'TN': 96,
'FN': 2,
'Recall': 0.96,
'Specificity': 0.96,
'F1_score': 0.9411764705882353,
'G_mean': 0.96,
'MCC': 0.9112931795128765,
'PPV': 0.9230769230769231,
'NPV': 0.9795918367346939,
'LR_plus': 23.99999999999998,
'LR_minus': 0.041666666666666706,
'Odd_ratio': 576.0,
'Relative_risk': 45.23076923076923,
'Confusion_matrix': {'classes': ['Iris-versicolor', 'other'],
'Iris-versicolor': [48, 2],
'other': [4, 96]}},
'Iris-virginica': {'TP': 46,
'FP': 2,
'TN': 98,
'FN': 4,
'Recall': 0.92,
'Specificity': 0.98,
'F1_score': 0.9387755102040817,
'G_mean': 0.9495261976375375,
'MCC': 0.9095085938862487,
'PPV': 0.9583333333333334,
'NPV': 0.9607843137254902,
'LR_plus': 45.999999999999964,
'LR_minus': 0.08163265306122446,
'Odd_ratio': 563.5,
'Relative_risk': 24.4375,
'Confusion_matrix': {'classes': ['Iris-virginica', 'other'],
'Iris-virginica': [46, 4],
'other': [2, 98]}}}}
The calculate_rules_metrics function of a ruleset object computes metrics describing each of the rules in the rule set. The full description of the metrics can be found in the documentation of ruleminer.
[10]:
metrics = ruleset.calculate_rules_metrics(X, y)
for rule_id, metrics in metrics.items():
print('Rule', rule_id)
print(metrics)
Rule fd0d78eb-aea8-4ab0-9704-6999d87573c2
{'p': 50, 'n': 0, 'P': 50, 'N': 100, 'p_unique': 50, 'n_unique': 50, 'support': 50, 'conditions_count': 1, 'precision': 1.0, 'coverage': 1.0, 'C2': 1.0, 'RSS': 1.0, 'correlation': 1.0, 'lift': 3.0, 'p_value': 4.968040370318492e-41, 'TP': 50, 'FP': 0, 'TN': 100, 'FN': 0, 'sensitivity': 1.0, 'specificity': 1.0, 'negative_predictive_value': 1.0, 'odds_ratio': inf, 'relative_risk': inf, 'lr+': inf, 'lr-': 0.0}
Rule 1cd50058-f754-4d69-ab42-5690bdccfa1b
{'p': 48, 'n': 4, 'P': 50, 'N': 100, 'p_unique': 48, 'n_unique': 48, 'support': 52, 'conditions_count': 2, 'precision': 0.9230769230769231, 'coverage': 0.96, 'C2': 0.8669230769230769, 'RSS': 0.9199999999999999, 'correlation': 0.9112931795128765, 'lift': 2.769230769230769, 'p_value': 6.403421751602081e-32, 'TP': 48, 'FP': 4, 'TN': 96, 'FN': 2, 'sensitivity': 0.96, 'specificity': 0.96, 'negative_predictive_value': 0.9795918367346939, 'odds_ratio': 576.0, 'relative_risk': 45.23076923076923, 'lr+': 23.99999999999998, 'lr-': 0.041666666666666706}
The calculate_ruleset_stats function returns some general statistics regarding the rules present in the rule set.
[11]:
general_stats = ruleset.calculate_ruleset_stats()
print(general_stats)
{'rules_count': 2, 'avg_conditions_count': 1.5, 'avg_precision': 0.96, 'avg_coverage': 0.98, 'total_conditions_count': 3}
We can serialize the ruleset into a dict using JSONSerializer.serialize. The dict can be later stored in a string or in a text file.
[12]:
import os
import json
from decision_rules.serialization import JSONSerializer
OUTPUT_DIR = 'output'
RULESET_FILENAME = 'manual_iris.json'
os.makedirs(OUTPUT_DIR, exist_ok=True)
ruleset_path = os.path.join(OUTPUT_DIR, RULESET_FILENAME)
# Serialize the ruleset
ruleset_dict = JSONSerializer.serialize(ruleset)
display(ruleset_dict)
# Save to JSON
with open(ruleset_path, 'w') as fp:
json.dump(ruleset_dict, fp)
{'meta': {'attributes': ['sepallength',
'sepalwidth',
'petallength',
'petalwidth'],
'decision_attribute': 'class',
'decision_attribute_distribution': {'Iris-setosa': 50,
'Iris-versicolor': 50,
'Iris-virginica': 50}},
'rules': [{'uuid': 'fd0d78eb-aea8-4ab0-9704-6999d87573c2',
'string': 'IF petallength < 2.50 THEN class = Iris-setosa',
'premise': {'type': 'compound',
'operator': 'CONJUNCTION',
'subconditions': [{'type': 'elementary_numerical',
'attributes': [2],
'negated': False,
'left': None,
'right': 2.5,
'left_closed': False,
'right_closed': False}]},
'conclusion': {'value': 'Iris-setosa'},
'coverage': {'p': 50, 'n': 0, 'P': 50, 'N': 100}},
{'uuid': '1cd50058-f754-4d69-ab42-5690bdccfa1b',
'string': 'IF petallength >= 2.50 AND petalwidth < 1.65 THEN class = Iris-versicolor',
'premise': {'type': 'compound',
'operator': 'CONJUNCTION',
'subconditions': [{'type': 'elementary_numerical',
'attributes': [2],
'negated': False,
'left': 2.5,
'right': None,
'left_closed': True,
'right_closed': False},
{'type': 'elementary_numerical',
'attributes': [3],
'negated': False,
'left': None,
'right': 1.65,
'left_closed': False,
'right_closed': False}]},
'conclusion': {'value': 'Iris-versicolor'},
'coverage': {'p': 48, 'n': 4, 'P': 50, 'N': 100}}]}
The ruleset can be loaded using JSONSerializer.deserialize
[13]:
with open(ruleset_path) as fp:
reloaded_json = json.load(fp)
reloaded_ruleset = JSONSerializer.deserialize(reloaded_json, ClassificationRuleSet)
assert reloaded_ruleset == ruleset
Before using some of the functions of the deserialized ruleset, it may be necessary to call the update function. After that the object will be ready for prediction.
[14]:
reloaded_ruleset.update(X, y, accuracy)
y_pred = reloaded_ruleset.predict(X)
display(y_pred)
array(['Iris-setosa', 'Iris-setosa', 'Iris-setosa', 'Iris-setosa',
'Iris-setosa', 'Iris-setosa', 'Iris-setosa', 'Iris-setosa',
'Iris-setosa', 'Iris-setosa', 'Iris-setosa', 'Iris-setosa',
'Iris-setosa', 'Iris-setosa', 'Iris-setosa', 'Iris-setosa',
'Iris-setosa', 'Iris-setosa', 'Iris-setosa', 'Iris-setosa',
'Iris-setosa', 'Iris-setosa', 'Iris-setosa', 'Iris-setosa',
'Iris-setosa', 'Iris-setosa', 'Iris-setosa', 'Iris-setosa',
'Iris-setosa', 'Iris-setosa', 'Iris-setosa', 'Iris-setosa',
'Iris-setosa', 'Iris-setosa', 'Iris-setosa', 'Iris-setosa',
'Iris-setosa', 'Iris-setosa', 'Iris-setosa', 'Iris-setosa',
'Iris-setosa', 'Iris-setosa', 'Iris-setosa', 'Iris-setosa',
'Iris-setosa', 'Iris-setosa', 'Iris-setosa', 'Iris-setosa',
'Iris-setosa', 'Iris-setosa', 'Iris-versicolor', 'Iris-versicolor',
'Iris-versicolor', 'Iris-versicolor', 'Iris-versicolor',
'Iris-versicolor', 'Iris-versicolor', 'Iris-versicolor',
'Iris-versicolor', 'Iris-versicolor', 'Iris-versicolor',
'Iris-versicolor', 'Iris-versicolor', 'Iris-versicolor',
'Iris-versicolor', 'Iris-versicolor', 'Iris-versicolor',
'Iris-versicolor', 'Iris-versicolor', 'Iris-versicolor',
'Iris-virginica', 'Iris-versicolor', 'Iris-versicolor',
'Iris-versicolor', 'Iris-versicolor', 'Iris-versicolor',
'Iris-versicolor', 'Iris-virginica', 'Iris-versicolor',
'Iris-versicolor', 'Iris-versicolor', 'Iris-versicolor',
'Iris-versicolor', 'Iris-versicolor', 'Iris-versicolor',
'Iris-versicolor', 'Iris-versicolor', 'Iris-versicolor',
'Iris-versicolor', 'Iris-versicolor', 'Iris-versicolor',
'Iris-versicolor', 'Iris-versicolor', 'Iris-versicolor',
'Iris-versicolor', 'Iris-versicolor', 'Iris-versicolor',
'Iris-versicolor', 'Iris-versicolor', 'Iris-versicolor',
'Iris-virginica', 'Iris-virginica', 'Iris-virginica',
'Iris-virginica', 'Iris-virginica', 'Iris-virginica',
'Iris-virginica', 'Iris-virginica', 'Iris-virginica',
'Iris-virginica', 'Iris-virginica', 'Iris-virginica',
'Iris-virginica', 'Iris-virginica', 'Iris-virginica',
'Iris-virginica', 'Iris-virginica', 'Iris-virginica',
'Iris-virginica', 'Iris-versicolor', 'Iris-virginica',
'Iris-virginica', 'Iris-virginica', 'Iris-virginica',
'Iris-virginica', 'Iris-virginica', 'Iris-virginica',
'Iris-virginica', 'Iris-virginica', 'Iris-versicolor',
'Iris-virginica', 'Iris-virginica', 'Iris-virginica',
'Iris-versicolor', 'Iris-versicolor', 'Iris-virginica',
'Iris-virginica', 'Iris-virginica', 'Iris-virginica',
'Iris-virginica', 'Iris-virginica', 'Iris-virginica',
'Iris-virginica', 'Iris-virginica', 'Iris-virginica',
'Iris-virginica', 'Iris-virginica', 'Iris-virginica',
'Iris-virginica', 'Iris-virginica'], dtype='<U15')