Generating rules with RuleKit
Currently decision-rules does not generate rules. You can, however, generate rules using RuleKit and convert the model into decision-rules rule set.
We will start by importing the RuleKit package.
[1]:
import pandas as pd
from rulekit import RuleKit
from rulekit.classification import RuleClassifier
from rulekit.params import Measures
RuleKit.init()
We will use the following zoo dataset:
[2]:
df = pd.read_csv("resources/zoo.csv")
display(df)
| hair | feathers | eggs | milk | airborne | aquatic | predator | toothed | backbone | breathes | venomous | fins | legs | tail | domestic | catsize | class | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | True | False | False | True | False | False | True | True | True | True | False | False | 4.0 | False | False | True | mammal |
| 1 | True | False | False | True | False | False | False | True | True | True | False | False | 4.0 | True | False | True | mammal |
| 2 | False | False | True | False | False | True | True | True | True | False | False | True | 0.0 | True | False | False | fish |
| 3 | True | False | False | True | False | False | True | True | True | True | False | False | 4.0 | False | False | True | mammal |
| 4 | True | False | False | True | False | False | True | True | True | True | False | False | 4.0 | True | False | True | mammal |
| ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
| 96 | True | False | False | True | False | False | False | True | True | True | False | False | 2.0 | True | False | True | mammal |
| 97 | True | False | True | False | True | False | False | False | False | True | True | False | 6.0 | False | False | False | insect |
| 98 | True | False | False | True | False | False | True | True | True | True | False | False | 4.0 | True | False | True | mammal |
| 99 | False | False | True | False | False | False | False | False | False | True | False | False | 0.0 | False | False | False | invertebrate |
| 100 | False | True | True | False | True | False | False | False | True | True | False | False | 2.0 | True | False | False | bird |
101 rows × 17 columns
The class column will be our target y and the other columns will be predictors X.
[3]:
X = df.drop("class", axis=1)
y = df["class"]
Now we will generate the rules using RuleKit. We just need to create a RuleClassifier object and train it using the fit function.
[4]:
rulekit_model = RuleClassifier()
rulekit_model.fit(X, y)
[4]:
<rulekit.classification.RuleClassifier at 0x7f5dcda44490>
Let’s see the generated rules.
[5]:
for rule in rulekit_model.model.rules:
print(rule)
IF aquatic = <0.50, inf) AND legs = <3, inf) AND toothed = <0.50, inf) AND hair = (-inf, 0.50) THEN class = {amphibian}
IF feathers = <0.50, inf) THEN class = {bird}
IF fins = <0.50, inf) AND eggs = <0.50, inf) THEN class = {fish}
IF legs = <5.50, inf) AND aquatic = (-inf, 0.50) AND eggs = <0.50, inf) THEN class = {insect}
IF backbone = (-inf, 0.50) AND airborne = (-inf, 0.50) THEN class = {invertebrate}
IF milk = <0.50, inf) THEN class = {mammal}
IF toothed = <0.50, inf) AND fins = (-inf, 0.50) AND legs = (-inf, 2) THEN class = {reptile}
IF hair = (-inf, 0.50) AND toothed = <0.50, inf) AND aquatic = (-inf, 0.50) THEN class = {reptile}
IF hair = (-inf, 0.50) AND feathers = (-inf, 0.50) AND aquatic = (-inf, 0.50) AND backbone = <0.50, inf) THEN class = {reptile}
The RuleKitRuleSetFactory from ruleset_factories converts the RuleKit model to decision-rules rule set.
[6]:
from ruleset_factories._factories.classification import RuleKitRuleSetFactory
factory = RuleKitRuleSetFactory()
decision_rules_ruleset = factory.make(rulekit_model, X, y)
Let’s check if the rules in decision_rules_ruleset are the same as in rulekit_model.
[7]:
for rule in decision_rules_ruleset.rules:
print(rule)
IF aquatic >= 0.50 AND legs >= 3.00 AND toothed >= 0.50 AND hair < 0.50 THEN class = amphibian (p=4, n=0, P=4, N=97)
IF feathers >= 0.50 THEN class = bird (p=20, n=0, P=20, N=81)
IF fins >= 0.50 AND eggs >= 0.50 THEN class = fish (p=13, n=0, P=13, N=88)
IF legs >= 5.50 AND aquatic < 0.50 AND eggs >= 0.50 THEN class = insect (p=8, n=0, P=8, N=93)
IF backbone < 0.50 AND airborne < 0.50 THEN class = invertebrate (p=10, n=2, P=10, N=91)
IF milk >= 0.50 THEN class = mammal (p=41, n=0, P=41, N=60)
IF toothed >= 0.50 AND fins < 0.50 AND legs < 2.00 THEN class = reptile (p=3, n=0, P=5, N=96)
IF hair < 0.50 AND toothed >= 0.50 AND aquatic < 0.50 THEN class = reptile (p=3, n=0, P=5, N=96)
IF hair < 0.50 AND feathers < 0.50 AND aquatic < 0.50 AND backbone >= 0.50 THEN class = reptile (p=4, n=0, P=5, N=96)
We can now predict values using decision_rules_ruleset and calculate various statistics describing the rules.
[8]:
y_pred = decision_rules_ruleset.predict(X)
display(y_pred)
array(['mammal', 'mammal', 'fish', 'mammal', 'mammal', 'mammal', 'mammal',
'fish', 'fish', 'mammal', 'mammal', 'bird', 'fish', 'invertebrate',
'invertebrate', 'invertebrate', 'bird', 'mammal', 'fish', 'mammal',
'bird', 'bird', 'mammal', 'bird', 'insect', 'amphibian',
'amphibian', 'mammal', 'mammal', 'mammal', 'insect', 'mammal',
'mammal', 'bird', 'fish', 'mammal', 'mammal', 'bird', 'fish',
'insect', 'insect', 'bird', 'insect', 'bird', 'mammal', 'mammal',
'invertebrate', 'mammal', 'mammal', 'mammal', 'mammal', 'insect',
'amphibian', 'invertebrate', 'mammal', 'mammal', 'bird', 'bird',
'bird', 'bird', 'fish', 'fish', 'reptile', 'mammal', 'mammal',
'mammal', 'mammal', 'mammal', 'mammal', 'mammal', 'mammal', 'bird',
'invertebrate', 'fish', 'mammal', 'mammal', 'reptile',
'invertebrate', 'bird', 'bird', 'reptile', 'invertebrate', 'fish',
'bird', 'mammal', 'invertebrate', 'fish', 'bird', 'insect',
'amphibian', 'reptile', 'reptile', 'fish', 'mammal', 'mammal',
'bird', 'mammal', 'insect', 'mammal', 'invertebrate', 'bird'],
dtype='<U12')
The calculate_for_classification function computes the usual classification metrics, such as accuracy or F1.
[9]:
from decision_rules.classification.prediction_indicators import calculate_for_classification
metrics = calculate_for_classification(y, y_pred)
display(metrics)
{'type_of_problem': 'classification',
'general': {'Balanced_accuracy': 1.0,
'Accuracy': 1.0,
'Cohen_kappa': 1.0,
'F1_micro': 1.0,
'F1_macro': 1.0,
'F1_weighted': 1.0,
'G_mean_micro': 1.0,
'G_mean_macro': 1.0,
'G_mean_weighted': 1.0,
'Recall_micro': 1.0,
'Recall_macro': 1.0,
'Recall_weighted': 1.0,
'Specificity': 1.0,
'Confusion_matrix': {'classes': ['amphibian',
'bird',
'fish',
'insect',
'invertebrate',
'mammal',
'reptile'],
'amphibian': [4, 0, 0, 0, 0, 0, 0],
'bird': [0, 20, 0, 0, 0, 0, 0],
'fish': [0, 0, 13, 0, 0, 0, 0],
'insect': [0, 0, 0, 8, 0, 0, 0],
'invertebrate': [0, 0, 0, 0, 10, 0, 0],
'mammal': [0, 0, 0, 0, 0, 41, 0],
'reptile': [0, 0, 0, 0, 0, 0, 5]}},
'for_classes': {'amphibian': {'TP': 4,
'FP': 0,
'TN': 97,
'FN': 0,
'Recall': 1.0,
'Specificity': 1.0,
'F1_score': 1.0,
'G_mean': 1.0,
'MCC': 1.0,
'PPV': 1.0,
'NPV': 1.0,
'LR_plus': 0,
'LR_minus': 0.0,
'Odd_ratio': 0,
'Relative_risk': 0,
'Confusion_matrix': {'classes': ['amphibian', 'other'],
'amphibian': [4, 0],
'other': [0, 97]}},
'bird': {'TP': 20,
'FP': 0,
'TN': 81,
'FN': 0,
'Recall': 1.0,
'Specificity': 1.0,
'F1_score': 1.0,
'G_mean': 1.0,
'MCC': 1.0,
'PPV': 1.0,
'NPV': 1.0,
'LR_plus': 0,
'LR_minus': 0.0,
'Odd_ratio': 0,
'Relative_risk': 0,
'Confusion_matrix': {'classes': ['bird', 'other'],
'bird': [20, 0],
'other': [0, 81]}},
'fish': {'TP': 13,
'FP': 0,
'TN': 88,
'FN': 0,
'Recall': 1.0,
'Specificity': 1.0,
'F1_score': 1.0,
'G_mean': 1.0,
'MCC': 1.0,
'PPV': 1.0,
'NPV': 1.0,
'LR_plus': 0,
'LR_minus': 0.0,
'Odd_ratio': 0,
'Relative_risk': 0,
'Confusion_matrix': {'classes': ['fish', 'other'],
'fish': [13, 0],
'other': [0, 88]}},
'insect': {'TP': 8,
'FP': 0,
'TN': 93,
'FN': 0,
'Recall': 1.0,
'Specificity': 1.0,
'F1_score': 1.0,
'G_mean': 1.0,
'MCC': 1.0,
'PPV': 1.0,
'NPV': 1.0,
'LR_plus': 0,
'LR_minus': 0.0,
'Odd_ratio': 0,
'Relative_risk': 0,
'Confusion_matrix': {'classes': ['insect', 'other'],
'insect': [8, 0],
'other': [0, 93]}},
'invertebrate': {'TP': 10,
'FP': 0,
'TN': 91,
'FN': 0,
'Recall': 1.0,
'Specificity': 1.0,
'F1_score': 1.0,
'G_mean': 1.0,
'MCC': 1.0,
'PPV': 1.0,
'NPV': 1.0,
'LR_plus': 0,
'LR_minus': 0.0,
'Odd_ratio': 0,
'Relative_risk': 0,
'Confusion_matrix': {'classes': ['invertebrate', 'other'],
'invertebrate': [10, 0],
'other': [0, 91]}},
'mammal': {'TP': 41,
'FP': 0,
'TN': 60,
'FN': 0,
'Recall': 1.0,
'Specificity': 1.0,
'F1_score': 1.0,
'G_mean': 1.0,
'MCC': 1.0,
'PPV': 1.0,
'NPV': 1.0,
'LR_plus': 0,
'LR_minus': 0.0,
'Odd_ratio': 0,
'Relative_risk': 0,
'Confusion_matrix': {'classes': ['mammal', 'other'],
'mammal': [41, 0],
'other': [0, 60]}},
'reptile': {'TP': 5,
'FP': 0,
'TN': 96,
'FN': 0,
'Recall': 1.0,
'Specificity': 1.0,
'F1_score': 1.0,
'G_mean': 1.0,
'MCC': 1.0,
'PPV': 1.0,
'NPV': 1.0,
'LR_plus': 0,
'LR_minus': 0.0,
'Odd_ratio': 0,
'Relative_risk': 0,
'Confusion_matrix': {'classes': ['reptile', 'other'],
'reptile': [5, 0],
'other': [0, 96]}}}}
calculate_ruleset_stats shows some general information about the rule set.
[10]:
general_stats = decision_rules_ruleset.calculate_ruleset_stats()
print(general_stats)
{'rules_count': 9, 'avg_conditions_count': 2.56, 'avg_precision': 0.98, 'avg_coverage': 0.89, 'total_conditions_count': 23}
You can compute metrics describing each rule using calculate_rules_metrics.
[11]:
metrics = decision_rules_ruleset.calculate_rules_metrics(X, y)
for rule_id, metrics in metrics.items():
print('Rule', rule_id)
print(metrics)
Rule 3357850a-4701-41ee-92f5-17d524564033
{'p': 4, 'n': 0, 'P': 4, 'N': 97, 'p_unique': 4, 'n_unique': 4, 'support': 4, 'conditions_count': 4, 'precision': 1.0, 'coverage': 1.0, 'C2': 1.0, 'RSS': 1.0, 'correlation': 1.0, 'lift': 25.25, 'p_value': 2.4492245142881635e-07, 'TP': 4, 'FP': 0, 'TN': 97, 'FN': 0, 'sensitivity': 1.0, 'specificity': 1.0, 'negative_predictive_value': 1.0, 'odds_ratio': inf, 'relative_risk': inf, 'lr+': inf, 'lr-': 0.0}
Rule 0d28724c-56c4-47bb-847e-96fc60e191d9
{'p': 20, 'n': 0, 'P': 20, 'N': 81, 'p_unique': 20, 'n_unique': 20, 'support': 20, 'conditions_count': 1, 'precision': 1.0, 'coverage': 1.0, 'C2': 1.0, 'RSS': 1.0, 'correlation': 1.0, 'lift': 5.05, 'p_value': 1.4962781353003363e-21, 'TP': 20, 'FP': 0, 'TN': 81, 'FN': 0, 'sensitivity': 1.0, 'specificity': 1.0, 'negative_predictive_value': 1.0, 'odds_ratio': inf, 'relative_risk': inf, 'lr+': inf, 'lr-': 0.0}
Rule 9804e9ab-07d5-4954-8a32-22cb0033196a
{'p': 13, 'n': 0, 'P': 13, 'N': 88, 'p_unique': 13, 'n_unique': 13, 'support': 13, 'conditions_count': 2, 'precision': 1.0, 'coverage': 1.0, 'C2': 1.0, 'RSS': 1.0, 'correlation': 1.0, 'lift': 7.769230769230769, 'p_value': 1.225345504562382e-16, 'TP': 13, 'FP': 0, 'TN': 88, 'FN': 0, 'sensitivity': 1.0, 'specificity': 1.0, 'negative_predictive_value': 1.0, 'odds_ratio': inf, 'relative_risk': inf, 'lr+': inf, 'lr-': 0.0}
Rule d7aa0f70-1492-4114-99f7-2a26a000c366
{'p': 8, 'n': 0, 'P': 8, 'N': 93, 'p_unique': 8, 'n_unique': 8, 'support': 8, 'conditions_count': 3, 'precision': 1.0, 'coverage': 1.0, 'C2': 1.0, 'RSS': 1.0, 'correlation': 1.0, 'lift': 12.625, 'p_value': 4.948156798010051e-12, 'TP': 8, 'FP': 0, 'TN': 93, 'FN': 0, 'sensitivity': 1.0, 'specificity': 1.0, 'negative_predictive_value': 1.0, 'odds_ratio': inf, 'relative_risk': inf, 'lr+': inf, 'lr-': 0.0}
Rule 787b7170-b084-4830-a995-dd39cfe2ca15
{'p': 10, 'n': 2, 'P': 10, 'N': 91, 'p_unique': 10, 'n_unique': 10, 'support': 12, 'conditions_count': 2, 'precision': 0.8333333333333334, 'coverage': 1.0, 'C2': 0.8150183150183151, 'RSS': 0.978021978021978, 'correlation': 0.9027836479568707, 'lift': 8.416666666666666, 'p_value': 3.4352561220406376e-12, 'TP': 10, 'FP': 2, 'TN': 89, 'FN': 0, 'sensitivity': 1.0, 'specificity': 0.978021978021978, 'negative_predictive_value': 1.0, 'odds_ratio': inf, 'relative_risk': inf, 'lr+': 45.49999999999993, 'lr-': 0.0}
Rule 67f18dcb-1a70-43ad-9e16-713daedc5eca
{'p': 41, 'n': 0, 'P': 41, 'N': 60, 'p_unique': 41, 'n_unique': 41, 'support': 41, 'conditions_count': 1, 'precision': 1.0, 'coverage': 1.0, 'C2': 1.0, 'RSS': 1.0, 'correlation': 1.0, 'lift': 2.4634146341463414, 'p_value': 2.95310402655518e-29, 'TP': 41, 'FP': 0, 'TN': 60, 'FN': 0, 'sensitivity': 1.0, 'specificity': 1.0, 'negative_predictive_value': 1.0, 'odds_ratio': inf, 'relative_risk': inf, 'lr+': inf, 'lr-': 0.0}
Rule d122f503-9a1c-40b4-a879-a8452235ef9f
{'p': 3, 'n': 0, 'P': 5, 'N': 96, 'p_unique': 3, 'n_unique': 3, 'support': 3, 'conditions_count': 3, 'precision': 1.0, 'coverage': 0.6, 'C2': 0.8, 'RSS': 0.6, 'correlation': 0.766651877999928, 'lift': 20.2, 'p_value': 6.000600060006002e-05, 'TP': 3, 'FP': 0, 'TN': 96, 'FN': 2, 'sensitivity': 0.6, 'specificity': 1.0, 'negative_predictive_value': 0.9795918367346939, 'odds_ratio': inf, 'relative_risk': 49.0, 'lr+': inf, 'lr-': 0.4}
Rule cce9a48a-ffd9-4636-9330-c9c991d0a15d
{'p': 3, 'n': 0, 'P': 5, 'N': 96, 'p_unique': 3, 'n_unique': 3, 'support': 3, 'conditions_count': 3, 'precision': 1.0, 'coverage': 0.6, 'C2': 0.8, 'RSS': 0.6, 'correlation': 0.766651877999928, 'lift': 20.2, 'p_value': 6.000600060006002e-05, 'TP': 3, 'FP': 0, 'TN': 96, 'FN': 2, 'sensitivity': 0.6, 'specificity': 1.0, 'negative_predictive_value': 0.9795918367346939, 'odds_ratio': inf, 'relative_risk': 49.0, 'lr+': inf, 'lr-': 0.4}
Rule 0485665e-f92e-42f5-86e5-d05625076cd8
{'p': 4, 'n': 0, 'P': 5, 'N': 96, 'p_unique': 4, 'n_unique': 4, 'support': 4, 'conditions_count': 4, 'precision': 1.0, 'coverage': 0.8, 'C2': 0.9, 'RSS': 0.8, 'correlation': 0.8898047973120777, 'lift': 20.2, 'p_value': 1.2246122571440818e-06, 'TP': 4, 'FP': 0, 'TN': 96, 'FN': 1, 'sensitivity': 0.8, 'specificity': 1.0, 'negative_predictive_value': 0.9896907216494846, 'odds_ratio': inf, 'relative_risk': 97.0, 'lr+': inf, 'lr-': 0.19999999999999996}
The calculate_condition_importances finds the importance of each condition from the rule set. Similarily, calculate_attribute_importances calculates the importance of attributes.
[12]:
from decision_rules.measures import c2
condition_importances = decision_rules_ruleset.calculate_condition_importances(X, y, measure=c2)
print('Condition importances:')
display(condition_importances)
attribute_importances = decision_rules_ruleset.calculate_attribute_importances(condition_importances)
print('Attribute importances:')
display(attribute_importances)
Condition importances:
{'amphibian': [{'condition': 'legs >= 3.00',
'attributes': ['legs'],
'importance': 0.21835455831817263},
{'condition': 'toothed >= 0.50',
'attributes': ['toothed'],
'importance': 0.15137644827521457},
{'condition': 'aquatic >= 0.50',
'attributes': ['aquatic'],
'importance': 0.0706758304696449},
{'condition': 'hair < 0.50',
'attributes': ['hair'],
'importance': 0.05970494134376112}],
'bird': [{'condition': 'feathers >= 0.50',
'attributes': ['feathers'],
'importance': 1.0}],
'fish': [{'condition': 'fins >= 0.50',
'attributes': ['fins'],
'importance': 0.812392368349497},
{'condition': 'eggs >= 0.50',
'attributes': ['eggs'],
'importance': 0.187607631650503}],
'insect': [{'condition': 'legs >= 5.50',
'attributes': ['legs'],
'importance': 0.47480739916779957},
{'condition': 'aquatic < 0.50',
'attributes': ['aquatic'],
'importance': 0.11461012106173396},
{'condition': 'eggs >= 0.50',
'attributes': ['eggs'],
'importance': 0.060634901349317274}],
'invertebrate': [{'condition': 'backbone < 0.50',
'attributes': ['backbone'],
'importance': 0.6437411794554653},
{'condition': 'airborne < 0.50',
'attributes': ['airborne'],
'importance': 0.17127713556284985}],
'mammal': [{'condition': 'milk >= 0.50',
'attributes': ['milk'],
'importance': 1.0}],
'reptile': [{'condition': 'hair < 0.50',
'attributes': ['hair'],
'importance': 0.49337088161384896},
{'condition': 'aquatic < 0.50',
'attributes': ['aquatic'],
'importance': 0.4172718233231141},
{'condition': 'toothed >= 0.50',
'attributes': ['toothed'],
'importance': 0.4041948772993855},
{'condition': 'legs < 2.00',
'attributes': ['legs'],
'importance': 0.2763037988581467},
{'condition': 'fins < 0.50',
'attributes': ['fins'],
'importance': 0.23977086814257867},
{'condition': 'feathers < 0.50',
'attributes': ['feathers'],
'importance': 0.18732960390946504},
{'condition': 'backbone >= 0.50',
'attributes': ['backbone'],
'importance': 0.16063629518072292}]}
Attribute importances:
{'amphibian': {'legs': 0.21835455831817263,
'toothed': 0.15137644827521457,
'aquatic': 0.0706758304696449,
'hair': 0.05970494134376112},
'bird': {'feathers': 1.0},
'fish': {'fins': 0.812392368349497, 'eggs': 0.187607631650503},
'insect': {'legs': 0.47480739916779957,
'aquatic': 0.11461012106173396,
'eggs': 0.060634901349317274},
'invertebrate': {'backbone': 0.6437411794554653,
'airborne': 0.17127713556284985},
'mammal': {'milk': 1.0},
'reptile': {'hair': 0.49337088161384896,
'aquatic': 0.4172718233231141,
'toothed': 0.4041948772993855,
'legs': 0.2763037988581467,
'fins': 0.23977086814257867,
'feathers': 0.18732960390946504,
'backbone': 0.16063629518072292}}
We can serialize the decision-rules rule set to Python dict which can be later stored in JSON format. We do it using JSONSerializer.serialize.
[13]:
from decision_rules.serialization.utils import JSONSerializer
from decision_rules.classification.ruleset import ClassificationRuleSet
[14]:
ruleset_json = JSONSerializer.serialize(decision_rules_ruleset)
print(ruleset_json)
{'meta': {'attributes': ['hair', 'feathers', 'eggs', 'milk', 'airborne', 'aquatic', 'predator', 'toothed', 'backbone', 'breathes', 'venomous', 'fins', 'legs', 'tail', 'domestic', 'catsize'], 'decision_attribute': 'class', 'decision_attribute_distribution': {'amphibian': 4, 'bird': 20, 'fish': 13, 'insect': 8, 'invertebrate': 10, 'mammal': 41, 'reptile': 5}}, 'rules': [{'uuid': '3357850a-4701-41ee-92f5-17d524564033', 'string': 'IF aquatic >= 0.50 AND legs >= 3.00 AND toothed >= 0.50 AND hair < 0.50 THEN class = amphibian', 'premise': {'type': 'compound', 'operator': 'CONJUNCTION', 'subconditions': [{'type': 'elementary_numerical', 'attributes': [5], 'negated': False, 'left': 0.5, 'right': None, 'left_closed': True, 'right_closed': False}, {'type': 'elementary_numerical', 'attributes': [12], 'negated': False, 'left': 3.0, 'right': None, 'left_closed': True, 'right_closed': False}, {'type': 'elementary_numerical', 'attributes': [7], 'negated': False, 'left': 0.5, 'right': None, 'left_closed': True, 'right_closed': False}, {'type': 'elementary_numerical', 'attributes': [0], 'negated': False, 'left': None, 'right': 0.5, 'left_closed': False, 'right_closed': False}]}, 'conclusion': {'value': 'amphibian'}, 'coverage': {'p': 4, 'n': 0, 'P': 4, 'N': 97}}, {'uuid': '0d28724c-56c4-47bb-847e-96fc60e191d9', 'string': 'IF feathers >= 0.50 THEN class = bird', 'premise': {'type': 'compound', 'operator': 'CONJUNCTION', 'subconditions': [{'type': 'elementary_numerical', 'attributes': [1], 'negated': False, 'left': 0.5, 'right': None, 'left_closed': True, 'right_closed': False}]}, 'conclusion': {'value': 'bird'}, 'coverage': {'p': 20, 'n': 0, 'P': 20, 'N': 81}}, {'uuid': '9804e9ab-07d5-4954-8a32-22cb0033196a', 'string': 'IF fins >= 0.50 AND eggs >= 0.50 THEN class = fish', 'premise': {'type': 'compound', 'operator': 'CONJUNCTION', 'subconditions': [{'type': 'elementary_numerical', 'attributes': [11], 'negated': False, 'left': 0.5, 'right': None, 'left_closed': True, 'right_closed': False}, {'type': 'elementary_numerical', 'attributes': [2], 'negated': False, 'left': 0.5, 'right': None, 'left_closed': True, 'right_closed': False}]}, 'conclusion': {'value': 'fish'}, 'coverage': {'p': 13, 'n': 0, 'P': 13, 'N': 88}}, {'uuid': 'd7aa0f70-1492-4114-99f7-2a26a000c366', 'string': 'IF legs >= 5.50 AND aquatic < 0.50 AND eggs >= 0.50 THEN class = insect', 'premise': {'type': 'compound', 'operator': 'CONJUNCTION', 'subconditions': [{'type': 'elementary_numerical', 'attributes': [12], 'negated': False, 'left': 5.5, 'right': None, 'left_closed': True, 'right_closed': False}, {'type': 'elementary_numerical', 'attributes': [5], 'negated': False, 'left': None, 'right': 0.5, 'left_closed': False, 'right_closed': False}, {'type': 'elementary_numerical', 'attributes': [2], 'negated': False, 'left': 0.5, 'right': None, 'left_closed': True, 'right_closed': False}]}, 'conclusion': {'value': 'insect'}, 'coverage': {'p': 8, 'n': 0, 'P': 8, 'N': 93}}, {'uuid': '787b7170-b084-4830-a995-dd39cfe2ca15', 'string': 'IF backbone < 0.50 AND airborne < 0.50 THEN class = invertebrate', 'premise': {'type': 'compound', 'operator': 'CONJUNCTION', 'subconditions': [{'type': 'elementary_numerical', 'attributes': [8], 'negated': False, 'left': None, 'right': 0.5, 'left_closed': False, 'right_closed': False}, {'type': 'elementary_numerical', 'attributes': [4], 'negated': False, 'left': None, 'right': 0.5, 'left_closed': False, 'right_closed': False}]}, 'conclusion': {'value': 'invertebrate'}, 'coverage': {'p': 10, 'n': 2, 'P': 10, 'N': 91}}, {'uuid': '67f18dcb-1a70-43ad-9e16-713daedc5eca', 'string': 'IF milk >= 0.50 THEN class = mammal', 'premise': {'type': 'compound', 'operator': 'CONJUNCTION', 'subconditions': [{'type': 'elementary_numerical', 'attributes': [3], 'negated': False, 'left': 0.5, 'right': None, 'left_closed': True, 'right_closed': False}]}, 'conclusion': {'value': 'mammal'}, 'coverage': {'p': 41, 'n': 0, 'P': 41, 'N': 60}}, {'uuid': 'd122f503-9a1c-40b4-a879-a8452235ef9f', 'string': 'IF toothed >= 0.50 AND fins < 0.50 AND legs < 2.00 THEN class = reptile', 'premise': {'type': 'compound', 'operator': 'CONJUNCTION', 'subconditions': [{'type': 'elementary_numerical', 'attributes': [7], 'negated': False, 'left': 0.5, 'right': None, 'left_closed': True, 'right_closed': False}, {'type': 'elementary_numerical', 'attributes': [11], 'negated': False, 'left': None, 'right': 0.5, 'left_closed': False, 'right_closed': False}, {'type': 'elementary_numerical', 'attributes': [12], 'negated': False, 'left': None, 'right': 2.0, 'left_closed': False, 'right_closed': False}]}, 'conclusion': {'value': 'reptile'}, 'coverage': {'p': 3, 'n': 0, 'P': 5, 'N': 96}}, {'uuid': 'cce9a48a-ffd9-4636-9330-c9c991d0a15d', 'string': 'IF hair < 0.50 AND toothed >= 0.50 AND aquatic < 0.50 THEN class = reptile', 'premise': {'type': 'compound', 'operator': 'CONJUNCTION', 'subconditions': [{'type': 'elementary_numerical', 'attributes': [0], 'negated': False, 'left': None, 'right': 0.5, 'left_closed': False, 'right_closed': False}, {'type': 'elementary_numerical', 'attributes': [7], 'negated': False, 'left': 0.5, 'right': None, 'left_closed': True, 'right_closed': False}, {'type': 'elementary_numerical', 'attributes': [5], 'negated': False, 'left': None, 'right': 0.5, 'left_closed': False, 'right_closed': False}]}, 'conclusion': {'value': 'reptile'}, 'coverage': {'p': 3, 'n': 0, 'P': 5, 'N': 96}}, {'uuid': '0485665e-f92e-42f5-86e5-d05625076cd8', 'string': 'IF hair < 0.50 AND feathers < 0.50 AND aquatic < 0.50 AND backbone >= 0.50 THEN class = reptile', 'premise': {'type': 'compound', 'operator': 'CONJUNCTION', 'subconditions': [{'type': 'elementary_numerical', 'attributes': [0], 'negated': False, 'left': None, 'right': 0.5, 'left_closed': False, 'right_closed': False}, {'type': 'elementary_numerical', 'attributes': [1], 'negated': False, 'left': None, 'right': 0.5, 'left_closed': False, 'right_closed': False}, {'type': 'elementary_numerical', 'attributes': [5], 'negated': False, 'left': None, 'right': 0.5, 'left_closed': False, 'right_closed': False}, {'type': 'elementary_numerical', 'attributes': [8], 'negated': False, 'left': 0.5, 'right': None, 'left_closed': True, 'right_closed': False}]}, 'conclusion': {'value': 'reptile'}, 'coverage': {'p': 4, 'n': 0, 'P': 5, 'N': 96}}]}
[15]:
import json
with open('output/zoo.json', 'w') as f:
json.dump(ruleset_json, f)
The serialized ruleset can be reloaded using JSONSerializer.deserialize.
[16]:
with open('output/zoo.json') as f:
deserialized_ruleset_json = json.load(f)
deserialized_ruleset = JSONSerializer.deserialize(deserialized_ruleset_json, target_class=ClassificationRuleSet)
Let’s check if the rules are the same.
[17]:
for rule in deserialized_ruleset.rules:
print(rule)
IF aquatic >= 0.50 AND legs >= 3.00 AND toothed >= 0.50 AND hair < 0.50 THEN class = amphibian (p=4, n=0, P=4, N=97)
IF feathers >= 0.50 THEN class = bird (p=20, n=0, P=20, N=81)
IF fins >= 0.50 AND eggs >= 0.50 THEN class = fish (p=13, n=0, P=13, N=88)
IF legs >= 5.50 AND aquatic < 0.50 AND eggs >= 0.50 THEN class = insect (p=8, n=0, P=8, N=93)
IF backbone < 0.50 AND airborne < 0.50 THEN class = invertebrate (p=10, n=2, P=10, N=91)
IF milk >= 0.50 THEN class = mammal (p=41, n=0, P=41, N=60)
IF toothed >= 0.50 AND fins < 0.50 AND legs < 2.00 THEN class = reptile (p=3, n=0, P=5, N=96)
IF hair < 0.50 AND toothed >= 0.50 AND aquatic < 0.50 THEN class = reptile (p=3, n=0, P=5, N=96)
IF hair < 0.50 AND feathers < 0.50 AND aquatic < 0.50 AND backbone >= 0.50 THEN class = reptile (p=4, n=0, P=5, N=96)
Before using some of the functions of the deserialized ruleset, it may be necessary to call the update function. After that the object will be ready for prediction.
[18]:
deserialized_ruleset.update(X, y, c2)
y_pred = deserialized_ruleset.predict(X)
display(y_pred)
array(['mammal', 'mammal', 'fish', 'mammal', 'mammal', 'mammal', 'mammal',
'fish', 'fish', 'mammal', 'mammal', 'bird', 'fish', 'invertebrate',
'invertebrate', 'invertebrate', 'bird', 'mammal', 'fish', 'mammal',
'bird', 'bird', 'mammal', 'bird', 'insect', 'amphibian',
'amphibian', 'mammal', 'mammal', 'mammal', 'insect', 'mammal',
'mammal', 'bird', 'fish', 'mammal', 'mammal', 'bird', 'fish',
'insect', 'insect', 'bird', 'insect', 'bird', 'mammal', 'mammal',
'invertebrate', 'mammal', 'mammal', 'mammal', 'mammal', 'insect',
'amphibian', 'invertebrate', 'mammal', 'mammal', 'bird', 'bird',
'bird', 'bird', 'fish', 'fish', 'reptile', 'mammal', 'mammal',
'mammal', 'mammal', 'mammal', 'mammal', 'mammal', 'mammal', 'bird',
'invertebrate', 'fish', 'mammal', 'mammal', 'reptile',
'invertebrate', 'bird', 'bird', 'reptile', 'invertebrate', 'fish',
'bird', 'mammal', 'invertebrate', 'fish', 'bird', 'insect',
'amphibian', 'reptile', 'reptile', 'fish', 'mammal', 'mammal',
'bird', 'mammal', 'insect', 'mammal', 'invertebrate', 'bird'],
dtype='<U12')