decision_rules.similarity

decision_rules.similarity.calculate

class decision_rules.similarity.calculate.SimilarityType(value, names=None, *values, module=None, qualname=None, type=None, start=1, boundary=None)

Bases: str, Enum

SEMANTIC = 'semantic'
SYNTACTIC = 'syntactic'
decision_rules.similarity.calculate.calculate_rule_similarity(ruleset1: AbstractRuleSet, ruleset2: AbstractRuleSet, dataset, similarity_type: SimilarityType, measure: SimilarityMeasure = None) ndarray

Calculate the rule similarity between two sets of rules based on the specified measure and type.

Parameters:
  • ruleset1 – AbstractRuleSet

  • ruleset2 – AbstractRuleSet

  • dataset – pd.DataFrame

  • similarity_type – SimilarityType - semantic or syntactic

  • measure – for semantic similarity

Returns:

array (matrix) of similarities

decision_rules.similarity.ruleset

decision_rules.similarity.ruleset.calculate_ruleset_similarity(ruleset1: AbstractRuleSet, ruleset2: AbstractRuleSet, dataset: DataFrame) float

Calculates the similarity between two rulesets based on the number of pairs of examples that are covered by the same rules in both rulesets.

Parameters:
  • ruleset1 – first ruleset

  • ruleset2 – second ruleset

  • dataset – dataset to calculate the similarity for

Returns:

similarity between the rulesets

decision_rules.similarity.semantic

class decision_rules.similarity.semantic.SimilarityMeasure(value, names=None, *values, module=None, qualname=None, type=None, start=1, boundary=None)

Bases: str, Enum

CORRELATION = 'Correlation'
JACCARD = 'Jaccard'
KULCZYNSKI = 'Kulczynski'
decision_rules.similarity.semantic.calculate_semantic_similarity_matrix(measure: SimilarityMeasure, ruleset1: AbstractRuleSet, ruleset2: AbstractRuleSet, dataset: DataFrame) ndarray

Calculates the similarity matrix based on the specified similarity measure

Parameters:
  • measure (SimilarityMeasure) – The similarity measure to use

  • ruleset1 (AbstractRuleset) – The first ruleset

  • ruleset2 (AbstractRuleset) – The second ruleset

  • dataset (pd.DataFrame) – The dataset to use for calculations

Raises:

ValueError – If an invalid measure is specified.

Returns:

The similarity matrix.

Return type:

np.ndarray

decision_rules.similarity.syntactic

class decision_rules.similarity.syntactic.SyntacticRuleSimilarityCalculator(ruleset1: AbstractRuleSet, ruleset2: AbstractRuleSet, dataset: DataFrame)

Bases: object

Calculator of syntactic rule similarity. Caveat: the assumption is that the conditions in rules are connected only with conjunction operators.

calculate() ndarray