decision_rules.regression

decision_rules.regression.metrics

Contains class for calculating rule metrics for regression rules

class decision_rules.regression.metrics.RegressionRulesMetrics(rules: list[RegressionRule])

Bases: AbstractRulesMetrics

Returns metrics object instance.

calculate_p_value(coverage: Coverage | None = None, rule: RegressionRule | None = None, y: ndarray | None = None) float

Calculates ryle p-value on given dataset based on X2 test comparing label variance of covered vs. uncovered examples.

Parameters:
Returns:

rule p-value

Return type:

float

get_metrics_calculator(rule: RegressionRule, X: ndarray, y: ndarray) dict[str, Callable[[], Any]]

Returns metrics calculator object in a form of dictionary where values are the non-parmetrized callables calculating specified metrics and keys are the names of those metrics.

Examples

>>> {
>>>  'p': lambda: rule.coverage.p,
>>>  'n': lambda: rule.coverage.n,
>>>  'P': lambda: rule.coverage.P,
>>>  'N': lambda: rule.coverage.N,
>>>  'coverage': lambda: measures.coverage(rule.coverage),
>>>  ...
>>> }
Parameters:
  • rule (AbstractRule) – rule

  • X (pd.DataFrame) – data

  • y (pd.Series) – labels

Returns:

metrics calculator object

Return type:

dict[str, Callable[[], Any]]

property supported_metrics: list[str]

Returns: list[str]: list of names of all supported metrics

decision_rules.regression.prediction

class decision_rules.regression.prediction.VotingPredictionStrategy(rules: list[AbstractRule], default_conclusion: AbstractConclusion)

Bases: PredictionStrategy

Voting prediction strategy for regression prediction.

decision_rules.regression.prediction_indicators

class decision_rules.regression.prediction_indicators.RegressionGeneralPredictionIndicators

Bases: TypedDict

Covered_by_prediction: int
MAE: float
MAPE: float
Not_covered_by_prediction: int
RMSE: float
maxError: float
rMAE: float
rRMSE: float
class decision_rules.regression.prediction_indicators.RegressionPredictionHistogram

Bases: TypedDict

bin_edges: list[float]
histogram: list[int]
max: float
min: float
class decision_rules.regression.prediction_indicators.RegressionPredictionIndicators

Bases: TypedDict

general: RegressionGeneralPredictionIndicators
histogram: RegressionPredictionHistogram
type_of_problem: str
decision_rules.regression.prediction_indicators.calculate_for_regression(y_true: ndarray, y_pred: ndarray, calculate_only_for_covered_examples: bool = False) RegressionPredictionIndicators

Calculate prediction indicators for regression problem.

Parameters:
  • y_true (np.ndarray) – Array containing the actual labels.

  • y_pred (np.ndarray) – Array containing the predicted labels.

  • calculate_only_for_covered_examples (bool, optional) – If true, it will calculate indicators only for the examples where prediction was not empty. Otherwise, it will calculate indicators for all the examples. Defaults to False.

Returns:

A dictionary containing prediction indicators

Return type:

RegressionPredictionIndicators

decision_rules.regression.rule

Contains regression rule and conclusion classes.

class decision_rules.regression.rule.RegressionConclusion(value: float, column_name: str, fixed: bool = False, low: float | None = None, high: float | None = None)

Bases: AbstractConclusion

Conclusion part of the regression rule

calculate_low_high()

Automatically calculate low and high values based on standard deviation of covered examples label attribute where:

low = value - train_covered_y_std high = value + train_covered_y_std

Parameters:

train_covered_y_std (float) – standard deviation of covered examples label attribute

Raises:

ValueError – When conclusion value is None

is_empty() bool

Returns whether conclusion is empty or not.

static make_empty(column_name: str) RegressionConclusion

Creates empty conclusion. Use it when you don’t want to use default conclusion during prediction.

Parameters:

column_name (str) – decision column name

Returns:

empty conclusion

Return type:

AbstractConclusion

positives_mask(y: ndarray) ndarray

Calculates positive examples mask

Parameters:

y (np.ndarray)

Returns:

1 dimensional numpy array of booleans specifying

whether given examples are consistent with the conclusion.

Return type:

np.ndarray

property value: float

Returns: float: Conclusion’s value

class decision_rules.regression.rule.RegressionRule(premise: AbstractCondition, conclusion: RegressionConclusion, column_names: list[str])

Bases: AbstractRule

Regression rule.

calculate_coverage(X: ndarray, y: ndarray = None, P: int = None, N: int = None) Coverage
Parameters:
  • X (np.ndarray)

  • y (np.ndarray, optional) – if None then P and N params should be passed. Defaults to None.

  • P (int, optional) – optional number of all examples from rule decison class. Defaults to None.

  • N (int, optional) – optional number of all examples not from rule decison class. Defaults to None.

Raises:

ValueError – if y is None and either P or N is None too

Returns:

rule coverage

Return type:

Coverage

get_coverage_dict() dict

decision_rules.regression.ruleset

Contains regression ruleset class.

class decision_rules.regression.ruleset.RegressionRuleSet(rules: list[RegressionRule])

Bases: AbstractRuleSet

Regression ruleset allowing to perform prediction on data

calculate_attribute_importances(condition_importances: dict[str, float]) dict[str, float]

Calculate importances of attriubtes in RuleSet based on conditions importances

Parameters:
  • Union[dict[str (condition_importances) – condition importances

  • float] – condition importances

  • dict[str – condition importances

  • dict[str – condition importances

  • float]]] – condition importances

Returns:

attribute importances, in the case of classification additionally

returns information about class dict[str, dict[str, float]]:

Return type:

dict[str, float]

calculate_condition_importances(X: DataFrame, y: Series, measure: Callable[[Coverage], float]) dict[str, dict[str, float]]

Calculate importances of conditions in RuleSet

Parameters:
  • X (pd.DataFrame)

  • y (pd.Series)

  • measure (Callable[[Coverage], float]) – measure used to count importance

Returns:

condition importances, in the case of classification additionally returns information about class dict[str, dict[str, float]]:

Return type:

dict[str, float]

calculate_p_values(y: ndarray) list
get_default_prediction_strategy_class() Type[PredictionStrategy]

Returns default prediction strategy class used when user doesn’t specify any.

Returns:

class implementing PredictionStrategy

interface

Return type:

Type[PredictionStrategy]

get_metrics_object_instance() AbstractRulesMetrics

Returns metrics object instance.

property prediction_strategies_choice: dict[str, Type[PredictionStrategy]]

Specifies prediction strategies available for this model.

Returns:

Dictionary containing available prediction

strategies. Keys are prediction strategies names and values are classes implementing PredictionStrategy interface for this model.

Return type:

dict[str, Type[PredictionStrategy]]

update(X_train: DataFrame, y_train: Series, measure: Callable[[Coverage], float]) ndarray

Updates ruleset using training dataset. This method should be called both after creation of new ruleset or after manipulating any of its rules or internal conditions. This method recalculates rules coverages and voting weights making it ready for prediction

Parameters:
  • X_train (pd.DataFrame)

  • y_train (pd.Series)

  • measure (Callable[[Coverage], float]) – voting measure function

Raises:

ValueError – if called on empty ruleset with no rules

update_using_coverages(coverages_info: dict[str, RegressionCoverageInfodict], measure: Callable[[Coverage], float], columns_names: list[str] = None)
property y_train_median: float

Returns: float: label’s median value on train dataset