decision_rules.regression
decision_rules.regression.metrics
Contains class for calculating rule metrics for regression rules
- class decision_rules.regression.metrics.RegressionRulesMetrics(rules: list[RegressionRule])
Bases:
AbstractRulesMetricsReturns metrics object instance.
- calculate_p_value(coverage: Coverage | None = None, rule: RegressionRule | None = None, y: ndarray | None = None) float
Calculates ryle p-value on given dataset based on X2 test comparing label variance of covered vs. uncovered examples.
- Parameters:
rule (RegressionRule) – rule
y (np.ndarray) – labels
- Returns:
rule p-value
- Return type:
float
- get_metrics_calculator(rule: RegressionRule, X: ndarray, y: ndarray) dict[str, Callable[[], Any]]
Returns metrics calculator object in a form of dictionary where values are the non-parmetrized callables calculating specified metrics and keys are the names of those metrics.
Examples
>>> { >>> 'p': lambda: rule.coverage.p, >>> 'n': lambda: rule.coverage.n, >>> 'P': lambda: rule.coverage.P, >>> 'N': lambda: rule.coverage.N, >>> 'coverage': lambda: measures.coverage(rule.coverage), >>> ... >>> }
- Parameters:
rule (AbstractRule) – rule
X (pd.DataFrame) – data
y (pd.Series) – labels
- Returns:
metrics calculator object
- Return type:
dict[str, Callable[[], Any]]
- property supported_metrics: list[str]
Returns: list[str]: list of names of all supported metrics
decision_rules.regression.prediction
- class decision_rules.regression.prediction.VotingPredictionStrategy(rules: list[AbstractRule], default_conclusion: AbstractConclusion)
Bases:
PredictionStrategyVoting prediction strategy for regression prediction.
decision_rules.regression.prediction_indicators
- class decision_rules.regression.prediction_indicators.RegressionGeneralPredictionIndicators
Bases:
TypedDict- Covered_by_prediction: int
- MAE: float
- MAPE: float
- Not_covered_by_prediction: int
- RMSE: float
- maxError: float
- rMAE: float
- rRMSE: float
- class decision_rules.regression.prediction_indicators.RegressionPredictionHistogram
Bases:
TypedDict- bin_edges: list[float]
- histogram: list[int]
- max: float
- min: float
- class decision_rules.regression.prediction_indicators.RegressionPredictionIndicators
Bases:
TypedDict- histogram: RegressionPredictionHistogram
- type_of_problem: str
- decision_rules.regression.prediction_indicators.calculate_for_regression(y_true: ndarray, y_pred: ndarray, calculate_only_for_covered_examples: bool = False) RegressionPredictionIndicators
Calculate prediction indicators for regression problem.
- Parameters:
y_true (np.ndarray) – Array containing the actual labels.
y_pred (np.ndarray) – Array containing the predicted labels.
calculate_only_for_covered_examples (bool, optional) – If true, it will calculate indicators only for the examples where prediction was not empty. Otherwise, it will calculate indicators for all the examples. Defaults to False.
- Returns:
A dictionary containing prediction indicators
- Return type:
decision_rules.regression.rule
Contains regression rule and conclusion classes.
- class decision_rules.regression.rule.RegressionConclusion(value: float, column_name: str, fixed: bool = False, low: float | None = None, high: float | None = None)
Bases:
AbstractConclusionConclusion part of the regression rule
- calculate_low_high()
Automatically calculate low and high values based on standard deviation of covered examples label attribute where:
low = value - train_covered_y_std high = value + train_covered_y_std
- Parameters:
train_covered_y_std (float) – standard deviation of covered examples label attribute
- Raises:
ValueError – When conclusion value is None
- is_empty() bool
Returns whether conclusion is empty or not.
- static make_empty(column_name: str) RegressionConclusion
Creates empty conclusion. Use it when you don’t want to use default conclusion during prediction.
- Parameters:
column_name (str) – decision column name
- Returns:
empty conclusion
- Return type:
- positives_mask(y: ndarray) ndarray
Calculates positive examples mask
- Parameters:
y (np.ndarray)
- Returns:
- 1 dimensional numpy array of booleans specifying
whether given examples are consistent with the conclusion.
- Return type:
np.ndarray
- property value: float
Returns: float: Conclusion’s value
- class decision_rules.regression.rule.RegressionRule(premise: AbstractCondition, conclusion: RegressionConclusion, column_names: list[str])
Bases:
AbstractRuleRegression rule.
- calculate_coverage(X: ndarray, y: ndarray = None, P: int = None, N: int = None) Coverage
- Parameters:
X (np.ndarray)
y (np.ndarray, optional) – if None then P and N params should be passed. Defaults to None.
P (int, optional) – optional number of all examples from rule decison class. Defaults to None.
N (int, optional) – optional number of all examples not from rule decison class. Defaults to None.
- Raises:
ValueError – if y is None and either P or N is None too
- Returns:
rule coverage
- Return type:
- get_coverage_dict() dict
decision_rules.regression.ruleset
Contains regression ruleset class.
- class decision_rules.regression.ruleset.RegressionRuleSet(rules: list[RegressionRule])
Bases:
AbstractRuleSetRegression ruleset allowing to perform prediction on data
- calculate_attribute_importances(condition_importances: dict[str, float]) dict[str, float]
Calculate importances of attriubtes in RuleSet based on conditions importances
- Parameters:
Union[dict[str (condition_importances) – condition importances
float] – condition importances
dict[str – condition importances
dict[str – condition importances
float]]] – condition importances
- Returns:
- attribute importances, in the case of classification additionally
returns information about class dict[str, dict[str, float]]:
- Return type:
dict[str, float]
- calculate_condition_importances(X: DataFrame, y: Series, measure: Callable[[Coverage], float]) dict[str, dict[str, float]]
Calculate importances of conditions in RuleSet
- Parameters:
X (pd.DataFrame)
y (pd.Series)
measure (Callable[[Coverage], float]) – measure used to count importance
- Returns:
condition importances, in the case of classification additionally returns information about class dict[str, dict[str, float]]:
- Return type:
dict[str, float]
- calculate_p_values(y: ndarray) list
- get_default_prediction_strategy_class() Type[PredictionStrategy]
Returns default prediction strategy class used when user doesn’t specify any.
- Returns:
- class implementing PredictionStrategy
interface
- Return type:
Type[PredictionStrategy]
- get_metrics_object_instance() AbstractRulesMetrics
Returns metrics object instance.
- property prediction_strategies_choice: dict[str, Type[PredictionStrategy]]
Specifies prediction strategies available for this model.
- Returns:
- Dictionary containing available prediction
strategies. Keys are prediction strategies names and values are classes implementing PredictionStrategy interface for this model.
- Return type:
dict[str, Type[PredictionStrategy]]
- update(X_train: DataFrame, y_train: Series, measure: Callable[[Coverage], float]) ndarray
Updates ruleset using training dataset. This method should be called both after creation of new ruleset or after manipulating any of its rules or internal conditions. This method recalculates rules coverages and voting weights making it ready for prediction
- Parameters:
X_train (pd.DataFrame)
y_train (pd.Series)
measure (Callable[[Coverage], float]) – voting measure function
- Raises:
ValueError – if called on empty ruleset with no rules
- update_using_coverages(coverages_info: dict[str, RegressionCoverageInfodict], measure: Callable[[Coverage], float], columns_names: list[str] = None)
- property y_train_median: float
Returns: float: label’s median value on train dataset