Documentation for `Evaluator`¶

luminator.evaluation.evaluator.EvaluatorBase ¶

Bases: BaseModel

model `instance-attribute` ¶

model: PreTrainedModel

tokenizer `instance-attribute` ¶

tokenizer: PreTrainedTokenizerBase

luminator.evaluation.evaluator.AttributionEvaluator ¶

Bases: EvaluatorBase

The AttributionEvaluator is used to evaluate attributions by computing different metrics.

predict_fn `instance-attribute` ¶

predict_fn: Callable[Concatenate[Tuple[Tensor, ...], P], Generator[SequenceClassifierOutput, None, None]]

non_zero_weights ¶

non_zero_weights(explanations: List[SequenceExplanation], threshold: float = 1e-09) -> List[Tensor]

Computes the non-zero-weights metric

Parameters:

Name	Type	Description	Default
`explanations`	`List[SequenceExplanation]`	A list of SequenceExplanation	required
`threshold`	`float`	All values greater than threshold increase the metric.	`1e-09`

Returns:

Type	Description
`List[Tensor]`	One score for each explanation

faithfulness ¶

faithfulness(explanations: List[SequenceExplanation]) -> List[Tensor]

replaces the highest attributed token with an UNK token and predicts the example the score is the difference of the predictions of the base and permuted example a highter score shows a higher difference, showing that removing the most important token has a large impact on the prediction

Parameters:

Name	Type	Description	Default
`explanations`	`List[SequenceExplanation]`	A list of SequenceExplanation	required

Returns:

Type	Description
`List[Tensor]`	One score for each explanation

truthfulness ¶

truthfulness(explanations: List[SequenceExplanation]) -> List[Tensor]

For each token in the example a new example is created where the token is replaced by UNK All permuted examples are then predicted. For each permuted token, the score is increased by 1 if the removal of the token leads to a decrease in prediction-probability (positive) or increase in prediction-probability (negative). The score is averaged over all tokens. The higher the scores, the more truthfully attributed tokens have been computed.

Parameters:

Name	Type	Description	Default
`explanations`	`List[SequenceExplanation]`	A list of SequenceExplanation	required

Returns:

Type	Description
`List[Tensor]`	One score for each explanation

faithful_truthfulness ¶

faithful_truthfulness(explanations: List[SequenceExplanation]) -> List[Tensor]

For each token in the example a new example is created where the token is replaced by UNK All permuted examples are then predicted. For each permuted token, difference in prediction-probability is added to the score. The score is averaged over all tokens. The higher the score, the more influence did each token have and the more precise was each attribution in predicting the influence on the model output.

Parameters:

Name	Type	Description	Default
`explanations`	`List[SequenceExplanation]`	A list of SequenceExplanation	required

Returns:

Type	Description
`List[Tensor]`	One score for each explanation

ranked_faithful_truthfulness ¶

ranked_faithful_truthfulness(explanations: List[SequenceExplanation]) -> List[Tensor]

For each token in the example a new example is created where the token is replaced by UNK All permuted examples are then predicted. The attributed tokens are sorted by their attribution-value thus giving the highest attribution the highest rank and vice versa. For each permuted token, the difference in prediction-probability divided by the rank is added to the score. The higher the score, the more influence did each token have and the more precise was each attribution in predicting the influence on the model output.

Parameters:

Name	Type	Description	Default
`explanations`	`List[SequenceExplanation]`	A list of SequenceExplanation	required

Returns:

Type	Description
`List[Tensor]`	One score for each explanation

robustness ¶

robustness(explanation: SequenceExplanation, tweaked_explanations: List[SequenceExplanation]) -> Tensor

Robustness measures the degree of change between the interpretations for the initial and modified instances.

Parameters:

Name	Type	Description	Default
`explanation`	`SequenceExplanation`	A SequenceExplanation	required
`tweaked_explanations`	`List[SequenceExplanation]`	A list of tweaked SequenceExplanation	required

Returns:

Type	Description
`Tensor`	One score for each explanation

rationale_f1 ¶

rationale_f1(explanation: SequenceExplanation, rationales: List[int], top_k: Optional[int] = None) -> float

Maps all token attributions to their corresponding words and classifies, whether they are rationales or not. Then measures the f1-score in comparison to the given rationales.

Parameters:

Name	Type	Description	Default
`explanation`	`SequenceExplanation`	A SequenceExplanation	required
`rationales`	`List[int]`	A list of rationales (0 or 1) for each word in the explanation	required
`top_k`	`Optional[int]`	Specifies the way the attributions are used to classify rationales. If top_k is 0 or None, each token where the attribution >= mean + std of all attributions is classified as positive. Otherwise the top_k highest attributions are classified as positive.	`None`

Returns:

Type	Description
`float`	One score for each explanation

rationale_accuracy ¶

rationale_accuracy(explanation: SequenceExplanation, rationales: List[int], top_k: Optional[int] = None) -> float

Maps all token attributions to their corresponding words and classifies, whether they are rationales or not. Then measures the accuracy-score in comparison to the given rationales.

Parameters:

Name	Type	Description	Default
`explanation`	`SequenceExplanation`	A SequenceExplanation	required
`rationales`	`List[int]`	A list of rationales (0 or 1) for each word in the explanation	required
`top_k`	`Optional[int]`	Specifies the way the attributions are used to classify rationales. If top_k is 0 or None, each token where the attribution >= mean + std of all attributions is classified as positive. Otherwise the top_k highest attributions are classified as positive.	`None`

Returns:

Type	Description
`float`	One score for each explanation

rationale_recall ¶

rationale_recall(explanation: SequenceExplanation, rationales: List[int], top_k: Optional[int] = None) -> float

Maps all token attributions to their corresponding words and classifies, whether they are rationales or not. Then measures the recall-score in comparison to the given rationales.

Parameters:

Name	Type	Description	Default
`explanation`	`SequenceExplanation`	A SequenceExplanation	required
`rationales`	`List[int]`	A list of rationales (0 or 1) for each word in the explanation	required
`top_k`	`Optional[int]`	Specifies the way the attributions are used to classify rationales. If top_k is 0 or None, each token where the attribution >= mean + std of all attributions is classified as positive. Otherwise the top_k highest attributions are classified as positive.	`None`

Returns:

Type	Description
`float`	One score for each explanation

rationale_precision ¶

rationale_precision(explanation: SequenceExplanation, rationales: List[int], top_k: Optional[int] = None) -> float

Maps all token attributions to their corresponding words and classifies, whether they are rationales or not. Then measures the precision-score in comparison to the given rationales.

Parameters:

Name	Type	Description	Default
`explanation`	`SequenceExplanation`	A SequenceExplanation	required
`rationales`	`List[int]`	A list of rationales (0 or 1) for each word in the explanation	required
`top_k`	`Optional[int]`	Specifies the way the attributions are used to classify rationales. If top_k is 0 or None, each token where the attribution >= mean + std of all attributions is classified as positive. Otherwise the top_k highest attributions are classified as positive.	`None`

Returns:

Type	Description
`float`	One score for each explanation

Documentation for Evaluator¶

luminator.evaluation.evaluator.EvaluatorBase ¶

model instance-attribute ¶

tokenizer instance-attribute ¶

luminator.evaluation.evaluator.AttributionEvaluator ¶

predict_fn instance-attribute ¶

non_zero_weights ¶

faithfulness ¶

truthfulness ¶

faithful_truthfulness ¶

ranked_faithful_truthfulness ¶

robustness ¶

rationale_f1 ¶

rationale_accuracy ¶

rationale_recall ¶

rationale_precision ¶

Documentation for `Evaluator`¶

model `instance-attribute` ¶

tokenizer `instance-attribute` ¶

predict_fn `instance-attribute` ¶