langchain.evaluation.parsing.base.JsonEqualityEvaluator

class langchain.evaluation.parsing.base.JsonEqualityEvaluator(operator: Optional[Callable] = None, **kwargs: Any)[source]
Evaluates whether the prediction is equal to the reference after

parsing both as JSON.

This evaluator checks if the prediction, after parsing as JSON, is equal

to the reference,

which is also parsed as JSON. It does not require an input string.

requires_input

Whether this evaluator requires an input string. Always False.

Type

bool

requires_reference

Whether this evaluator requires a reference string. Always True.

Type

bool

evaluation_name

The name of the evaluation metric. Always “parsed_equality”.

Type

str

Examples

>>> evaluator = JsonEqualityEvaluator()
>>> evaluator.evaluate_strings('{"a": 1}', reference='{"a": 1}')
{'score': True}
>>> evaluator.evaluate_strings('{"a": 1}', reference='{"a": 2}')
{'score': False}
>>> evaluator = JsonEqualityEvaluator(operator=lambda x, y: x['a'] == y['a'])
>>> evaluator.evaluate_strings('{"a": 1}', reference='{"a": 1}')
{'score': True}
>>> evaluator.evaluate_strings('{"a": 1}', reference='{"a": 2}')
{'score': False}

Attributes

evaluation_name

The name of the evaluation.

requires_input

Whether this evaluator requires an input string.

requires_reference

Whether this evaluator requires a reference label.

Methods

__init__([operator])

aevaluate_strings(*, prediction[, ...])

Asynchronously evaluate Chain or LLM output, based on optional input and label.

evaluate_strings(*, prediction[, reference, ...])

Evaluate Chain or LLM output, based on optional input and label.

__init__(operator: Optional[Callable] = None, **kwargs: Any) None[source]
async aevaluate_strings(*, prediction: str, reference: Optional[str] = None, input: Optional[str] = None, **kwargs: Any) dict

Asynchronously evaluate Chain or LLM output, based on optional input and label.

Parameters
  • prediction (str) – The LLM or chain prediction to evaluate.

  • reference (Optional[str], optional) – The reference label to evaluate against.

  • input (Optional[str], optional) – The input to consider during evaluation.

  • **kwargs – Additional keyword arguments, including callbacks, tags, etc.

Returns

The evaluation results containing the score or value.

Return type

dict

evaluate_strings(*, prediction: str, reference: Optional[str] = None, input: Optional[str] = None, **kwargs: Any) dict

Evaluate Chain or LLM output, based on optional input and label.

Parameters
  • prediction (str) – The LLM or chain prediction to evaluate.

  • reference (Optional[str], optional) – The reference label to evaluate against.

  • input (Optional[str], optional) – The input to consider during evaluation.

  • **kwargs – Additional keyword arguments, including callbacks, tags, etc.

Returns

The evaluation results containing the score or value.

Return type

dict