langchain_experimental.data_anonymizer.presidio
.PresidioReversibleAnonymizer¶
- class langchain_experimental.data_anonymizer.presidio.PresidioReversibleAnonymizer(analyzed_fields: Optional[List[str]] = None, operators: Optional[Dict[str, OperatorConfig]] = None, languages_config: Optional[Dict] = None, add_default_faker_operators: bool = True, faker_seed: Optional[int] = None)[source]¶
Reversible Anonymizer using Microsoft Presidio.
- Parameters
analyzed_fields (Optional[List[str]]) – List of fields to detect and then anonymize. Defaults to all entities supported by Microsoft Presidio.
operators (Optional[Dict[str, OperatorConfig]]) – Operators to use for anonymization. Operators allow for custom anonymization of detected PII. Learn more: https://microsoft.github.io/presidio/tutorial/10_simple_anonymization/
languages_config (Optional[Dict]) – Configuration for the NLP engine. First language in the list will be used as the main language in self.anonymize(…) when no language is specified. Learn more: https://microsoft.github.io/presidio/analyzer/customizing_nlp_models/
faker_seed (Optional[int]) – Seed used to initialize faker. Defaults to None, in which case faker will be seeded randomly and provide random values.
add_default_faker_operators (bool) –
Attributes
anonymizer_mapping
Return the anonymizer mapping This is just the reverse version of the deanonymizer mapping.
deanonymizer_mapping
Return the deanonymizer mapping
Methods
__init__
([analyzed_fields, operators, ...])- param analyzed_fields
List of fields to detect and then anonymize.
add_operators
(operators)Add operators to the anonymizer
add_recognizer
(recognizer)Add a recognizer to the analyzer
anonymize
(text[, language, allow_list])Anonymize text.
deanonymize
(text_to_deanonymize[, ...])Deanonymize text
load_deanonymizer_mapping
(file_path)Load the deanonymizer mapping from a JSON or YAML file.
Reset the deanonymizer mapping
save_deanonymizer_mapping
(file_path)Save the deanonymizer mapping to a JSON or YAML file.
- __init__(analyzed_fields: Optional[List[str]] = None, operators: Optional[Dict[str, OperatorConfig]] = None, languages_config: Optional[Dict] = None, add_default_faker_operators: bool = True, faker_seed: Optional[int] = None)[source]¶
- Parameters
analyzed_fields (Optional[List[str]]) – List of fields to detect and then anonymize. Defaults to all entities supported by Microsoft Presidio.
operators (Optional[Dict[str, OperatorConfig]]) – Operators to use for anonymization. Operators allow for custom anonymization of detected PII. Learn more: https://microsoft.github.io/presidio/tutorial/10_simple_anonymization/
languages_config (Optional[Dict]) – Configuration for the NLP engine. First language in the list will be used as the main language in self.anonymize(…) when no language is specified. Learn more: https://microsoft.github.io/presidio/analyzer/customizing_nlp_models/
faker_seed (Optional[int]) – Seed used to initialize faker. Defaults to None, in which case faker will be seeded randomly and provide random values.
add_default_faker_operators (bool) –
- add_operators(operators: Dict[str, OperatorConfig]) None ¶
Add operators to the anonymizer
- Parameters
operators (Dict[str, OperatorConfig]) – Operators to add to the anonymizer.
- Return type
None
- add_recognizer(recognizer: EntityRecognizer) None ¶
Add a recognizer to the analyzer
- Parameters
recognizer (EntityRecognizer) – Recognizer to add to the analyzer.
- Return type
None
- anonymize(text: str, language: Optional[str] = None, allow_list: Optional[List[str]] = None) str ¶
Anonymize text.
- Parameters
text (str) –
language (Optional[str]) –
allow_list (Optional[List[str]]) –
- Return type
str
- deanonymize(text_to_deanonymize: str, deanonymizer_matching_strategy: ~typing.Callable[[str, ~typing.Dict[str, ~typing.Dict[str, str]]], str] = <function exact_matching_strategy>) str ¶
Deanonymize text
- Parameters
text_to_deanonymize (str) –
deanonymizer_matching_strategy (Callable[[str, Dict[str, Dict[str, str]]], str]) –
- Return type
str
- load_deanonymizer_mapping(file_path: Union[Path, str]) None [source]¶
Load the deanonymizer mapping from a JSON or YAML file.
- Parameters
file_path (Union[Path, str]) – Path to file to load the mapping from.
- Return type
None
Example: .. code-block:: python
anonymizer.load_deanonymizer_mapping(file_path=”path/mapping.json”)
- save_deanonymizer_mapping(file_path: Union[Path, str]) None [source]¶
Save the deanonymizer mapping to a JSON or YAML file.
- Parameters
file_path (Union[Path, str]) – Path to file to save the mapping to.
- Return type
None
Example: .. code-block:: python
anonymizer.save_deanonymizer_mapping(file_path=”path/mapping.json”)