bugfinder.processing.tokenizers.replace_functions

class bugfinder.processing.tokenizers.replace_functions.ReplaceFunctions(dataset, deprecation_warning=None)

Bases: AbstractTokenizer

Processing to replace user-created functions from a dataset.

execute()

Run the processing

static process_file(filepath)

Process a single file looking for user-created functions and replace them with a token FUN to reduce uniqueness in the corpus.

Parameters

filepath (str) – Path of the file to be processed

Returns

number of functions replaced

Return type

int