agatha.ml.abstract_generator.tokenizer module¶
-
class
agatha.ml.abstract_generator.tokenizer.AbstractGeneratorTokenizer(tokenizer_model_path, extra_data_path, lowercase)¶ Bases:
object-
decode_dep(idx)¶ - Return type
str
-
decode_entity_label(idx)¶ - Return type
str
-
decode_mesh(idx)¶ - Return type
str
-
decode_pos(idx)¶ - Return type
str
-
decode_text(ids)¶ - Return type
str
-
decode_year(idx)¶ - Return type
int
-
encode_dep(dep)¶ - Return type
int
-
encode_entity_label(entity_label)¶ - Return type
int
-
encode_for_generation(initial_text=None, year=None, mesh_terms=None, allow_unknown_terms=False)¶ Given initial text and condition data, produce model_in. Intended use:
- Return type
Dict[str,LongTensor]
-
encode_mesh(mesh)¶ - Return type
int
-
encode_pos(pos)¶ - Return type
int
-
encode_sentence(sentence, is_first=False, is_last=False)¶ - Return type
Dict[str,List[int]]
-
encode_year(year)¶ - Return type
int
-
len_dep()¶ - Return type
int
-
len_entity_label()¶ - Return type
int
-
len_mesh()¶ - Return type
int
-
len_pos()¶ - Return type
int
-
len_text()¶ - Return type
int
-
len_year()¶ - Return type
int
-
simple_encode_text(text)¶ - Return type
List[int]
-
-
agatha.ml.abstract_generator.tokenizer.get_current_year()¶