agatha.ml.abstract_generator.tokenizer module¶
-
class
agatha.ml.abstract_generator.tokenizer.
AbstractGeneratorTokenizer
(tokenizer_model_path, extra_data_path, lowercase)¶ Bases:
object
-
decode_dep
(idx)¶ - Return type
str
-
decode_entity_label
(idx)¶ - Return type
str
-
decode_mesh
(idx)¶ - Return type
str
-
decode_pos
(idx)¶ - Return type
str
-
decode_text
(ids)¶ - Return type
str
-
decode_year
(idx)¶ - Return type
int
-
encode_dep
(dep)¶ - Return type
int
-
encode_entity_label
(entity_label)¶ - Return type
int
-
encode_for_generation
(initial_text=None, year=None, mesh_terms=None, allow_unknown_terms=False)¶ Given initial text and condition data, produce model_in. Intended use:
- Return type
Dict
[str
,LongTensor
]
-
encode_mesh
(mesh)¶ - Return type
int
-
encode_pos
(pos)¶ - Return type
int
-
encode_sentence
(sentence, is_first=False, is_last=False)¶ - Return type
Dict
[str
,List
[int
]]
-
encode_year
(year)¶ - Return type
int
-
len_dep
()¶ - Return type
int
-
len_entity_label
()¶ - Return type
int
-
len_mesh
()¶ - Return type
int
-
len_pos
()¶ - Return type
int
-
len_text
()¶ - Return type
int
-
len_year
()¶ - Return type
int
-
simple_encode_text
(text)¶ - Return type
List
[int
]
-
-
agatha.ml.abstract_generator.tokenizer.
get_current_year
()¶