Common Properties
Properties have 'upward' accessibility, properties at a lower level are accessible at a higher level:
NLP > Document > Sentence >= Subsentence > Token
For example 'str' which is a token property can be accessed at all analysis levels (NLP/Document/Sentence/Subsentence/Token)
Conversely, 'emotion_doc' is only available at the Document level, and at the NLP level which groups multiple documents
All properties have a _flat variant (token_flat) which flatten recursively the return value
Document properties
Name | type | Description |
---|---|---|
str_doc | string | Returns the text of the whole document after tokenization |
original_text_doc | List | Returns the original text of the whole document |
emotion_doc | String | Returns emotions detected on the whole document at once |
sentiment_doc | String | Returns sentiments detected on the whole document at once |
domain_doc | String | Returns classification by domain to allow loading certain dictionary / specific entities (Healthcare, Legal, Finance etc) |
type_doc | String | Returns type of input (contract, article, tweet, reviews, conversation, etc) |
emoticon_doc | Emoticon Object | Returns global text emoticons |
spans | List | Returns spans in the document |
clusters | List | Returns cluster in the document |
Sentence/Subsentence properties
Name | type | Description |
---|---|---|
str | String | Returns sentence as a string |
original_text | String | Returns the original sentence in the input text before modification from the tokenization |
tokens | List | Returns a list of Token Instances from the sentence |
synthesis | List | Returns synthesis of sentence data |
detail | list | Return list of sentence details |
emotion | Tuple | Returns emotion as tuple (Type, score) |
emotion_ml | List | Returns emotion of ml_model without further fine tuning |
sentiment | Dictionary | Returns sentiment with positive, negative and total values |
sentiment_ml | Dictionary | Returns sentiment of ml_model without further fine tuning |
sentence_type | String | Returns the type of sentence |
language | String | Returns detected language |
spans | List | Returns spans in the sentence |
clusters | List | Returns clusters in the sentence |
Token properties
Name | type | Description |
---|---|---|
str | String | Returns string of token |
source | String | Returns string of token |
original_text | String | Returns the original token in the input text before modification from the tokenization |
pos | String | Returns POS (Part-of-Speech) |
ner | NER Object | Returns actual (combines NER & ML_NER) |
lemma | String | Returns the lemma |
lemma_detail | List | Returns details of lemmatizer |
auxiliary | String | Returns unmerged lemma |
gender | String | Returns gender of token |
plural | String | Returns whether token is singular or plural |
infinitive | String | Returns infinitive of token if verb |
mode | String | Returns mode of token if verb |
conjugate | String | Returns conjugate of token if verb |
morphology | String | Returns morphological features |
dep | String | Returns dependency relations |
ref | Integer | Returns the index of the parent dependence -1 for root, else >= 0 |
meaning | List | Returns meanings as tuples (SUPER, SUB) |
spans | List | Returns spans that contain the token |
clusters | List | Returns clusters that contain the token |
For a demo of common properties check out our tutorial ๐จ๐ปโ๐ป