CoreNLP NER
CoreNLP NER-tagging is done using CoreNLP Server
using official pre-trained model invoked from fromnltk.parse.corenlp.CoreNLPParser
Tagset
Quoted from https://stanfordnlp.github.io/CoreNLP/ner.html
For English, by default, this annotator recognizes named (PERSON, LOCATION, ORGANIZATION, MISC), numerical (MONEY, NUMBER, ORDINAL, PERCENT), and temporal (DATE, TIME, DURATION, SET) entities (12 classes). Adding the
regexner
annotator and using the supplied RegexNER pattern files adds support for the fine-grained and additional entity classes EMAIL, URL, CITY, STATE_OR_PROVINCE, COUNTRY, NATIONALITY, RELIGION, (job) TITLE, IDEOLOGY, CRIMINAL_CHARGE, CAUSE_OF_DEATH, (Twitter, etc.) HANDLE (12 classes) for a total of 24 classes. Named entities are recognized using a combination of three CRF sequence taggers trained on various corpora, including CoNLL, ACE, MUC, and ERE corpora. Numerical entities are recognized using a rule-based system.
PERSON
, LOCATION
, ORGANIZATION
, MISC
, MONEY
, NUMBER
, ORDINAL
, PERCENT
, DATE
, TIME
, DURATION
, SET
, EMAIL
, URL
, CITY
, STATE_OR_PROVINCE
, COUNTRY
, NATIONALITY
, RELIGION
, TITLE
, IDEOLOGY
, CRIMINAL_CHARGE
, CAUSE_OF_DEATH
, HANDLE
References
Last updated