30-04-2021



NLTK has a function called regexpparser to parse the Part-of-Speech tagged sentence. I cannot find a good and short explanation for the Regex pattern. So here is one.

  • 1. Part-of-Speech tagging
  • 2. Regular Expression regex grammar

This action simply tag your tokenized words with the word type, for example, Verb, noun, adjective, etc.

1.1. The tags and explanations

The full list of tags can be shown when running the command

Remove irrelevant words using nltk stop words like is,the,a etc from the sentences as they don’t carry any information. Import nltk from nltk.corpus import stopwords stopword = stopwords.words(‘english’) text = “This is a Demo Text for NLP using NLTK. Cheat Sheet for NLP: A Summary of My NLP Learning Journey So Far. Sentences = nltk.senttokenize(text) analyzer = SentimentIntensityAnalyzer.

Nltk Cheat Sheet Pdf

For the full list of explanations, scroll to the bottom of this post.

1.2. Tagging them

Nltk Cheat Sheet

For example, when you have the sentence:

The process should be something like this

The regular expression with nltk tokens is quite different than normal text. The grammar treats each token as a string of text, and apply the regex pattern on that string with matched to the position of the token in the sentence.

Below I will present a list of grammar, from the most simple to the more complex one, using the same sentence above

2.1. Exact match

Nltk cheat sheet 2019

2.2. Skip some tags

Nltk Cheat Sheet Download

We will skip all tags between The and fox (between DT and NN tags)

Explanation:

<.*> means match every tag. The dot (.) mean match every character (of the tag). The asterisk (*) means repeat match from 0 to unlimited time.

The next asterisk (*) right behind it means repeat the matching tags from 0 to unlimited time.

2.3. Match all tags start with a character

Nltk Cheat Sheet Pdf

We will match all tag starting with ‘N’, this is including ‘NN’, ‘NNP’, ‘NNPS’, ‘NNS’ tags

To use multiple grammars to scan your text, simply combine them with new line character n

Nltk

For example:

Single pattern grammar

Multiple patterns grammar

Here is the full list of tags to save you some time:

TagExplanation
CCconjunction, coordinating
& ‘n and both but either et for less minus neither nor or plus so
therefore times v. versus vs. whether yet
CDnumeral, cardinal
mid-1890 nine-thirty forty-two one-tenth ten million 0.5 one forty-
seven 1987 twenty ‘79 zero two 78-degrees eighty-four IX ‘60s .025
fifteen 271,124 dozen quintillion DM2,000 …
DTdeterminer
all an another any both del each either every half la many much nary
neither no some such that the them these this those
EXexistential there
there
FWforeign word
gemeinschaft hund ich jeux habeas Haementeria Herr K’ang-si vous
lutihaw alai je jour objets salutaris fille quibusdam pas trop Monte
terram fiche oui corporis …
INpreposition or conjunction, subordinating
astride among uppon whether out inside pro despite on by throughout
below within for towards near behind atop around if like until below
next into if beside …
JJadjective or numeral, ordinal
third ill-mannered pre-war regrettable oiled calamitous first separable
ectoplasmic battery-powered participatory fourth still-to-be-named
multilingual multi-disciplinary …
JJRadjective, comparative
bleaker braver breezier briefer brighter brisker broader bumper busier
calmer cheaper choosier cleaner clearer closer colder commoner costlier
cozier creamier crunchier cuter …
JJSadjective, superlative
calmest cheapest choicest classiest cleanest clearest closest commonest
corniest costliest crassest creepiest crudest cutest darkest deadliest
dearest deepest densest dinkiest …
LSlist item marker
A A. B B. C C. D E F First G H I J K One SP-44001 SP-44002 SP-44005
SP-44007 Second Third Three Two * a b c d first five four one six three
two
MDmodal auxiliary
can cannot could couldn’t dare may might must need ought shall should
shouldn’t will would
NNnoun, common, singular or mass
common-carrier cabbage knuckle-duster Casino afghan shed thermostat
investment slide humour falloff slick wind hyena override subhumanity
machinist …
NNPnoun, proper, singular
Motown Venneboerger Czestochwa Ranzer Conchita Trumplane Christos
Oceanside Escobar Kreisler Sawyer Cougar Yvette Ervin ODI Darryl CTCA
Shannon A.K.C. Meltex Liverpool …
NNPSnoun, proper, plural
Americans Americas Amharas Amityvilles Amusements Anarcho-Syndicalists
Andalusians Andes Andruses Angels Animals Anthony Antilles Antiques
Apache Apaches Apocrypha …
NNSnoun, common, plural
undergraduates scotches bric-a-brac products bodyguards facets coasts
divestitures storehouses designs clubs fragrances averages
subjectivists apprehensions muses factory-jobs …
PDTpre-determiner
all both half many quite such sure this
POSgenitive marker
‘ ‘s
PRPpronoun, personal
hers herself him himself hisself it itself me myself one oneself ours
ourselves ownself self she thee theirs them themselves they thou thy us
PRP$pronoun, possessive
her his mine my our ours their thy your
RBadverb
occasionally unabatingly maddeningly adventurously professedly
stirringly prominently technologically magisterially predominately
swiftly fiscally pitilessly …
RBRadverb, comparative
further gloomier grander graver greater grimmer harder harsher
healthier heavier higher however larger later leaner lengthier less-
perfectly lesser lonelier longer louder lower more …
RBSadverb, superlative
best biggest bluntest earliest farthest first furthest hardest
heartiest highest largest least less most nearest second tightest worst
RPparticle
aboard about across along apart around aside at away back before behind
by crop down ever fast for forth from go high i.e. in into just later
low more off on open out over per pie raising start teeth that through
under unto up up-pp upon whole with you
SYMsymbol
% & ‘ ‘’ ‘’. ) ). * + ,. < = > @ A[fj] U.S U.S.S.R * ** ***
TOto as preposition or infinitive marker
to
UHinterjection
Goodbye Goody Gosh Wow Jeepers Jee-sus Hubba Hey Kee-reist Oops amen
huh howdy uh dammit whammo shucks heck anyways whodunnit honey golly
man baby diddle hush sonuvabitch …
VBverb, base form
ask assemble assess assign assume atone attention avoid bake balkanize
bank begin behold believe bend benefit bevel beware bless boil bomb
boost brace break bring broil brush build …
VBDverb, past tense
dipped pleaded swiped regummed soaked tidied convened halted registered
cushioned exacted snubbed strode aimed adopted belied figgered
speculated wore appreciated contemplated …
VBGverb, present participle or gerund
telegraphing stirring focusing angering judging stalling lactating
hankerin’ alleging veering capping approaching traveling besieging
encrypting interrupting erasing wincing …
VBNverb, past participle
multihulled dilapidated aerosolized chaired languished panelized used
experimented flourished imitated reunifed factored condensed sheared
unsettled primed dubbed desired …
VBPverb, present tense, not 3rd person singular
predominate wrap resort sue twist spill cure lengthen brush terminate
appear tend stray glisten obtain comprise detest tease attract
emphasize mold postpone sever return wag …
VBZverb, present tense, 3rd person singular
bases reconstructs marks mixes displeases seals carps weaves snatches
slumps stretches authorizes smolders pictures emerges stockpiles
seduces fizzes uses bolsters slaps speaks pleads …
WDTWH-determiner
that what whatever which whichever
WPWH-pronoun
that what whatever whatsoever which who whom whosoever
WP$WH-pronoun, possessive
whose
WRBWh-adverb
how however whence whenever where whereby whereever wherein whereof why