麻豆精品无码av,欧美1区2区,久久中文字幕乱码人妻,亚洲欧美另类少妇精品,在线看黄射,69pao高清,九九九久久久国产精品,子操大逼1234区,九九爱99热精品

0
點(diǎn)贊
0
評論
0
轉(zhuǎn)載
收藏

Lesson 18 命名實(shí)體識別 & 關(guān)系抽取

一、命名實(shí)體識別Named Entity Recognition(NER)

NE TypeExamples
組織ORGANIZATIONGeorgia-Pacific Corp.WHO
人物PERSONEddy BontePresident Obama
地點(diǎn)LOCATIONMurray RiverMount Everest
DATEJune2008-06-29
TIMEtwo fifty a m1:30 p.m.
MONEY175 million Canadian DollarsGBP 10.40
百分?jǐn)?shù)PERCENTtwenty pct18.75 %
設(shè)施FACILITYWashington MonumentStonehenge
政治地緣實(shí)體GPESouth East AsiaMidlothian

s="""The fourth Wells account moving to another agency is the packaged paper-products division of Georgia-Pacific Corp., which arrived at Wells only last fall. Like Hertz and the History Channel, it is also leaving for an Omnicom-owned agency, the BBDO South unit of BBDO Worldwide. BBDO South in Atlanta, which handles corporate advertising for Georgia-Pacific, will assume additional duties for brands like Angel Soft toilet tissue and Sparkle paper towels, said Ken Haldin, a spokesman for Georgia-Pacific in Atlanta."""

s_w=nltk.word_tokenize(s) #分詞 s_tag=nltk.pos_tag(s_w)  #POS 標(biāo)注 print(nltk.ne_chunk(s_tag)) #ne_chunk命名實(shí)體識別函數(shù) #print(nltk.ne_chunk(s_tag, binary=True)) #binary=True,則實(shí)體都顯示為NE,否則顯示具體類別


練習(xí):根據(jù)上例,完成下面文本的NER。

Guangdong University of Foreign Studies (GDUFS) is a major internationalized university in South China for its global-minded faculty/students and its research on international languages, literature, culture, trade and strategic studies. 

Dating back to 1965 when the Guangzhou Institute of Foreign Languages was established and 1980 when the Guangzhou Institute of Foreign Trade was founded, the University had its present form by merging the two in 1995, with the Guangdong College of Finance and Economics incorporated into the University in 2008. The University has three campuses with a total area of 153 hectares: the North Campus at the foot of the Baiyun Mountain, the South Campus in Guangzhou Higher Education Mega Center, and Dalang Campus.


二、關(guān)系抽取

如果命名實(shí)體被確定后,就可以實(shí)現(xiàn)關(guān)系抽取來提取信息。一種方法是:尋找所有的三元組(X,a,Y)。其中X和Y是命名實(shí)體,a是表示兩者關(guān)系的字符串,示例如下:


import nltk, re

IN = re.compile(r'.*\bin\b') #預(yù)先設(shè)定好正則表達(dá)式,匹配單詞in

for doc in nltk.corpus.ieer.parsed_docs('NYT_19980315'):

     for rel in nltk.sem.extract_rels('ORG', 'LOC', doc, corpus='ieer', pattern = IN):

         print(nltk.sem.rtuple(rel))


三、BosonNLP  
https://bosonnlp.com/

中文語義開放平臺


附件
聲明:本內(nèi)容系學(xué)者網(wǎng)用戶個人學(xué)術(shù)動態(tài)分享,不代表平臺立場。

廣東外語外貿(mào)大學(xué) 信息科學(xué)與技術(shù)學(xué)院
SCHOLAT.com 學(xué)者網(wǎng)
免責(zé)聲明 | 關(guān)于我們 | 聯(lián)系我們
聯(lián)系我們:
返回頂部
永和县| 武山县| 乡城县| 武宣县| 铅山县| 万安县| 尉氏县| 京山县| 江都市| 汨罗市| 宁德市| 石嘴山市| 龙门县| 太康县| 游戏| 叙永县| 句容市| 红河县| 阿城市| 宜川县| 商都县| 临湘市| 乌什县| 濉溪县| 宣威市| 遂昌县| 佛冈县| 搜索| 卢湾区| 米易县| 石家庄市| 淮阳县| 庆城县| 昌黎县| 上栗县| 双牌县| 怀仁县| 黑龙江省| 鲁山县| 衡阳县| 丰宁|