Please use this identifier to cite or link to this item:
https://ir.iimcal.ac.in:8443/jspui/handle/123456789/1685
Full metadata record
DC Field | Value | Language |
---|---|---|
dc.contributor.author | Basu, Moumita | |
dc.contributor.author | Roy, Anurag | |
dc.contributor.author | Ghosh, Kripabandhu | |
dc.contributor.author | Bandyopadhyay, Somprakash | |
dc.contributor.author | Ghosh, Saptarshi | |
dc.date.accessioned | 2021-08-26T06:23:44Z | - |
dc.date.available | 2021-08-26T06:23:44Z | - |
dc.date.issued | 2017 | |
dc.identifier.uri | https://www.scopus.com/inward/record.uri?eid=2-s2.0-85017930397&doi=10.1007%2f978-3-319-56608-5_53&partnerID=40&md5=bd2df865f16c09422df202c7fb94b8d7 | |
dc.identifier.uri | https://ir.iimcal.ac.in:8443/jspui/handle/123456789/1685 | - |
dc.description | Basu, Moumita, Indian Institute of Engineering Science and Technology, Shibpur, India, Indian Institute of Management, Calcutta, India; Roy, Anurag, Indian Institute of Engineering Science and Technology, Shibpur, India; Ghosh, Kripabandhu, Indian Institute of Technology, Kanpur, India; Bandyopadhyay, Somprakash, Indian Institute of Management, Calcutta, India; Ghosh, Saptarshi, Indian Institute of Engineering Science and Technology, Shibpur, India, Indian Institute of Technology, Kharagpur, India | |
dc.description | ISSN/ISBN - 3029743 | |
dc.description | pp.589-597 | |
dc.description | DOI - 10.1007/978-3-319-56608-5_53 | |
dc.description.abstract | IR methods are increasingly being applied over microblogs to extract real-time information, such as during disaster events. In such sites, most of the user-generated content is written informally – the same word is often spelled differently by different users, and words are shortened arbitrarily due to the length limitations on microblogs. Stemming is a common step for improving retrieval performance by unifying different morphological variants of a word. In this study, we show that rule-based stemming meant for formal text often cannot capture the arbitrary variations of words in microblogs. We propose a context-specific stemming algorithm, based on word embeddings, which can capture many more variations of words than what can be detected by conventional stemmers. Experiments on a large set of English microblogs posted during a recent disaster event shows that, the proposed stemming gives considerably better retrieval performance compared to Porter stemming. © Springer International Publishing AG 2017. | |
dc.publisher | SCOPUS | |
dc.publisher | Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) | |
dc.publisher | Springer Verlag | |
dc.relation.ispartofseries | 10193 LNCS | |
dc.subject | Disasters | |
dc.subject | Microblog retrieval | |
dc.subject | Stemming | |
dc.subject | Word embedding | |
dc.subject | Word2vec | |
dc.title | A novel word embedding based stemming approach for microblog retrieval during disasters | |
dc.type | Conference Paper | |
Appears in Collections: | Management Information Systems |
Files in This Item:
There are no files associated with this item.
Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.