Please use this identifier to cite or link to this item: https://ir.iimcal.ac.in:8443/jspui/handle/123456789/1685
Full metadata record
DC FieldValueLanguage
dc.contributor.authorBasu, Moumita
dc.contributor.authorRoy, Anurag
dc.contributor.authorGhosh, Kripabandhu
dc.contributor.authorBandyopadhyay, Somprakash
dc.contributor.authorGhosh, Saptarshi
dc.date.accessioned2021-08-26T06:23:44Z-
dc.date.available2021-08-26T06:23:44Z-
dc.date.issued2017
dc.identifier.urihttps://www.scopus.com/inward/record.uri?eid=2-s2.0-85017930397&doi=10.1007%2f978-3-319-56608-5_53&partnerID=40&md5=bd2df865f16c09422df202c7fb94b8d7
dc.identifier.urihttps://ir.iimcal.ac.in:8443/jspui/handle/123456789/1685-
dc.descriptionBasu, Moumita, Indian Institute of Engineering Science and Technology, Shibpur, India, Indian Institute of Management, Calcutta, India; Roy, Anurag, Indian Institute of Engineering Science and Technology, Shibpur, India; Ghosh, Kripabandhu, Indian Institute of Technology, Kanpur, India; Bandyopadhyay, Somprakash, Indian Institute of Management, Calcutta, India; Ghosh, Saptarshi, Indian Institute of Engineering Science and Technology, Shibpur, India, Indian Institute of Technology, Kharagpur, India
dc.descriptionISSN/ISBN - 3029743
dc.descriptionpp.589-597
dc.descriptionDOI - 10.1007/978-3-319-56608-5_53
dc.description.abstractIR methods are increasingly being applied over microblogs to extract real-time information, such as during disaster events. In such sites, most of the user-generated content is written informally – the same word is often spelled differently by different users, and words are shortened arbitrarily due to the length limitations on microblogs. Stemming is a common step for improving retrieval performance by unifying different morphological variants of a word. In this study, we show that rule-based stemming meant for formal text often cannot capture the arbitrary variations of words in microblogs. We propose a context-specific stemming algorithm, based on word embeddings, which can capture many more variations of words than what can be detected by conventional stemmers. Experiments on a large set of English microblogs posted during a recent disaster event shows that, the proposed stemming gives considerably better retrieval performance compared to Porter stemming. © Springer International Publishing AG 2017.
dc.publisherSCOPUS
dc.publisherLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
dc.publisherSpringer Verlag
dc.relation.ispartofseries10193 LNCS
dc.subjectDisasters
dc.subjectMicroblog retrieval
dc.subjectStemming
dc.subjectWord embedding
dc.subjectWord2vec
dc.titleA novel word embedding based stemming approach for microblog retrieval during disasters
dc.typeConference Paper
Appears in Collections:Management Information Systems

Files in This Item:
There are no files associated with this item.


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.