استخراج ماشینی کلیدواژه با مدل‌سازی موضوعی ال. دی. اِی.: شباهت‌سنجی با کلیدواژه‌های استاندارد و ارزیابی کاربران

نویسندگان
1 دانشگاه خوارزمی ، دانشکده روانشناسی و علوم تربیتی، گروه علم اطلاعات و دانش شناسی
2 دانشگاه خوارزمی
3 دانشگاه تربیت مدرس
چکیده
زمینه و هدف: هدف این پژوهش، بررسی نتایج استخراج خودکار کلیدواژه از فهرست مندرجات کتاب‌های الکترونیکی فارسی حوزۀ علوم با استفاده از مدل‌سازی موضوعی ال. دی. اِی.، سنجش شباهت‌ کلیدواژه‌های خروجی با کلیدواژه‌های استاندارد و ارزیابی کاربران از کلیدواژه‌های استخراج‌شده به‌صورت ماشینی است.

روش پژوهش: این پژوهش کاربردی، از نوع پژوهش‌های متن‌کاوی و به جنبۀ روش‌های مورداستفاده در آن پژوهش آمیخته است. از مدل‌سازی موضوعی ال. دی. اِی. برای استخراج کلیدواژه از فهرست‌های مندرجات کتاب‌ها استفاده‌شده‌ و نتایج کاربرد مدل با دو روش سنجش کسینوس شباهت و پژوهش کیفی توسط کاربران مورد ارزیابی قرار گرفته است.

یافته‌ها: فهرست‌های مندرجات مورد بررسی با میانگین پیراسته ۲۶۰.۰۲ کلمه از متون با طول متوسط محسوب می‌شوند و حدود ۲۰ درصد از کلمات آن‌ها را ایست‌واژه‌ها تشکیل داده‌اند. میان کلیدواژه‌های استاندارد سرعنوانی و کلیدواژه‌های خروجی مدل ال. دی. اِی. کسینوس شباهت، ۰.۰۹۳۲، بسیار پایین به دست آمد. توافق کامل کاربران نشان داد کلیدواژه‌های خروجی مدل موضوعی ال. دی. اِی. حوزه موضوعی کل پیکره را نشان می‌دهند، اما ازنظر کاربران به ترتیب کلیدواژه‌های سرعنوانی استاندارد، کلیدواژه‌های مستخرج از مدل در زیرحوزه‌های موضوعی و کلیدواژه‌های مستخرج از مدل با کل پیکره در توصیف موضوعات هر تک مدرک موفق‌اند.

نتیجه‌گیری: کلیدواژه‌های به‌دست‌آمده از مدل موضوعی ال. دی. اِی. را می‌توان در مجموعه‌های ناشناخته به‌منظور استخراج محتوای موضوعی ناآشکار کل مجموعه به کار برد، اما برای ربط دقیق موضوع به مدرک در پیکره‌های بزرگ با موضوعات ناهمگن و متنوع، نمی‌توان از این روش استفاده کرد. این روش در رویه‌های رسمی توصیف موضوعی تک‌تک مدارک به‌صورت مستقل می‌تواند به‌عنوان یک سیستم پیشنهاددهنده کلیدواژه به نیروی انسانی نمایه‌ساز به کار گرفته شود.

کلیدواژه‌ها

عنوان مقاله English

Automatic keyword extraction using Latent Dirichlet Allocation topic modeling: Similarity with golden standard and users' evaluation

نویسندگان English

Nosrat RiahiNia 1
Farzaneh Shadanpour 2
Keyvan Borna 2
Gholam Ali Montazer 3
1 Kharazmi University
2 Kharazmi University
3 Tarbiat Modares University
چکیده English

Purpose: This study investigates the automatic keyword extraction from the table of contents of Persian e-books in the field of science using LDA topic modeling, evaluating their similarity with the golden standard, and users' viewpoints of the model keywords.

Methodology: This is mixed text-mining research in which LDA topic modeling is used to extract keywords from the table of contents of scientific e-books. The evaluation of the used approach has been done by two methods of cosine similarity computing and qualitative evaluation by users.

Findings: Table of contents are medium-length texts with a trimmed mean of 260.02 words, about 20% of which are stop-words. The cosine similarity between the golden standard keywords and the output keywords is 0.0932 thus very low. The full agreement of users showed that the extracted keywords with the LDA topic model represent the subject field of the whole corpus, but the golden standard keywords, the keywords extracted using the LDA topic model in sub-domains of the corpus, and the keywords extracted from the whole corpus were respectively successful in subject describing of each document.

Conclusion: The keywords extracted using the LDA topic model can be used in unspecified and unknown collections to extract hidden thematic content of the whole collection, but not to accurately relate each topic to each document in large and heterogeneous themes. In collections of texts in one subject field, such as mathematics or physics, etc., with less diversity and more uniformity in terms of the words used in them, more coherent and relevant keywords are obtained, but in these cases, the control of the relevance of keywords to each document is required. In formal subject analysis procedures and processes of individual documents, this approach can be used as a keyword suggestion system for indexing and analytical workforce.

کلیدواژه‌ها English

Keyword extraction
Topic Modeling
Latent Dirichlet Allocation (LDA)
Similarity evaluation
Users' evalua-tion
Asgari, E., Chappelier, J.-C. (2013). Linguistic re-sources & topic models for the analysis of Per-sian poems. In Proceedings of the Second Work-shop on Computational Linguistics for Literature ( pp. 23-31), Atlanta, Georgia, June 14, 2013. Association for Computational Linguistics.
Asgari, E., Chappelier, J.-C. (2013). Linguistic re-sources & topic models for the analysis of Per-sian poems. In Proceedings of the Second Work-shop on Computational Linguistics for Literature ( pp. 23-31), Atlanta, Georgia, June 14, 2013. Association for Computational Linguistics.
Asmussen, C. B., & Mّller, Ch. (2019). Smart litera-ture review: A practical topic modeling approach to exploratory literature review. Journal of Big Data, 6(93). DOI: 10.1186/s40537-019-0255-7 [DOI:10.1186/s40537-019-0255-7]
Asmussen, C. B., & Mّller, Ch. (2019). Smart litera-ture review: A practical topic modeling approach to exploratory literature review. Journal of Big Data, 6(93). DOI: 10.1186/s40537-019-0255-7 [DOI:10.1186/s40537-019-0255-7]
Beliga, S., Mestrovic, A., & Martincic-Ipsic, S. (2015). An overview of graph-based keyword ex-traction methods and approaches. Journal of In-formation and Organization Sciences, 39(1), 1-20. Retrieved from https://jios.foi.hr/index.php/jios/article/view/938
Beliga, S., Mestrovic, A., & Martincic-Ipsic, S. (2015). An overview of graph-based keyword ex-traction methods and approaches. Journal of In-formation and Organization Sciences, 39(1), 1-20. Retrieved from https://jios.foi.hr/index.php/jios/article/view/938
Blei, Ng, and Jordan. (2003). Latent Dirichlet Allo-cation. Journal of Machine Learning Research, 3, 993-1022. DOI: 10.5555/944919.944937
Blei, Ng, and Jordan. (2003). Latent Dirichlet Allo-cation. Journal of Machine Learning Research, 3, 993-1022. DOI: 10.5555/944919.944937
Choi, Y., Hsieh-Yee, I., & Kules, B. (2007). Re-trieval effectiveness of table of contents and sub-ject headings. JCDL '07 June 18-23, 2007, Van-couver, British Columbia, Canada (pp.103-104). DOI:10.1145/1255175.1255195 [DOI:10.1145/1255175.1255195]
Choi, Y., Hsieh-Yee, I., & Kules, B. (2007). Re-trieval effectiveness of table of contents and sub-ject headings. JCDL '07 June 18-23, 2007, Van-couver, British Columbia, Canada (pp.103-104). DOI:10.1145/1255175.1255195 [DOI:10.1145/1255175.1255195]
Dieng, A. B., Ruiz, F. J. R., Blei, D. M. (2020). Topic modeling in embedding spaces. Transactions of the Association for Computational Linguistics, 8, 439-453. DOI: 10.1162/tacl a 00325 [DOI:10.1162/tacl_a_00325]
Dieng, A. B., Ruiz, F. J. R., Blei, D. M. (2020). Topic modeling in embedding spaces. Transactions of the Association for Computational Linguistics, 8, 439-453. DOI: 10.1162/tacl a 00325 [DOI:10.1162/tacl_a_00325]
Di Maggio, P., Nag, M., Blei, D. (2013). Exploiting affinities between topic modeling and the socio-logical perspective on culture: Application to newspaper coverage of U.S. government arts funding, Poetics, 41(6), 570-606. DOI: 10.1016/j.poetic.2013.08.004. [DOI:10.1016/j.poetic.2013.08.004]
Di Maggio, P., Nag, M., Blei, D. (2013). Exploiting affinities between topic modeling and the socio-logical perspective on culture: Application to newspaper coverage of U.S. government arts funding, Poetics, 41(6), 570-606. DOI: 10.1016/j.poetic.2013.08.004. [DOI:10.1016/j.poetic.2013.08.004]
Goh, R. (2018). Using Named Entity Recognition for Automatic Indexing. Paper presented at the IFLA WLIC, 2018, Kuala Lumpur, Malaysia
Goh, R. (2018). Using Named Entity Recognition for Automatic Indexing. Paper presented at the IFLA WLIC, 2018, Kuala Lumpur, Malaysia
Golube, K, Hagelbach, J., & Ardo, A. (2018). Au-tomatic classification using DDC on the Swedish :union: Catalogue. CEUR-WS.org/vol-2200/paper1.pdf
Golube, K, Hagelbach, J., & Ardo, A. (2018). Au-tomatic classification using DDC on the Swedish :union: Catalogue. CEUR-WS.org/vol-2200/paper1.pdf
Hamid, F. (2016). Evaluation techniques and graph-based algorithm for automatic summari-zation and keyphrase extraction. (Doctoral dis-sertation). Available from ProQuest Dissertations & Theses Global database. (UMI No. 10307512)
Hamid, F. (2016). Evaluation techniques and graph-based algorithm for automatic summari-zation and keyphrase extraction. (Doctoral dis-sertation). Available from ProQuest Dissertations & Theses Global database. (UMI No. 10307512)
Hoyt, B. (2020). Best practices for content manag-er ondemand full-text search. Retrieved from https://www.ibm.com/support/pages/sites/default/files/inline-files/Best%20practices%20for%20Using%20Full%20Text%20Searching%20with%20Content%20Manager%20OnDemand-4-22-2020.pdf
Hoyt, B. (2020). Best practices for content manag-er ondemand full-text search. Retrieved from https://www.ibm.com/support/pages/sites/default/files/inline-files/Best%20practices%20for%20Using%20Full%20Text%20Searching%20with%20Content%20Manager%20OnDemand-4-22-2020.pdf
Hurtado, J. L. (2016). Text mining and topic mod-eling for social and medical decision support. (Doctoral dissertation). Available from ProQuest Dissertations & Theses Global database. (UMI No. 10583055)
Hurtado, J. L. (2016). Text mining and topic mod-eling for social and medical decision support. (Doctoral dissertation). Available from ProQuest Dissertations & Theses Global database. (UMI No. 10583055)
Im, Y., Park, J., Kim, M., & Park, K. (2019). Com-parative study on perceived trust of topic model-ing based on affective level of educational text. Appl. Sci, 9(21), 4565. DOI: 10.3390/app9214565 [DOI:10.3390/app9214565]
Im, Y., Park, J., Kim, M., & Park, K. (2019). Com-parative study on perceived trust of topic model-ing based on affective level of educational text. Appl. Sci, 9(21), 4565. DOI: 10.3390/app9214565 [DOI:10.3390/app9214565]
Junger, U. (2018). Automated first- The subject cataloguing policy of the Deutsche Nationalbib-liottek. Paper presented at IFLA WLIC 2018- Kuala Lumpur, Malaysia- Transform Libraries, Transform Societies in Session 115- Subject Analysis and Access. Retrieved from http://library.ifla.org/2213/1/115-junger-en.pdf
Junger, U. (2018). Automated first- The subject cataloguing policy of the Deutsche Nationalbib-liottek. Paper presented at IFLA WLIC 2018- Kuala Lumpur, Malaysia- Transform Libraries, Transform Societies in Session 115- Subject Analysis and Access. Retrieved from http://library.ifla.org/2213/1/115-junger-en.pdf
Khoshian, Nahid, and Mirzaeian, Vahidreza (2020). The Most Widely Used Functions of Nat-ural Language Processing in the Field of Library Science and Information Science. Knowledge Re-trieval and Semantic Systems, 6(23), 117-151. DOI: 10.22054/jks.2020.44502.1238. (Persian)
Khoshian, Nahid, and Mirzaeian, Vahidreza (2020). The Most Widely Used Functions of Nat-ural Language Processing in the Field of Library Science and Information Science. Knowledge Re-trieval and Semantic Systems, 6(23), 117-151. DOI: 10.22054/jks.2020.44502.1238. (Persian)
Levy, K. E. C., & Franklin, M. (2014). Driving regu-lation: Using topic models to examine political contention in the U.S. trucking industry. Social Science Computer Review, 32(2), 182-194. DOI: 10.1177/0894439313506847 [DOI:10.1177/0894439313506847]
Levy, K. E. C., & Franklin, M. (2014). Driving regu-lation: Using topic models to examine political contention in the U.S. trucking industry. Social Science Computer Review, 32(2), 182-194. DOI: 10.1177/0894439313506847 [DOI:10.1177/0894439313506847]
Maier, D., Waldherr, A., Miltner, P., Wiedemann, G., Niekler, A., Keinert, A., Pfetsch, B., Heyer, G., Reber, U., Häussler, T., Schmid-Petri, H., & Ad-am,A. (2018). Applying LDA topic modeling in communication research: Toward a valid and re-liable methodology. Communication Methods and Measures, DOI: 10.1080/19312458.2018.1430754 [DOI:10.1080/19312458.2018.1430754]
Maier, D., Waldherr, A., Miltner, P., Wiedemann, G., Niekler, A., Keinert, A., Pfetsch, B., Heyer, G., Reber, U., Häussler, T., Schmid-Petri, H., & Ad-am,A. (2018). Applying LDA topic modeling in communication research: Toward a valid and re-liable methodology. Communication Methods and Measures, DOI: 10.1080/19312458.2018.1430754 [DOI:10.1080/19312458.2018.1430754]
Mas'oudi, B., & Rahati Ghochani S. (2016). Farsi word sense disambiguation with LDA Topic model . JSDP, 12 (4), 117-125. Retrieved from
Mas'oudi, B., & Rahati Ghochani S. (2016). Farsi word sense disambiguation with LDA Topic model . JSDP, 12 (4), 117-125. Retrieved from
http://jsdp.rcisp.ac.ir/article-1-58-fa.html. (Persian)
http://jsdp.rcisp.ac.ir/article-1-58-fa.html. (Persian)
Momtazi, S. (2018). Unsupervised Latent Dirichlet Allocation for supervised question classification. Information Processing and Management, 54,380-393. DOI: 10.1016/j.ipm.2018.01.001 [DOI:10.1016/j.ipm.2018.01.001]
Momtazi, S. (2018). Unsupervised Latent Dirichlet Allocation for supervised question classification. Information Processing and Management, 54,380-393. DOI: 10.1016/j.ipm.2018.01.001 [DOI:10.1016/j.ipm.2018.01.001]
Onal Suzek, T. (2017). Using latent semantic anal-ysis for automated keyword extraction from large document corpora. Turkish Journal of Elec-trical Engineering & Computer Sciences, 25, 1784-1794. DOI: 10.3906/elk-1511-203 [DOI:10.3906/elk-1511-203]
Onal Suzek, T. (2017). Using latent semantic anal-ysis for automated keyword extraction from large document corpora. Turkish Journal of Elec-trical Engineering & Computer Sciences, 25, 1784-1794. DOI: 10.3906/elk-1511-203 [DOI:10.3906/elk-1511-203]
Pietsch, A.-S., & Lessmann, S. (2018) Topic model-ing for analyzing open-ended survey responses. Journal of Business Analytics, 1(2), 93-116. DOI: 10.1080/2573234X.2019.1590131 [DOI:10.1080/2573234X.2019.1590131]
Pietsch, A.-S., & Lessmann, S. (2018) Topic model-ing for analyzing open-ended survey responses. Journal of Business Analytics, 1(2), 93-116. DOI: 10.1080/2573234X.2019.1590131 [DOI:10.1080/2573234X.2019.1590131]
Pokorny, J. (2018). Automatic subject indexing and classification using text recognition and computer based analysis of the table of contents. In Chau, L. [DOI:10.4000/proceedings.elpub.2018.19]
Pokorny, J. (2018). Automatic subject indexing and classification using text recognition and computer based analysis of the table of contents. In Chau, L. [DOI:10.4000/proceedings.elpub.2018.19]
& Mounier, P. ELPUB 2018. June 2018, Toronto, Canada. DOI: 10.4000/proceedings.elpub.2018.19. [DOI:10.4000/proceedings.elpub.2018.19]
& Mounier, P. ELPUB 2018. June 2018, Toronto, Canada. DOI: 10.4000/proceedings.elpub.2018.19. [DOI:10.4000/proceedings.elpub.2018.19]
Rahgozar, A. (2020). Automatic poetry classifica-tion and chronological semantic analysis. (Doc-toral dissertation). The University of Ottawa. Canada. Retrieved from https://ruor.uottawa.ca/bitstream/10393/40516/3/Rahgozar_Arya_2020_thesis.pdf
Rahgozar, A. (2020). Automatic poetry classifica-tion and chronological semantic analysis. (Doc-toral dissertation). The University of Ottawa. Canada. Retrieved from https://ruor.uottawa.ca/bitstream/10393/40516/3/Rahgozar_Arya_2020_thesis.pdf
Revert, F. (2019). An Overview of Topics Extrac-tion in Python with Latent Dirichlet Allocation. Retrieved from https://www.kdnuggets.com/2019/09/overview-topics-extraction-python-latent-dirichlet-allocation.html
Revert, F. (2019). An Overview of Topics Extrac-tion in Python with Latent Dirichlet Allocation. Retrieved from https://www.kdnuggets.com/2019/09/overview-topics-extraction-python-latent-dirichlet-allocation.html
Riaz, K. H. (2018). Improving search via named entity recognition in morphologically rich lan-guages - A case study in Urdu (Doctoral disserta-tion). Available from ProQuest Dissertations & Theses Global database. (UMI No. 10747478)
Riaz, K. H. (2018). Improving search via named entity recognition in morphologically rich lan-guages - A case study in Urdu (Doctoral disserta-tion). Available from ProQuest Dissertations & Theses Global database. (UMI No. 10747478)
Risch, J. (2016). Detecting Twitter topics using La-tent Dirichlet Allocation. (Master's Thesis). Re-trieved from http://uu.diva-por-tal.org/smash/get/diva2:904196/FULLTEXT01.pdf
Risch, J. (2016). Detecting Twitter topics using La-tent Dirichlet Allocation. (Master's Thesis). Re-trieved from http://uu.diva-por-tal.org/smash/get/diva2:904196/FULLTEXT01.pdf
Roder, M., Both, A., & Hinneburg, A. (2015). Ex-ploring the space of topic coherence measures. In The Eighth ACM International Conference on Web Search and Data Mining WSDM'15, Feb-ruary 2-6, Shanghai, China (pp. 39- 408). ACM. DOI: 10.1145/2684822.2685324 [DOI:10.1145/2684822.2685324]
Roder, M., Both, A., & Hinneburg, A. (2015). Ex-ploring the space of topic coherence measures. In The Eighth ACM International Conference on Web Search and Data Mining WSDM'15, Feb-ruary 2-6, Shanghai, China (pp. 39- 408). ACM. DOI: 10.1145/2684822.2685324 [DOI:10.1145/2684822.2685324]
Sadeghi, M., & Vegas, J. (2014). Automatic identi-fication of light stop words for Persian infor-mation retrieval systems. Journal of Information Science, 40, 476 - 487. DOI: 10.1177/0165551514530655 [DOI:10.1177/0165551514530655]
Sadeghi, M., & Vegas, J. (2014). Automatic identi-fication of light stop words for Persian infor-mation retrieval systems. Journal of Information Science, 40, 476 - 487. DOI: 10.1177/0165551514530655 [DOI:10.1177/0165551514530655]
Sbalchiero, S., & Eder, M. (2020). Topic modeling, long texts, and the best number of topics: Some Problems and solutions. Quality & Quantity, 54, pp. 1095-1108. DOI: 10.1007/s11135-020-00976-w [DOI:10.1007/s11135-020-00976-w]
Sbalchiero, S., & Eder, M. (2020). Topic modeling, long texts, and the best number of topics: Some Problems and solutions. Quality & Quantity, 54, pp. 1095-1108. DOI: 10.1007/s11135-020-00976-w [DOI:10.1007/s11135-020-00976-w]
Saidul Hasan, K., & Ng, V. (2014). Automatic keyphrase extraction: A survey of the state of the art. In Proceedings of the 52nd Annual Meet-ing of the Association for Computational Lin-guistics, Baltimore, Maryland, USA, June 23-25, 2014. Pp. 1262-1273. DOI: 10.3115/v1/p14-1119 [DOI:10.3115/v1/P14-1119]
Saidul Hasan, K., & Ng, V. (2014). Automatic keyphrase extraction: A survey of the state of the art. In Proceedings of the 52nd Annual Meet-ing of the Association for Computational Lin-guistics, Baltimore, Maryland, USA, June 23-25, 2014. Pp. 1262-1273. DOI: 10.3115/v1/p14-1119 [DOI:10.3115/v1/P14-1119]
Schauble, P. (1997). Multimedia information re-trieval: Content-based information retrieval from large text and audio databases. New York: Springer Science+Business Media. [DOI:10.1007/978-1-4615-6163-7]
Schauble, P. (1997). Multimedia information re-trieval: Content-based information retrieval from large text and audio databases. New York: Springer Science+Business Media. [DOI:10.1007/978-1-4615-6163-7]
Schofield, A., Magnusson, M., & Mimno, D. (2017). Pulling Out the Stops: Rethinking Stop-word Removal for Topic Models. In Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics: Volume 2, Short Papers, Valencia, Spain, April 2017 (pp. 432-436). Association for Computa-tional Linguistics. https://www.aclweb.org/anthology/E17-2069.pdf [DOI:10.18653/v1/E17-2069]
Schofield, A., Magnusson, M., & Mimno, D. (2017). Pulling Out the Stops: Rethinking Stop-word Removal for Topic Models. In Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics: Volume 2, Short Papers, Valencia, Spain, April 2017 (pp. 432-436). Association for Computa-tional Linguistics. https://www.aclweb.org/anthology/E17-2069.pdf [DOI:10.18653/v1/E17-2069]
Sfakakis, M., Zoutsou, K., Papachristopoulos, L., Tsakonas, G., & Papatheorodu, Ch. (2019, Au-gust). Between two worlds: harmonizing auto-mated and manual term labeling. Paper present-ed at IFLA WLIC 2019 - Athens, Greece - Librar-ies: dialogue for change in Session S02 - Knowledge Management with Digital Humani-ties/Digital Scholarship. In: Artificial Intelligence (AI) and its impact on libraries and librarianship, 22 August 2019, Corfu, Greece. Retrieved from http://library.ifla.org/2759/1/s02-2019-sfakakis-en.pdf
Sfakakis, M., Zoutsou, K., Papachristopoulos, L., Tsakonas, G., & Papatheorodu, Ch. (2019, Au-gust). Between two worlds: harmonizing auto-mated and manual term labeling. Paper present-ed at IFLA WLIC 2019 - Athens, Greece - Librar-ies: dialogue for change in Session S02 - Knowledge Management with Digital Humani-ties/Digital Scholarship. In: Artificial Intelligence (AI) and its impact on libraries and librarianship, 22 August 2019, Corfu, Greece. Retrieved from http://library.ifla.org/2759/1/s02-2019-sfakakis-en.pdf
Short, M. (2019). Text mining and subject analysis for fiction; or, using machine learning and infor-mation extraction to assign subject headings to dime novels. Cataloging and Classification Quar-terly, 57(5), 315-336. DOI: 10.1080/01639374.2019.1653413 [DOI:10.1080/01639374.2019.1653413]
Short, M. (2019). Text mining and subject analysis for fiction; or, using machine learning and infor-mation extraction to assign subject headings to dime novels. Cataloging and Classification Quar-terly, 57(5), 315-336. DOI: 10.1080/01639374.2019.1653413 [DOI:10.1080/01639374.2019.1653413]
Sun, Y., Loparo, K., & Kolacinski, R. (2020). Con-versational Structure Aware and Context Sensi-tive Topic Model for Online Discussions. 2020 IEEE 14th International Conference on Seman-tic Compting(ICSC),(pp.8592).DOI:10.1109/ICSC.2020.00019 [DOI:10.1109/ICSC.2020.00019]
Sun, Y., Loparo, K., & Kolacinski, R. (2020). Con-versational Structure Aware and Context Sensi-tive Topic Model for Online Discussions. 2020 IEEE 14th International Conference on Seman-tic Compting(ICSC),(pp.8592).DOI:10.1109/ICSC.2020.00019 [DOI:10.1109/ICSC.2020.00019]
Tchoua, R. B. (2019). Hybrid human-machine scientific information extraction. (Doctoral dis-sertation). Available from ProQuest Dissertations & Theses Global database. (UMI No. 13904924)
Tchoua, R. B. (2019). Hybrid human-machine scientific information extraction. (Doctoral dis-sertation). Available from ProQuest Dissertations & Theses Global database. (UMI No. 13904924)
Sun, Ch., Hu, L., Li, Sh., Li,T., Li, H., & Chi, L. (2020). A Review of Unsupervised Keyphrase Extraction
Sun, Ch., Hu, L., Li, Sh., Li,T., Li, H., & Chi, L. (2020). A Review of Unsupervised Keyphrase Extraction
Methods Using Within-Collection Resources. Symmetry, 12(1864). DOI:10.3390/sym12111864 [DOI:10.3390/sym12111864]
Methods Using Within-Collection Resources. Symmetry, 12(1864). DOI:10.3390/sym12111864 [DOI:10.3390/sym12111864]
Syed, Sh., and Spruit, M. (2017). Full-Text or Ab-stract? Examining Topic Coherence Scores Using [DOI:10.1109/DSAA.2017.61] [PMID]
Syed, Sh., and Spruit, M. (2017). Full-Text or Ab-stract? Examining Topic Coherence Scores Using [DOI:10.1109/DSAA.2017.61] [PMID]
Latent Dirichlet Allocation. 2017 IEEE Interna-tional Conference on Data Science and Ad-vanced
Latent Dirichlet Allocation. 2017 IEEE Interna-tional Conference on Data Science and Ad-vanced
Analytics (DSAA), Tokyo, Japan, 2017, pp. 165-174. doi: 10.1109/DSAA.2017.61 [DOI:10.1109/DSAA.2017.61] [PMID]
Analytics (DSAA), Tokyo, Japan, 2017, pp. 165-174. doi: 10.1109/DSAA.2017.61 [DOI:10.1109/DSAA.2017.61] [PMID]
Tushara, M. G., Mownika, T., & Mangamuru, R. (2019). A comparative study on different key-word extraction algorithms. In Proceedings of the Third International Conference on Computing Methodologies and Communication (ICCMC 2019), Erode. India, 2019. Pp 969-973; DOI: 10.1109/ICCMC.2019.8819630 [DOI:10.1109/ICCMC.2019.8819630]
Tushara, M. G., Mownika, T., & Mangamuru, R. (2019). A comparative study on different key-word extraction algorithms. In Proceedings of the Third International Conference on Computing Methodologies and Communication (ICCMC 2019), Erode. India, 2019. Pp 969-973; DOI: 10.1109/ICCMC.2019.8819630 [DOI:10.1109/ICCMC.2019.8819630]
Wang, W., Feng, Y., & Dai, W. (2018). Topic anal-ysis of online reviews for two competitive prod-ucts using Latent Dirichlet Allocation. Electronic Commerce Research and Application, 29, 142-156. DOI:10.1016/j.elerap.2018.04.003 [DOI:10.1016/j.elerap.2018.04.003]
Wang, W., Feng, Y., & Dai, W. (2018). Topic anal-ysis of online reviews for two competitive prod-ucts using Latent Dirichlet Allocation. Electronic Commerce Research and Application, 29, 142-156. DOI:10.1016/j.elerap.2018.04.003 [DOI:10.1016/j.elerap.2018.04.003]
Wang, Y., & Taylor, J. E. (2019). DUET: data-driven approach based on Latent Dirichlet Allo-cation topic modeling. Journal of Computing in Civil Engineering, 33(3), 04019023. [DOI:10.1061/(ASCE)CP.1943-5487.0000819]
Wang, Y., & Taylor, J. E. (2019). DUET: data-driven approach based on Latent Dirichlet Allo-cation topic modeling. Journal of Computing in Civil Engineering, 33(3), 04019023. [DOI:10.1061/(ASCE)CP.1943-5487.0000819]
Xing, L., Paulz, M. J., & Carenini, G. (2019). Eval-uating Topic Quality with Posterior Variability. In Proceedings of the 2019 Conference on Empiri-cal Methods in Natural Language Processing and the 9th International Joint Conference on Natu-ral Language Processing, Hong Kong, China, No-vember 3-7, 2019 (pp. 3471-3477). Association for Computational Linguistics. DOI: 10.18653/v1/D19-1349 [DOI:10.18653/v1/D19-1349]
Xing, L., Paulz, M. J., & Carenini, G. (2019). Eval-uating Topic Quality with Posterior Variability. In Proceedings of the 2019 Conference on Empiri-cal Methods in Natural Language Processing and the 9th International Joint Conference on Natu-ral Language Processing, Hong Kong, China, No-vember 3-7, 2019 (pp. 3471-3477). Association for Computational Linguistics. DOI: 10.18653/v1/D19-1349 [DOI:10.18653/v1/D19-1349]
Yan, Y., Guo, J., Lan, Y., & Cheng, X. (2013). A Biterm topic model for short texts. WWW2013, May, 13-17,2013, Rio de Janeiro, Brazil. DOI: 10.1145/2488388.2488514 [DOI:10.1145/2488388.2488514]
Yan, Y., Guo, J., Lan, Y., & Cheng, X. (2013). A Biterm topic model for short texts. WWW2013, May, 13-17,2013, Rio de Janeiro, Brazil. DOI: 10.1145/2488388.2488514 [DOI:10.1145/2488388.2488514]
Yao, J., Wang, Y., Zhang, Y., Sun, J., & Zhou, J. (2018). Joint Latent Dirichlet Allocation for social tags. IEEE Transactions on Multimedia, 20(1). DOI: 0.1109/TMM.2017.2716829 [DOI:10.1109/TMM.2017.2716829]
Yao, J., Wang, Y., Zhang, Y., Sun, J., & Zhou, J. (2018). Joint Latent Dirichlet Allocation for social tags. IEEE Transactions on Multimedia, 20(1). DOI: 0.1109/TMM.2017.2716829 [DOI:10.1109/TMM.2017.2716829]