Natural Language Processing - Usage and Scope in Modern Day Data


Dewan Ziaul Karim (DZK)

Lecturer

ziaul.karim@bracu.ac.bd

Synopsis

 

Natural language processing (NLP) is the branch of computer science—specifically, the branch of artificial intelligence or AI—concerning giving computers the capacity to understand text and spoken words in the same manner that humans do. NLP blends computational linguistics (human language rule-based modeling) with statistical, machine learning, and deep learning models. These technologies, when combined, allow computers to analyze human language in the form of text or speech data and 'understand' its full meaning, complete with the speaker's or writer's intent and sentiment. Efficient NLP techniques (in both Bengali and English) can help machines to communicate well with the users and provide a lot of useful insights. NLP can be used in speech recognition, speech tagging, co-reference resolution, summarization, sentiment analysis, etc. 


Relevance of the Topic

 

Natural language processing enables computers to speak with humans in their native language while also automating other language-related processes. NLP, for example, enables computers to read text, hear voice, analyze it, gauge sentiment, and identify which bits are significant. Machines can now interpret more language-based data than humans, without becoming fatigued and in a consistent, unbiased manner. Given the massive volume of unstructured data generated every day, from medical records to social media, automation will be essential for efficiently analyzing text and audio data.


Future Research/Scope

 

Build a model to perform sentiment analysis, hate speech recognition, fake news classification, etc.

Build an application for summarization, medical document analysis, etc.

Provide useful insights in businesses regarding market trends, customer behavior, etc.


Skills Learned

 

  • Basics of machine learning.
  • Usage of high level neural network APIs such as keras.
  • Usage of different libraries such as Natural Language Toolkit (NLTK), Gensim, CoreNLP, spaCy, TextBlob, etc.
  • Usage of different model performance metrics such as precision, recall, f1 score, confusion matrix, etc. 

Relevant courses to the topic

 

  • Artificial Intelligence (CSE422)
  • Machine Learning (CSE427)
  • Natural Language Processing-I (CSE431)
  • Natural Language Processing-II (CSE440)
  • Neural Networks (CSE425)

Reading List

 

  • Pak, A. and Paroubek, P., 2010, May. Twitter as a corpus for sentiment analysis and opinion mining. In LREc (Vol. 10, No. 2010, pp. 1320-1326).
  • Liu, B. and Zhang, L., 2012. A survey of opinion mining and sentiment analysis. In Mining text data (pp. 415-463). Springer, Boston, MA.
  • Tripto, N.I. and Ali, M.E., 2018, September. Detecting multilabel sentiment and emotions from bangla youtube comments. In 2018 International Conference on Bangla Speech and Language Processing (ICBSLP) (pp. 1-6). IEEE.
  • Del Vigna12, F., Cimino23, A., Dell’Orletta, F., Petrocchi, M. and Tesconi, M., 2017, January. Hate me, hate me not: Hate speech detection on facebook. In Proceedings of the first Italian conference on cybersecurity (ITASEC17) (pp. 86-95).
  • Neto, J.L., Freitas, A.A. and Kaestner, C.A., 2002. Automatic text summarization using a machine learning approach. In Advances in Artificial Intelligence: 16th Brazilian Symposium on Artificial Intelligence, SBIA 2002 Porto de Galinhas/Recife, Brazil, November 11–14, 2002 Proceedings 16 (pp. 205-215). Springer Berlin Heidelberg.
  • Murff, H.J., FitzHenry, F., Matheny, M.E., Gentry, N., Kotter, K.L., Crimin, K., Dittus, R.S., Rosen, A.K., Elkin, P.L., Brown, S.H. and Speroff, T., 2011. Automated identification of postoperative complications within an electronic medical record using natural language processing. Jama, 306(8), pp.848-855.
  • Liu, X., Shin, H. and Burns, A.C., 2021. Examining the impact of luxury brand's social media marketing on customer engagement​: Using big data analytics and natural language processing. Journal of Business research, 125, pp.815-826.

 



©2024 BracU CSE Department