Hi I am Dwi Setyo Aji from Indonesia.Currently I am doing my research on bachelor degree at UPN Veteran Jawa Timur Communication Science.You will be confused why a communication student creating Artificial intelligence,let me explain in a little bit. Laswell in Littlejohn (2011:334) define "Communication is Who, Says What, In Which Channel, To Whom, With What Effect". To understand I do this because the word of "Says What".Message is very crucial in Communication especially in corporate.We are focusing the text message and only analyze that object. This method called "Content analysis" where is we analyse the content of message,in this scenario is the text. We already encountered a big chunk data text but not really look at it because the automation like this not really common to build in our major studies. We are living in era where data is too many for us to read in our life time.Creating automate in Natural Language Processing is a thing we need more researching for enhance our quality in Communication science. And here I am ready for challenge any difficulties I will facing.
In this Project I will primarily using Artificial Intelligence especially Natural Language Processing library with Tensorflow,Scikit-learn,Spacy, gensim and a few other library like pandas,numpy,matplotlib,Networkx,plotly,pyvis and etc. I am gonna using pycharm as primary IDE for coding.
The first task I will complete is topic classification with Neural Network Architecture Bidirectional Long Short Term Memory[1]. The case will be in multiclass classification. Company need to classify text so they can make more different approach for different problem. From digital humanites perspective this theory came from what we called Public Relation. Building trusted relationship with public need a good understanding of the text data. Every text element of customer message need to clasify in perfect order of scope.Let say a text complain will differently processed from a text contain customer complain.
Second Task will be sentiment analysis with Neural Network Architecture Bidirectional Encoder Representations from Transformers[2]. The task only binary classification.This is because for much easier to detect Public relation crisis.
Third task will be extract all text customer message using gensim in unsupervised latent Dirichlet Allocation (LDA)[3] model by gensim[4]. We are doing this to make us more understand what happen or what topic people think about the research subject. With this machine learning we are can see a unseen pattern of message[5].
Fourth Task will be relation extraction.But first we need to use and train Named Entity Recognition. This is will the most complicated task since NER in bahasa indonesia still difficult to train. Well I will said this is almost impossible to achieve since i know nothing about this NLP Task.
If there is a possible time I will also mapping the network graph[6],in this scenario we can see who is the most powerfull person who have influence our sentiment in social media. For visualize the network we are gonna use networkx and pyvis[7]
[1]https://www.analyticsvidhya.com/blog/2021/06/lstm-for-text-classification/
[2]https://www.tensorflow.org/text/tutorials/classify_text_with_bert
[3]https://towardsdatascience.com/latent-dirichlet-allocation-lda-9d1cd064ffa2
[4]https://radimrehurek.com/gensim/models/ldamodel.html
[5]Mattingly, William. Introduction to Topic Modeling and Text Classification, 2021. topic-modeling.pythonhumanities.com.
[6]ric A. Hagberg, Daniel A. Schult and Pieter J. Swart, “Exploring network structure, dynamics, and function using NetworkX”, in Proceedings of the 7th Python in Science Conference (SciPy2008), Gäel Varoquaux, Travis Vaught, and Jarrod Millman (Eds), (Pasadena, CA USA), pp. 11–15, Aug 2008