-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Deep Classification, Embedding & Text Generation - Orientation #33
Comments
LeCun et all make a strong case for the power of deep learning in a wide array of applications. My question is simple, if the recent advancements in neural networks and deep learning have allowed these algorithms to outperform their classical supervised learning peers in classification tasks what is the main reason analysis/researchers like us might still prefer the latter to the former in their applications? |
All authors presented the usage of deep learning in NLP very clearly, also mentioning the potential (or already implemented) application of these techniques. It seems to me, though, that as methods in the CS community grew more and more sophisticated, it would be harder to modify them to fit social science purposes. For example, Hopkins and Kings (2010) suggested a different objective function for classification. Are there any recent modification of deep learning methods that can directly target social science objectives? Also do you view newly developed deep learning models as bigger "black box" that might be great for prediction, yet very hard to get nice interpretations? |
It was interesting to see the examples of text generated by deep learning algorithms in Karpathy (2015). All of the readings seemed to agree that a corpus composed of more texts is better when training models, and the increased availability of computing resources makes these large corpuses more feasible. My question, then, is at what point is it no longer helpful to add more text to the training base? How does the agreed-upon definition of "satisfactory" differ across applications? |
Social science applications of deep learning seem to use certain metrics pulled from the model, e.g. perplexity from BERT. Can you list the available metrics and, ideally, a publication using each of them in a social science context? (The exception to this seems to be "digital doubles," which make use of the whole model.) |
In their conclusion, LeCun et al write that they "expect unsupervised learning to become far more important |
Extending from @RobertoBarrosoLuque 's question, what are the limitations of deep learning methods compared to machine learning methods? |
Deep learning is really popular in recent days, however, I have been worried about the hidden layer of the deep learning algorithms for a while--though it can provide us really accurate predictions, how can we make sure this method will continue to work in the future or in various application cases if we don't really understand it? For example, is the failure of the Google Flu Trends an example of the failure of this unknown black box? |
In Karpathy and Andrej's paper "The Unreasonable Effectiveness of Recurrent Neural Networks", they used RNN model and produced strange results for Wikipedia and literature works. Obviously, the results are not satisfactory. So I'm wondering if this is a limitation of RNNs themselves, or if this article is just an exception to the application of RNNs. Can RNNs perform well in more general situations? If so, could you please provide some examples of applications where RNNs have shown excellent performance? For another paper by OpenAI, I am really glad to see that they have given attention to the potential abuse that this great model could lead to. For example,
They show a very cautious approach on this issue. I think this is something that many scholars and researchers can learn from. While unleashing human ingenuity on technology, we should also be careful about the potential risks we may have to take and face! |
Twitter has already been crowded with bots that generates posts and activities by themselves. Even if organizations like OpenAI doesn't release their model, the technology will eventually emerge elsewhere and possibly go into the wrong hands. How do you think we should deal with the potential problem of massive bot-generated misinformation on the internet, specifically in social media? |
As researchers, we should be wary about the power or the convenience brought by those extremely large neural networks. I believe that there is no free lunch in the world. We should be careful about how to interpret the result from the black-box model. And isn't interpretable neural network a more important research avenue for social scientists than those benchmark-driven ML conferences? |
Engineering NLP models have attracted more attention from text analysts as compared to a cognitive/psycholinguistic viewpoint. Are there areas of social content analysis you think would benefit from prioritising this viewpoint? Are there any noteworthy perspectives from psycholinguistics that have been integrated into mainstream text analysis? |
Through techniques of unsupervised learning, and deep learning, we may capture and predict the data. While my concern is, with increasing better prediction accuracy, we actually don't understand the pattern or the causal relationship well. It seems very possible to cause mistakes when we do such a prediction in a black box. How do you think of the value for the danger of the techniques like that? And will there be science just aiming at solving such potential mistakes? |
As we saw in the HW this week, deep learning algorithms and training seem to be extremely computationally intensive. I'm wondering: do you have any advice for achieving professional-level results such as the algorithms in these readings without professional-level hardware? Also, do you think that further developments will make these algorithms more accessible, or is their nature just to demand more and more computational power as they advance (and thereby means that we just need better machines)? |
I'm wondering how are the LSTM networks evolving these years and whether alternative approaches exist. Since the paper concludes that "theoretical and empirical evidence shows that it is difficult to learn to store information for very long", would you mind explaining why this long time storage might be a problem for machine translation? What useful thoughts or mechanisms can we explore in this case? |
How to use deep learning for causal inference? What are the advantages and disadvantages of deep learning for this particular purpose? |
The deep learning methods are very powerful and has huge potential to solve great problems. What do you think are the key reasons behind their success? What is special about 'deep' versus 'shallow' learning? Thanks! |
Will it be possible that deep learning can predict human behavior someday? Human behavior is getting digitalized, with the increasing use of digital devices (such as Apple Watch). With millions of behavioral, physiological, and psychological records at hand, it seems that deep learning or some new technique has the potential to predict human behavior in the future. Is this a concern or opportunity for social sciences? |
My question is similar to @lilygrier 's: how do we decide the corpus is large enough? |
It is interesting to see, in Karpathy's blogpost, how they implement RNN to generate text character by character. Although the author highlighted the effectiveness of text generation by using RNN, how can we evaluate the outputs of RNN as successful? |
Thinking back to the discussion in week 4 about predictive models that are powerful but difficult to interpret, that seems to be even more the case for deep learning. What does this mean for the kinds of questions that deep learning methodologies are suited to answer? Are there instances in which less powerful models could actually be more analytically useful? |
What would be some of the reasons for using simpler models rather than complex RNNs or CNNs for text analysis? Can complexity and interpretability be a concern for our models in text analysis or are we always interested in better performance and accuracy? |
When working with text that is not in English, what models can we use? What are the risks and limitations when training or our models (e.g., not enough computational power) that we should take into account? |
LeCun, Yann, Yoshua Bengio & Geoffrey Hinton. 2015. “Deep Learning.” Nature 521: 436-444.
Karpathy, Andrej. 2015. “The Unreasonable Effectiveness of Recurrent Neural Networks”. Blogpost.
OpenAI. 2019. “Better Language Models and Their Implications”. Blogpost.
The text was updated successfully, but these errors were encountered: