-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Exploring Semantic Spaces - Fundamentals #32
Comments
My question is non-technical. According to the text, human languages have a wide variety of features that are used to convey meaning. And just like the paper included in this week, Caliskan and colleagues found that semantics derived automatically from language corpora contain human-like biases. So I’m curious if we have some techniques to eliminate social biases when constructing our logical representations of sentence meaning? I’m afraid the social biases embedded in human expression would aggravate the social imbalance with the wide use of automatical techniques in the near future. |
Is there an established way to run word2vec or another embedding algorithm with attention weights? i.e. where the context words are given unequal weights depending on their relevance. I see Sonkar et al 2020 attempt this, but I don't know how viable their approach is and whether it's coded up so we can use it (e.g. in Gensim), or whether it really makes a difference in the semantic space we build. |
Can multilayer perceptron networks be used in unsupervised learning? |
I'd like to ask here your opinion on semantic parsing. Many think that semantic parsing is a bulky approach, much less efficient than neural networks, but it is indeed intuitive to human reading. Do you think it's an approach worth investigating on? |
How can we choose the best loss functions? If I understand your first lecture this week correctly, you mentioned some are good at understanding the texts, and others are good at predictions. Is there any rule of thumb when choosing loss functions? |
Has any CSS research used dynamic contextual embeddings (from an attention-based transformer model) recently? Given the superior performance of BERT and transformer-based models on 'meaning' related benchmarks, I'm wondering what exciting new CSS research paradigms these dynamic embeddings can power. |
I understand the foundations and sentence level functionality of semantic parsing as is explained by this piece - but I was hoping to hear a bit more about the types of texts in which it functions best and is most often implemented - could you elaborate on some examples where semantic parsing is most often used and most often useful (i.e what types of differences might we see when texts are more or less correlated? More linguistically consistent?)? |
As the quote of Zhuangzi is originally in Chinese, I am wondering if the embedding models are applicable for Chinese, given that the concepts introduced in Chapter 6 do not work in Chinese. Chinese characters can either express meanings alone or in combination, and there are usually no spaces in a sentence. |
I'm wondering whether tense logic would be easier to deal with in other languages. In particular, "temporal expressions in English are frequently expressed in spatial terms, as is illustrated by the various uses of at, in, somewhere, and near in these examples". |
The problem of bias and embeddings appear again, which is an important topic in one of the exemplary readings. What's the practical problem of it in the analysis, and what are techniques to reduce the bias when using embedding? |
In “Vector Semantics”, why can we assume that "battle", "good", "fool", "wit" are orthogonal to each other and assign them four distinctive dimensions, as in figure 6.2? |
Is there any literature that examines the qualitative difference between embedding generated by, said, LSA using 300 SVD components and a learned feedforward neural network representation layer? |
The word2vec technique is awesome! The co-occurrence information brings in much richer second-order information of texts! I have a question about the assessment of word-embeddings. What are some common metrics to assess the effectiveness of the embeddings learnt? |
Extending from @ming-cui , I wonder if the usefulness embedding models is stable across other forms of languages. |
Chapter 6 discusses attempts at debiasing, but notes that bias can not be eliminated. Are there particular forms of bias that are especially difficult to reduce? What kinds of consequences can this have in applications of these methods? |
If possible could we go more in depth into how word embedding are trained? Say we want to compare how two actors use the same term, should we have a corpus for each then train a wordembedding for each and find the closest words to our term of interest? Also, using word embeddings is more memory efficient than using tf-idf or count vectors since, unlike the other two, embeddings are dense, yet they seem more computationally inefficient since they require learning complex weights through stochastic gradient descent. How should we weigh these trade-offs when thinking about which word representations to use? |
For this week’s reading, I noticed that it is necessary to generate training data for the transition-based dependency parsing. I wonder what is the “appropriate” size of the training set in order to obtain a reliable model? Would this algorithm be robust if we are not able to provide enough information? |
I would like to hear more about the unsupervised vs supervied neurat nets and their different uses in text analysis. Also is there a reason, say for a classification problem, why a researcher would choose to use logistic regression instead of a neural network besides interpretability? And it would be great if you could talk more about the word2vec representation of words and how its results are being processed by neural networks. |
Given the extra computational effort required, in what contexts do word embeddings provide insight above and beyond more straightforward analyses of co-occurrences through looking at k-grams? Are there cases where it actually makes more sense not to use a mathematically intense operation such as word2vec? |
Post questions here for one or more of our fundamentals readings:
Jurafsky, Daniel and James H. Martin. 2015. Speech and Language Processing. Chapters 15-16 (“Vector Semantics”, “Semantics with Dense Vectors”)
The text was updated successfully, but these errors were encountered: