Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Exploring Semantic Spaces - Fundamentals #32

Open
HyunkuKwon opened this issue Jan 12, 2021 · 19 comments
Open

Exploring Semantic Spaces - Fundamentals #32

HyunkuKwon opened this issue Jan 12, 2021 · 19 comments

Comments

@HyunkuKwon
Copy link
Collaborator

Post questions here for one or more of our fundamentals readings:

Jurafsky, Daniel and James H. Martin. 2015. Speech and Language Processing. Chapters 15-16 (“Vector Semantics”, “Semantics with Dense Vectors”)

@Raychanan
Copy link

My question is non-technical. According to the text, human languages have a wide variety of features that are used to convey meaning. And just like the paper included in this week, Caliskan and colleagues found that semantics derived automatically from language corpora contain human-like biases. So I’m curious if we have some techniques to eliminate social biases when constructing our logical representations of sentence meaning? I’m afraid the social biases embedded in human expression would aggravate the social imbalance with the wide use of automatical techniques in the near future.

@jacyanthis
Copy link

Is there an established way to run word2vec or another embedding algorithm with attention weights? i.e. where the context words are given unequal weights depending on their relevance. I see Sonkar et al 2020 attempt this, but I don't know how viable their approach is and whether it's coded up so we can use it (e.g. in Gensim), or whether it really makes a difference in the semantic space we build.

@xxicheng
Copy link

Can multilayer perceptron networks be used in unsupervised learning?

@romanticmonkey
Copy link

I'd like to ask here your opinion on semantic parsing. Many think that semantic parsing is a bulky approach, much less efficient than neural networks, but it is indeed intuitive to human reading. Do you think it's an approach worth investigating on?

@jinfei1125
Copy link

How can we choose the best loss functions? If I understand your first lecture this week correctly, you mentioned some are good at understanding the texts, and others are good at predictions. Is there any rule of thumb when choosing loss functions?

@k-partha
Copy link

Has any CSS research used dynamic contextual embeddings (from an attention-based transformer model) recently? Given the superior performance of BERT and transformer-based models on 'meaning' related benchmarks, I'm wondering what exciting new CSS research paradigms these dynamic embeddings can power.

@sabinahartnett
Copy link

I understand the foundations and sentence level functionality of semantic parsing as is explained by this piece - but I was hoping to hear a bit more about the types of texts in which it functions best and is most often implemented - could you elaborate on some examples where semantic parsing is most often used and most often useful (i.e what types of differences might we see when texts are more or less correlated? More linguistically consistent?)?

@ming-cui
Copy link

ming-cui commented Feb 26, 2021

As the quote of Zhuangzi is originally in Chinese, I am wondering if the embedding models are applicable for Chinese, given that the concepts introduced in Chapter 6 do not work in Chinese. Chinese characters can either express meanings alone or in combination, and there are usually no spaces in a sentence.

@MOTOKU666
Copy link

I'm wondering whether tense logic would be easier to deal with in other languages. In particular, "temporal expressions in English are frequently expressed in spatial terms, as is illustrated by the various uses of at, in, somewhere, and near in these examples".

@Rui-echo-Pan
Copy link

The problem of bias and embeddings appear again, which is an important topic in one of the exemplary readings. What's the practical problem of it in the analysis, and what are techniques to reduce the bias when using embedding?

@zshibing1
Copy link

In “Vector Semantics”, why can we assume that "battle", "good", "fool", "wit" are orthogonal to each other and assign them four distinctive dimensions, as in figure 6.2?

@Bin-ary-Li
Copy link

Is there any literature that examines the qualitative difference between embedding generated by, said, LSA using 300 SVD components and a learned feedforward neural network representation layer?

@hesongrun
Copy link

hesongrun commented Feb 26, 2021

The word2vec technique is awesome! The co-occurrence information brings in much richer second-order information of texts! I have a question about the assessment of word-embeddings. What are some common metrics to assess the effectiveness of the embeddings learnt?

@william-wei-zhu
Copy link

william-wei-zhu commented Feb 26, 2021

Extending from @ming-cui , I wonder if the usefulness embedding models is stable across other forms of languages.

@theoevans1
Copy link

Chapter 6 discusses attempts at debiasing, but notes that bias can not be eliminated. Are there particular forms of bias that are especially difficult to reduce? What kinds of consequences can this have in applications of these methods?

@RobertoBarrosoLuque
Copy link

RobertoBarrosoLuque commented Feb 26, 2021

If possible could we go more in depth into how word embedding are trained? Say we want to compare how two actors use the same term, should we have a corpus for each then train a wordembedding for each and find the closest words to our term of interest?

Also, using word embeddings is more memory efficient than using tf-idf or count vectors since, unlike the other two, embeddings are dense, yet they seem more computationally inefficient since they require learning complex weights through stochastic gradient descent. How should we weigh these trade-offs when thinking about which word representations to use?

@mingtao-gao
Copy link

For this week’s reading, I noticed that it is necessary to generate training data for the transition-based dependency parsing. I wonder what is the “appropriate” size of the training set in order to obtain a reliable model? Would this algorithm be robust if we are not able to provide enough information?

@egemenpamukcu
Copy link

I would like to hear more about the unsupervised vs supervied neurat nets and their different uses in text analysis. Also is there a reason, say for a classification problem, why a researcher would choose to use logistic regression instead of a neural network besides interpretability? And it would be great if you could talk more about the word2vec representation of words and how its results are being processed by neural networks.

@lilygrier
Copy link

lilygrier commented Feb 26, 2021

Given the extra computational effort required, in what contexts do word embeddings provide insight above and beyond more straightforward analyses of co-occurrences through looking at k-grams? Are there cases where it actually makes more sense not to use a mathematically intense operation such as word2vec?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests