Question regarding Ternary operator in `clocal_exp_editsim` #23

KeitaW · 2020-05-31T23:48:58Z

I got the following question from https://github.com/rcojocaru.

"""
I have a short question about the code. In editsim.py --> clocal_exp_editsim(_withbp) you have this:

for col1 in range(nrow):
for col2 in range(ncol):
match = 0
for row in range(nneuron):
match += mat1[row, col1] * mat2[row, col2]
match = -10 if match == 0 else match
...
dp[col1+1, col2+1] = max4(
0,
down_score,
right_score,
dp[col1, col2] + match

Do you remember why you introduced this if clause for the case in which match is 0? I think it can have radical effects on the edit similarity score. For example, even if comparing identical sequences, instead of getting the maximum edit similarity score, the result would be alpha dependent because of this if clause. I would really appreciate your input before modifying core things.
"""

The text was updated successfully, but these errors were encountered:

KeitaW · 2020-05-31T23:50:54Z

This change was intentional. Let me explain why I made this change.
https://github.com/KeitaW/spykesim/blob/f095c24e20ef07a1029f422344f02ed0741059cc/spykesim/editsim.pyx#L291-L317

KeitaW · 2020-05-31T23:55:38Z

match == 0 means there is no neuron that fires in both time windows. In that case, we would like to penalize the score by either the down_score or right_score but since the minimum of the match score is zero, those two values are not considered in the previous version.
The ternary operator was introduced to circumvent the problem.

rcojocaru · 2020-06-22T04:58:33Z

I am still not convinced about this.

"match == 0 means there is no neuron that fires in both time windows" --> as I understand it, match is the scalar product between two specific columns of the time windows; so, if this is 0, it might mean there's no match between them (different neurons firing in one and the other) or no activity at all in those two columns (no neuron firing). Maybe we should penalize the first situation, but not the second one.

For example if you compare two identical windows of 100 ms with 10 neurons firing, this condition will systematically reduce the edit similarity score for these two identical windows - the reduction will depend on alpha. It is true that for low enough alpha the reduction might not be significant. But this is the main thing that bothers me, that even identical windows with sufficient activity can end up with lower scores, because it is penalizing "holes" or gaps in the profile.

Is it a good idea for this penalty to be fixed/always used? Maybe it should be parameterized and applied in function of the data set.

KeitaW · 2020-07-01T22:59:47Z

Thank you for the comment.

Yeah, I understand what you mean. We can of course do something like below:

nspikes1 = 0
nspikes2 = 0
match = 0
for row in range(nneuron):
    nspikes1 += mat1[row, col1]
    nspikes2 += mat2[row, col2]
    match += mat1[row, col1] * mat2[row, col2]  
match = -10 if match == 0 else match
if nspikes1 == 0 or nspikes2 == 0:
    down[col1+1, col2+1] = down[col1, col2+1]
    down_score = dp[col1-down[col1+1, col2+1]+1, col2+1] - cexp(a, down[col1+1, col2+1]) + 1
    # For Rightp
    right[col1+1, col2+1] = right[col1+1, col2]
    right_score = dp[col1+1, col2-right[col1+1, col2+1]+1] - cexp(a, right[col1+1, col2+1]) + 1
else:
    # For Down
    # Comparing options: newly extend a gap or extend the exsinting gap
    dirpair = pairmax2(
        dp[col1, col2+1] - cexp(a, 1) + 1,
        dp[col1-down[col1, col2+1], col2+1] - cexp(a, down[col1, col2+1] + 1) + 1
    )
    down[col1+1, col2+1] = 1 if dirpair.idx == 0 else down[col1, col2+1] + 1
    down_score = dp[col1-down[col1+1, col2+1]+1, col2+1] - cexp(a, down[col1+1, col2+1]) + 1
    # For Rightp
    # Comparing options: newly extend a gap or extend the exsinting gap
    dirpair = pairmax2(
        dp[col1+1, col2] - cexp(a, 1) + 1,
        dp[col1+1, col2-right[col1+1, col2]] - cexp(a, right[col1+1, col2] + 1) + 1
    )
    right[col1+1, col2+1] = 1 if dirpair.idx == 0 else right[col1+1, col2] + 1
    right_score = dp[col1+1, col2-right[col1+1, col2+1]+1] - cexp(a, right[col1+1, col2+1]) + 1
# Update dp
pair = pairmax4(
    0,
    down_score,
    right_score,
    dp[col1, col2] + match
)
bp[col1 + 1, col2 + 1] = pair.idx
dp[col1 + 1, col2 + 1] = pair.value

to avoid penalization when there is no spikes in a certain column.
How do you think?

rcojocaru · 2020-07-02T02:03:24Z

Thank you for looking into it. I think this solution could be a good compromise for the cases I am worried about (my initial solution was to remove this zero-match penalty completely, but that might be too drastic). Let me test this new code on some of the data I had issues with and get back to you!

rcojocaru · 2020-07-14T08:27:22Z

Hi. I checked, and this version of the code does what I was asking for. Still, this modification can significantly change the range of edit similarity values one can obtain on a given dataset, so maybe before committing to it (if you plan to modify the repository) it could be good to have another parameter that chooses this version or the previous one. For reproducibility if nothing else.

I was thinking something about like:

def clocal_exp_editsim(DBL_C[:, :] mat1, DBL_C[:, :] mat2, DBL_C a = 0.01, zerospikes_penalty = False):
....
if (nspikes1 == 0 or nspikes2 == 0) and not zerospikes_penalty:
...

same for clocal_exp_editsim_withbp. Please check if this makes sense and feel free to close the issue, I consider it solved. Thank you!

KeitaW · 2020-07-15T01:30:46Z

Thanks @rcojocaru for testing the idea. I agree that we should have an option to enable/disable the modification.
Okay, let me make a PR that introduce the change shortly.

KeitaW self-assigned this May 31, 2020

KeitaW added the question Further information is requested label May 31, 2020

KeitaW mentioned this issue Jul 25, 2020

Add zerospikes_penalty argument #26

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Question regarding Ternary operator in `clocal_exp_editsim` #23

Question regarding Ternary operator in `clocal_exp_editsim` #23

KeitaW commented May 31, 2020

KeitaW commented May 31, 2020 •

edited

Loading

KeitaW commented May 31, 2020

rcojocaru commented Jun 22, 2020 •

edited

Loading

KeitaW commented Jul 1, 2020

rcojocaru commented Jul 2, 2020

rcojocaru commented Jul 14, 2020

KeitaW commented Jul 15, 2020

Question regarding Ternary operator in clocal_exp_editsim #23

Question regarding Ternary operator in clocal_exp_editsim #23

Comments

KeitaW commented May 31, 2020

KeitaW commented May 31, 2020 • edited Loading

KeitaW commented May 31, 2020

rcojocaru commented Jun 22, 2020 • edited Loading

KeitaW commented Jul 1, 2020

rcojocaru commented Jul 2, 2020

rcojocaru commented Jul 14, 2020

KeitaW commented Jul 15, 2020

Question regarding Ternary operator in `clocal_exp_editsim` #23

Question regarding Ternary operator in `clocal_exp_editsim` #23

KeitaW commented May 31, 2020 •

edited

Loading

rcojocaru commented Jun 22, 2020 •

edited

Loading