You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
tlsh_diff=tlsh.diff(row, tlsh_value)
iftlsh_diff<=120: # MATCHEDif (matched_tlsh_diff<0) or (tlsh_diff<matched_tlsh_diff):
matched_tlsh_diff=tlsh_diffmatched_tlsh=row
I've noticed that FOSSLight treats two files as the same if their TLSH score (distance) is 120 or less.
I'm curious about the rationale behind choosing 120 as the threshold for file similarity.
Could you please provide some insight into how this particular value ware determined?
I know that lower score means more similar, but I couldn't find any specific standard number neither TLSH web page nor the tech paper from Trend Micro.
Was it based on empirical testing, or other considerations?
Thank you for the amazing binary scanner!
The text was updated successfully, but these errors were encountered:
Have a question regarding the use of TLSH for file comparison.
Related code
Line 109 - 113 in src/fosslight_binary/_binary_dao.py
I've noticed that FOSSLight treats two files as the same if their TLSH score (distance) is 120 or less.
I'm curious about the rationale behind choosing 120 as the threshold for file similarity.
Could you please provide some insight into how this particular value ware determined?
I know that lower score means more similar, but I couldn't find any specific standard number neither TLSH web page nor the tech paper from Trend Micro.
Was it based on empirical testing, or other considerations?
Thank you for the amazing binary scanner!
The text was updated successfully, but these errors were encountered: