You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
In most decompilers, like IDA Pro, you can have types that have comments in them, like:
structElf64_Vernaux// sizeof=0x10
{ // XREF: LOAD:0000000000400410/runsigned __int32vna_hash; // this is some comment on this first memberunsigned __int16vna_flags;
unsigned __int16vna_other;
unsigned __int32vna_name__offset(OFF64,0x400390);
unsigned __int32vna_next;
};
Which libbs does not currently support. An ideal solution would look like this:
if you don't want to have to use the edm_t.cmt and udm_t.cmt attributes to enumerate or serialize complex field comments, you can also unpack/save them from the result of tinfo_t.serialize() ..which was the pre-8.4 method anyways ("fields" are similar).
decoding the bytes returned by tinfo_t.serialize into a list of comments is basically consuming a byte, determine whether it's an 8-bit/16-bit length, decoding said length, using the length to extract the comment, then utf-8 decoding those bytes and repeating until done.
def decode_bytes(bytes):
'''Decode the given `bytes` into a list containing the length and the bytes for each encoded string.'''
ok, results, iterable = True, [], (ord for ord in bytearray(bytes))
integer = next(iterable, None)
length_plus_one, ok = integer or 0, False if integer is None else True
while ok:
one = 1 if length_plus_one < 0x7f else next(iterable, None)
assert((one == 1) and length_plus_one > 0)
encoded = bytearray(ord for index, ord in zip(builtins.range(length_plus_one - 1), iterable)) # using zip to clamp bytes consumed
results.append((length_plus_one - 1, encoded)) if ok else None
integer = next(iterable, None)
length_plus_one, ok = integer or 0, False if integer is None else True
return results
encoding the string passed to tinfo_t.deserialize(til, type, fields, cmts=None) requires encoding the length for each utf-8 encoded comment, and concatenating back into a stream of bytes.
apologies for the unreadability of the following.. "encode_length" is all that is relevant
def encode_bytes(cls, strings):
'''Encode the list of `strings` with their lengths and return them as bytes.'''
encode_length = lambda integer: bytearray([integer + 1] if integer + 1 < 0x80 else [integer + 1, 1])
iterable = (bytes(string) if isinstance(string, (bytes, bytearray)) else string.encode('utf-8') for string in strings)
pairs = ((len(chunk), chunk) for chunk in iterable)
return bytes(bytearray().join(itertools.chain(*((encode_length(length), bytearray(chunk)) for length, chunk in pairs))))
however, it's worth confirming the performance with regards to serializing/deserializing them at scale is actually relevant in binsync. minsc creates an index for all commentable "things" so that they can be tagged for searching and (mis-)used to store nearly-arbitrary data, so being able to check if a tinfo_t even has comments or distinguishing what exactly was updated (name/comment/other) in response to events (w/o having to iterate through all the fields one-by-one) made a difference.
...i'm literally praying that they don't try to retrofit repeatable/non-repeatable comments into this btw.
Background
In most decompilers, like IDA Pro, you can have types that have comments in them, like:
Which libbs does not currently support. An ideal solution would look like this:
Implementation
To support this type of commenting, we'll need to do a few things:
comment
attributeFunction
to support commentsThe text was updated successfully, but these errors were encountered: