You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Another observation is that Transformer-based methods
generally suffer from unaligned-length problem [49], which
denotes that the Transformer is hard to correct the vision
prediction if character number is unaligned with ground truth.
The unaligned-length problem is caused by the inevitable
implementation of padding mask which is fixed for filtering
context outside text length. Our iterative LM can alleviate
this problem as the visual feature and linguistic feature are
fused several times, and thus the predicted text length is also
refined gradually.
这段指的是什么问题?这套框架应该是不适用复杂的layout和很长的文本的吧,有大佬解释下这里解决的是啥问题么?
The text was updated successfully, but these errors were encountered:
Another observation is that Transformer-based methods
generally suffer from unaligned-length problem [49], which
denotes that the Transformer is hard to correct the vision
prediction if character number is unaligned with ground truth.
The unaligned-length problem is caused by the inevitable
implementation of padding mask which is fixed for filtering
context outside text length. Our iterative LM can alleviate
this problem as the visual feature and linguistic feature are
fused several times, and thus the predicted text length is also
refined gradually.
这段指的是什么问题?这套框架应该是不适用复杂的layout和很长的文本的吧,有大佬解释下这里解决的是啥问题么?
The text was updated successfully, but these errors were encountered: