Some questions about decoder position embedding for masked tokens #173

chrisway613 · 2021-11-24T07:30:14Z

In the decoder position embedding matrix, the size of first dim is the number of patches + 1, as the 1 for ViT's cls_token. But when embedding the position for masked tokens, their indices have not shifted 1, it may confuse with the position of the ViT's cls_token(although MAE do not use cls_token, but this will lead to weak extensibility if we wanna use the cls_token later)

lucidrains · 2021-11-24T16:17:07Z

@chrisway613 Hi Chris! while this is true, i think leaving untrained parameters in the wrapper class isn't elegant. you can always just concat the CLS tokens onto the decoder_pos_emb after you finished training, something like

decoder_cls_token = nn.Parameter(torch.randn(1, decoder_dim))
pos_embs_with_cls_token = torch.cat((decoder_cls_token, self.decoder_pos_emb), dim = 0)

some questions about decoder position embedding for masked tokens

6b7921f

lucidrains force-pushed the main branch 3 times, most recently from dbb7bd1 to b983bbe Compare December 21, 2021 18:23

lucidrains force-pushed the main branch from dfcfa20 to 2aae406 Compare March 23, 2022 17:42

lucidrains force-pushed the main branch 6 times, most recently from ddff7a7 to b3e90a2 Compare May 4, 2022 03:24

lucidrains force-pushed the main branch from ad1e6df to cb6d749 Compare October 29, 2022 18:35

lucidrains force-pushed the main branch from e051522 to 89e1996 Compare December 2, 2022 19:28

lucidrains force-pushed the main branch from 014df1e to df8733d Compare October 6, 2023 17:27

lucidrains force-pushed the main branch 3 times, most recently from 19eb6d4 to 5e808f4 Compare August 21, 2024 14:23

lucidrains force-pushed the main branch from 43cbcad to f50d7d1 Compare October 9, 2024 14:32

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Some questions about decoder position embedding for masked tokens #173

Some questions about decoder position embedding for masked tokens #173

chrisway613 commented Nov 24, 2021

lucidrains commented Nov 24, 2021

Some questions about decoder position embedding for masked tokens #173

Are you sure you want to change the base?

Some questions about decoder position embedding for masked tokens #173

Conversation

chrisway613 commented Nov 24, 2021

lucidrains commented Nov 24, 2021