Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ablation study on Multi-way Transformer Decoder #9

Open
amos-x-wang opened this issue Dec 5, 2023 · 0 comments
Open

Ablation study on Multi-way Transformer Decoder #9

amos-x-wang opened this issue Dec 5, 2023 · 0 comments

Comments

@amos-x-wang
Copy link

amos-x-wang commented Dec 5, 2023

Dear authors:

First of all, amazing work, and I enjoyed it a lot.

Your paper stated:

In this ablation study, we use a hybrid transformer as a baseline
encoder and train both models from scratch with a longer
side of the input image of 768, and test them with an image
size of 1280.

According to Table 7, the improvement of DET and E2E is minor (in my personal viewpoint only, as ICDAR 2015 and Total-Text are small scale datasets)
image


Are you happy to share more about this ablation study? E.g.,
1. What is the improvement of Multi-way Decoder in the final model, i.e.,

  • when using the Swin transformer, and
  • the image resolution is 1920.

2. The converge curves with/without using Multi-way Decoder?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant