Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Download of GT fails partially #17

Open
stweil opened this issue Jan 10, 2024 · 1 comment
Open

Download of GT fails partially #17

stweil opened this issue Jan 10, 2024 · 1 comment

Comments

@stweil
Copy link
Contributor

stweil commented Jan 10, 2024

The script scripts/prepare.shdownloads GT for Reichsanzeiger, but fails to download the GT defined in data_srcs/default_data_sources.txt. Here is protocol of bash -x scripts/prepare.sh:

+ mkdir -p gt
+ echo 'Prepare OCR-D Ground Truth …'
Prepare OCR-D Ground Truth …
+ IFS=
+ read -r URL
++ echo https://github.com/tboenig/16_frak_simple
++ cut -d/ -f4
+ OWNER=tboenig
++ echo https://github.com/tboenig/16_frak_simple
++ cut -d/ -f5
+ REPO=16_frak_simple
+ [[ ! -f gt/16_frak_simple.zip ]]
+ echo 'Downloading 16_frak_simple …'
Downloading 16_frak_simple …
++ curl -L -H 'Accept: application/vnd.github+json' -H 'X-GitHub-Api-Version: 2022-11-28' https://api.github.com/repos/tboenig/16_frak_simple/releases/latest
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
^M  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0^M100  9963  100  9963    0     0  48600      0 --:--:-- --:--:-- --:--:-- 48600
+ RESULT='{
  "url": "https://api.github.com/repos/tboenig/16_frak_simple/releases/126253523",
  "assets_url": "https://api.github.com/repos/tboenig/16_frak_simple/releases/126253523/assets",
  "upload_url": "https://uploads.github.com/repos/tboenig/16_frak_simple/releases/126253523/assets{?name,label}",
  "html_url": "https://github.com/tboenig/16_frak_simple/releases/tag/v1.1.1",
  "id": 126253523,
  "author": {
    "login": "github-actions[bot]",
    "id": 41898282,
    "node_id": "MDM6Qm90NDE4OTgyODI=",
    "avatar_url": "https://avatars.githubusercontent.com/in/15368?v=4",
    "gravatar_id": "",
    "url": "https://api.github.com/users/github-actions%5Bbot%5D",
    "html_url": "https://github.com/apps/github-actions",
    "followers_url": "https://api.github.com/users/github-actions%5Bbot%5D/followers",
    "following_url": "https://api.github.com/users/github-actions%5Bbot%5D/following{/other_user}",
    "gists_url": "https://api.github.com/users/github-actions%5Bbot%5D/gists{/gist_id}",
    "starred_url": "https://api.github.com/users/github-actions%5Bbot%5D/starred{/owner}{/repo}",
    "subscriptions_url": "https://api.github.com/users/github-actions%5Bbot%5D/subscriptions",
    "organizations_url": "https://api.github.com/users/github-actions%5Bbot%5D/orgs",
    "repos_url": "https://api.github.com/users/github-actions%5Bbot%5D/repos",
    "events_url": "https://api.github.com/users/github-actions%5Bbot%5D/events{/privacy}",
    "received_events_url": "https://api.github.com/users/github-actions%5Bbot%5D/received_events",
    "type": "Bot",
    "site_admin": false
  },
  "node_id": "RE_kwDOIFGkSM4HhnnT",
  "tag_name": "v1.1.1",
  "target_commitish": "main",
  "name": "Release 81_v1.1.1",
  "draft": false,
  "prerelease": false,
  "created_at": "2023-10-23T14:29:21Z",
  "published_at": "2023-10-23T14:30:58Z",
  "assets": [
    {
      "url": "https://api.github.com/repos/tboenig/16_frak_simple/releases/assets/131958447",
      "id": 131958447,
      "node_id": "RA_kwDOIFGkSM4H3Yav",
      "name": "kistler_kraeuter_1500.ocrd.zip",
      "label": "",
      "uploader": {
        "login": "github-actions[bot]",
        "id": 41898282,
        "node_id": "MDM6Qm90NDE4OTgyODI=",
        "avatar_url": "https://avatars.githubusercontent.com/in/15368?v=4",
        "gravatar_id": "",
        "url": "https://api.github.com/users/github-actions%5Bbot%5D",
        "html_url": "https://github.com/apps/github-actions",
        "followers_url": "https://api.github.com/users/github-actions%5Bbot%5D/followers",
        "following_url": "https://api.github.com/users/github-actions%5Bbot%5D/following{/other_user}",
        "gists_url": "https://api.github.com/users/github-actions%5Bbot%5D/gists{/gist_id}",
        "starred_url": "https://api.github.com/users/github-actions%5Bbot%5D/starred{/owner}{/repo}",
        "subscriptions_url": "https://api.github.com/users/github-actions%5Bbot%5D/subscriptions",
        "organizations_url": "https://api.github.com/users/github-actions%5Bbot%5D/orgs",
        "repos_url": "https://api.github.com/users/github-actions%5Bbot%5D/repos",
        "events_url": "https://api.github.com/users/github-actions%5Bbot%5D/events{/privacy}",
        "received_events_url": "https://api.github.com/users/github-actions%5Bbot%5D/received_events",
        "type": "Bot",
        "site_admin": false
      },
      "content_type": "application/zip",
      "state": "uploaded",
      "size": 19379136,
      "download_count": 14,
      "created_at": "2023-10-23T14:31:01Z",
      "updated_at": "2023-10-23T14:31:02Z",
      "browser_download_url": "https://github.com/tboenig/16_frak_simple/releases/download/v1.1.1/kistler_kraeuter_1500.ocrd.zip"
    },
    {
      "url": "https://api.github.com/repos/tboenig/16_frak_simple/releases/assets/131958445",
      "id": 131958445,
[...]
++ jq -r '.assets | .[].browser_download_url'
+ ZIP_URL='https://github.com/tboenig/16_frak_simple/releases/download/v1.1.1/kistler_kraeuter_1500.ocrd.zip
https://github.com/tboenig/16_frak_simple/releases/download/v1.1.1/luther_auszlegunge_1520.ocrd.zip
https://github.com/tboenig/16_frak_simple/releases/download/v1.1.1/metadata-v81.zip
https://github.com/tboenig/16_frak_simple/releases/download/v1.1.1/trota_mordtbrenner_1540.ocrd.zip'
+ curl -L -o gt/16_frak_simple.zip 'https://github.com/tboenig/16_frak_simple/releases/download/v1.1.1/kistler_kraeuter_1500.ocrd.zip
https://github.com/tboenig/16_frak_simple/releases/download/v1.1.1/luther_auszlegunge_1520.ocrd.zip
https://github.com/tboenig/16_frak_simple/releases/download/v1.1.1/metadata-v81.zip
https://github.com/tboenig/16_frak_simple/releases/download/v1.1.1/trota_mordtbrenner_1540.ocrd.zip'
curl: (3) URL using bad/illegal format or missing URL
[...]

So instead of calling curl with single URLs, it is called with all URLs combined in a single argument. That fails of course.

@stweil
Copy link
Contributor Author

stweil commented Jan 10, 2024

@tboenig, it looks like a change in the released GT causes the breakage: older releases contained a single zip file like for example bagitDump-v79.zip while the latest release contains several zip files. The download script is not prepared to handle that correctly.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant