Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: parse and clean archive badges and markdown links to URL #243

Merged
merged 15 commits into from
Jan 14, 2025
Merged
1 change: 1 addition & 0 deletions pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -25,6 +25,7 @@ classifiers = [
]
dependencies = [
"pydantic>=2.0",
"python-doi",
"python-dotenv",
"requests",
"ruamel-yaml>=0.17.21",
Expand Down
20 changes: 19 additions & 1 deletion src/pyosmeta/models/base.py
Original file line number Diff line number Diff line change
Expand Up @@ -19,7 +19,7 @@
)

from pyosmeta.models.github import Labels
from pyosmeta.utils_clean import clean_date, clean_markdown
from pyosmeta.utils_clean import clean_archive, clean_date, clean_markdown


class Partnerships(str, Enum):
Expand Down Expand Up @@ -403,3 +403,21 @@
label.name if isinstance(label, Labels) else label
for label in labels
]

@field_validator(
"archive",
mode="before",
)
@classmethod
def clean_archive(cls, archive: str) -> str:
"""Clean the archive value to ensure it's a valid archive URL."""
return clean_archive(archive)

@field_validator(
"joss",
mode="before",
)
@classmethod
def clean_joss(cls, joss: str) -> str:
"""Clean the joss value to ensure it's a valid URL."""
return clean_archive(joss)

Check warning on line 423 in src/pyosmeta/models/base.py

View check run for this annotation

Codecov / codecov/patch

src/pyosmeta/models/base.py#L423

Added line #L423 was not covered by tests
29 changes: 29 additions & 0 deletions src/pyosmeta/utils_clean.py
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,8 @@
from datetime import datetime
from typing import Any

import doi
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since we've added a new dep, we should make sure that it is noted in the changelog and also document why we added it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Further clarified this in e1246e0



def get_clean_user(username: str) -> str:
"""Cleans a GitHub username provided in a review issue by removing any
Expand Down Expand Up @@ -125,3 +127,30 @@
review_dict["date_accepted"] = value
break
return review_dict


def clean_archive(archive):
"""Clean an archive link to ensure it is a valid URL."""

def is_doi(archive):
try:
return doi.validate_doi(archive)
except ValueError:
return False

Check warning on line 139 in src/pyosmeta/utils_clean.py

View check run for this annotation

Codecov / codecov/patch

src/pyosmeta/utils_clean.py#L136-L139

Added lines #L136 - L139 were not covered by tests

if archive.startswith("[") and archive.endswith(")"):
# Extract the outermost link
link = archive[archive.rfind("](") + 2 : -1]
if not link.startswith("http"):
return clean_archive(link)

Check warning on line 145 in src/pyosmeta/utils_clean.py

View check run for this annotation

Codecov / codecov/patch

src/pyosmeta/utils_clean.py#L145

Added line #L145 was not covered by tests
return link
elif archive.startswith("http"):
return archive

Check warning on line 148 in src/pyosmeta/utils_clean.py

View check run for this annotation

Codecov / codecov/patch

src/pyosmeta/utils_clean.py#L148

Added line #L148 was not covered by tests
elif link := is_doi(archive):
return link

Check warning on line 150 in src/pyosmeta/utils_clean.py

View check run for this annotation

Codecov / codecov/patch

src/pyosmeta/utils_clean.py#L150

Added line #L150 was not covered by tests
elif archive.lower() == "n/a":
return None

Check warning on line 152 in src/pyosmeta/utils_clean.py

View check run for this annotation

Codecov / codecov/patch

src/pyosmeta/utils_clean.py#L152

Added line #L152 was not covered by tests
elif archive.lower() == "tbd":
return None

Check warning on line 154 in src/pyosmeta/utils_clean.py

View check run for this annotation

Codecov / codecov/patch

src/pyosmeta/utils_clean.py#L154

Added line #L154 was not covered by tests
else:
raise ValueError(f"Invalid archive URL: {archive}")

Check warning on line 156 in src/pyosmeta/utils_clean.py

View check run for this annotation

Codecov / codecov/patch

src/pyosmeta/utils_clean.py#L156

Added line #L156 was not covered by tests
Loading