-
Notifications
You must be signed in to change notification settings - Fork 465
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fixing #894 and correcting tests #901
Conversation
It seems removing this line resolves the issue #894 : search_dates() wrong result for 12:xx am
Codecov Report
@@ Coverage Diff @@
## master #901 +/- ##
=======================================
Coverage 98.26% 98.26%
=======================================
Files 231 231
Lines 2597 2599 +2
=======================================
+ Hits 2552 2554 +2
Misses 45 45
Continue to review full report at Codecov.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added some doubts I have. I'm not also sure if we can just modify the parse_item
and the parse_found_objects
signature directly...
Put together The German language date parsing needs some correction but this PR fixes |
Co-authored-by: Marc Hernández <[email protected]>
Hi, @noviluni I have made the changes, and thanks for the review |
dateparser/search/search.py
Outdated
if language == "de": | ||
item = item.replace('am', '') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if language == "de": | |
item = item.replace('am', '') | |
if language == "de": | |
# Replace "am" because "am 8" means "on the 8th" but "8 am" actually | |
# means "8 am" | |
item = item.replace('am', '') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmmmm... Maybe we can use regex (re.sub
) to remove it only when preceding a number? 🤔
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @noviluni, I have made some can you please check and recommend
('June 5 am utc', datetime.datetime(2023, 6, datetime.datetime.utcnow().day, 5, 0, tzinfo=pytz.utc)), | ||
('June 23th 5 pm EST', datetime.datetime(2023, 6, 23, 17, 0, tzinfo=pytz.timezone("EST"))), | ||
('May 31', datetime.datetime(2023, 5, 31, 0, 0)), | ||
('8am UTC', datetime.datetime(2023, 8, 31, 0, 0, tzinfo=pytz.utc))]), | ||
('8am UTC', datetime.datetime(2023, 5, 31, 8, 0, tzinfo=pytz.utc))]), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the parsing of 8am UTC
is a change for the better, but I’m not sure about June 5 am utc
. The previous interpretation seems OK to me, “June 5th in the morning UTC”. In fact, I think it may even be better, because I suspect leaving a unit missing in the middle (i.e. having month and hour, but missing day) may be less likely than the other interpretation (month and day, am meaning sometime in the morning).
I checked and that test string was added as part of a performance improvement, #881, and I imagine it does not come from an actual case found in the internet, which makes it hard to argue either way.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, I guess you are right that fixing it is out of the scope of this change and should be a separate issue. It was just bad luck that your change broke it, or rather it was just dumb luck that it was working in the first place.
However, what I would do here is move that scenario into a separate test, with the previous value, and mark the test as an expected failure (xfail), to make it clear that the expected outcome is still the old one, only that Dateparser does not support it yet.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great after this PR is merged I will create an issue and create a supporting PR
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Don’t forget the xfail part, though, I do think that needs to be addressed in this pull request, instead of changing the test expectations.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am working on it now and by the time you create separate test, with the previous value, and mark the test as an expected failure (xfail) it should be done
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
by the time you create separate test
I was hoping for you to do that, as it needs to be done as part of this pull request (which is the one that causes the test to fail).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok on it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems it is broken throughout search_dates
and parse
function as with other cases like
June 5th pm EST ---> [('June 5th pm EST', datetime.datetime(2021, 6, 24, 17, 0, tzinfo=<StaticTzInfo 'EST'>))]
or
June 5th pm EST ---> ('June 9 pm', datetime.datetime(2021, 6, 24, 21, 0))]
Which is incorrect. I am working on the patch but it will require some time and significant change in other parts of the library and many tests will be broken because they are also incorrect. Or the current interpretation is correct.
Thanks, for your support and suggestions these improvements are now made on PR #945. |
Hi, @Gallaecio ,
I have fixed #894 and submitted PR #900 but it failed because the tests didn't supported
am
so tests failed.In this PR the tests and #894 is fixed.