Fixing #894 and correcting tests #901

gavishpoddar · 2021-03-30T20:11:00Z

I have fixed #894 and submitted PR #900 but it failed because the tests didn't supported am so tests failed.

In this PR the tests and #894 is fixed.

It seems removing this line resolves the issue #894 : search_dates() wrong result for 12:xx am

The test files were corrected to recognize am previously the am was ignored because of issue #894 After merging the test file issue #894 is resolved

codecov · 2021-03-30T20:14:00Z

Codecov Report

Merging #901 (ff1e856) into master (803d445) will increase coverage by 0.00%.
The diff coverage is 100.00%.

@@           Coverage Diff           @@
##           master     #901   +/-   ##
=======================================
  Coverage   98.26%   98.26%           
=======================================
  Files         231      231           
  Lines        2597     2599    +2     
=======================================
+ Hits         2552     2554    +2     
  Misses         45       45

Impacted Files	Coverage Δ
dateparser/search/search.py	`99.35% <100.00%> (+<0.01%)`	⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 803d445...ff1e856. Read the comment docs.

noviluni

I added some doubts I have. I'm not also sure if we can just modify the parse_item and the parse_found_objects signature directly...

dateparser/search/search.py

gavishpoddar · 2021-03-31T10:48:19Z

Put together

The German language date parsing needs some correction but this PR fixes AM PM for other languages including English.

tests/test_search.py

dateparser/search/search.py

Co-authored-by: Marc Hernández <[email protected]>

gavishpoddar · 2021-04-12T16:01:40Z

Hi, @noviluni I have made the changes, and thanks for the review

noviluni · 2021-04-13T09:23:52Z

dateparser/search/search.py

+        if language == "de":
+            item = item.replace('am', '')


Suggested change

if language == "de":

item = item.replace('am', '')

if language == "de":

# Replace "am" because "am 8" means "on the 8th" but "8 am" actually

# means "8 am"

item = item.replace('am', '')

Hmmmm... Maybe we can use regex (re.sub) to remove it only when preceding a number? 🤔

Hi @noviluni, I have made some can you please check and recommend

tests/test_search.py

Gallaecio · 2021-04-20T14:42:04Z

tests/test_search.py

+               ('June 5 am utc', datetime.datetime(2023, 6, datetime.datetime.utcnow().day, 5, 0, tzinfo=pytz.utc)),
               ('June 23th 5 pm EST', datetime.datetime(2023, 6, 23, 17, 0, tzinfo=pytz.timezone("EST"))),
               ('May 31', datetime.datetime(2023, 5, 31, 0, 0)),
-               ('8am UTC', datetime.datetime(2023, 8, 31, 0, 0, tzinfo=pytz.utc))]),
+               ('8am UTC', datetime.datetime(2023, 5, 31, 8, 0, tzinfo=pytz.utc))]),


I think the parsing of 8am UTC is a change for the better, but I’m not sure about June 5 am utc. The previous interpretation seems OK to me, “June 5th in the morning UTC”. In fact, I think it may even be better, because I suspect leaving a unit missing in the middle (i.e. having month and hour, but missing day) may be less likely than the other interpretation (month and day, am meaning sometime in the morning).

I checked and that test string was added as part of a performance improvement, #881, and I imagine it does not come from an actual case found in the internet, which makes it hard to argue either way.

Right it “June 5th in the morning UTC”. is a better interpretation. I am fixing that but I think it's beyond the scope of #894.

So should I open a new issue or maybe directly create a new pull request or fix it in the current PR but that's beyond not directly related to issue #894.

Well, I guess you are right that fixing it is out of the scope of this change and should be a separate issue. It was just bad luck that your change broke it, or rather it was just dumb luck that it was working in the first place.

However, what I would do here is move that scenario into a separate test, with the previous value, and mark the test as an expected failure (xfail), to make it clear that the expected outcome is still the old one, only that Dateparser does not support it yet.

Great after this PR is merged I will create an issue and create a supporting PR

Don’t forget the xfail part, though, I do think that needs to be addressed in this pull request, instead of changing the test expectations.

I am working on it now and by the time you create separate test, with the previous value, and mark the test as an expected failure (xfail) it should be done

by the time you create separate test

I was hoping for you to do that, as it needs to be done as part of this pull request (which is the one that causes the test to fail).

It seems it is broken throughout search_dates and parse function as with other cases like

June 5th pm EST ---> [('June 5th pm EST', datetime.datetime(2021, 6, 24, 17, 0, tzinfo=<StaticTzInfo 'EST'>))]

or

June 5th pm EST ---> ('June 9 pm', datetime.datetime(2021, 6, 24, 21, 0))]

Which is incorrect. I am working on the patch but it will require some time and significant change in other parts of the library and many tests will be broken because they are also incorrect. Or the current interpretation is correct.

gavishpoddar · 2021-07-16T22:47:42Z

Thanks, for your support and suggestions these improvements are now made on PR #945.

gavishpoddar added 3 commits March 30, 2021 23:37

Fixing Issue #894

5583f3a

It seems removing this line resolves the issue #894 : search_dates() wrong result for 12:xx am

Tests are corrected for recognising am

fff5112

The test files were corrected to recognize am previously the am was ignored because of issue #894 After merging the test file issue #894 is resolved

Fixing the issue #894

a77cb4c

noviluni reviewed Mar 31, 2021

View reviewed changes

dateparser/search/search.py Outdated Show resolved Hide resolved

dateparser/search/search.py Outdated Show resolved Hide resolved

dateparser/search/search.py Outdated Show resolved Hide resolved

dateparser/search/search.py Outdated Show resolved Hide resolved

gavishpoddar added 5 commits March 31, 2021 15:20

Removing unnecessary comments

3a93e8a

Improving test_search.py

0ea06bc

Correcting 'am' for German Language

4e729e8

Update test_search.py

5773f76

Update search.py

b169301

gavishpoddar commented Mar 31, 2021

View reviewed changes

dateparser/search/search.py Show resolved Hide resolved

gavishpoddar mentioned this pull request Apr 1, 2021

Date Search improvements #897

Open

Gallaecio reviewed Apr 5, 2021

View reviewed changes

tests/test_search.py Show resolved Hide resolved

Gallaecio reviewed Apr 5, 2021

View reviewed changes

tests/test_search.py Outdated Show resolved Hide resolved

gavishpoddar added 4 commits April 5, 2021 20:22

Fixing the static date to datetime.datetime.utcnow().day

fba58dd

Update test_search.py

2aff295

Update test_search.py

5d8e3d5

Update test_search.py

c21e428

noviluni reviewed Apr 12, 2021

View reviewed changes

dateparser/search/search.py Outdated Show resolved Hide resolved

noviluni reviewed Apr 12, 2021

View reviewed changes

dateparser/search/search.py Outdated Show resolved Hide resolved

noviluni reviewed Apr 12, 2021

View reviewed changes

dateparser/search/search.py Outdated Show resolved Hide resolved

gavishpoddar and others added 5 commits April 12, 2021 20:50

Update dateparser/search/search.py

34f49b9

Co-authored-by: Marc Hernández <[email protected]>

Update search.py

50c1d30

newlines removed

0563be4

Update test_search.py

c089907

Adding comment

dbd12ec

noviluni reviewed Apr 13, 2021

View reviewed changes

Fixing test from line break to comma

4343cd4

gavishpoddar added 3 commits April 16, 2021 18:08

Fixing tests

2ff140e

Update test_search.py

14bb443

Update test_search.py

840e2a3

Gallaecio reviewed Apr 16, 2021

View reviewed changes

tests/test_search.py Outdated Show resolved Hide resolved

gavishpoddar added 5 commits April 16, 2021 19:24

Test's fixed

2f61776

Using re.sub to replace

4810bc6

Fixing tests

95fbe1d

Update test_search.py

61bfe13

Update test_search.py

98b6ba6

Gallaecio reviewed Apr 20, 2021

View reviewed changes

gavishpoddar added 2 commits May 19, 2021 12:49

Merge branch 'scrapinghub:master' into master

da1c1c3

Merge branch 'scrapinghub:master' into master

ff1e856

gavishpoddar closed this Jul 16, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fixing #894 and correcting tests #901

Fixing #894 and correcting tests #901

gavishpoddar commented Mar 30, 2021 •

edited

Loading

codecov bot commented Mar 30, 2021 •

edited

Loading

noviluni left a comment

gavishpoddar commented Mar 31, 2021

gavishpoddar commented Apr 12, 2021

noviluni Apr 13, 2021 •

edited

Loading

noviluni Apr 13, 2021

gavishpoddar Apr 16, 2021

Gallaecio Apr 20, 2021

gavishpoddar Apr 20, 2021

Gallaecio Apr 20, 2021

gavishpoddar Apr 20, 2021

Gallaecio Apr 20, 2021

gavishpoddar Apr 21, 2021

Gallaecio Apr 21, 2021

gavishpoddar Apr 21, 2021

gavishpoddar Apr 24, 2021

gavishpoddar commented Jul 16, 2021

Fixing #894 and correcting tests #901

Fixing #894 and correcting tests #901

Conversation

gavishpoddar commented Mar 30, 2021 • edited Loading

codecov bot commented Mar 30, 2021 • edited Loading

Codecov Report

noviluni left a comment

Choose a reason for hiding this comment

gavishpoddar commented Mar 31, 2021

gavishpoddar commented Apr 12, 2021

noviluni Apr 13, 2021 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

gavishpoddar commented Jul 16, 2021

gavishpoddar commented Mar 30, 2021 •

edited

Loading

codecov bot commented Mar 30, 2021 •

edited

Loading

noviluni Apr 13, 2021 •

edited

Loading