Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fixes to deal with Pandas tz_convert and fromtimestamp issues #231

Closed
wants to merge 25 commits into from
Closed
Changes from 1 commit
Commits
Show all changes
25 commits
Select commit Hold shift + click to select a range
f4726b6
Fixes for pandas fromtimestamp issues
jhollanderpax8 May 1, 2023
fb0a26f
Fixing timezone stuff
jhollanderpax8 Oct 30, 2023
e92ea5f
Fix screeners with numbers in ID
ms32035 Jul 12, 2023
5411207
Update quote summary endpoint from v10 to v6
dpguthrie Jul 15, 2023
5797bd3
Update to 2.3.2
dpguthrie Jul 15, 2023
55a20a8
Update changelog for 2.3.2
dpguthrie Jul 15, 2023
2b1e066
Modifications to switch to poetry
dpguthrie Jul 14, 2023
a21e9c5
Make selenium pkgs optional for poetry
dpguthrie Jul 17, 2023
ffd60d3
Add function to get crumb, set in init
dpguthrie Jul 17, 2023
3154020
Add webdriver manager for selenium
dpguthrie Jul 17, 2023
f993385
Access public instead of private method
dpguthrie Jul 17, 2023
43643fb
Initialize session with cookies, need appropriate headers
dpguthrie Jul 17, 2023
91d0774
Change to appropriate branch
dpguthrie Jul 17, 2023
f5358a4
Account for FuturesSession when setting up for cookies
dpguthrie Jul 17, 2023
ea2a7e2
Updates for selenium 4
dpguthrie Jul 18, 2023
c6da27a
Update optional deps
dpguthrie Jul 18, 2023
d457fcc
Update to get crumb when setting up session
dpguthrie Jul 18, 2023
331c134
Add better error handling when unable to find crumb
dpguthrie Jul 24, 2023
ad5eceb
Create variable indicating if user has optional dependency
dpguthrie Jul 24, 2023
ef3bd52
Add fallback if crumb, cookies retrieval fails
dpguthrie Jul 24, 2023
e9c66ed
Add warning
dpguthrie Oct 28, 2023
7baf030
Update streamlit app url
dpguthrie Jul 18, 2023
667106e
Update how cookie is obtained, remove selenium backup
dpguthrie Oct 28, 2023
77f201e
Remove unnecesasry imports
dpguthrie Oct 28, 2023
bdf0a43
Bump urllib3 from 2.0.3 to 2.0.7
dependabot[bot] Oct 28, 2023
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
Add better error handling when unable to find crumb
dpguthrie authored and jhollanderpax8 committed Oct 31, 2023
commit 331c1341bc9037f62ba68ceb34115d183684149b
45 changes: 34 additions & 11 deletions yahooquery/utils/__init__.py
Original file line number Diff line number Diff line change
@@ -1,14 +1,21 @@
# stdlib
import datetime
import json
import random
import re
from urllib3.exceptions import MaxRetryError

# third party
import pandas as pd
from requests import Response, Session
from requests.adapters import HTTPAdapter
from requests.exceptions import ConnectionError, RetryError
from requests.packages.urllib3.util.retry import Retry
from requests_futures.sessions import FuturesSession

# first party
from yahooquery.login import YahooSelenium, _has_selenium


DEFAULT_TIMEOUT = 5

@@ -168,21 +175,37 @@ def _init_session(session=None, **kwargs):
def setup_session_with_cookies_and_crumb(session: Session):
headers = {**random.choice(HEADERS), **addl_headers}
session.headers = headers
response = session.get('https://finance.yahoo.com', hooks={'response': get_crumb})
if isinstance(session, FuturesSession):
response = response.result()
return session, response.crumb
try:
response = session.get('https://finance.yahoo.com')
except Exception:
return session, None
else:
if isinstance(session, FuturesSession):
response = response.result()
crumb = _get_crumb(response.text, session)
return session, crumb


def get_crumb(r: Response, *args, **kwargs):
r.crumb = None
def _get_crumb(page_text, session):
crumb = None
path = re.compile(r'window\.YAHOO\.context = ({.*?});', re.DOTALL)
match = re.search(path, r.text)
match = re.search(path, page_text)
if match:
js_dict = json.loads(match.group(1))
r.crumb = js_dict.get('crumb', None)

return r
dct = json.loads(match.group(1))
crumb = dct.get('crumb', None)
if crumb is not None:
return crumb

try:
response = session.get('https://query2.finance.yahoo.com/v1/test/getcrumb')
if isinstance(session, FuturesSession):
response = response.result()
crumb = response.text
except (ConnectionError, RetryError) as e:
# Cookies most likely not set in previous request
pass

return crumb

def _flatten_list(ls):
return [item for sublist in ls for item in sublist]