ejtraderNS

Programmatically collect normalized news from (almost) any website.

Filter by topic, country, or language.

Installation

pip install ejtraderNS --upgrade

Quick Start

from ejtraderNS import Client

Get the latest news from nytimes.com (we support thousands of news websites, try yourself!) main news feed

api = ejtraderNS(website = 'nytimes.com')
results = api.get_news()

# results.keys()
# 'url', 'topic', 'language', 'country', 'articles'

# Get the articles
articles = results['articles']

first_article_summary = articles[0]['summary']
first_article_title = articles[0]['title']

Get the latest news from nytimes.com politics feed

api = ejtraderNS(website = 'nytimes.com', topic = 'politics')

results = api.get_news()
articles = results['articles']

Some websites support multiple countries, such as investing.com or tradingeconomics.com

In this example, I will demonstrate a website that supports multiple countries, retrieve multiple topics, and convert the data into a pandas dataframe.

import pandas as pd
from ejtraderNS import Client
from datetime import datetime

url = 'investing.com' # or tradingeconomics.com
country = 'GB'
country_topic = ["finance","news","economics"]
dfs = []

for topic in country_topic:
    api = Client(website=url, topic=topic, country=country)
    getdata = api.get_news()
    print(f"topic: {topic}")

    if getdata is None:
        continue

    data = []

    for article in getdata['articles']:
        article_data = {
            'topic': getdata['topic'],
            'author': article['author'],
            'date': article['published_parsed'] if article['published_parsed'] else article['published'],
            'country': getdata['country'],
            'language': getdata['language'],
            'title': article['title'],
            'summary': article.get('summary', article['title'])
        }
        data.append(article_data)

    df = pd.DataFrame(data)

    df['date'] = pd.to_datetime(df['date'].apply(lambda x: datetime(*x[:6]) if isinstance(x, tuple) else x), utc=True, errors='coerce')
    df.set_index('date', inplace=True)
    dfs.append(df)

df = pd.concat(dfs)
df.sort_index(inplace=True)
print(df)

output example:

topic	author	country	language	title	summary
finance	Reuters	GB	en	Italy pushes to limit executive pay in listed state-run firms	Italy pushes to limit executive pay in listed state-run firms
economics	Reuters	GB	en	UK's Cleverly raises Xinjiang and Taiwan with Chinese vice president	UK's Cleverly raises Xinjiang and Taiwan with Chinese vice president
news	Reuters	GB	en	Ukraine hails return of 45 Azov fighters, Russia says 3 pilots released	Ukraine hails return of 45 Azov fighters, Russia says 3 pilots released

There is a limited set of topic that you might find: 'tech', 'news', 'business', 'science', 'finance', 'food', 'politics', 'economics', 'travel', 'entertainment', 'music', 'sport', 'world'

extras topics only for investing.com

'crypto', 'forex', 'stock', 'commodities', 'central_bank', 'forex_analysis', 'forex_technical', 'forex_fundamental', 'forex_opinion', 'forex_signal', 'bonds_analysis', 'bonds_technical', 'bonds_fundamental', 'bonds_opinion', 'bonds_strategy', 'bonds_government', 'bonds_corporate', 'stock_analysis', 'stock_technical', 'stock_fundamental', 'stock_opinion', 'stock_picks', 'indices_analysis', 'futures_analysis', 'options_analysis', 'commodities_analysis', 'commodities_technical', 'commodities_Fundamental', 'commodities_opinion', 'commodities_strategy', 'commodities_metals', 'commodities_energy', 'commodities_agriculture', 'overview_analysis', 'overview_technical', 'overview_fundamental', 'overview_opinion', 'overview_investing', 'crypto_opinion'

However, not all topics are supported by every newspaper.

How to check which topics are supported by which newspaper:

from ejtraderNS import describe_url

describe = describe_url('nytimes.com')

print(describe['topics'])

Get the list of all news feeds by topic/language/country

If you want to find the full list of supported news websites you can always do so using urls() function

from ejtraderNS import urls

# URLs by TOPIC
politic_urls = urls(topic = 'politics')

# URLs by COUNTRY
american_urls = urls(country = 'US')

# URLs by LANGUAGE
english_urls = urls(language = 'en')

# Combine any from topic, country, language
american_english_politics_urls = urls(country = 'US', topic = 'politics', language = 'en') 

# note some websites do not explicitly declare their language 
# as a result they will be excluded from queries based on language

Documentation

`ejtraderNS` Class

from ejtraderNS import Client

Client(website, topic = None)

Please take the base form url of a website (without www.,neither https://, nor / at the end of url).

For example: “nytimes”.com, “news.ycombinator.com” or “theverge.com”.

Client.get_news() - Get the latest news from the website of interest.

Allowed topics: tech, news, business, science, finance, food, politics, economics, travel, entertainment, music, sport, world

If no topic is provided, the main feed is returned.

Returns a dictionary of 5 elements:

url - URL of the website
topic - topic of the returned feed
language - language of returned feed
country - country of returned feed
articles - articles of the feed. Feedparser object

Client.get_headlines() - Returns only the headlines

Client.print_headlines(n) - Print top n headlines

`describe_url()` & `urls()`

Those functions exist to help you navigate through this package

from ejtraderNS import describe_url

describe_url(website) - Get the main info on the website.

Returns a dictionary of 5 elements:

url - URL of the website
topics - list of all supported topics
language - language of website
country - country of returned feed
main_topic - main topic of a website

from ejtraderNS import urls

urls(topic = None, language = None, country = None) - Get a list of all supported news websites given any combination of topic, language, country

Returns a list of websites that match your combination of topic, language, country

Supported topics: tech, news, business, science, finance, food, politics, economics, travel, entertainment, music, sport, world

Supported countries: US, GB, DE, FR, IN, RU, ES, BR, IT, CA, AU, NL, PL, NZ, PT, RO, UA, JP, AR, IR, IE, PH, IS, ZA, AT, CL, HR, BG, HU, KR, SZ, AE, EG, VE, CO, SE, CZ, ZH, MT, AZ, GR, BE, LU, IL, LT, NI, MY, TR, BM, NO, ME, SA, RS, BA

Supported languages: EL, IT, ZH, EN, RU, CS, RO, FR, JA, DE, PT, ES, AR, HE, UK, PL, NL, TR, VI, KO, TH, ID, HR, DA, BG, NO, SK, FA, ET, SV, BN, GU, MK, PA, HU, SL, FI, LT, MR, HI

Tech/framework used

The package itself is nothing more than a SQLite database with RSS feed endpoints for each website and some basic wrapper of feedparser.

Acknowledgements

I would like to express my gratitude to @kotartemiy for creating the initial project. Their work has been an invaluable starting point for my modifications and improvements.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

ejtraderNS

Installation

Quick Start

Get the list of all news feeds by topic/language/country

Documentation

`ejtraderNS` Class

`describe_url()` & `urls()`

Tech/framework used

Acknowledgements

Files

README.md

Latest commit

History

README.md

File metadata and controls

ejtraderNS

Installation

Quick Start

Get the list of all news feeds by topic/language/country

Documentation

ejtraderNS Class

describe_url() & urls()

Tech/framework used

Acknowledgements

`ejtraderNS` Class

`describe_url()` & `urls()`