Skip to content

Fintech Project 2 - twitter data sentiment analysis

License

Notifications You must be signed in to change notification settings

sububer/sentimental

Repository files navigation

sentimental

Fintech Project 2 - Analysis Of Crypto Pricing and Ukraine War Twitter Sentiment

Project Overview

An analysis of Twitter Data based on Ukraine and crypto queries. This data was cleaned, and then run through sentiment analysis, looking for relationships between crypto prices and twitter sentiment on war/crypto topics.

Presentation slides

See installation guide below for specifics on setting up your environment.


Data Collection And Preparation

The data source for the sentiment analysis is Twitter Search API, specifically the /2/tweets/search endpoint.

  1. To run this application, you will need to open an account in the Twitter Developer platform to obtaion a bearer token. See config_example.py for how to stage your bearer token.
  2. Modify the query
    • by default the data_input.py will grab the past 7 days of data which is the limit of the api
  3. query.py contains the TwitterQuery class for managing the querying and data scraping/prep
  4. utils.py wraps the cleaning such as:
    • Case normalization/ standardizing text
    • Removing Unicode Characters (Punctuation, Emoji’s, URL’s and @’s)
    • Removing hyperlinks, marks and styles
    • Removing Stopwords (words that don’t value)
    • Stemming / Lemmatizing text
    • Tokenize tweets text
  5. resultant data is saved as .csv format in ./data/ folder, eg 2022323_144.csv

BTC Price Data from FTX BTC Feed

NOTE: Data Prep details in slide presentation pages 4 and 5.

Analysis Approach

Analysis Steps:

  1. sentiment.py is used to analyze the csv data and generate sentiment infer csv
  2. consolidate_data.py reads in infer csv data, and BTC data, generates training data and plots
  3. keras_train.py consumes train_dataset.csv

NOTE: Model details contained in slide presentation page 6.

Results

Sentiment 7 Day Sentiment 24 hr
Emotions 7 Day ![BTCPrice](04_BTC Price.png) Wordcloud

Technologies And Modules Used

This proect uses python 3.7 and the following modules:

See installation guide below for specifics on setting up your environment.


Installation Guide

You will need Python 3.7 for this application to run. An easy way to install python 3.7 is to download and install Anaconda. After installing anaconda, open a terminal/command-prompt, and setup a python 3.7 environment, and then activate it like so:

# creating a python 3.7 environment
# name can be any friendly name to refer to your environment, eg 'dev'
conda create --name dev python=3.7 anaconda

# activating the environment
conda activate dev

Next, use pip to install the required modules from the list above

# instaling required modules
$ pip install pandas
$ pip install numpy
$ etc...

You are now ready to run the program!


Usage Notes

IMPORTANT NOTE Twitter API Usage
You must sign up for a Twitter API key in order to authenticate and fetch twitter data.
See config_example for how to stage your Twitter API Bearer Token

Also, allow time and apply for an academic twitter api key, and not the free tier. This will open up a significantly higher usage and data granularity limit. Unfortunately with the free tier, you have limits on the amount of data you can pull.

WordCloud Generates a wordcloud visual from query data.

Contributors

Peter Morales
Shivangi Gupta
Jaime Aranda
David Lopez


License

MIT

About

Fintech Project 2 - twitter data sentiment analysis

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 3

  •  
  •  
  •