Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Measuring Meaning & Computational Introduction - Challenge #39

Open
jamesallenevans opened this issue Jan 12, 2021 · 35 comments
Open

Measuring Meaning & Computational Introduction - Challenge #39

jamesallenevans opened this issue Jan 12, 2021 · 35 comments

Comments

@jamesallenevans
Copy link

jamesallenevans commented Jan 12, 2021

Pose a research question you would like to answer (in one, artfully worded sentence...ending with a question mark). This need not be the basis of your final project...but it could lead there. Then describe a collection of sources in a short (2-5 sentence) paragraph you would like to assemble, scrape, generate or spider (see this week’s code for examples) into a textual corpus that you believe will help you answer your stated question. Please do NOT spend time/space explaining how you will answer the question with the assembled corpus.

@hesongrun
Copy link

hesongrun commented Jan 13, 2021

Research Question: What are the key textual information in financial announcement or news that provide insights for investors into the future performance of the companies?

Collection of Sources: The company announcements, such as 10-Ks and 10-Qs, are available on the Electronic Data Gathering, Analysis, and Retrieval system (EDGAR). The earning conferences phone call by companies are available from the Factiva's Fair Disclosure (FD) Wire. We can link these textual information with CRSP for stock returns or Compustat for company fundamental information.

@jacyanthis
Copy link

How do the goals of artificial intelligence (accuracy, social benefit, fairness, interpretability, etc.) relate to each other and evolve over time? Sources for the "framing" (Goffman 1972) and "sensemaking" (Weick 1995) of these goals include:

  • newspaper articles about AI (e.g. NOW, ProQuest)
  • scholarly papers on AI in computer science, social science, and ethics (e.g. Scopus)
  • analyst reports (e.g. Thomson One Investext)
  • press releases (e.g. LexisNexis)
  • mission statements (e.g. company websites)
  • earnings conference calls (e.g. Refinitiv)

@william-wei-zhu
Copy link

william-wei-zhu commented Jan 13, 2021

Research question: Do CEO turnovers impact company culture? If so, what factors influence the magnitude and speed of company cultural change due to CEO turnover?

Sources: Glassdoor company review data. Indeed company review data.

@toecn
Copy link

toecn commented Jan 14, 2021

RQ: Research on populism has centered on the speech characteristics of populist politicians; can we inquire into the context of reception and its relationship to candidates' rhetoric using computational content analysis? In other words, what can computational text analysis tell us about political opportunities for populism? Do populist politicians disrupt the rhetoric landscape when they enter the field? Does the composition of their speech reflect on the composition of their electorate?

Sources: the project will be centered on Latin America.

  • Corpora from the web: 1. Sketch Engine, 2. COW, 3. Corpus del Español
  • Candidates speeches: effort in collecting and assembling a corpus from available speech (ideas here would be great!!!).
  • Electoral outcomes.

@egemenpamukcu
Copy link

My research question is similar to the one above: Do political leaders use hostile rhetoric against foreign nations or international organizations as a tool to increase their national support and distract their populations from domestic strife?

Sources can be popular online news sources and government webpages that publish official speeches and announcements made by politicians. In order to gauge the "distraction" of public or an change in support towards political leadership, social media sites like Twitter and YouTube, and political opinion blogs/forums might be a useful source.

@RobertoBarrosoLuque
Copy link

RobertoBarrosoLuque commented Jan 14, 2021

What are the best textual predictors for acceptance of papers in high impact journals?
Are such predictors shared across academic disciplines or are their domain-specific features that predict which papers will be published in their field's most influential journals?

Sources:

  • dataset containing scholarly articles across a wide range of fields. For example this dataset on COVID literature.
  • dataset containing acceptance and rejection of papers to different scientific journals.

@medcar8879
Copy link

How has the COVID pandemic and the subsequent shift to remote work/learning environments impacted public sentiment towards the use of video-communication tools (e.g., Zoom, Skype, FaceTime)?

Sources:

  • Tweets

  • Instant-messages from common messaging platforms (e.g., Slack)

@mingtao-gao
Copy link

Research question:  How do brand-related user generated contents (UGC) differ across theoretically categorized social media platforms?  In other words, do conceptual categorizations of social media platforms in fact influence brand-related UGC and consumer engagement?

Source: Scraping social media content using public API. Different categorized social media including Facebook (relationship media), Twitter (self-media), Instagram (creative outlets), and Reddit (collaboration platforms).

@chiayunc
Copy link

Research question: Is there a change in the legal rhetoric formed in international climate change law throughout the 25 years of implementation? and if so, how? does it reflect long-standing international cooperation issues? does it reflect the new environmental governance paradigm?

Source: UNFCCC COP decisions and resolutions.

@theoevans1
Copy link

Research question: In fan communities surrounding media, how do the themes and language of fan creations or discussions shift from those of the source material?

The TV or movie corpora could provide data on source material. Fan works could be found through sites like fanfiction.net, and fan discussions could be found on Reddit, Twitter, or Tumblr.

@k-partha
Copy link

k-partha commented Jan 15, 2021

Can we understand the cognitive/personality basis of political/moral philosophies through text and social media posts?

  • Twitter, Reddit, Tumblr discussion forums and profiles
  • Influential political texts associated with particular ideology (libertarianism/anarcho-socialism/communism etc.)

@xxicheng
Copy link

Research question: Over the past century, had it become easier for kids from middle-class families to engage in high culture?

Source: Musicians of Major American Orchestras on the Stokowski website (https://www.stokowski.org/), and musicians’ Wikipedia pages (if available).

@dtanoglidis
Copy link

Research question: What makes visiting Paris (or Athens, or New York) feel like visiting Paris (or Athens, or New York)? What are the salient, location-specific, characteristics in the way people describe their stay in different cities across the world, and are there ethnic stereotypes embedded in them?

Sources:

  • Airbnb reviews (from inside airbnb).
  • Trib advisor reviews.
  • Google reviews (of landmarks)

@vfuentesc
Copy link

Question: What is the statement of purpose's tone, arguments' rigorosity, and plagiarism level of legislative projects in Peru? The number of legislative projects rocketed from 13k in 2018 to 97k in 2020 since the former (former) President* dissolved the entire Congress body in 2018, resulting in a completely non-experience new legislative body.

Source: scrapping Peruvian Congress website

*FYI: Peru's Congress ousted President Vizcarra in an impeachment vote in November. His successor resigned 5 days later.

@jinfei1125
Copy link

Question: What's the time trend of people's anxiety about disposable income in the U.S.?
Source: Scraped data from Reddit Page Personal Finance

A question to @william-wei-zhu 's question "Do CEO turnovers impact company culture?": I think content analysis can't tell us causality but only correlation? For example, we don't know the true mechanism is CEO's turnovers impact company culture or company culture impacts CEO's turnover.

@jcvotava
Copy link

Question: What words, phrases, or sentiments are used by companies in advertisements in order to convince their audience that a product or service is of high value? How do these words, phrases, or sentiments vary across place (i.e. country) or time?

Sources: Scraped newspaper advertisements, transcripts of radio advertisements, or transcripts of visual (TV or Web) advertisements

@Raychanan
Copy link

I have seen a lot of reports and pieces that after COVID-19, there has been a big change in attitude towards China at the national level in many regions. But I want to verify whether this is true from what the people have spontaneously posted on the Internet. So the question I want to explore is: How did the negative attitudes of the "people" of various developed countries toward the Chinese change after the COVID-19 pandemic?

I think there are the following possible sources.

  1. Textual data: Twitter
  2. Corpus: Global Web-Based English (GloWbE); Corpus of Contemporary American English (COCA)

@chuqingzhao
Copy link

chuqingzhao commented Jan 15, 2021

Research on labor's skills and firms' innovation capability. Research Question: what is the relationship between firms' skills requirement and innovation? What's kind of skills is popular to companies that have strong innovation capability? How does the labor's skill requirement change over time across different industries?

Source: job posting data from Glassdoor and Indeed

@romanticmonkey
Copy link

Question: Can ten years of movie reviews reflect cultural change in the movie industry, or the society in general?

Source: movie review website like rotten tomatoes, perhaps with the changes in google search terms

@Rui-echo-Pan
Copy link

Rui-echo-Pan commented Jan 15, 2021

Research Question: Donation behavior/decision
How do the emotional descriptions and themes of medical crowdfunding (MCF) influence the donation outcome (number of donations, average donation amount)? How does it interact with the demographic characteristics (race, gender, age, etc) of the campaigners/ the recipients?
To be more specific, whether emotional descriptions or authentic style would be more effective in persuading others to donate.
Source: GoFundMe can be a good source to scrap the passages of MCF.

  • There are abundant narrative description asking for help on medical urgency;
  • dependent variables: number of donations, average donation amount for each campaign;
  • other control variables: monetary goal, money raised, percent of goal reached, number of donations, largest single donation, shares, followers, comments; days from starting; the number of pictures, videos, updates

@sabinahartnett
Copy link

RQ: What language/themes are propagated by extremist-affiliated blogs, social media accounts and other news outlets? And how might these trends seep into and be reflected in online civic discourse?

Sources:

  • Social media platforms - via public API (incl. Twitter, Youtube, FB)
  • Instant-messaging platforms (incl. WhatsApp, Slack)
  • Parler, Gap, other host sites for extreme content
  • ** @donk_enby’s scraping of Parler postings

@Bin-ary-Li
Copy link

Bin-ary-Li commented Jan 15, 2021

Research Question: How do the authorship, provenance, and artist statement of an artwork influence its auctioned value in the market.

Sources:

  • Scrapping artwork auction entries from the fine art auctioning agency like Sotheby's and Christie's.
  • Scrapping artists' bio from their Wikipedia entries
  • Getty Provenance Index

@MOTOKU666
Copy link

RQ: Would the COVID change company's decision of employment towards people with a certain cultural or national background?

Sources:

  • Glassdoor review
  • Social media platform comments

@ming-cui
Copy link

Research Question: Will people who have intensive communications/connections with high-status individuals gain status?

Sources: Comments, likes, retweets on Twitter. Connections on LinkedIn. Likes on Facebook.

@dtmlinh
Copy link

dtmlinh commented Jan 15, 2021

Research Question: How have the lyrics and sentence structures of popular songs changed over time?

Sources: Song lyrics, song metadata (artist, year, genre, etc.), song rankings

@zshibing1
Copy link

zshibing1 commented Jan 15, 2021

How does China's Social Credit System (SCS) construct the idea of trustworthiness (xinyong or chenxin) in its discourses and practices?
Sources include SCS's policy and legal documents and published records of businesses and individuals on the city-level and national SCS online platform.

@keiraou
Copy link

keiraou commented Jan 15, 2021

Research Question: What is patterns (emotional descriptions, themes) of Trump tweets and its spread?
Sources: https://www.thetrumparchive.com/ ; Twitter API

@YijingZhang-98
Copy link

RQ: How does the COVID-19 affect the labor market demand

Sources: scraping the job posting information on leading hiring website, such as https://www.51job.com/

@yushiouwillylin
Copy link

Research Question:
Will average lyric sentiments or topics in a certain time period be influenced by economic conditions? (Could be lagged or actually predict future economic conditions, who knows?)

Source:

@acozzubo
Copy link

Research question
How does "pro-life" and "pro-choice" movements present their arguments? Has this changed over time? Is there any middle ground between their positions?

Sources

  • Social media (Facebook, Twitter, Instagram, or TikTok) posts of each side
  • Social media followers and interactions
  • Statements from each side's leaders and their organizations
  • Statements from political parties or leaders that openly identify with one side

@jetienne6
Copy link

Research question: How do white parents who participated in Black Lives Matter protests participate in their student's school and/or school district diversity, equity, inclusion initiatives?

Sources:

  • social media posts on Facebook, Twitter, Instagram
  • social media following lists
  • notes from school district meetings, PTA meetings
  • attendance lists from district, PTA meetings

@MayarakQuintero
Copy link

Research question:
What are the best predictors for mental distress (e.g. depression, anxiety)? Are these predictors different based on geography or cultural patterns?

Sources:

  • Social media posts and interactions on Twitter and Facebook

@mkjang17
Copy link

Question: How does information on product description pages relate to consumer perceptions about the products, in particular, how people categorize products, or some dimensions of the products, such as price?

Sources: Amazon.com, Bestbuy.com, etc.

@Atkinson1
Copy link

Research question: How do local news/newspaper comment sections on national news stories differ in emotionally charged language relative to comment sections for major city newspapers? Does emotionally charged language significantly vary between either source and/or type of news story?

Sources: local news/newspapers, geographically major city newspaper counterparts (NY Times, LA Times, Chicago Tribune, etc.), scholarly research, language analysis software/methods.

@anariaah
Copy link

Research question: How do organizations in the STEM industry frame their diversity and inclusion policy? How has the framing evolved overtime?
Sources: diversity statements and reports from companies (web scraping)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests