Collaboration_Network

Assignment

Build a network of co-occurring cities in publication affiliations. Visualise the network and find interesting relationships. Which are the strongest co-occurrences across countries? Does geography play a role? Use relative optionally log-odds ratios in your analysis. Draw the network on a map.

How to use

python3 -m venv .venv # create virtual environment
source .venv/bin/activate # activate virtual environment
pip install -r requirements.txt # install dependencies
python data_collection.py # to fetch some papers and coordinates
python main.py # to build graph, run visualization, ...

papers and coordinates fetched from api can be saved locally with methods in file_io.py (files can be found in data/papers and data/coordinates)
a list with countries names, coordinates, codes and alternative names can be found in data/dictionaries

Questions

How should we use log odds ratios? What is meant by "relative optionally"?
Concerning "interesting relationships" in the assignment: One idea we had was finding out which countries have a particularly well research in a specific topic. Are we on the right track? Probably that would be a use case for log odds ratios?
Is it crucial to have great performance when analyzing the data? How big should the dataset be?
How much accuracy is expected when extracting city names? Probably there will always be affiliations where we won't be able to extract a city and country. (Affiliations don't have a uniform format.)
(How to fetch all papers from PubMed at once?)

TODO

improve city visualization (Florian)
visualize relationships between countries (Florian)
improve location extraction (Oliver)
make coords fetching faster (persist city coordinates) (Albrecht)
try to find interesting relationships (e. g. per research field) (Markus)
special tasks for Karl (Karl)

Name		Name	Last commit message	Last commit date
Latest commit History 62 Commits
.vscode		.vscode
data		data
processing		processing
protokolle		protokolle
visualization		visualization
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
data_collection.py		data_collection.py
file_io.py		file_io.py
main.py		main.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Collaboration_Network

Assignment

How to use

Questions

TODO

About

Releases

Packages

Languages

License

MarkusKramer1/Collaboration_Network

Folders and files

Latest commit

History

Repository files navigation

Collaboration_Network

Assignment

How to use

Questions

TODO

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages