version control: what should be on git, directory structure, .gitignore #1

tavareshugo · 2022-05-13T09:22:45Z

Covering three somewhat related points (sorry if I missed these in the materials somewhere):

Clarify that data should not be under version control
Add some suggestions about organising files/folders within a project directory
Add note about .gitignore and its uses

Add a note somewhere about what files should or should not be under version control with Git.
Mainly, data files and binary documents should live elsewhere (e.g. data repositories released with publications).

Large files that can be recreated from code do not need to be under version control - the code that generates them is what is kept under version control and can be used to recreate them.

We could give some advice on structuring a project directory, to tie in with the version control process and insisting that raw data should be left alone and we should try (as much as possible) to make our results able to be recreated from code.
For example:

my_project
   |_ data             # raw data - never changed (read-only preferably)
   |_ documents  # misc documents such as word, presentations, etc.
   |_ results         # results of the analysis (these files can usually be re-created from data + code)
   |_ scripts         # analysis code

An with this we could motivate and introduce the .gitignore, so for the example above we could have:

data
documents
results

The text was updated successfully, but these errors were encountered:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

version control: what should be on git, directory structure, .gitignore #1

version control: what should be on git, directory structure, .gitignore #1

tavareshugo commented May 13, 2022

version control: what should be on git, directory structure, .gitignore #1

version control: what should be on git, directory structure, .gitignore #1

Comments

tavareshugo commented May 13, 2022