Skip to content

Commit

Permalink
updates to paper production to allow for separate paper information y…
Browse files Browse the repository at this point in the history
…aml and auotomated manuscript PDF generation using R script
  • Loading branch information
AaronGullickson committed May 27, 2019
1 parent 6d56826 commit 095378a
Show file tree
Hide file tree
Showing 9 changed files with 1,423 additions and 49 deletions.
6 changes: 4 additions & 2 deletions products/paper/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,9 +2,11 @@

This directory contains all the files necessary to produce a final manuscript.

When possible, I attempt to write the manuscript in R Markdown. The `main.Rmd` is used to write the main manuscript. I use a custom tex template [adapated from Steven V. Miller](http://svmiller.com/blog/2016/02/svm-r-markdown-manuscript/) to produce the R Markdown in PDF format. You can define a variety of options in the YAML header to get a different look. The template itself contains more details about all of these options. Most importantly, the `anonymous` option can be changed to true to remove author names and change to the crappy double spaced and ragged edged manuscript that most journals expect. You can see a version of what a nicely formatted version of this template looks like [here](http://pages.uoregon.edu/aarong/assets/fullmanuscript.pdf) and a version that has been anonymized [here](http://pages.uoregon.edu/aarong/assets/fullmanuscript_submission.pdf). I also include an MS Word template that can be used to knit to Word when necessary.
When possible, I attempt to write the manuscript in R Markdown. The `main.Rmd` is used to write the main manuscript. I use a custom tex template [adapated from Steven V. Miller](http://svmiller.com/blog/2016/02/svm-r-markdown-manuscript/) to produce the R Markdown in PDF format. You can define a variety of options in the YAML header to get a different look. The template itself contains more details about all of these options. Most importantly, the `anonymous` option can be changed to true to remove author names and change to the crappy double spaced and ragged edged manuscript that most journals expect. You can see a version of what a nicely formatted version of this template looks like [here](https://aarongullickson.github.io/assets/fullmanuscript.pdf) and a version that has been anonymized [here](https://aarongullickson.github.io/assets/fullmanuscript_submission.pdf). I also include an MS Word template that can be used to knit to Word when necessary.

I sometimes just write the manuscript using `main.Rmd`. However, sometimes it is preferable to keep separate Rmd files for tables and figures and any technical appendices. Journals often want tables and figures at the end of the document and sometimes I just don't want my main document to be slowed down by the processing of code. Starter code for these two R Markdown files are also located in this directory. When I do it this way, the three PDFs can be combined together with the `combine_pdfs.py` python script. This will create a `full_manuscript.pdf` file that includes all PDFs. Users will have to have the [PyPDF2 library](https://github.com/mstamy2/PyPDF2) installed in order for this script to work. In order for page numbering to be accurate, the pagenumber variable will also need to be adjusted for the supplementary R Markdown files.
I keep information on title, authors, affiliations, abstract, and acknowledgements in a separate `paper_info.yaml`. This information will be added to the `main.Rmd` output when knit. This way the same information can also be used to generate a title page pdf with `title_page.Rmd` without having to copy everything over. Such title pages are often required by journals (god knows why we need them in this day and age).

I sometimes just write the manuscript using `main.Rmd`. However, sometimes it is preferable to keep separate Rmd files for tables and figures and any technical appendices. Journals often want tables and figures at the end of the document and sometimes I just don't want my main document to be slowed down by the processing of code. Starter code for these two R Markdown files are also located in this directory. When I do it this way, the three PDFs can be combined together with the `generate_full_paper.R` script. This will create a `full_manuscript.pdf` file that includes all PDFs. One of the the nice features of this script is that it will adjust the page numbers of the additional R Markdown files when it knits so that the final document will have proper pagination throughout. At the top of this script are two boolean options to indicate whether you want a particular thing to be knit and added to the full manuscript.

In order to create tables and figures, I usually just load constructed data from the analysis side of the directory using relative paths from within R Markdown. Something like:

Expand Down
6 changes: 1 addition & 5 deletions products/paper/appendix.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -5,12 +5,8 @@ output:
template: ./resources/aog-latex-ms.tex
fontfamily: mathpazo
fontsize: 11pt
anonymous: false
endnotes: false
appendix: true
appendixletter: A
#change page number for the final pdf
pagenumber: 1
---

#Technical Appendix
# Technical Appendix
28 changes: 0 additions & 28 deletions products/paper/combine_pdfs.py

This file was deleted.

45 changes: 45 additions & 0 deletions products/paper/generate_full_paper.R
Original file line number Diff line number Diff line change
@@ -0,0 +1,45 @@
#a script to knit and produce the full document

#You can specify below what additional Rmd files you want to be added to
#main.pdf for the full manuscript (currently tables and figures and appendix).
#The program will change page numbering for the supplementary pdfs so there
#is no need to change page numbers in the YAML. The script will also
#create a title page since many journals require this.

separate_tabfigs <- TRUE
separate_appendix <- TRUE

library(pdftools)
library(rmarkdown)

# Knit PDFs ---------------------------------------------------------------

#first remove all existing PDF files
system("rm *.pdf")

#assume main.Rmd exists or no point to this
render("main.Rmd")

#now render additional pdfs
list_of_pdfs <- "main.pdf"
start_page <- pdf_length("main.pdf")+1
if(separate_tabfigs & file.exists("tablesfigs.Rmd")) {
render("tablesfigs.Rmd",
output_options = list(pandoc_args = c(paste("--metadata=pagenumber:",
start_page, sep=" "))))
start_page <- start_page + pdf_length("tablesfigs.pdf")
list_of_pdfs <- c(list_of_pdfs, "tablesfigs.pdf")
}
if(separate_appendix & file.exists("appendix.Rmd")) {
render("appendix.Rmd",
output_options = list(pandoc_args = c(paste("--metadata=pagenumber:",
start_page, sep=" "))))
list_of_pdfs <- c(list_of_pdfs, "appendix.pdf")
}

#now do the title page
if(file.exists("title_page.Rmd")) {
render("title_page.Rmd")
}

pdf_combine(list_of_pdfs, output = "full_manuscript.pdf")
14 changes: 4 additions & 10 deletions products/paper/main.Rmd
Original file line number Diff line number Diff line change
@@ -1,27 +1,21 @@
---
output:
pdf_document:
pandoc_args: './paper_info.yaml'
citation_package: natbib
fig_caption: yes
template: ./resources/aog-latex-ms.tex
word_document:
pandoc_args: './paper_info.yaml'
reference_docx: ./resources/aog_word_style.docx
html_document:
pandoc_args: './paper_info.yaml'
fontfamily: mathpazo
fontsize: 11pt
anonymous: false
endnotes: false
pagenumber: 1
bibliography: ../project.bib
biblio-style: ./resources/ajs.bst
title: "The Title"
author:
- affiliation: University of Oregon, Sociology
name: Aaron Gullickson
- affiliation: Some Research University
name: My co-author
keywords: keywords
thanks: Thanks to people and stuff
abstract: This is a test abstract
---

# Introduction
Expand Down
12 changes: 12 additions & 0 deletions products/paper/paper_info.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
---
title: "The Title"
author:
- affiliation: University of Oregon, Sociology
name: Aaron Gullickson
- affiliation: Some Research University
name: My co-author
keywords: keywords
#FYI: the thanks and abstract are easier to view and edit in Atom which does proper linewrapping
thanks: Thanks to people and stuff
abstract: This is a test abstract
---
Loading

0 comments on commit 095378a

Please sign in to comment.