updates to paper production to allow for separate paper information y…

…aml and auotomated manuscript PDF generation using R script
AaronGullickson · May 27, 2019 · 095378a · 095378a
1 parent 6d56826
commit 095378a
Show file tree

Hide file tree

Showing 9 changed files with 1,423 additions and 49 deletions.
diff --git a/products/paper/README.md b/products/paper/README.md
@@ -2,9 +2,11 @@
 
 This directory contains all the files necessary to produce a final manuscript. 
 
-When possible, I attempt to write the manuscript in R Markdown. The `main.Rmd` is used to write the main manuscript. I use a custom tex template [adapated from Steven V. Miller](http://svmiller.com/blog/2016/02/svm-r-markdown-manuscript/) to produce the R Markdown in PDF format. You can define a variety of options in the YAML header to get a different look. The template itself contains more details about all of these options. Most importantly, the `anonymous` option can be changed to true to remove author names and change to the crappy double spaced and ragged edged manuscript that most journals expect. You can see a version of what a nicely formatted version of this template looks like [here](http://pages.uoregon.edu/aarong/assets/fullmanuscript.pdf) and a version that has been anonymized [here](http://pages.uoregon.edu/aarong/assets/fullmanuscript_submission.pdf). I also include an MS Word template that can be used to knit to Word when necessary.  
+When possible, I attempt to write the manuscript in R Markdown. The `main.Rmd` is used to write the main manuscript. I use a custom tex template [adapated from Steven V. Miller](http://svmiller.com/blog/2016/02/svm-r-markdown-manuscript/) to produce the R Markdown in PDF format. You can define a variety of options in the YAML header to get a different look. The template itself contains more details about all of these options. Most importantly, the `anonymous` option can be changed to true to remove author names and change to the crappy double spaced and ragged edged manuscript that most journals expect. You can see a version of what a nicely formatted version of this template looks like [here](https://aarongullickson.github.io/assets/fullmanuscript.pdf) and a version that has been anonymized [here](https://aarongullickson.github.io/assets/fullmanuscript_submission.pdf). I also include an MS Word template that can be used to knit to Word when necessary.  
 
-I sometimes just write the manuscript using `main.Rmd`. However, sometimes it is preferable to keep separate Rmd files for tables and figures and any technical appendices. Journals often want tables and figures at the end of the document and sometimes I just don't want my main document to be slowed down by the processing of code. Starter code for these two R Markdown files are also located in this directory. When I do it this way, the three PDFs can be combined together with the `combine_pdfs.py` python script. This will create a `full_manuscript.pdf` file that includes all PDFs. Users will have to have the [PyPDF2 library](https://github.com/mstamy2/PyPDF2) installed in order for this script to work. In order for page numbering to be accurate, the pagenumber variable will also need to be adjusted for the supplementary R Markdown files. 
+I keep information on title, authors, affiliations, abstract, and acknowledgements in a separate `paper_info.yaml`. This information will be added to the `main.Rmd` output when knit. This way the same information can also be used to generate a title page pdf with `title_page.Rmd` without having to copy everything over. Such title pages are often required by journals (god knows why we need them in this day and age). 
+
+I sometimes just write the manuscript using `main.Rmd`. However, sometimes it is preferable to keep separate Rmd files for tables and figures and any technical appendices. Journals often want tables and figures at the end of the document and sometimes I just don't want my main document to be slowed down by the processing of code. Starter code for these two R Markdown files are also located in this directory. When I do it this way, the three PDFs can be combined together with the `generate_full_paper.R` script. This will create a `full_manuscript.pdf` file that includes all PDFs. One of the the nice features of this script is that it will adjust the page numbers of the additional R Markdown files when it knits so that the final document will have proper pagination throughout. At the top of this script are two boolean options to indicate whether you want a particular thing to be knit and added to the full manuscript. 
 
 In order to create tables and figures, I usually just load constructed data from the analysis side of the directory using relative paths from within R Markdown. Something like:
 

diff --git a/products/paper/appendix.Rmd b/products/paper/appendix.Rmd
@@ -5,12 +5,8 @@ output:
     template: ./resources/aog-latex-ms.tex
 fontfamily: mathpazo
 fontsize: 11pt
-anonymous: false
-endnotes: false
 appendix: true
 appendixletter: A
-#change page number for the final pdf
-pagenumber: 1
 ---
 
-#Technical Appendix
+# Technical Appendix
diff --git a/products/paper/combine_pdfs.py b/products/paper/combine_pdfs.py
diff --git a/products/paper/generate_full_paper.R b/products/paper/generate_full_paper.R
@@ -0,0 +1,45 @@
+#a script to knit and produce the full document
+
+#You can specify below what additional Rmd files you want to be added to
+#main.pdf for the full manuscript (currently tables and figures and appendix).
+#The program will change page numbering for the supplementary pdfs so there
+#is no need to change page numbers in the YAML. The script will also
+#create a title page since many journals require this. 
+
+separate_tabfigs <- TRUE
+separate_appendix <- TRUE
+
+library(pdftools)
+library(rmarkdown)
+
+# Knit PDFs ---------------------------------------------------------------
+
+#first remove all existing PDF files
+system("rm *.pdf")
+
+#assume main.Rmd exists or no point to this
+render("main.Rmd")
+
+#now render additional pdfs
+list_of_pdfs <- "main.pdf"
+start_page <- pdf_length("main.pdf")+1
+if(separate_tabfigs & file.exists("tablesfigs.Rmd")) {
+  render("tablesfigs.Rmd", 
+         output_options = list(pandoc_args = c(paste("--metadata=pagenumber:", 
+                                                     start_page, sep=" "))))
+  start_page <- start_page + pdf_length("tablesfigs.pdf")
+  list_of_pdfs <- c(list_of_pdfs, "tablesfigs.pdf")
+}
+if(separate_appendix & file.exists("appendix.Rmd")) {
+  render("appendix.Rmd", 
+         output_options = list(pandoc_args = c(paste("--metadata=pagenumber:", 
+                                                     start_page, sep=" "))))
+  list_of_pdfs <- c(list_of_pdfs, "appendix.pdf")
+}
+
+#now do the title page
+if(file.exists("title_page.Rmd")) {
+  render("title_page.Rmd")
+}
+
+pdf_combine(list_of_pdfs, output = "full_manuscript.pdf")
diff --git a/products/paper/main.Rmd b/products/paper/main.Rmd
@@ -1,27 +1,21 @@
 ---
 output:
   pdf_document:
+    pandoc_args: './paper_info.yaml'
     citation_package: natbib
     fig_caption: yes
     template: ./resources/aog-latex-ms.tex
   word_document:
+    pandoc_args: './paper_info.yaml'
     reference_docx: ./resources/aog_word_style.docx
+  html_document:
+    pandoc_args: './paper_info.yaml'
 fontfamily: mathpazo
 fontsize: 11pt
 anonymous: false
 endnotes: false
-pagenumber: 1
 bibliography: ../project.bib
 biblio-style: ./resources/ajs.bst
-title: "The Title"
-author:
-- affiliation: University of Oregon, Sociology
-  name: Aaron Gullickson
-- affiliation: Some Research University
-  name: My co-author
-keywords: keywords
-thanks: Thanks to people and stuff
-abstract: This is a test abstract
 ---
 
 # Introduction

diff --git a/products/paper/paper_info.yaml b/products/paper/paper_info.yaml
@@ -0,0 +1,12 @@
+---
+title: "The Title"
+author:
+- affiliation: University of Oregon, Sociology
+  name: Aaron Gullickson
+- affiliation: Some Research University
+  name: My co-author
+keywords: keywords
+#FYI: the thanks and abstract are easier to view and edit in Atom which does proper linewrapping
+thanks: Thanks to people and stuff
+abstract: This is a test abstract
+---