Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

markup: add --citeproc to pandoc converter #9953

Open
wants to merge 1 commit into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
56 changes: 54 additions & 2 deletions docs/content/en/content-management/formats.md
Original file line number Diff line number Diff line change
Expand Up @@ -43,7 +43,7 @@ Hugo passes reasonable default arguments to these external helpers by default:

- `asciidoctor`: `--no-header-footer -`
- `rst2html`: `--leave-comments --initial-header-level=2`
- `pandoc`: `--mathjax`
- `pandoc`: `--mathjax` and, for pandoc >= 2.11, `--citeproc`

{{% note %}}
Because additional formats are external commands, generation performance will rely heavily on the performance of the external tool you are using. As this feature is still in its infancy, feedback is welcome.
Expand Down Expand Up @@ -106,7 +106,59 @@ parameters. Run Hugo with `-v`. You will get an output like
INFO 2019/12/22 09:08:48 Rendering book-as-pdf.adoc with C:\Ruby26-x64\bin\asciidoctor.bat using asciidoc args [--no-header-footer -r asciidoctor-html5s -b html5s -r asciidoctor-diagram --base-dir D:\prototypes\hugo_asciidoc_ddd\docs -a outdir=D:\prototypes\hugo_asciidoc_ddd\build -] ...
```

## Learn markdown
### External Helper Pandoc

[Pandoc](https://pandoc.org) is a universal document converter and can be used to convert markdown files.
In Hugo, Pandoc can be used for LaTeX-style math (the `--mathjax` command line option is provided):

```
---
title: Math document
---

Some inline math: $a^2 + b^2 = c^2$.
```

This will render in your HTML as:

```
<p>Some inline math: <span class="math inline">\(a^2 + b^2 = c^2\)</span></p>
```
You will have to [add MathJax](https://www.mathjax.org/#gettingstarted) to your template to properly render the math.

For **Pandoc >= 2.11**, you can use [citations](https://pandoc.org/MANUAL.html#extension-citations).
One way is to employ [BibTeX files](https://en.wikibooks.org/wiki/LaTeX/Bibliography_Management#BibTeX) to cite:

```
---
title: Citation document
---
---
bibliography: assets/bibliography.bib
...
This is a citation: @Doe2022
```

Note that Hugo will **not** pass its metadata YAML block to Pandoc; however, it will pass the **second** meta data block, denoted with `---` and `...` to Pandoc.
Thus, all Pandoc settings should go there.

You can also add all elements from a bibliography file (without citing them explicitly) using:

```
---
title: My Publications
---
---
bibliography: assets/bibliography.bib
nocite: |
@*
...
```

It is also possible to provide a custom [CSL style](https://citationstyles.org/authors/) by passing `csl: path-to-style.csl` as a Pandoc option.


## Learn Markdown

Markdown syntax is simple enough to learn in a single sitting. The following are excellent resources to get you up and running:

Expand Down
72 changes: 71 additions & 1 deletion markup/pandoc/convert.go
Original file line number Diff line number Diff line change
Expand Up @@ -15,10 +15,14 @@
package pandoc

import (
"bytes"
"strconv"
"strings"
"sync"

"github.com/gohugoio/hugo/common/hexec"
"github.com/gohugoio/hugo/htesting"
"github.com/gohugoio/hugo/identity"

"github.com/gohugoio/hugo/markup/converter"
"github.com/gohugoio/hugo/markup/internal"
)
Expand Down Expand Up @@ -65,6 +69,9 @@ func (c *pandocConverter) getPandocContent(src []byte, ctx converter.DocumentCon
return src, nil
}
args := []string{"--mathjax"}
if supportsCitations(c.cfg) {
args = append(args[:], "--citeproc")
}
return internal.ExternallyRenderContent(c.cfg, ctx, src, binaryName, args)
}

Expand All @@ -77,6 +84,69 @@ func getPandocBinaryName() string {
return ""
}

type pandocVersion struct {
major, minor int64
}

func (left pandocVersion) greaterThanOrEqual(right pandocVersion) bool {
return left.major > right.major || (left.major == right.major && left.minor >= right.minor)
}

var versionOnce sync.Once
var foundPandocVersion pandocVersion

// getPandocVersion parses the pandoc version output
func getPandocVersion(cfg converter.ProviderConfig) (pandocVersion, error) {
var err error

versionOnce.Do(func() {
argsv := []any{"--version"}

var out bytes.Buffer
argsv = append(argsv, hexec.WithStdout(&out))

cmd, err := cfg.Exec.New(pandocBinary, argsv...)
if err != nil {
cfg.Logger.Errorf("Could not call pandoc: %v", err)
foundPandocVersion = pandocVersion{0, 0}
return
}

err = cmd.Run()
if err != nil {
cfg.Logger.Errorf("%s --version: %v", pandocBinary, err)
foundPandocVersion = pandocVersion{0, 0}
return
}

outbytes := bytes.Replace(out.Bytes(), []byte("\r"), []byte(""), -1)
output := strings.Split(string(outbytes), "\n")[0]
// Split, e.g., "pandoc 2.5" into 2 and 5 and convert them to integers
versionStrings := strings.Split(strings.Split(output, " ")[1], ".")
majorVersion, err := strconv.ParseInt(versionStrings[0], 10, 64)
if err != nil {
println(err)
}
minorVersion, err := strconv.ParseInt(versionStrings[1], 10, 64)
if err != nil {
println(err)
}
foundPandocVersion = pandocVersion{majorVersion, minorVersion}
})

return foundPandocVersion, err
}

// SupportsCitations returns true for pandoc versions >= 2.11, which include citeproc
func supportsCitations(cfg converter.ProviderConfig) bool {
if Supports() {
foundPandocVersion, err := getPandocVersion(cfg)
supportsCitations := foundPandocVersion.greaterThanOrEqual(pandocVersion{2, 11}) && err == nil
return supportsCitations
}
return false
}

// Supports returns whether Pandoc is installed on this computer.
func Supports() bool {
hasBin := getPandocBinaryName() != ""
Expand Down
139 changes: 136 additions & 3 deletions markup/pandoc/convert_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -25,7 +25,7 @@ import (
qt "github.com/frankban/quicktest"
)

func TestConvert(t *testing.T) {
func setupTestConverter(t *testing.T) (*qt.C, converter.Converter, converter.ProviderConfig) {
if !Supports() {
t.Skip("pandoc not installed")
}
Expand All @@ -38,7 +38,140 @@ func TestConvert(t *testing.T) {
c.Assert(err, qt.IsNil)
conv, err := p.New(converter.DocumentContext{})
c.Assert(err, qt.IsNil)
b, err := conv.Convert(converter.RenderContext{Src: []byte("testContent")})
return c, conv, cfg
}

func TestConvert(t *testing.T) {
c, conv, _ := setupTestConverter(t)
output, err := conv.Convert(converter.RenderContext{Src: []byte("testContent")})
c.Assert(err, qt.IsNil)
c.Assert(string(output.Bytes()), qt.Equals, "<p>testContent</p>\n")
}

func runCiteprocTest(t *testing.T, content string, expected string) {
c, conv, cfg := setupTestConverter(t)
if !supportsCitations(cfg) {
t.Skip("pandoc does not support citations")
}
output, err := conv.Convert(converter.RenderContext{Src: []byte(content)})
c.Assert(err, qt.IsNil)
c.Assert(string(b.Bytes()), qt.Equals, "<p>testContent</p>\n")
c.Assert(string(output.Bytes()), qt.Equals, expected)
}

func TestGetPandocVersionCallTwice(t *testing.T) {
c, _, cfg := setupTestConverter(t)

version1, err1 := getPandocVersion(cfg)
version2, err2 := getPandocVersion(cfg)
c.Assert(version1, qt.Equals, version2)
c.Assert(err1, qt.IsNil)
c.Assert(err2, qt.IsNil)
}

func TestPandocVersionEquality(t *testing.T) {
c := qt.New(t)
v1 := pandocVersion{1, 0}
v2 := pandocVersion{2, 0}
v3 := pandocVersion{2, 2}
v4 := pandocVersion{1, 2}
v5 := pandocVersion{2, 11}

// 1 >= 1 -> true
c.Assert(v1.greaterThanOrEqual(v1), qt.IsTrue)

// 1 >= 2 -> false, 2 >= 1 -> tru
c.Assert(v1.greaterThanOrEqual(v2), qt.IsFalse)
c.Assert(v2.greaterThanOrEqual(v1), qt.IsTrue)

// 2.0 >= 2.2 -> false, 2.2 >= 2.0 -> true
c.Assert(v2.greaterThanOrEqual(v3), qt.IsFalse)
c.Assert(v3.greaterThanOrEqual(v2), qt.IsTrue)

// 2.2 >= 1.2 -> true, 1.2 >= 2.2 -> false
c.Assert(v3.greaterThanOrEqual(v4), qt.IsTrue)
c.Assert(v4.greaterThanOrEqual(v3), qt.IsFalse)

// 2.11 >= 2.2 -> true, 2.2 >= 2.11 -> false
c.Assert(v5.greaterThanOrEqual(v3), qt.IsTrue)
c.Assert(v3.greaterThanOrEqual(v5), qt.IsFalse)
}

func TestCiteprocWithHugoMeta(t *testing.T) {
content := `
---
title: Test
published: 2022-05-30
---
testContent
`
expected := "<p>testContent</p>\n"
runCiteprocTest(t, content, expected)
}

func TestCiteprocWithPandocMeta(t *testing.T) {
content := `
---
---
---
...
testContent
`
expected := "<p>testContent</p>\n"
runCiteprocTest(t, content, expected)
}

func TestCiteprocWithBibliography(t *testing.T) {
content := `
---
---
---
bibliography: testdata/bibliography.bib
...
testContent
`
expected := "<p>testContent</p>\n"
runCiteprocTest(t, content, expected)
}

func TestCiteprocWithExplicitCitation(t *testing.T) {
content := `
---
---
---
bibliography: testdata/bibliography.bib
...
@Doe2022
`
expected := `<p><span class="citation" data-cites="Doe2022">Doe and Mustermann
(2022)</span></p>
<div id="refs" class="references csl-bib-body hanging-indent"
role="doc-bibliography">
<div id="ref-Doe2022" class="csl-entry" role="doc-biblioentry">
Doe, Jane, and Max Mustermann. 2022. <span>“A Treatise on Hugo
Tests.”</span> <em>Hugo Websites</em>.
</div>
</div>
`
runCiteprocTest(t, content, expected)
}

func TestCiteprocWithNocite(t *testing.T) {
content := `
---
---
---
bibliography: testdata/bibliography.bib
nocite: |
@*
...
`
expected := `<div id="refs" class="references csl-bib-body hanging-indent"
role="doc-bibliography">
<div id="ref-Doe2022" class="csl-entry" role="doc-biblioentry">
Doe, Jane, and Max Mustermann. 2022. <span>“A Treatise on Hugo
Tests.”</span> <em>Hugo Websites</em>.
</div>
</div>
`
runCiteprocTest(t, content, expected)
}
6 changes: 6 additions & 0 deletions markup/pandoc/testdata/bibliography.bib
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
@article{Doe2022,
author = "Jane Doe and Max Mustermann",
title = "A Treatise on Hugo Tests",
journal = "Hugo Websites",
year = "2022",
}