The plain word alignment files are difficult to read and interpret for a human. This simple tool lets you visualize the word alignments between parallel sentences by drawing edges between words. It outputs a LaTeX file (relying on TikZ) that can be compiled to PDF.
A prebuilt version of align2tex is available as a release.
The project can be built with sbt 0.13.5. Run sbt assembly
to create a fat jar in target/scala-2.11/
.
java -jar align2tex.jar SOURCE_SENTS_FILE TARGET_SENTS_FILE ALIGNMENTS_FILE
java -jar align2tex.jar src/main/resources/sample1.en src/main/resources/sample1.fr src/main/resources/sample1.align
This will produce src/main/resources/sample1.align.tex
which can be compiled with pdflatex
to give the following result. Although some tweaking of TikZ spacing parameters is performed internally depending on the source- and target-sentence lengths, it might still be necessary to adjust the spacing manually.
Any of the languages supported by the babel package for LaTeX. Currently, these are hard-coded in LangCodes.scala. Simply using the ISO 639-1 language code (en, fr, ru, es, de etc.) as extension of the input sentence files is sufficient and should insert the correct language options for babel.
Yes, to get the tikzpicture
block without the document and package definitions, run align2tex by specifying concise
as the last option.
(c) Simon Šuster, 2016