Skip to content

Latest commit

 

History

History
138 lines (113 loc) · 5.23 KB

README.md

File metadata and controls

138 lines (113 loc) · 5.23 KB

Offene Bibel Parser

Building

To build the project:

The results reside in the install/ folder. To run the converter install/bin/exporter.sh --help.

The converter will download the translation from the Offene Bibel and create two .osis files in the install/results/ folder. The exporter caches all files it downloads to install/tmp/pageCache/. They won't be redownloaded. To redownload a file, just delete it in the cache.

There is also a convenience script that creates two sword modules, copies them to ~/.sword and creates a .zip archive: install/bin/swordConverter.sh.

And then there is the validator that checks a given Wiki page for validity. You can run it as follows: install/bin/validator.sh --help.

More format converters

The remaining formats can be created by calling different Main classes. Most of these tools do not take any command line arguments, but will just read their input files (OSIS or Zefania XML) and write output files. I run them on Windows since most of the post-processing tools run on Windows as well, but the converters should run as well on Linux (just that there are no individual shell scripts for them).

  • offeneBibel.zefania.ZefaniaConverter
    produces Zefania XML from OSIS
  • offeneBibel.zefania.FootnoteHTMLGrabber
    Reads existing Zefania XML and produces Zefania XML with HTML footnotes (by grabbing the wiki)
  • offeneBibel.zefania.LogosConverter
    Reads Zefania XML (both with and without HTML footnotes) and produces HTML that can be converted for Logos
  • offeneBibel.zefania.ESwordConverter
    Reads Zefania XML (without HTML footnotes) and produces HTML that can be converted for E-Sword (with E-Sword ToolTipTool NT). This one supports a parameter, a marker value (use e.g. $MARKER$) to mark the end of lines/verses (to work around bugs in ToolTipTool's HTML import which sometimes skips and adds linebreaks). The marker should not appear anywhere in the Bible text and you will have to use the same Marker later when producing the actual E-Sword files.
  • offeneBibel.zefania.MyBibleZoneConverter
    Reads Zefania XML (with HTML footnotes) and produces MyBible.Zone database files
  • offeneBibel.zefania.MySwordConverter
    Reads Zefania XML (with HTML footnotes) and produces MySword database files

Web viewer file generation

The parser can generate files suitable as input for the Offene Bibel Web Viewer. It generates a file structure as follows:

webResults/Matthäus_12_lf
webResults/Matthäus_12_sf
webResults/Matthäus_12_ls
webResults/generated.index

Multiple runs will overwrite both, chapter files and the status file. The generated.index file will have a comment at the start indicating the date and parameters used for generation.

AST layout

Important types:

  • TreeNode Generic base class of all AST elements. Contains no OfBi specifics. Defines the tree structure and supports the visitor pattern.
  • AstNode Generic base class of all Offene Bibel (...Node) AST nodes. Contains an enum with all possible node types. Not every node type has an extra class, most are just instances of AstNode with the respective type set.
  • TextNode Used for all textual information.
  • VerseStatus Represents the status of one single verse. It's calculated from the chapter tags via VerseNode.getStatus().

Other *Node types:

  • FassungNode
  • ChapterNode
  • VerseNode
  • NoteNode
  • ParallelPassageNode
  • SuperScriptTextNode
  • NoteLinkNode
  • WikiLinkNode

Verses have no children. They are markers.

The basic page and AST layout is:

chapter
  chapterNotes
  fassung
    [text]
    fassungNotes
  fassung
    [text]
    fassungNotes

[text] is a mostly unconstrained mixture of the following elements:

  • text is some text.
  • The following elements typically wrap some text.
    • insertion
    • omission
    • alternative
    • alternateReading
    • fat
    • italics
    • secondaryContent
    • textBreak
    • quote Wraps text. Often wraps longer passages.
  • Standalone elements are:
    • parallelPassage
    • noteLink
    • heading (Only allowed in Lesefassung)
    • verse Since verses can freely interleave with other elements it is standalone.
  • note Contains a completely different syntax - a note.
    • hebrew
    • wikiLink
    • superScript
    • strikeThrough
    • underline
  • poemStart / poemStop Mark a text passage with a syntax differing from normal scripture text. In poems newlines are significant and not removed from the text.
    • secondVoice