Skip to content
evbacher edited this page Nov 4, 2024 · 143 revisions

Docs to Markdown (GD2md-html): convert a Google DocTM to Markdown or HTML

Install: Docs to Markdown

A request: If you rate this add-on, and you give it less than 4 or 5 stars, please file a bug at Docs to Markdown issues so we can understand what's bugging you. Or, ask a question on the Docs to Markdown mailing list. Thanks!

NEW: See News for significant recent changes.

Contents:


Docs to Markdown quick start:

  1. Install Docs to Markdown from the G Suite Marketplace.
  2. From the Google Docs Extensions menu in a Google Doc, select Docs to Markdown > Convert.
  3. Use the Markdown or HTML buttons in the sidebar window to convert your document to either Markdown or HTML.
  4. Copy the entire text from the text area in the sidebar to the clipboard and paste it into either a Markdown or HTML file.
  5. View the rendered page using a web browser or your publishing tools.

Details

Docs to Markdown is a Google DocsTM add-on that converts a Google Doc to a simple, readable Markdown or HTML text file. Docs to Markdown lets you use Google Docs' editing, formatting, and collaboration tools before you publish to a Markdown or HTML platform. Docs to Markdown also lets you select and convert part of a Google Doc.

Docs to Markdown requires minimal permissions: it will ask for permission to access the current Doc (to convert it) and permission to create a sidebar (the user interface). It requires no other permissions.

Use Google Docs styles for headings (Heading 1, Heading 2, etc.) so that Docs to Markdown can convert them properly to Markdown or HTML heading styles. If you just make them bold and large, they will convert as normal paragraphs.

Note: Not all Markdown renderers support all Markdown features. For example github Markdown does not support a table of contents ([TOC]), footnotes, or definition lists. Also, some Markdown environments strip heading IDs and replace them with their own generated IDs. You'll have to do some manual adjustments, depending on your target environment.

If you find any bugs, please file them at https://github.com/evbacher/gd2md-html/issues. Thanks for helping to make Docs to Markdown better!

Join the GD2md-html group group for announcements, questions, and discussion about Docs to Markdown (formerly GD2md-html). Low traffic.

Thanks to Renato Mangini for inspiration: he created an earlier conversion script at https://github.com/mangini/gdocs2md before Drive add-ons even existed!

Installing Docs to Markdown

Install Docs to Markdown from the G Suite Marketplace.

Using Docs to Markdown

You can use the Docs to Markdown add-on with any Doc for which you have edit permission. If you only have view permission, make a copy.

  1. From the Google Docs Extensions menu, select Docs to Markdown > Convert. The sidebar window opens:

GD2md-html sidebar

  1. Use the Markdown or HTML buttons in the sidebar window to convert your document to either Markdown or HTML. If you select part of the document, Docs to Markdown will convert only the selection. Otherwise it will convert the entire document. Click the Docs link for more information.

Partial selection

By default, Docs to Markdown will convert the entire document, but you can limit the scope of the conversion by selecting part of the document. Note that for table conversion, you need to select the entire table, otherwise Docs to Markdown will not see the containing table element.

Options

There are a number of options that you can select before you do a conversion:

GD2md-html sidebar

  • _/__ for italic/bold: The default markup for italic or bold uses asterisks (* or **), but you can switch to underscores (_ and __) if you choose this option.
  • Demote headings (H1 → H2, etc.): If you have used multiple Heading 1 headings in your Doc, choose this option to demote all heading levels to conform with the following standard: single H1 as title, H2 as top-level heading in the text body. You may need to add an H1 to function as the title after conversion.
  • HTML headings/IDs: Not all Markdown renderers handle Markdown-style IDs. If that is the case for your target platform, choose this option to generate HTML headings and IDs.
  • Wrap HTML: By default, HTML output is not wrapped. Selecting this option will wrap HTML text (but note that some publishing platforms treat line breaks as <br> or <p> tags).
  • Render HTML tags: By default, angle brackets (<) will be replaced by the &lt; entity. If you really want to embed HTML tags in your Markdown, select this option to preserve them. Or, if you just want to use an occasional HTML tag, you can escape the opening angle bracket like this: \<tag>text\</tag>.
  • Suppress info comments: Removes the informational comment at the top of the output. Warnings and errors that result from the conversion will still be part of the output.
  • Use reckless mode (no inline alerts): Removes any alert messages embedded in the conversion output. If you enable reckless mode, Docs to Markdown will print a summary comment at the top of the output to let you know if there are any warnings or errors.

Using Docs to Markdown output

Docs to Markdown automatically copies the converted text to the clipboard. You can paste the contents of the clipboard into either a Markdown or an HTML file.

If there are any errors or warnings during the conversion, they will appear in the conversion notes comment at the top of the text output.

Images

Docs to Markdown creates placeholder markup for images. You will need to move images to your server and change the path in the image tags to match the image location. Docs to Markdown now creates image paths in the form images/image1.png, images/image2.jpg, etc. that are consistent with the images in the zip file from Download > Web Page in Google Docs. However, the order of images in the zip file is not always the same as the order in your Doc -- please check all your images!

Drawings

The converter cannot handle inline Google Drawings, since Google Drawings does not (yet) have an API to get drawing data. However, you can copy embedded drawings to standalone Google Drawings and refer to them by reference in your Markdown or HTML documents (see Google Drawings by reference for details).

Features

Text element conversion features

Support for both Markdown and HTML conversion, unless noted. Note that, in general, Docs to Markdown does only basic text formatting (for more on this, see the Development Guide)

  • Basic elements: Paragraphs, lists, inline code, links.

  • Text formatting: italic, bold, strikethrough, inline code, and combinations of text formatting.

  • Headings: Heading levels convert directly, unless you choose the option to demote heading levels.

  • Line breaks: If you use shift-enter to insert an explicit line break, the converter will preserve that line break, inserting a \ at the end of the line for Markdown, or <br> for HTML.

  • Intra-document links: If you linked to headings in your Doc, those links will convert properly as long as you generated a TOC (Note: generate the TOC with blue links).

  • Mixed inline code: Docs to Markdown may fall back to HTML to handle mixed code spans in normal text properly.

  • Code blocks: Constant-width-font paragraphs (Courier New, Consolas, Inconsolata, Roboto Mono, and Source Code Pro) and single-cell tables convert to fenced code blocks (Markdown) or <pre> blocks (HTML).

  • Code block language specification: You can optionally specify the language for a code block by specifying lang: langspec as the first line. For example this code block (in your Google Doc):

    lang:java
    public class HelloWorld {
      public static void main(String[] args) {
        System.out.println("Hello, World");
      }
    }
    

    renders as:

    public class HelloWorld {
      public static void main(String[] args) {
        System.out.println("Hello, World");
      }
    }

    This applies to both constant-width-font paragraphs and single-cell tables.

  • Smart quotes: Smart quotes in code and code blocks will convert to straight quotes. Smart quotes in other text will be preserved.

  • Definition lists: Docs to Markdown supports simple definition lists.

    Definition list syntax (in a Google Doc):
    
    ?Term starts with a question mark.
    :Definition here (starts with a leading colon).
    
  • Tables: Tables currently convert to HTML tables for both Markdown and HTML. Full support for merged rows and columns. Note that single-cell tables convert to code blocks.

  • Images: Docs to Markdown creates image links. You will need to save images to your web server and adjust the image paths.

  • Footnotes: Footnotes convert to standard Markdown footnotes, and corresponding markup in HTML.

  • Right-to-left languages: Embedded RTL text or RTL paragraphs display properly in the converted output.

UI/reporting features

  • Partial selections: If you select part of the document, Docs to Markdown will convert only the selection. Otherwise it will convert the entire document. This is useful for converting code samples or tables.
  • ERRORs, WARNINGs, ALERTs: Docs to Markdown output includes information about conditions that will probably require your attention (both in a comment at the top of the output and in red in the rendered output). Please proofread and correct before publishing.

Sample conversion

Both this document and the sample/demo page were converted from Google Docs using Docs to Markdown.

Troubleshooting

If you find trouble that's not noted here, please file a bug at Docs to Markdown bugs.

Installation issues

These are issues that crop up before the add-on can even start executing, not issues with the add-on itself:

  • Server errors when you click the Install button: (for example, Bad Request: Error 400). These are errors on the server side that have nothing to do with the add-on itself. The best thing to do is to try again later.
  • Missing menu items: If you installed the add-on, but you see only the Help menu, try uninstalling the add-on, then reinstalling it. You should see both the Convert and Help menus.

drive.google.com refused to connect

This is a general problem for add-ons that encounter multiple Google accounts.

If you open the add-on and see this message:

drive.google.com refused to connect error

you may be logged into more than one Google account from the same browser. Try opening a private (or incognito) window and logging into only one Google account. See https://github.com/evbacher/gd2md-html/issues/76 for more discussion.

This is probably related to this long-standing (since 2017) bug that Google seems reluctant to fix: https://issuetracker.google.com/issues/69270374.

Output not copied to the clipboard

Docs to Markdown writes the converted output to the clipboard automatically. However, if the conversion takes several seconds and you switch the focus to another tab, Docs to Markdown can’t write to the clipboard, because the clipboard is owned by the current tab.

If Docs to Markdown cannot write to the clipboard, you’ll see this comment at the top of the output:

    <!-- Copy and paste the converted output. -->

Conversion error

If a conversion fails completely, file a bug against the converter at Docs to Markdown bugs. Be sure to include the error message and a link to the document you were trying to convert.

TIP: You can do a crude sort of binary search to localize the problem by selecting the first half of the doc and converting. Change and/or reduce the selection until you find the offending portion of the document. You may be able to work around the problem, but please still file a bug! (And note the area of the problem if you localized it. Thanks!)

Alerts

Problem

Red marks all over the page.

Docs to Markdown alerts you to known issues, warnings, or errors that require your attention by inserting alert messages in the rendered output. For example, if you use a Google Docs equation, you'll get a message like this:

>>>>> equation: use MathJax/LaTeX if your publishing platform supports it.

Solution

Address the problem and remove the message from the Markdown/HTML file before publishing.

Heading IDs not generated

Problem

Docs to Markdown does not generate heading IDs and intra-document links properly if you're using a Doc that has the new-style TOC (with page numbers). Currently, Docs to Markdown requires the old-style ("blue links") TOC to correctly generate heading IDs and intra-document links.

Solution

Delete your TOC and insert a new TOC (choose the "blue links" style).

Alternate solution (in case the "blue links" style TOC is not available):

Find an old Doc that has the old-style TOC.

Cut and paste the entire TOC (not just the links -- you need to get the whole box) into your Doc, then regenerate the TOC. The promise is to have the TOC be selectable, but for now, the print-style TOC is the default for new Docs, it seems.

Problem

The conversion notes contains an error like this:

ERROR:
undefined internal link to this URL"#heading=h.64x194v1qq35".link text: link to an internal heading
?Did you generate a TOC?

Solution

First, check to see that you generated a table of contents (TOC) in your document. The conversion engine uses the TOC to create internal links. (Also, the TOC should come before any link references.) Also, Google Docs supports two different styles of TOCs now. Make sure you generate the TOC that has the blue links:

image of Google Docs insert TOC command menu

If you generated a TOC and you are still getting internal link errors, the heading link in the original TOC may be stale. Try regenerating the TOC in your Google Doc, then convert the document again.

To check for staleness: search through the converted output for the link URL (something like #heading=h.vwl3eupkcsuz). When you find it, go back to the doc and see if that link works in the doc itself (by appending #heading=h.vwl3eupkcsuz or similar to the end of the doc URL).

Permissions

The Docs to Markdown add-on does not collect or distribute any information about you or about the documents being converted.

Docs to Markdown will ask for permission to access the current Doc (to convert it). Docs to Markdown reads the Doc content and structure; it does not change any content.

Docs to Markdown will also ask for permission to display a sidebar. It uses the sidebar to display action buttons and option checkboxes as well as the converted text (Markdown or HTML).

It requires no other permissions.

See the official privacy policy for Docs to Markdown here: https://beanroad.com/docs-to-markdown/privacy.html.

Bugs

Go to https://github.com/evbacher/gd2md-html/issues to view open bugs or to file a new bug.

Acknowledgements

Thanks to Renato Mangini for inspiration: he created an earlier conversion script at https://github.com/mangini/gdocs2md before Drive add-ons even existed!

See also

See these links for more information about Markdown syntax and standards, and about Markdown previewers and editors.

Markdown previewers, editors

Markdown syntax, information

Articles about Docs to Markdown (GD2md-html)