mdast-util-from-markdown

mdast utility that turns markdown into a syntax tree

What is this?

This package is a utility that takes markdown input and turns it into a markdown abstract syntax tree.

This utility uses micromark, which turns markdown into tokens, and then turns those tokens into nodes.

When should I use this?

If you want to handle syntax trees manually, use this. When you just want to turn markdown into HTML, use micromark instead. For an easier time processing content, use the remark ecosystem instead.

Install

This package is ESM only.

In Node.js (version 18+) with yarn:

yarn add @flex-development/mdast-util-from-markdown

See Git - Protocols | Yarn for details regarding installing from Git.

In Deno with esm.sh:

import { fromMarkdown } from 'https://esm.sh/@flex-development/mdast-util-from-markdown'

In browsers with esm.sh:

<script type="module">
  import { fromMarkdown } from 'https://esm.sh/@flex-development/mdast-util-from-markdown'
</script>

Use

Say we have the following markdown file example.md:

## Hello, *World*!

…and our module example.mjs looks as follows:

import { fromMarkdown } from '@flex-development/mdast-util-from-markdown'
import { inspect } from '@flex-development/unist-util-inspect'
import { read } from 'to-vfile'

const file = await read('example.md')
const tree = fromMarkdown(String(file))

console.log(inspect(tree))

…now running node example.mjs yields:

root[1] (1:1-2:1, 0-19)
└─0 heading[3] (1:1-1:19, 0-18)
    │ depth: 2
    ├─0 text "Hello, " (1:4-1:11, 3-10)
    ├─1 emphasis[1] (1:11-1:18, 10-17)
    │   └─0 text "World" (1:12-1:17, 11-16)
    └─2 text "!" (1:18-1:19, 17-18)

API

`fromMarkdown(value[, encoding][, options])`

Turn markdown into a syntax tree.

Overloads

(value: Value | null | undefined, encoding?: Encoding | null | undefined, options?: Options) => Root
(value: Value | null | undefined, options?: Options | null | undefined) => Root

Parameters

value (Value | null | undefined) — markdown to parse
encoding (Encoding | null | undefined, optional) — character encoding for when value is Uint8Array
- default: 'utf8'
options (Options | null | undefined, optional) — configuration

Returns

(Root) mdast.

`compiler([options])`

Create an mdast compiler.

👉 The compiler only understands complete buffering, not streaming.

Parameters

options (Options | null | undefined, optional) — configuration

Returns

(Compiler) mdast compiler.

`handles`

(Handles) Token types mapped to default token handlers.

👉 Default handlers are also exported by name. See src/handles.ts for more info.

`CompileContext`

mdast compiler context (TypeScript type).

Properties

buffer ((this: CompileContext) => undefined) — capture some of the output data
config (Config) — configuration
data (CompileData) — info passed around; key/value store
enter ((this: CompileContext, node: Nodes, token: Token, onError?: OnEnterError) => undefined) — enter a node
exit ((this: CompileContext, token: Token, onError?: OnExitError) => undefined) — exit a node
resume ((this: CompileContext) => string) — stop capturing and access the output data
sliceSerialize (TokenizeContext['sliceSerialize']) — get the string value of a token
stack (StackedNode[]) — stack of nodes
tokenStack (TokenTuple[]) — stack of tokens

`CompileData`

Interface of tracked data (TypeScript interface).

interface CompileData {/* see code */}

When developing extensions that use more data, augment CompileData to register custom fields:

declare module 'mdast-util-from-markdown' {
  interface CompileData {
    mathFlowInside?: boolean | undefined
  }
}

`Compiler`

Turn micromark events into a syntax tree (TypeScript type).

Parameters

events (Event[]) — list of events

Returns

(Root) mdast.

`Config`

Configuration (TypeScript type).

Properties

canContainEols (string[]) — token types where line endings are used
enter (Handles) — opening handles
exit (Handles) — closing handles
transforms (Transform[]) — tree transforms

`Encoding`

Encodings supported by TextEncoder (TypeScript type).

See micromark-util-types for more info.

type Encoding =
  | 'utf-8' // always supported in node
  | 'utf-16le' // always supported in node
  | 'utf-16be' // not supported when ICU is disabled
  | (string & {}) // everything else (depends on browser, or full ICU data)

`Event`

The start or end of a token amongst other events (TypeScript type).

See micromark-util-types for more info.

type Event = ['enter' | 'exit', Token, TokenizeContext]

`Extension`

Change how tokens are turned into nodes (TypeScript type).

See Config for more info.

type Extension = Partial<Config>

`Fragment`

Temporary node (TypeScript type).

type Fragment = Omit<mdast.Parent, 'children' | 'type'> & {
  children: mdast.PhrasingContent[]
  type: 'fragment'
}

Properties

children (mdast.PhrasingContent[]) — list of children
type ('fragment') — node type

`Handle`

Handle a token (TypeScript type).

Parameters

this (CompileContext) — compiler context
token (Token) — token to handle

Returns

(undefined | void) Nothing.

`Handles`

Token types mapped to handles (TypeScript type).

type Handles = Record<string, Handle>

`OnEnterError`

Handle the case where the right token is open, but is closed by the left token, or because end of file was reached (TypeScript type).

Parameters

this (Omit<CompileContext, 'sliceSerialize'>) — compiler context
left (Token | undefined) — left token
right (Token) — open token

Returns

(undefined) Nothing.

`OnExitError`

Handle the case where the right token is open, but is closed by exiting the left token (TypeScript type).

Parameters

this (Omit<CompileContext, 'sliceSerialize'>) — compiler context
left (Token) — left token
right (Token) — open token

Returns

(undefined) Nothing.

`Options`

Configuration options (TypeScript type).

Properties

extensions? (micromark.Extension[] | null | undefined) — extensions for this utility to change how tokens are turned into nodes
from? (StartPoint | null | undefined) — point before first character in markdown value. node positions will be relative to this point
mdastExtensions? ((Extension | Extension[])[] | null | undefined) — extensions for this utility to change how tokens are turned into nodes

`Point`

A location in the source document and chunk (TypeScript type).

See micromark-util-types for more info.

`StackedNode`

A node on the compiler context stack (TypeScript type).

type StackedNode = Fragment | mdast.Nodes

`StartPoint`

Point before first character in a markdown value (TypeScript type).

type StartPoint = Omit<Point, '_bufferIndex' | '_index'>

`TokenTuple`

List containing an open token on the stack, and an optional error handler to use if the token isn't closed properly (TypeScript type).

type TokenTuple = [token: Token, handler: OnEnterError | undefined]

`Token`

A span of chunks (TypeScript interface).

See micromark-util-types for more info.

`TokenizeContext`

A context object that helps with tokenizing markdown constructs (TypeScript interface).

See micromark-util-types for more info.

`Transform`

Extra transform, to change the AST afterwards (TypeScript type).

Parameters

tree (Root) — tree to transform

Returns

(Root | null | undefined | void) New tree or nothing (in which case the current tree is used).

`Value`

Contents of a file.

See micromark-util-types for more info.

type Value = Uint8Array | string

List of extensions

mdast-util-directive — directives
mdast-util-frontmatter — frontmatter (YAML, TOML, more)
mdast-util-gfm — GFM
mdast-util-gfm-autolink-literal — GFM autolink literals
mdast-util-gfm-footnote — GFM footnotes
mdast-util-gfm-strikethrough — GFM strikethrough
mdast-util-gfm-table — GFM tables
mdast-util-gfm-task-list-item — GFM task list items
syntax-tree/mdast-util-math — math
syntax-tree/mdast-util-mdx — MDX
syntax-tree/mdast-util-mdx-expression — MDX expressions
syntax-tree/mdast-util-mdx-jsx — MDX JSX
syntax-tree/mdast-util-mdxjs-esm — MDX ESM

Syntax

Markdown is parsed according to CommonMark. Extensions can add support for other syntax. If you’re interested in extending markdown, more information is available in micromark’s readme.

Syntax tree

The syntax tree is mdast.

Types

This package is fully typed with TypeScript.

Security

As markdown is sometimes used for HTML, and improper use of HTML can open you up to a cross-site scripting (XSS) attack, use of mdast-util-from-markdown can also be unsafe.

When going to HTML, use this utility in combination with hast-util-sanitize to make the tree safe.

Contribute

See CONTRIBUTING.md.

Name		Name	Last commit message	Last commit date
Latest commit History 146 Commits
.github		.github
.husky		.husky
.vscode		.vscode
.yarn/releases		.yarn/releases
__fixtures__		__fixtures__
__tests__		__tests__
dprint		dprint
scripts		scripts
src		src
typings		typings
.attw.json		.attw.json
.codecov.yml		.codecov.yml
.commitlintrc.ts		.commitlintrc.ts
.cspell.json		.cspell.json
.dictionary.txt		.dictionary.txt
.dprint.jsonc		.dprint.jsonc
.editorconfig		.editorconfig
.env.vars		.env.vars
.env.zsh		.env.zsh
.gitattributes		.gitattributes
.gitignore		.gitignore
.lintstagedrc.json		.lintstagedrc.json
.markdownlint.jsonc		.markdownlint.jsonc
.markdownlintignore		.markdownlintignore
.npmrc		.npmrc
.nvmrc		.nvmrc
.remarkrc.json		.remarkrc.json
.yarnrc.yml		.yarnrc.yml
Brewfile		Brewfile
CHANGELOG.md		CHANGELOG.md
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE.md		LICENSE.md
README.md		README.md
build.config.ts		build.config.ts
eslint.base.config.mjs		eslint.base.config.mjs
eslint.config.mjs		eslint.config.mjs
example.md		example.md
example.mjs		example.mjs
grease.config.json		grease.config.json
loader.mjs		loader.mjs
package.json		package.json
tsconfig.build.json		tsconfig.build.json
tsconfig.json		tsconfig.json
tsconfig.typecheck.json		tsconfig.typecheck.json
vitest.config.ts		vitest.config.ts
yarn.lock		yarn.lock

License

flex-development/mdast-util-from-markdown

Folders and files

Latest commit

History

Repository files navigation

mdast-util-from-markdown

Contents

What is this?

When should I use this?

Install

Use

API

fromMarkdown(value[, encoding][, options])

Overloads

Parameters

Returns

compiler([options])

Parameters

Returns

handles

CompileContext

Properties

CompileData

Compiler

Parameters

Returns

Config

Properties

Encoding

Event

Extension

Fragment

Properties

Handle

Parameters

Returns

Handles

OnEnterError

Parameters

Returns

OnExitError

Parameters

Returns

Options

Properties

Point

StackedNode

StartPoint

TokenTuple

Token

TokenizeContext

Transform

Parameters

Returns

Value

List of extensions

Syntax

Syntax tree

Types

Security

Related

Contribute

About

Topics

Resources

License

Code of conduct

Stars

Watchers

Forks

Releases 1

Sponsor this project

Packages 0

Contributors 3

Languages

`fromMarkdown(value[, encoding][, options])`

`compiler([options])`

`handles`

`CompileContext`

`CompileData`

`Compiler`

`Config`

`Encoding`

`Event`

`Extension`

`Fragment`

`Handle`

`Handles`

`OnEnterError`

`OnExitError`

`Options`

`Point`

`StackedNode`

`StartPoint`

`TokenTuple`

`Token`

`TokenizeContext`

`Transform`

`Value`

Packages