Qubyte Codes

Custom markdown blocks with marked

I use marked to do the markdown rendering for this blog. A recent feature makes it possible to create custom block types with a little hacking. In this post I show you how!

I'll be using mathematics (TeX) blocks for my example. The marked setup code looks like this:

marked.use({
  walkTokens(token) {
    const { type, raw } = token;

    // Modify paragraph blocks beginning and ending with $$.
    if (type === 'paragraph' && raw.startsWith('$$\n') && raw.endsWith('\n$$')) {
      token.type = 'code';
      token.lang = 'mathematics';
      token.text = token.raw.slice(3, -3); // Remove the $$ boundaries.
      token.tokens.length = 0; // Remove child tokens.
    }
  },
  renderer: {
    code(code, language) {
      // Use custom mathematics renderer.
      if (language === 'mathematics') {
        return renderMathematics(code);
      }

      // Use default code renderer.
      return false;
    }
  }
});

I'm passing two things to marked to configure it here. The first is a token walker function, which is the recent feature which makes this all possible. It is called for each token, traversing the children of a token before it progresses to its siblings (so it's sort of depth-first).

The idea is for blocks of text with a $$ above and below them to be handled as mathematics. To a person this looks like a fenced code block with two dollar symbols in the place of the three backticks. For example:

Some example text with some mathematics to render below:

$$
a^2 = b^2 + c^2
$$

Some example text below.

It's a common extension to place LaTeX code inside $$ delimited blocks. Even if you're not familiar, the dollar symbols above and below are a little like a fenced code block. To marked the block looks like a paragraph. This means that the token walker will receive some paragraph tokens which need to be modified.

To know the difference, paragraph tokens are checked by the token walker to see if they begin with $$\n and end with \n$$. When they do, the block is modified to look like a code block with a special 'markdown' language. Child tokens are removed because the content shouldn't be treated as markdown, and the text property is set by snipping the leading and trailing dollar symbols and newlines off.

The second part of this trick is in the renderer option. The renderer for code blocks is modified with special handling for the 'mathematics' language. The code parameter received by it is the text we set on the token, so it's ready to be rendered to mathematics. The rendering itself is beyond the scope of this post, but I use MathJax. When code is of any other language the custom code renderer returns false to instruct marked to use the default code rendering behaviour.