Pandoc is a program created by John MacFarlane that can convert one type of file to another.
(example.extension) -> (example.other-extension)
It basically reads input and with a little magic converts it to the specified output. So if you have a markdown file you can very simply convert it to html using:
pandoc example.md -o example.htmlSay your example.md looks like this:
# Example
Textit will be converted to:
<h1 id="example">Example</h1>
<p>Text</p>Pandoc also has more types, you can check it here. If your file
doesn’t have .extension you can specify it by using
--from/-f format or --to/-t format.
Another cool thing is that you don’t need to specify the output
file you can just simply run it and the output will be directed
to stdout.
If you run:
pandoc example.md --standalone -o example.htmlyou will get whole html with head and body. How it works is
connected with templates. If you run
--standalone/-s pandoc will check for default
template for specified output, in our case html and use that in
conversion. If you want to see the default template for some
format run:
pandoc -D formatDefault html template:
<!DOCTYPE html>
<html xmlns="http://www.w3.org/1999/xhtml" lang="$lang$" xml:lang="$lang$"$if(dir)$ dir="$dir$"$endif$>
<head>
<meta charset="utf-8" />
...But what is this weird lang="$lang$" and others
similar definitions? When generating output, pandoc will copy
the template and in place these weird definitions will put the
metadata values.
Metadata are values that the user can define that will be
used in forming the output. You can use the command line option
--metadata/-M key=value to specify them or using
markdown:
---
title: Example
author: Your Name
---It’s also possible to use
--metadata-file file.
Metadata can have types(like booleans, strings, lists), which is the thing that differs them from variables.
Variables are basically the same, but as mentioned earlier
they don’t have types, so they can be only strings. Another
thing is that metadata can be used in filters, variables can’t.
To define a variable use
--variable/-V key=value.
If we want to put some metadata in our template, we can define values inside template using name of metadata:
$value$Pandoc will check if the value is defined in metadata and then put it where we specify it in template.
If we are not sure that the value is set, we can use the
if check:
$if(value)$
//
$endif$Note that if you use metadata and set the
value=falseit will not work, because of types in metadata. Writevalue="false"instead or use variables, not metadata.
If we want to loop through a list we can write this:
$for(values)$
$$values$$
$endfor$It will iterate over values list and print every
element. To define a list in metadata we can use:
---
values: example, ...
---Other cool things that you can put in templates are in manual(or
just man pandoc)
Okey so we are now ready to make our first custom template.
We will change the title of our website to
title | author
We can base it on the default html template running:
pandoc -D html > custom.htmlAfter that in the head section we can change
this line:
19c19
< <title>$$pagetitle$ $if(author)$$author$$endif$</title>
---
> <title>$if(title-prefix)$$title-prefix$ – $endif$$pagetitle$</title>Now the only thing left is to define metadata. Create
input.md with the following text:
---
title: Title
author: Your Name
---And then compile it with
pandoc input.md --template custom.html -o output.html
and open your browser. A tab with text
Title | Your Name should appear.
In --standalone/-s there is some css defined you
can use it, but you can also make your own. Just copy this css
and change some things, and then use --css file or
write it in your template using:
<link rel="stylesheet" href="file">Pandoc has built-in support for syntax highlighting using skylighting. It’s nice, but if you want to change the colors you can do this by:
pandoc --list-highlight-styles and write
pandoc --syntax-highlighting namepandoc -o my.theme --print-highlight-style name and
manually change colors using #000000 format and then write
pandoc --syntax-highlighting my.themepandoc --syntax-highlighting tango example.md -o example.html
and then copy the css and manually change colors and write the css to template.As mentioned earlier in pandoc we can use filters. Filters are programs that can change the pandoc’s Abstract Syntax Tree. It consist of blocks with some pre generated text from pandoc, that will be used to generate the specified output format.
input -> magic -> AST -> filter -> AST -> magic -> output
We can use haskell to write filters or other languages. But the best choice is to use lua. It is faster and has built-in support so no need to install other tools just to write a filter.
Calling --lua-filter name.lua will apply the
filter.
We can write lua filters by creating a function with name of the block we want to change. Some of the blocks are Meta, Pandoc, Header and others.
We will start with creating a file date.lua and
writing:
function Meta(m)
return m
endMeta is a block that has metadata. The value m
is a table that acts like a dictionary. To change date we need
to change m.date to a new date using os.date:
function Meta(m)
m.date = os.date("%a, %d %b %Y %H:%M:%S %z", os.time())
return m
endThat’s how you write a simple filter. You call the name of the block, change some values in dictionary and return it. Here are more examples of lua filters.
RSS is just a simple thing that cool kids use to inform people about changes on their site or to provide news. It is an xml file that you put on your website and people will use rss reader to see if something was added. We can simply add this to our website using pandoc.
To create rss we need to comply with its specification.
After carefully reading it, the rss.xml should look
like this:
<?xml version="1.0" encoding="UTF-8" ?>
<rss version="2.0">
<channel>
<title>Title of your rss</title>
<link>Link to your site</link>
<description>Description of your rss feed</description>
<language>en-us</language>
<copyright>Copyright 2025, Your Name</copyright>
<lastBuildDate>Last build date</lastBuildDate>
<item>
<title>Title of item</title>
<author>Your Name</author>
<link>Link to item</link>
<description><![CDATA[Description]]></description>
<pubDate>Publication date</pubDate>
</item>
...
</channel>
</rss>Using values we can brake this to 2 files
rss.xml and rss-item.xml. Template
rss.xml will not contain any items (we will add
them with $body$):
<?xml version="1.0" encoding="UTF-8" ?>
<rss version="2.0">
<channel>
<title>$title$</title>
<link>$link$</link>
<description>$description$</description>
<language>en-us</language>
<copyright>Copyright $year$, $author$</copyright>
<lastBuildDate>$date$</lastBuildDate>
$body$
</channel>
</rss>rss-item.xml should have this:
<item>
<title>$title$</title>
<author>$author$</author>
<link>$link$</link>
<description><![CDATA[$body$]]></description>
<pubDate>$date$</pubDate>
</item>We can collect metadata for each template file, it will be useful later:
If we have a folder with news we can use:
for file in news/*.md; then
pandoc $file
doneto iterate every markdown file and print the output to stdout
Next thing is to pipe this output of files to pandoc
for file in news/*.md; then
pandoc $file
done | pandoc -o news.xmlOkey, so this will get every markdown file, convert it to
html and then pipe this output to pandoc creating file
news.xml. But this is not the end cause we need to
use templates to form our items and the whole rss:
for file in news/*.md; then
pandoc $file --template rss-item.xml
done | pandoc --template rss.xml -o news.xmlAnother thing to do is to format the date because it needs to
be from specification. We can use lua filter to this. Create
file date.lua.
And second filter to format item
date(item-date.lua):
function Meta(m)
-- we take date from metadata
local date = pandoc.utils.stringify(m.date)
-- date "year-month-day hour:minute"
local year, month, day, hour, min = date:match("(%d+)-(%d+)-(%d+) (%d+):(%d+)")
local time = os.time({year=year, month=month, day=day, hour=hour, min=min})
m.date = os.date("%a, %d %b %Y %H:%M:%S %z", time)
return m
endThe files from news should have the metadata collected earlier:
---
title: Title
author: Your Name
date: 2025-09-27 21:20
link: https://example.com/link
---Also, we need to add metadata to rss.xml. We can
use --metadata-file file or arguments
--metadata key=value. Simpler is file so
news.md:
---
title: Title
author: Your Name
date: 2025-09-27 21:20
link: https://example.com/link
description: Description
---We glue everything and get:
for file in news/*.md; then
pandoc $file --lua-filter date-item.rss --template item-rss.xml
done | pandoc --metadata-file news.md --template rss.xml -o news.xmlThis should work right?
No, there is one extra step because pandoc will try to format the files output to html, not xml. One way to fix this is to create custom reader and writer or simply write this:
body=(for file in news/*.md; then
pandoc $file --lua-filter date-item.rss --template item-rss.xml
done)
pandoc --metadata-file news.md --variable body=$body --template rss.xml -o news.xmlWe can now create the website, but we need to manually call
pandoc or create special sh files; that’s inconvenient. So the
cherry on top of this article is Makefile. There,
you can write your shell scripts, name them and then call
make name and it will run it. So, for example, if
you want to create index.md:
index:
pandoc index.md -o index.html ...We can also define directories:
BUILD_DIR = build
index:
pandoc index.md -o ${BUILD_DIR}/index.html ...One thing to note here is that if you want to have variables in shell you need to use $$ not $ and use ;\:
index:
filename=index; \
pandoc $$filename.md -o ${BUILD_DIR}/$$filename.html ...We now can use the swiss-army knife for converting documents to create websites with markdown that have an rss feed. You can change some things to fit your needs and create a website generator. The code for generating this website is here