serhii.net

In the middle of the desert you can say anything you want

27 Dec 2023

Converting markdown thesis to pdf and stuff

Context:

> pandoc 230928-1745\ Masterarbeit\ draft.md -o master_thesis.pdf
# unicode magic
Try running pandoc with --pdf-engine=xelatex.
# thank you

> pandoc 230928-1745\ Masterarbeit\ draft.md -o master_thesis.pdf --pdf-engine=xelatex
# a volley of...
[WARNING] Missing character: There is no о (U+043E) in font [lmroman10-italic]:mapping=tex-text;!
  • Ugly:
    • 2023-12-27-170016_294x67_scrot.png
    • 2023-12-27-170044_623x83_scrot.png

Makefile magic etc

Exporting Hugo to PDF | akos.ma looks nice.

build/pdf/%.pdf: content/posts/%/index.md
	$(PANDOC) --write=pdf --pdf-engine=xelatex \
		--variable=papersize:a4 --variable=links-as-notes \
		--variable=mainfont:DejaVuSans \
		--variable=monofont:DejaVuSansMono \
		--resource-path=$$(dirname $<) --out=$@ $< 2> /dev/null

Let’s try:

pandoc 230928-1745\ Masterarbeit\ draft.md -o master_thesis.pdf --pdf-engine=xelatex  --variable=links-as-notes \
--variable=mainfont:DejaVuSans \
--variable=monofont:DejaVuSansMono

Better but not much; HTML is not parsed, lists count as lists only after a newline it seems.

2023-12-27-170319_692x290_scrot.png 2023-12-27-170246_579x202_scrot.png 2023-12-27-170253_812x137_scrot.png

Pandoc’s Markdown requires a newline after a paragraph for a list to render · Issue #6590 · jgm/pandoc

pandoc 230928-1745\ Masterarbeit\ draft.md -o master_thesis.pdf --pdf-engine=xelatex  --variable=links-as-notes \
--variable=mainfont:DejaVuSans \
--variable=monofont:DejaVuSansMono \
--from=markdown+lists_without_preceding_blankline

Better, but quotes unsolved: 2023-12-27-170541_681x194_scrot.png

Markdown blockquote shouldn’t require a leading blank line · Issue #7069 · jgm/pandoc

pandoc 230928-1745\ Masterarbeit\ draft.md -o master_thesis.pdf --pdf-engine=xelatex  --variable=links-as-notes \
--variable=mainfont:DejaVuSans \
--variable=monofont:DejaVuSansMono \
--from=markdown+lists_without_preceding_blankline
#+blank_before_blockquote

ACTUALLY, - f gfm (github-flavour) solves basically everything. commonmark doesn’t parse latex, commonmark_x (‘with many md extensions’) on first sight is similar to gfm.

I think HTML is the last one.

Raw HTML says it’s only for strict:

--from=markdown_strict+markdown_in_html_blocks

msword - Pandoc / Latex / Markdown - TeX - LaTeX Stack Exchange suggest md to tex and tex to pdf, interesting approach.

6.11 Write raw LaTeX code | R Markdown Cookbook says complex latex code may be too complex for markdown.

This means this except w/o backslashes:

\```{=latex}
$\underset{\text{NOUN-NOM}}{\overset{\text{man}}{\text{чоловік-}\varnothing}}$ $\underset{\text{PST}}{\overset{\text{saw}}{\text{побачив}}}$ $\underset{\text{NOUN-ACC}}{\overset{\text{dog}}{\text{собак-у}}}$.
\```

Then commonmark_x can handle that.

EDIT: --standalone!

More on HTML sub/sup to PDF

I don’t need HTML, I need <sub>.

  • pandoc md has a syntax for this: Pandoc - Pandoc User’s Guide

    • …but I’m not using pandoc md :(
  • Options

    • Can I replace my tags w/ that with yet another filter?
    • Ignore that obsidian/hugo can’t parse them and use pandoc syntax?.. and do --from=markdown+lists_without_preceding_blankline+blank_before_blockquote? :(
    • export to HTML w/ mathjax and from it PDF?
    • just use latex syntax everywhere? :(

ChatGPT tried to create a filter but nothing works, I’ll leave it for later: https://chat.openai.com/share/c94fffbe-1e90-4bc0-9e97-6027eeab281a

HTML

This produces the best HTML documents:

> pandoc 230928-1745\ Masterarbeit\ draft.md -o master_thesis.html \
--from=gfm --mathjax --standalone

NB If I add CSS, it should be an absolute path:

Later

Convert Markdown to PDF

Callouts

It’d be cool to wrap examples in the same environment!

https://forum.obsidian.md/t/rendering-callouts-similarly-in-pandoc/40020:

-- https://forum.obsidian.md/t/rendering-callouts-similarly-in-pandoc/40020/6
--
local stringify = (require "pandoc.utils").stringify

function BlockQuote (el)
    start = el.content[1]
    if (start.t == "Para" and start.content[1].t == "Str" and
        start.content[1].text:match("^%[!%w+%][-+]?$")) then
        _, _, ctype = start.content[1].text:find("%[!(%w+)%]")
        el.content:remove(1)
        start.content:remove(1)
        div = pandoc.Div(el.content, {class = "callout"})
        div.attributes["data-callout"] = ctype:lower()
        div.attributes["title"] = stringify(start.content):gsub("^ ", "")
        return div
    else
        return el
    end
end

Makes:

> [!NOTE]- callout Title
>
> callout content

into

::: {.callout data-callout="note" title="callout Title"}
callout content
:::
.callout {
    color: red; /* Set text color to red */
    border: 1px solid red; /* Optional: add a red border */
    padding: 10px; /* Optional: add some padding */
    /* Add any other styling as needed */
}

Then this makes it pretty HTML:

pandoc callout.md -L luas/obsidian-callouts.lua -t markdown -s | pandoc --standalone -o some_test.html --css luas/callout-style.css
<div class="callout" data-callout="note" title="callout Title">
<p>callout content</p>
</div>

For PDF: .. it’s more complex, will need such a header file etc. later on. TODO

\usepackage{xcolor} % Required for color definition
\newenvironment{callout}{
  \color{red} % Sets the text color to red within the environment
  % Add any other formatting commands here
}{}

Unrelated

Footnotes

Tufte CSS with pandoc

Damn! Just had to replace index.md with my thesis, then make all and it just …worked. Wow. 2023-12-28-151213_1231x720_scrot.png

Apparently to make it not a sidenote I just have to add - to the footnote itself. Would be trivial to replace with an @ etc., then I get my inital plan - citations as citations and footnotes with my remarks as sidenotes.

I can add --from gfm --mathjax to the makefile command and it works with all my other requirements!

pandoc \
	--katex \
	--section-divs \
	--from gfm \
	--mathjax \
	--filter pandoc-sidenote \
	--to html5+smart \
	--template=tufte \
	--css tufte.css --css pandoc.css --css pandoc-solarized.css --css tufte-extra.css \
	--output docs/tufte-md/index.html \
	docs/tufte-md/index.md

I wonder if I can modify it to create latex-style sidenotes, it should be very easy: pandoc-sidenote/src/Text/Pandoc/SideNote.hs at master · jez/pandoc-sidenote

Numering references etc

![Caption](file.ext){#fig:label}
$$ math $$ {#eq:label}
 Section {#sec:section}

TODO figure out, and latex as well.

Citations

TODO

Nel mezzo del deserto posso dire tutto quello che voglio.