Veel websites Zotero zal worden besteed aan middelen automatisch en deze, samen met alle beschikbare informatie veronderstelt (bijvoorbeeld. Perl module for integrating with CSL processor inside Zotero for plain text / markdown citation support - singingfish/Citeproc-Markdown. It is my reference manager of choice. I have Endnote X, but no longer use it. I just updated to Zotero 1.0.7 and received.
In this tutorial, you will first learn the basics of Markdown—an easy toread and write markup syntax for plain text—as well asPandoc, a command line tool thatconverts plain text into a number of beautifully formatted file types:PDF, .docx, HTML, LaTeX, slide decks, and more.1 With Pandocas your digital typesetting tool, you can use Markdown syntax to addfigures, a bibliography, formatting, and easily change citation stylesfrom Chicago to MLA (for instance), all using plain text.
The tutorial assumes no prior technical knowledge, but it scales withexperience, as we often suggest more advanced techniques towards the endof each section. These are clearly marked and can be revisited aftersome practice and experimentation.
Instead of following this tutorial in a mechanical way, we recommend youstrive to understand the solutions offered here as a methodology,which may need to be tailored further to fit your environment andworkflow. The installation of the necessary tools presents perhaps thebiggest barrier to participation. Allot yourself enough time andpatience to install everything properly, or do it with a colleague whohas a similar set-up and help each other out. Consult the UsefulResources section below if you getstuck.2
Writing, storing, and retrieving documents are activities central to thehumanities research workflow. And yet, many authors base their practiceon proprietary tools and formats that sometimes fall short of even themost basic requirements of scholarly writing. Perhaps you can relate tobeing frustrated by the fragility of footnotes, bibliographies, figures,and book drafts authored in Microsoft Word or Google Docs. Nevertheless,most journals still insist on submissions in .docx format.
More than causing personal frustration, this reliance on proprietarytools and formats has long-term negative implications for the academiccommunity. In such an environment, journals must outsource typesetting,alienating authors from the material contexts of publication and addingfurther unnecessary barriers to the unfettered circulation ofknowledge.3
When you use MS Word, Google Docs, or Open Office to write documents,what you see is not what you get. Beneath the visible layer of words,sentences, and paragraphs lies a complicated layer of codeunderstandable only to machines. Because of that hidden layer, your.docx and .pdf files depend on proprietary tools to be renderedcorrectly. Such documents are difficult to search, to print, and toconvert into other file formats.
Moreover, time spent formatting your document in MS Word or Open Officeis wasted, because all that formatting is removed by the publisherduring submission. Both authors and publishers would benefit fromexchanging files with minimal formatting, leaving the typesetting to thefinal typesetting stage of the publishing process.
This is where Markdown shines. Markdown is a syntax for marking semanticelements within a document explicitly, not in some hidden layer. Theidea is to identify units that are meaningful to humans, like titles,sections, subsections, footnotes, and illustrations. At the very least,your files will always remain comprehensible to you, even if the editoryou are currently using stops working or “goes out of business.”
Writing in this way liberates the author from the tool. Markdown can bewritten in any plain text editor and offers a rich ecosystem of softwarethat can render that text into beautiful looking documents. For thisreason, Markdown is currently enjoying a period of growth, not just asas means for writing scholarly papers but as a convention for onlineediting in general.
Popular general purpose plain text editors include Atom(all platforms) and Notepad++ (Windows only).
It is important to understand that Markdown is merely a convention.Markdown files are stored as plain text, further adding to theflexibility of the format. Plain text files have been around since theelectronic typewriter. The longevity of this standard inherently makesplain text more sustainable and stable than proprietary formats. Whilefiles produced even ten years ago in Microsoft Word and Apple’s Pagescan cause significant problems when opened with the latest version, itis still possible to open a file written in any number of “dead” plaintext editors from the past several decades: AlphaPlus, Perfect Writer,Text Wizard, Spellbinder, WordStar, or Isaac Asimov’s favorite SCRIPSIT2.0, made by Radio Shack. Writing in plain text guarantees that yourfiles will remain readable ten, fifteen, twenty years from now. In thistutorial, we outline a workflow that frees the researcher fromproprietary word processing software and fragile file formats.
It is now possible to write a wide range of documents in oneformat—articles, blog posts, wikis, syllabi, and recommendationletters—using the same set of tools and techniques to search, discover,backup, and distribute our materials. Your notes, blog entries, codedocumentation, and wikis can all be authored in Markdown. Increasingly,many platforms like WordPress, Reddit, and GitHub support Markdownauthorship natively. In the long term, your research will benefit fromsuch unified workflows, making it easier to save, search, share, andorganize your materials.
Inspired by best practices in a variety of disciplines, we were guidedby the following principles:
Sustainability. Plain text both ensures transparency and answersthe standards of long-term preservation. MS Word may go the way ofWord Perfect in the future, but plain text will always remain easyto read, catalog, mine, and transform. Furthermore, plain textenables easy and powerful versioning of the document, which isuseful in collaboration and organizing drafts. Your plain text fileswill be accessible on cell phones, tablets, or, perhaps, on alow-powered terminal in some remote library. Plain text is backwardscompatible and future-proof. Whatever software or hardware comesalong next, it will be able to understand your plain text files.
Preference for human-readable formats. When writing in Word orGoogle Docs, what you see is not what you get. The .doc filecontains hidden, automatically-generated formatting characters,creating an obfuscated typesetting layer that is difficult for theuser to troubleshoot. Something as simple as pasting an image ortext from the browser can have unpredictable effects on yourdocument’s formatting.
Separation of form and content. Writing and formatting at the sametime is distracting. The idea is to write first, and format later,as close as possible to the time of publication. A task likeswitching from Chicago to MLA formatting should be painless. Journaleditors who want to save time on needless formatting and copyediting should be able to provide their authors with a formattingtemplate which takes care of the typesetting minutia.
Support for the academic apparatus. The workflow needs to handlefootnotes, figures, international characters, and bibliographiesgracefully.
Platform independence. As the vectors of publication multiply, weneed to be able to generate a multiplicity of formats including forslide projection, print, web, and mobile. Ideally, we would like tobe able to generate the most common formats without breakingbibliographic dependencies. Our workflow needs to be portable aswell–it would be nice to be able to copy a folder to a thumbdriveand know that it contains everything needed for publication. Writingin plain text means you can easily share, edit, and archive yourdocuments in virtually any environment. For example, a syllabuswritten in Markdown can be saved as a PDF, printed as a handout, andconverted into HTML for the web, all from the same file. Both weband print documents should be published from the same source andlook similar, preserving the logical layout of the material.
Markdown and LaTeX answer all of these requirements. We chose Markdown(and not LaTeX) because it offers the most light-weight and clutter freesyntax (hence, mark down) and because when coupled with Pandoc itallows for the greatest flexibility in outputs (including .docx and .texfiles).4
We purposefully omit some of the granular, platform- or operatingsystem-bound details of installing the software listed below. Forexample, it makes no sense to provide installation instructions forLaTeX, when the canonical online instructions for your operating systemwill always remain more current and more complete. Similarly, themechanics of Pandoc installation are best explored by searching for“installing Pandoc” on Google, with the likely first result beingPandoc’s homepage.
Plain text editor. Entering the world of plain-text editing expandsyour choice of innovative authoring tools dramatically. Search online for“markdown text editor” and experiment with your options. It does not matterwhat you use as long as it is explicitly a plain text editor, such as Atom orNotepad++. Remember, since we are not tied to the tool, you can change editorsat any time.
Command line terminal. Working “in the command line” isequivalent to typing commands into the terminal. On a Mac you simplyneed to use your finder for “Terminal”. On Windows, use PowerShell.Linux users are likely to be familiar with their terminals already.We will cover the basics of how to find and use the command linebelow.
Pandoc. Detailed, platform-specific installation instructionsare available at the Pandocwebsite.Installation of Pandoc on your machine is crucial for thistutorial, so be sure to take your time and click through theinstructions. Pandoc was created and is maintained by JohnMacFarlane, Professor of Philosophy at the University of California,Berkeley. This is humanities computing at its best and will serve asthe engine of our workflow. With Pandoc, you will be able to compiletext and bibliography into beautifully formatted and flexibledocuments. Once you’ve followed the installation instructions,verify that Pandoc is installed by entering
pandoc --versionintothe command line. We assume that you have at least version 1.12.3,released in January 2014.
The following two pieces of software are recommended, but not requiredto complete this tutorial.
Zotero Markdown Cheat
Zotero or Endnote. Bibliographic reference software like Zoteroand Endnote are indispensable tools for organizing and formattingcitations in a research paper. These programs can export yourlibraries as a BibTeX file (which you will learn more about in Case2 below). This file, itself a formatted plain text document of allyour citations, will allow you to quickly and easily cite referencesusing
@tags. It should be noted that it’s also possible to typeall of your bibliographic references by hand, using ourbibliographyas a template.
LaTeX. Detailed, platform-specific installation instructionsavailable at the Pandocwebsite. AlthoughLaTeX is not covered in this tutorial, it is used by Pandoc for .pdfcreation. Advanced users will often convert into LaTeX directly tohave more granular control over the typesetting of the .pdf.Beginners may want to consider skipping this step. Otherwise, type
latex -vto see if LaTeX was installed correctly (you will get anerror if it was not and some information on the version if it was).
Markdown is a convention for structuring your plain-text documentssemantically. The idea is to identify logical structures in yourdocument (a title, sections, subsections, footnotes, etc.), mark themwith some unobtrusive characters, and then “compile” the resulting textwith a typesetting interpreter which will format the documentconsistently, according to a specified style.
Markdown conventions come in several “flavors” designed for use inparticular contexts, such as blogs, wikis, or code repositories. Theflavor of Markdown used by Pandoc is geared for academic use. Itsconventions are described on the Pandoc’sMarkdownpage. Its conventions include the “YAML”block,which contains some useful metadata.5
Let’s now create a simple document in Markdown. Open a plain-text editorof your choice and begin typing. It should look like this:
Pandoc-flavored Markdown stores each of the above values, and “prints”them in the appropriate location of your outputted document once you areready to typeset. We will later learn to add other, more powerful fieldsto the YAML block. For now, let’s pretend we are writing a paper thatcontains three sections, each subdivided into two subsections. Leave ablank line after last three dashes in the YAML block and paste thefollowing:
Go ahead and enter some dummy text as well. Empty space is meaningful inMarkdown: do not indent your paragraphs. Instead, separate paragraphs byusing an blank line. Blank lines must also precede section headers.
You can use asterisks to add bold or italicized emphasis to your words,like this:
**bold**. We should also add a link and afootnote to our text to cover the basic components of an average paper.Type:
When the text of the link and the address are the same it is faster towrite
<www.eff.org> instead of
Let’s save our file before advancing any further. Create a new folderthat will house this project. You are likely to have some system oforganizing your documents, projects, illustrations, and bibliographies.But often, your document, its illustrations, and bibliography live indifferent folders, which makes them hard to track. Our goal is to createa single folder for each project, with all relevant materials included.The general rule of thumb is one project, one paper, one folder. Nameyour file something like
main.md, where “md” stands for markdown.
Once your file is saved, let’s add an illustration. Copy an image (anysmall image) to your folder, and add the following somewhere in the bodyof the text:
At this point, your
main.md should look something like the following.You can download this sample .md filehere.
As we shall do shortly, this plain text file can be rendered as a verynice PDF:
If you’d like to get an idea of how this kind of markup will beinterpreted as HTML formatting, try this onlinesandbox and playaround with various kinds of syntax. Remember that certain elements ofPandoc-flavored Markdown (such as the title block and footnotes) willnot work in this web form, which only accepts the basics.
At this point, you should spend some time exploring some of otherfeatures of Markdown like quotations (referenced by
> symbol), bulletlists which start with
-, verbatim line breaks which start with
(useful for poetry), tables, and a few of the other functions listedon Pandoc’s markdown page.
Pay particular attention to empty space and the flow of paragraphs. Thedocumentation puts it succinctly when it defines a paragraph to be “oneor more lines of text followed by one or more blank line.” Note that“newlines are treated as spaces” and that “if you need a hard linebreak, put two or more spaces at the end of a line.” The best way tounderstand what that means is to experiment freely. Use your editor’spreview mode or just run Pandoc to see the results of your experiments.
Above all, avoid the urge to format. Remember that you are identifyingsemantic units: sections, subsections, emphasis, footnotes, andfigures. Even
**bold** in Markdown are not reallyformatting marks, but indicate different level of emphasis. Theformatting will happen later, once you know the venue and therequirements of publication.
There are programs that allow you to watch a live preview of Markdownoutput as you edit your plain text file, which we detail below in theUseful Resources section. Few of them support footnotes, figures, andbibliographies however. To take full advantage of Pandoc, we recommendthat you stick with simple, plain text files stored locally, on yourcomputer.
Getting in touch with your inner terminal
Before we can start publishing our
main.md file into other formats, weneed to get oriented with working on the command line using yourcomputer’s terminal program, which is the only (and best) way to usePandoc.
The command line is a friendly place, once you get used to it. If youare already familiar with using the command line, feel free to skip thissection. For others, it is important to understand that being able touse your terminal program directly will all you to use a broad range ofpowerful research tools that you couldn’t use otherwise, and can serveas a basis for more advanced work. For the purposes of this tutorial,you need to learn only a few, very simple commands.
First, open a command line window. If you are using macOS, open the Terminalapplication in the ‘Applications/Utilities’ directory. On Windows, werecommend you use PowerShell or, for a more robust solution, install theWindows Subsystem for Linux and use the terminal that comes with your favoriteLinux distribution. For an excellent introduction to the command line, consult“Introduction to the Bash CommandLine” by IanMilligan and James Baker.
In the terminal, you should see a text window and a prompt that lookssomething like this:
computer-name:~username$. The tilde indicates your“home” directory, and in fact you can type
$ cd ~ at any point to return toyour home directory. Don’t type the dollar sign, it just symbolizes thecommand prompt of your terminal, prompting you to type something into yourterminal (as opposed to typing it into your document); remember to hit enterafter every command.
It is very likely that your “Documents” folder is located here. Type
$ pwd (= print working directory) and press enter to display the nameof the current directory). Use
$ pwd whenever you feel lost.
$ ls (= list), which simply lists the files in the currentdirectory. Finally, you can use
$ cd> (= change directory) like
$ cd DIRECTORY_NAME (where
DIRECTORY_NAME is the name of thedirectory you’d like to navigate to). You can use
$ cd .. toautomatically move up one level in the directory structure (the parentdirectory of the directory you are currently in). Once you start typingthe directory name, use the Tab key to auto complete thetext—particularly useful for long directory names, or directories namesthat contain spaces.6
These three terminal commands:
cd are all you needfor this tutorial. Practice them for a few minutes to navigate yourdocuments folder and think about they way you have organized your files.If you’d like, follow along with your regular graphical file manager tokeep your bearings.
Using Pandoc to convert Markdown to an MS Word document
We are now ready to typeset! Open your terminal window, use
$ pwd and
$ cd DIRECTORY-NAME to navigate to the correct folder for your project. Once you arethere, type
$ ls in the terminal to list the files. If you see your.md file and your images, you are in the right place. To convert .mdinto .docx type:
Open the file with MS Word to check your results. Alternatively, if youuse Open- or LibreOffice you can run:
If you are new to the command line, imagine reading the above command as saying something like:“Pandoc, create an MS Word file out of my Markdown file.” The
-o part is a “flag,” which inthis case says something like “instead of me explicitly telling you the source and the targetfile formats, just guess by looking at the file extension” or simply “output.” Many options areavailable through such flags in Pandoc. You can see the complete list on Pandoc’swebsite or by typing
$ man pandoc in theterminal.
Try running the command
Now navigate back to your project directory. Can you tell what happened?
More advanced users who have LaTeX installed may want to experiment byconverting Markdown into .tex or specially formatted .pdf files. OnceLaTeX is installed, a beautifully formatted PDF file can be createdusing the same command structure:
If your document is written in languages other than English, you will likely need to usethe XeLaTeX engine instead of plain LaTeX for .pdf conversion:
Make sure your text editor supports the UTF-8 encoding. When using XeLaTeX forconversion into .pdf, instead of the
fontfamily attribute in YAML to changefonts, specify the
mainfont attribute, to produce something like thefollowing:
Zotero Export Markdown
command line argument (flag) functionality. For example, font styles couldbe passed to Pandoc in the form of
pandoc main.md --mainfont=times -otarget.pdf. However, we prefer to use the YAML header options wheneverpossible, since it makes our command line incantations easier to type and toremember. Using a version control tool such as Git will preserve your YAMLchanges, where what you type in the terminal is rather more ephemeral. Consultthe Templates section in the Pandoc manual (
man pandoc) for the list ofavailable YAML variables.
Working with Bibliographies
In this section, we will add a bibliography to our document and thenconvert from Chicago to MLA formats.
If you are not using a reference manger like Endnote or Zotero, youshould. We prefer Zotero, because, like Pandoc, it was created by theacademic community and like other open-source projects it is releasedunder the GNU General Public License. Most importantly for us, yourreference manager must have the ability to generate bibliographies inplain text format, to keep in line with our “everything in plain text”principle. Go ahead and open a reference manager of your choice and addsome sample entries. When you are ready, find the option to export yourbibliography in BibTeX (.bib) format. Save your .bib file in yourproject directory, and give it a reasonable title like “project.bib”.
The general idea is to keep your sources organized under one centralizedbibliographic database, while generating specific and much smaller .bibfiles that will live in the same directory as your project. Go ahead andopen your .bib file with the plain-text editor of your choice.7
Your .bib file should contain multiple entries that look something likethis:
You will rarely have to edit these by hand (although you can). In mostcases, you will simply “export” the .bib file from Zotero or from asimilar reference manager. Take a moment to orient yourself here. Eachentry consists of a document type, “article” in our case, a uniqueidentifier (fyfe_digital_2011), and the relevant meta-data on title,volume, author, and so on. The thing we care most about is the unique IDwhich immediately follows the curly bracket in the first line of eachentry. The unique ID is what allows us to connect the bibliography withthe main document. Leave this file open for now and go back to your
Edit the footnote in the first line of your
main.md file to looksomething like the following examples, where
@name_title_date can be replaced with one ofthe unique IDs from your
A reference formatted like this will render properly as inline- or footnote- style citation [@name_title_date, 67].8
'For citations within quotes, put the comma outside the quotation mark' [@name_title_2011, 67].
Once we run the markdown through Pandoc, “@fyfe_digital_2011” will beexpanded to a full citation in the style of your choice. You can use the
@citation syntax in any way you see fit: in-line with your text or inthe footnotes. To generate a bibliography simply include a sectioncalled
# Bibliography at the end of document.
Now, go back to your metadata header at the top of your .md document,and specify the bibliography file to be used, like so:
This tells Pandoc to look for your bibliography in the
project.bibfile, under the same directory as your
main.md. Let’s see if thisworks. Save your file, switch to the terminal window and run:
The “pandoc-citeproc” filter will parse any citation tags found in your document. The resultshould be a decently formatted MS Word file. If you have LaTeX installed, convert into .pdfusing the same syntax for prettier results. Do not worry if things are not exactly the way youlike them—remember, you are going to fine-tune the formatting all at once and at later time,as close as possible to the time of publication. For now we are just creating drafts based onreasonable defaults.
Changing citation styles
The default citation style in Pandoc is Chicago Author-date. We canspecify a different style by using stylesheet, written in the “CitationStyle Language” (yet another plain-text convention, in this case fordescribing citation styles) and denoted by the .csl file extension.Luckily, the CSL project maintains a repository of common citationstyles, some even tailored for specific journals. Visithttp://editor.citationstyles.org/about/ to find the .csl file forModern Language Association, download
modern-language-association.csl,and save to your project directory as
mla.csl. Now we need to tellPandoc to use the MLA stylesheet instead of the default Chicago. We dothis by updating the YAML header:
You then repeat the pandoc incantation to cast your markdown file into your target format (.pdfor .docx):
Parse the command into English as you are typing. In my head, I translate the above intosomething like: “Pandoc, take the my markdown file, run it through a citation filter, andoutput a Markdown file.” As you get more familiar with citation stylesheets, consider addingyour custom-tailored .csl files for journals in your field to the archive as a service to thecommunity.
You should now be able to write papers in Markdown, to create drafts inmultiple formats, to add bibliographies, and to easily change citationstyles. A final look at the project directory will reveal a number of“source” files: your
project.bib file, your
mla.cslfile, and some images. Besides the source files you should see some some“target” files that we created during the tutorial:
main.pdf. Your folder should look something like this:
Treat you source files as an authoritative version of your text, and youtarget files as disposable “print outs” that you can easily generatewith Pandoc on the fly. All revisions should go into
main.docx file is there for final-stage clean up and formatting. Forexample, if the journal requires double-spaced manuscripts, you canquickly double-space in Open Office or Microsoft Word. But don’t spendtoo much time formatting. Remember, it all gets stripped out when yourmanuscript goes to print. The time spent on needless formatting can beput to better use in polishing the prose of your draft.
Should you run into trouble, there is no better place to start lookingfor support than John MacFarlane’s Pandoc site and theaffiliated mailinglist. At leasttwo “Question and Answer” type sites can field questions on Pandoc:Stack Overflow andDigital Humanities Q&A.Questions may also be asked live, on Freenode IRC, #Pandoc channel,frequented by a friendly group of regulars. As you learn more aboutPandoc, you can also explore one of its most powerful features: filters.
Although we suggest starting out with a simple editor, many (70+,according to this blogpost)other, Markdown-specific alternatives to MS Word are available online,and often free of cost. From the standalone ones, we likedMou, Write Monkey, andSublime Text. Several web-based platformshave recently emerged that provide slick, graphic interfaces forcollaborative writing and version tracking using Markdown. Theseinclude: prose.io,Authorea, Penflip,Draft, andStackEdit.
But the ecosystem is not limited to editors. Gititand Ikiwiki supportauthoring in Markdown with Pandoc as parser. To this list we may a rangeof tools that generate fast, static webpages,Yst,Jekyll,Hakyll, and bash shellscript by the historian CalebMcDaniel.
Finally, whole publishing platforms are forming around the use ofMarkdown. Markdown to marketplace platformLeanpub could be an interesting alternative tothe traditional publishing model. And we ourselves are experimentingwith academic journal design based on GitHub andreadthedocs.org (tools usually used for technicaldocumentation).
Zotero Markdown Chrome
Don’t worry if you don’t understand some of of this terminology yet! ↩
The source files for this document can be downloaded from GitHub. Use the “raw” option when viewing in GitHub to see the source Markdown. The authors would like to thank Alex Gil and his colleagues from Columbia’s Digital Humanities Center, and the participants of openLab at the Studio in the Butler library for testing the code in this tutorial on a variety of platforms. ↩
See Charlie Stross’s excellent discussion of this topic in Why Microsoft Word Must Die. ↩
There are no good solutions for directly arriving at MS Word from LaTeX. ↩
Note that YAML often replicates some, although not all, of the ↩
It is a good idea to get into the habit of not using spaces in folder or file names. Dashes or underscores instead of spaces in your filenames ensure lasting cross-platform compatibility. ↩
Note that the .bib extension may be “registered” to Zotero in your operating system. That means when you click on a .bib file it is likely that Zotero will be called to open it, whereas we want to open it within a text editor. Eventually, you may want to associate the .bib extension with your text editor. ↩
Thanks to @njbart for the correction. In response to our original suggestion,
Some sentence that needs citation.^[@fyfe_digital_2011 argues that too.]he writes: “This is not recommended since it keeps you from switching easily between footnote and author-date styles. Better use the [corrected] (no circumflex, no final period inside the square braces, and the final punctuation of the text sentence after the square braces; with footnote styles, pandoc automatically adjusts the position of the final punctuation).” ↩