Reducing the hassle with BibTeX

BibTeX is great for generating bibliographies, in particular combined with Inspire, but it also has its annoying aspects. This is a typical workflow to generate references for a paper:

  1. Find the texkey of a paper on Inspire and \cite it in the manuscript
  2. Copy & paste the bibtex entry into the .bib file
  3. Correct LaTeX code in the title (often missing the dollar signs or containing characters like “->”)
  4. After having completed the paper, check whether any of the preprints have been published in the meantime and add the journal reference.

In this list, step 1 is the only one requiring a brain, while steps 2-4 are increasingly annoying. This is why I have written a script that mostly automatizes these steps and I want to explain it in this post.

Spare me the details, tell me how to use it

Having completed step 1 above, you can compile your LaTeX document (let’s call it paper.tex) and a paper.aux file will be generated. This is the case even if you don’t have a bibliography file yet (and the compilation will thus fail). Installing my inspiretools script from GitHub, you can now execute the following command:

auxtobib paper.aux > bilbiography.bib

This command will download all the BibTeX entries from Inspire and save them to the .bib file. Step 2 has been automatized! When you add citations to the paper, just rerun the command. It will always fetch all the references anew, so if one of the references gets a journal reference added, your bibliography will be up to date. So step 4 is redundant as well!

What about step 3? Well, you could still do it manually, but all changes will be overwritten when you update the bibliography. The best way would be to change it on Inspire itself! And you can help doing that. The code contains a second script that you can invoke as

auxtoxml paper.aux > titles.xml

This will generate an XML file containing all the titles of the references in your bibliography. Correct all the LaTeX errors there and then send the XML file to feedback@inspirehep.net. The file is in the right format for the Inspire staff to quickly update the information in their database. This way, the change will not only persist when you update your references, but you will also have saved your colleagues some time!

How it works

The code uses the pyinspire script by Ian Huston (with some modifications by myself) that uses the Inspire API to fetch entries. It is written in Python.

In case you are wondering why I am taking the detour via the .aux file rather than directly extracting the references from the .tex file: I have found this to be more robust since it works with many different citation commands like \cite, \nocite, \autocites, and even with custom macros without the need to use complicated regular expressions.

Note that the current implementation is quite slow as it fetches each entry separately, which can take some time especially for long papers. In principle this could be sped up by fetching several entries simultaneously. If you want to improve on this, you are welcome to contribute to the repository.

2 comments

  1. I was going to write my own tool to do steps 2 and 4, but fortunately looked around first to see if anyone else had already done it. I quite like the autoxml feature – hadn’t even considered automating step 3.

Leave a Reply

Your email address will not be published. Required fields are marked *