Introducing Photocite

2 May 2025 ImageMagick, images, citations

Recently I’ve been processing a number of family history photos or scans of old family artifacts such as letters. For images that are historic which I may share or distribute as a part of my research, I want to be sure that I have a good citation for the image and I want to embed it in the image itself so that the sourcing information is less likely to be lost as the image gets distributed or passed around. I started off using Pixelmator Pro to painstakingly add text citations to the images, but this seemed arduous and somewhat inconsistent after a while, and if I needed to tweak the citation it got a bit fiddly.

I thought I could probably figure out a way to do this via command line. It took more effort than I expected to figure out an efficient way which behaved the way I wanted, but I thought I would share the tool and explain how it works.

Photocite is a python script that chains together a few different tools to do this:

ImageMagick is a command-line swiss army knife for images, it’s a great tool and very fast. It even has some built-in captioning capability. I spent a bunch of time unsuccessfully trying to get this to produce the type of captions I wanted, but I didn’t have any luck. I particularly had problems with mixing regular and italic text and with positioning the captions how I wanted.
LaTeX is a document preparation and typesetting system. With this I found I could get the consistency and formatting I wanted in the citations.
Pandoc is a universal document converter. I use this for converting Markdown to LaTeX.
PdfCrop comes with the LaTeX/TeX installation.

The basic flow is as follows:

Get the dimensions and DPI of the image
If the image is a JPEG, get the quality of the JPEG
Read in Markdown and use Pandoc and Latex to create a high resolution PDF of the citation
Crop the PDF down to just encompass the text and some padding
Convert the PDF to a PNG and resize it to be smaller than the original image
Use ImageMagick to append the PNG to the original image.

I mostly used Claude.ai to generate the python code, but did some hand-tweaking as well.

Here’s an example:

Assuming I have an image and a markdown file that contains the text of my citation:

$ cat "Charles and Rhoda and possibly Hubert Crane.md"
Photograph depicting Charles Irvin Crane, Rhoda Ellen (Jenkins) Crane, and possibly Hubert Crane, ca. late 1895. Original print, approx. 6 × 4.5 in.; privately held by Todd Wells, Seattle, Washington, 2025. Inscription in the cursive handwriting of Agnes Crane Wells on back reads: “Charles & Rhoda Crane (& Hubert??)”.

Then I can execute

$ photocite crane.jpg -c "Charles and Rhoda and possibly Hubert Crane.md"
Created 'crane with citation.jpg' using citation text from file: Charles and Rhoda and possibly Hubert Crane.md
$

And I get a new image file with a citation embedded: