Introducing Photocite
Recently I’ve been processing a number of family history photos or scans of old family artifacts such as letters. For images that are historic which I may share or distribute as a part of my research, I want to be sure that I have a good citation for the image and I want to embed it in the image itself so that the sourcing information is less likely to be lost as the image gets distributed or passed around. I started off using Pixelmator Pro to painstakingly add text citations to the images, but this seemed arduous and somewhat inconsistent after a while, and if I needed to tweak the citation it got a bit fiddly.
I thought I could probably figure out a way to do this via command line. It took more effort than I expected to figure out an efficient way which behaved the way I wanted, but I thought I would share the tool and explain how it works.
Photocite is a python script that chains together a few different tools to do this:
- ImageMagick is a command-line swiss army knife for images, it’s a great tool and very fast. It even has some built-in captioning capability. I spent a bunch of time unsuccessfully trying to get this to produce the type of captions I wanted, but I didn’t have any luck. I particularly had problems with mixing regular and italic text and with positioning the captions how I wanted.
- LaTeX is a document preparation and typesetting system. With this I found I could get the consistency and formatting I wanted in the citations.
- Pandoc is a universal document converter. I use this for converting Markdown to LaTeX.
- PdfCrop comes with the LaTeX/TeX installation.
The basic flow is as follows:
- Get the dimensions and DPI of the image
- If the image is a JPEG, get the quality of the JPEG
- Read in Markdown and use Pandoc and Latex to create a high resolution PDF of the citation
- Crop the PDF down to just encompass the text and some padding
- Convert the PDF to a PNG and resize it to be smaller than the original image
- Use ImageMagick to append the PNG to the original image.
I mostly used Claude.ai to generate the python code, but did some hand-tweaking as well.
Here’s an example:
Assuming I have an image and a markdown file that contains the text of my citation:

$ cat "Charles and Rhoda and possibly Hubert Crane.md"
Photograph depicting Charles Irvin Crane, Rhoda Ellen (Jenkins) Crane, and possibly Hubert Crane, ca. late 1895. Original print, approx. 6 × 4.5 in.; privately held by Todd Wells, Seattle, Washington, 2025. Inscription in the cursive handwriting of Agnes Crane Wells on back reads: “Charles & Rhoda Crane (& Hubert??)”.
Then I can execute
$ photocite crane.jpg -c "Charles and Rhoda and possibly Hubert Crane.md"
Created 'crane with citation.jpg' using citation text from file: Charles and Rhoda and possibly Hubert Crane.md
$
And I get a new image file with a citation embedded:
