PDF Slurping

I’ve been using Advantage Archives for looking at the newspaper archives of a number of different libraries as a part of genealogy research.

The trouble is that each library has a slightly different UI for browsing these newspapers, and the experience can be fairly cumbersome. Ultimately, you can download a PDF if you want, but all the clicking around still makes the process slow and frustrating.

Of course the built-in MacOS Preview tool can show you PDFs, too, and it’s navigation/zoom interface is easier, too (especially using pinch-to-zoom, etc).

TextExpander Citations

In my research I’ve been finding a lot of newspaper articles and transcribing them using automation. But when I’m documenting those articles and linking them to people, I also need to create a source citation. I try to use Evidence Explained-style citations to the best of my ability, but there’s a lot of repeated boilerplate text when writing these up – and there is formatting! Some parts of italicized, so that gets tedious, too.

Automating Transcription with ChatGPT

I recently discovered a trove of online newspaper archives for an region where a branch of my family research is focused. Advantage Archives partners with mostly small-town libraries, especially in the midwestern USA, to digitize old newspaper archives and put them online.

The discovery of all these newpaper archives has led me to want to transcribe and write source citations for hundreds of articles. You know what’s tedious? Transcribing and citing hundreds of articles! This is a job for automation!

Hello World

Welcome to Pedigree Pipeline, a place for me to share tips and techniques for automating genealogy and family history research — along with other insights or discoveries from my research itself. Much of my work is computer-centric these days, and I often find myself repeating the same tasks. As a software engineer, I’m always thinking about how to automate these workflows — primarily on a Mac, using the Unix command line and Mac-centric automation tools.