Ah, bureaucracy at its finest. Have you ever asked for a list from a client, another department or agency and instead of saving into a file that might even be considered marginally useful, they give it to you as a pdf. A… p…d…f.
After some prostrations to the great Google Apps Script gods, I had a though.
“Hey, can’t we convert a PDF to a Google Doc with just a click of the button? Surely the great Google Apps Script devs have made it so we can do it programmatically too.”
And you know what? They bloody well did. The big legends.
Table of Contents
I’ve just received a bunch of PDFs. The PFDs are all labelled by the class number. Take a look at the files in my Google Drive:
Each PDF file contains a list of student IDs that I need to extract and put into a Google Sheet.
The aim is to have a list of student IDs in column A and their corresponding sections in column B.
As you can see, we have some pretty standard text in the PDF that should be easy for Google to recognise so that we can extract the IDs.
The list of names in the demo sheets were randomly generated by AI!
NOTE! As always, I have tried to create this tutorial for varying levels. Feel free to follow along, or just grab what you need and get stuck into your own project.
If you are playing along, you can find a copy of the PDF files below. Simply add them to your own Drive before you get started:
Google Sheets: COUNTIF, ARRAYFOMULA, FILTER, REGEXMATCH, REGULAR EXPRESSIONS (RE2)
Okay, wait! Stop!…
…I know the byline contained regular expressions, but I promise you I won’t just leave you with the formula for you to figure out your own jumbled mass of jibberish characters or inundate you with a list of commands and leave to attempt to piece it all together.
I promise to give you some clear examples with an explanation of each to you can apply it to your project.
Take a deep breath, mate, and let’s get cracking.
We are going to look at two related scenarios:
Imagine that you have a huge list of items. You have a hunch that some of the cells contain certain values of interest for you. You want to build a new list with only those values in them.
Imagine that you have a list of full names, and you want to use Google Sheets to create a new list of full names that only contain John.
You have that same huge list again, but this time you only want the total count of all the rows that contain certain values in each cell.
Imagining that list of full names again, you are now going to get a total count of all full names that contain John in it.
We’ll first go through how to create these formulas and then provide a number of clear examples on some common criteria for searching cell for values using REGEXMATCH and regular expressions.