Google Apps Script: Extract Specific Data From a PDF and insert it into a Google Sheet (Updated Feb 2024)

Google Apps Script: DriveApp, Advanced Drive Service, SpreadsheetApp,  DocumentApp Javascript: spread operator, map, regex

Ah, bureaucracy at its finest.  Have you ever asked for a list from a client, another department or agency and instead of saving into a file that might even be considered marginally useful, they give it to you as a pdf. A… p…d…f.

Gee, thanks.

After some prostrations to the great Google Apps Script gods, I had a though.

“Hey, can’t we convert a PDF to a Google Doc with just a click of the button? Surely the great Google Apps Script devs have made it so we can do it programmatically too.”

And you know what? They bloody well did. The big legends.

The Scenario

I’ve just received a bunch of PDFs. The PFDs are all labelled by the class number. Take a look at the files in my Google Drive:

PDFs of sections in Google Drive

Each PDF file contains a list of student IDs that I need to extract and put into a Google Sheet.

PDF file with student IDs

The aim is to have a list of student IDs in column A and their corresponding sections in column B.

As you can see, we have some pretty standard text in the PDF that should be easy for Google to recognise so that we can extract the IDs.

The list of names in the demo sheets were randomly generated by AI!

NOTE! As always, I have tried to create this tutorial for varying levels. Feel free to follow along, or just grab what you need and get stuck into your own project.

If you are playing along, you can find a copy of the PDF files below. Simply add them to your own Drive before you get started:

PDF files by Section.

Continue reading “Google Apps Script: Extract Specific Data From a PDF and insert it into a Google Sheet (Updated Feb 2024)”

Google Apps Script – How to Change the File Name of Non-Native Files like MS Word Docs, PDF’s and Excel files.

Google Apps Script, DriveApp

I had an unexpected need to change some non-native file names – in my case, MS Word docs – to something else to appease my masters. Being the resident Igor, I delighted in the task. 

The Problem

I had a folder of MS Word documents on my Google Drive that needed to all be renamed. They all needed the same name with an index number to distinguish that they were different files. 

There were however other file types in the folder. 

The problem with non-native files in Google Drive is that it’s a little tricky to find their ID. Besides, I just wanted to change ALL files that were MS Word in the folder. 

What I would need to do is search for all files inside my selected folder that are of MS Word type and rename them with the same name but with a counter at the end. For example:

Old File Name New File Name
boring.doc cranberrySauce.doc
sad.doc cranberrySauce1.doc
yawn.doc cranberrySauce2.doc

Continue reading “Google Apps Script – How to Change the File Name of Non-Native Files like MS Word Docs, PDF’s and Excel files.”

Google Apps Script: Get File By Name – With Optional Parent Folder Crosscheck

Google Apps Script:  DriveApp

One of my recent projects in Google Apps Script required me to search for a file by name and get its ID. This can be problematic in Google Drive because you can have multiple files of the same name in multiple locations. My solution was to also check the file’s parent folder name as well. 

I created a function getFileByName() to handle this. The function takes the following parameters:

  • fileName – The name of the file you are looking for. (required)
  • fileInFolder – The parent folder of the file you are searching for. (optional)

The file path would look a little something like this:

.../fileInFolder/fileName

getFileByName() returns an object containing :

  1.  the ID of the file if the file exists or false if it does not.
  2. the ERROR if there is an error or false if there is not. 

The returned object would look like the following:

Successful

Error

getFileByName() reports the following errors:

  1. If no parent folder name is given and there are more than one file with the same file name. 
  2. There is more than one parent folder and file combination with the same name.
  3. There are multiple files with the same name but none are in the folder parent name provided.
  4. No file with the file name provided exists.

Continue reading “Google Apps Script: Get File By Name – With Optional Parent Folder Crosscheck”

Google Apps Script – How to Find the Folder ID of a Non-Unique Folder Using File Path Names

Google Apps Script – DriveApp

sighisoara Google Apps Script Tutorial
This tutorial comes from a comfy spot in the centre of Sighisoara.

Well, that title is such a mouthful. So I think it needs some explaining…

I’ve recently come across a task that requires me to access a particular folder in a directory based on the information in a Google Sheet. This means:

  1. I don’t have the folder or the sub-folder ID.
  2. I can get the directory path information from information supplied in a Google Sheet.

Basically what I needed to do was create a Report maker and store it in the following directory path:

My Drive(root) >> Year >>  Quarter >> Unit+ "Report"

For example:

MyDrive >> 2018 >> Q4  >> Unit 4 Report

Boring!!!! Take me to the code!!!

The Problem

Unfortunately, I could not simply change the last folder name from say, Unit 4 Report to Q4 Unit 4 Report 2018 so it is easily searchable and unique. The other problem is that there are other Unit 4 Reports in other years and quarters so I did not want to accidentally call them instead of the exact one I wanted.

So, what to do..?

The Solution

Continue reading “Google Apps Script – How to Find the Folder ID of a Non-Unique Folder Using File Path Names”

Google Apps Script – How to Create Folders in Directories with DriveApp

Google Apps Script – DriveApp

Well, it’s a rainy day here travelling in Romania, so time for a post.

coding in Romania
Relaxing in a cafe in Cluj Napoca, Romania on a Rainy. I love this country!!!

When creating a Google Apps Script’s I often find I am creating new folders and files in specific locations on Google Drive after, say, generating a report or something.

A Note on Folders in Google Drive

All files and folders in Google Drive are allocated a unique key that identifies them.

Google Drive Folder ID

The file location and all the data about the file are mapped to this ID. This means you can have as many folders or files with the same name even in the same directory without a duplicate error being thrown because they all have their own unique ID for their URL.

Getting Started

More often than not, I know the parent folder that I want to put my subfolders in. This means I can get the parent folder’s ID and use that as my starting point to add subfolders. To do this we use the DriveApp class.

Below are three useful functions for creating folders.

  • Simple – Create a folder under the Parent folder ID – Duplicates are not checked and there can be multiple subfolders with the same name but all have their own unique id.
  • Medium – Create a folder only if that folder name does not exist in the Parent folder – No folder is created if the folder already exists.
  • Hard-ish – Create a folder. If the name exists, add a counter to the name – If the file already exists then add a counter to the end of the file name.

All the functions will take two arguments: folderID – the unique id of the parent folder and folderName – the name you want to call your new folder.

The start() function will simply grab the two variables for the folderID and folderName and run the folder creation function. This is to simulate using the functions in your code.

Feel free to read what you need. I try and write these for a wide range of coding skill in mind.

Continue reading “Google Apps Script – How to Create Folders in Directories with DriveApp”