## Performance of Google Apps Script Text Finder Class on 2 Approaches to Searching Large Datasets

Inspired by research into a recent blog post, the Google Apps Script Text Finder Class’ Find All (`findAll()`) and Find Next (`findNext()`) methods were benchmarked over two different datasets containing 50,000 rows. The first dataset contained 1,000 cells matching the search text. The second dataset contained 100 matching cells.

For each dataset, a test was conducted to retrieve either the first 10 matching cells or the first 100 matching cells. The Find All and Find Next approaches were tested and compared on each test.

It was expected that Find Next would perform best on the condition where the dataset contained a large number of found items and only a small number of first cells needed to be reported. The benchmark results suggest that this hypothesis is most likely.

 First number of cells to retrieve Test Function Avg. run time over 100 runs. Fastest Function Fastest Avg. Time Avg. time Difference 1000 items to find 10 1 v2 findAll 1626.24 v3 findNext 1368.45 257.79 v3 findNext 1368.45 50 2 v2 findAll 1578.19 v2 findAll 1578.19 4993.61 v3 findNext 6571.8 100 items to find 10 3 v2 findAll 360.94 v2 findAll 360.94 975.16 v3 findNext 1336.1 50 4 v2 findAll 377.13 v2 findAll 377.13 6175.59 v3 findNext 6552.72

Table: The average time in milliseconds of 100 runs of each test of Apps Script Text Finder findAll() and findNext() methods. Image link for mobile-friendly viewers.

## Method

### Sample Data

Two columns of data 50,000 rows deep were generated for this test. Each cell in each column consisted of a number; either 1, 2, 3, 4 or 5. An equal spread of numbers 1 through 4 where added to each row. Each column differs by the number of 5s in each row:

• Col A: 1,000 5’s
• Col B: 100 5’s

Each column was then selected and randomised with: Data > Randomise range.

### Test

Two functions are compared to test their performance based on four test conditions based on 100 runs of each test:

1. Retrieve the first 10 cells containing the search text where the range contains 1,000 matching search items.
2. Retrieve the first 50 cells containing the search text where the range contains 1,000 matching search items.
3. Retrieve the first 10 cells containing the search text where the range contains 100 matching search items.
4. Retrieve the first 50 cells containing the search text where the range contains 100 matching search items.

The time in milliseconds was recorded using the JavaScript `Date.now()` method before and after the functions were run. The difference in time in milliseconds was then appended to an array and added to a Google Sheet column for each test type. This culminated in 8 sets of 100 results.

The average of each test was then recorded and used to compare performance.

Note: `Performance.now()` is not available in Google Apps Script.

### Code

All code and results can be found copied from this sheet:

Analysis of Google Apps Script Create Finder Class Retrieve n found values

To explore the code and run your own independent tests, make a copy of the Google Sheet: File > Make a copy.

More detailed breakdowns of the code for each test function can be found in the source tutorial.

Note! There is no v1. The version numbers refer to the tutorial related to this post.

#### Main Test RUN

This function ran all the test conditions. Modify `colPastePosition` to add the culminated times to the desired columns. Then uncomment the desired run.

#### test_v2 – Google Apps Script Text Finder Class- findAll()

Code breakdown can be found here: link.

This function retrieves the full list of all found cells using the `findAll()` method from the Text Finder Class. All available found items in the range are then stored in the `found` variable.

It then relies on a for-loop to iterate through each cell and collect the cell location using the Spreadsheet App Class’ range `getA1Notation` method. Each cell location is then stored in the locations variable as an array item before returning the array to the initialising function.

The for-loop breaks when the total number of required cell items (the `position`) equal the index variable (`i`) in the loop.

#### test_v3 – Google Apps Script Text Finder Class- findNext()

Code breakdown can be found here: link.

In this function, a call is made to the spreadsheet to retrieve the found cell value each time `findNext()` method of the Text Finder Class is called. On each iteration, the `getA1Notation` method is used to retrieve the cell location. This location is then stored as an array value in the `locations` variable before being returned to the initiating function.

The function used a while-loop to iterate through each next item found until the `counter` – or the number of required cells to collect – is reached.

## Results & Discussion

### Test 1: Retrieve the first 10 cells containing the search text where the range contains 1,000 matching search items.

Version 3 –`findNext()` performed better on average when there were 1000 potential items to find in the range but only the first 10 items need to be selected. Versions 3’s average speed was 1368.45ms compared to version 2’s average run speed of 1826.24ms. This is a performance increase of 257.79ms for version 3.

Version 2’s lower performance is likely due to needing to collect all available found cells before it can extract the top 10 items.

Version 3, makes 10 calls to the Google Sheets in this example. Compared to version 2, this takes relatively less time than collecting all available found cell references to the search item.

### Test 2: Retrieve the first 50 cells containing the search text where the range contains 1,000 matching search items.

Version 2 – `findAll()` performed significantly better over 100 runs than version 3 when retrieving the top 50 found cells from a possible 1000. Version 2 was, on average, 4993.61ms faster at an average runtime of 1578.19ms compared to version 3’s sluggish 6571.80ms average.

It was expected that test one and test two’s times for version 2 would be similar and there are only 48.05ms between their average runtimes.

Version 3’s poor performance is likely due to its reliance on calling the spreadsheet to collect the cell data on all 50 calls it needs to make.

### Test 3: Retrieve the first 10 cells containing the search text where the range contains 100 matching search items.

Version 2, again, performed better by 975.16ms than version 3 when there was a smaller potential number of items to find in the range and only the first ten items need to be retrieved.

Here the performance margin between the two versions was closer than in the previous test. Version 2’s average run speed was 360.94ms while version 3’s runtime was 1336.10ms.

With a smaller number of retrieved items, the version 2 `findAll()` function did not have to work as hard to collect the methods related to each range it collects. Whereas version 3 still needed to make 10 performance-intensive calls back to the Google Sheet each time with relatively no performance change to test one.

### Test 4: Retrieve the first 50 cells containing the search text where the range contains 100 matching search items.

Predictably, version 2 – `findAll()` performed the best when the expected match sample is small (100 available matches) and the total first set of cells to retrieve was relatively large (50).

Version 2’s average completion time was 377.13ms compared to version 3’s average of 6552.72ms, performing on average 6175.59ms faster. This is by far the largest margin on performance between the two versions.

Here again, version 3 must perform 50 calls to the Google Sheet, each one retrieving the cell range data. Alternatively, version 2 makes one call to the spreadsheet and then retrieves the cell data for all collected values. This is significantly faster than version 3’s approach.

### Overall

On datasets that may have the potential to contain a large number of matching items, but fewer required results to return, version 3 may be the best option. In all other cases, version 2 is the most optimal approach to finding data in a range.

It is important to note that it can be difficult to accurately measure performance with Apps Script runs because resource allocation to run a script does seem to vary. Nevertheless, with a sample size of 100 runs, it is hoped that average values will be more accurate than a smaller sample.

Grab Your Own Copy of the Google Sheet and Attached Code here

If you have found the tutorial helpful, why not shout me a coffee ☕? I'd really appreciate it.

Need help with Google Workspace development?

My team of experts can help you with all of your needs, from custom app development to integrations and security. We have a proven track record of success in helping businesses of all sizes get the most out of Google Workspace.