Arrays, one of the most flexible, performant and popular data structure in almost any programming language becomes especially useful in Excel. All you need to know about arrays in VBA is right here
In case you missed it, the last month in my Blog was dedicated to data structures in computer programming and their use in Excel VBA. After we covered Enumerations, Collections and User-Defined Data Types, we close this week with a grand finale: Arrays.
What is an Array?
An array is a memory storage structure for storing an indexed series of elements of the same data type.
This is how we would dimension and array of 4 String-type elements:
Dim Seasons(0 to 3) as String
As you can see, what differentiates a reference to an array variable from a regular variable is the parenthesis following the variable name.
In the parenthesis, we specify the size of the array, or more accurately, the index bounds of its elements (or members, as I like to call them). In our example, the Seasons array is defined to be able to accommodate four strings, the first of which will assume the index of 0. Our Seasons array variable would therefore be a perfect store for the four seasons of the year:
Let’s assign a value to the first member of the Seasons array:
Seasons(0) = “Winter”
At this time, we have four allocated slots in memory serving the Seasons variable, the first of which is populated with the (string) value “Winter” and the other three contains an empty string (the default for string-type variables). Let’s complete the whole year:
Seasons(1) = “Spring”
Seasons(2) = “Summer”
Seasons(3) = “Autumn”
By the way, a short form of dimensioning the Seasons array could be:
Dim Seasons(3) as String
The lower bound (or index of the first member) defaults to 0. While we can change that to be 1 by declaring Option Base 1 at the top of our module, real cool programmers always work with 0-based arrays in all programming languages (in some that’s the only option there is).
The Great Benefits of Arrays
Among the benefits of using arrays, I can offer:
- Very performant, in-memory processing
- Favorable for .Next loops
- Ideal for table-like structures (think about a Range…)
- Most efficient in data-exchange to/from Worksheet cells
- Supports up to 60 dimensions
By the end of this Blog post, you’ll understand all of this, trust me.
Working with Arrays in Excel VBA
Having indexed members, an array naturally lends itself to a For..Next loop. Let’s loop through our Seasons array to celebrate the summer:
For i = Lbound(Seasons) To Ubound(Seasons)
If (Seasons(i) = “Summer”) Then MsgBox “YEAH!”
Couple of things to note here:
We draw the limits of the For..Next loop straight from the array itself.
The Lbound function returns the lower bound of the array (0 in our example) and the Ubound function returns the upper bound of the array (which is 3).
Using Lbound and Ubound (instead of hard coding “0” and “3”, for example) is a very good practice when addressing the bounds of an array. If the array ever changes in size, there is no need to adjust any references to its bounds anywhere in our code.
Another important point is to recognize how the iteration variable of the loop (“i”) allows us to address a different array member in every iteration of the loop, in essence running through all of them in sequence.
Flexibly enough, an array can also be treated as a Collection of its members, making the following loop perfectly valid:
For Each season in Seasons
If (season = “Summer”) Then MsgBox “YEAH!”
However, there are 3 points to remember when using an array as a collection:
- The iterator variable (“season” in our example) must be of type Variant
- Array members can only be read (not written). In other words, Add will not work
- This will not work if the array members are of a user-defined data type
It is not always the case in which we know the size of the array at the time we dimension it, as in the easy 4-seasons example above.
Let’s say we want to store our employees’ salary in an array of Single numbers. However, only at run-time it will be known how many employees I have (I typically read them from a database or an Excel table that keeps changing as I add and remove employees).
One approach to handle the situation would be to dimension a large enough array to accommodate the largest data set anticipated. This is typically achieved by declaring a Constant to set the upper bound of the array:
Const MAX_EMPLOYEES = 500
Dim Salaries(MAX_EMPLOYEES) As Single
In some cases, that would be a good enough, especially when handling small arrays, not committing too much memory space.
A more elegant and efficient way would be to dimension the array without specifying its upper bound at all:
Dim Salaries() as Single
I still need the parenthesis to flag this variable as an array, though.
As we cannot work with an un-dimensioned array, before we actually use it we must re-dimension its bounds with the ReDim statement, like that:
ReDim Salaries(1 To EmpCount) as Single
The ReDim statement is carried out during runtime, therefore we can prepare the known bounds of the array in other variables beforehand. In this example, we probably read all employees from a table and stored the number of the employees in the EmpCount variable, and now it can be used to dimension the array.
I also over-written the default lower bound of zero to be 1 for this array.
Now that the array is legally dimensioned, we can start using it.
By the way, The ReDim statement can be used more than once on a given array. If you want to dynamically change the size of the array without losing the values stored in the array already, be sure to add the Preserve qualifier to the statement, like this:
ReDim Preserve Salaries(1 To EmpCount + 1) as Single
We added room for one more salary value for one more employee here, while preserving the salary values already stored in the array. Without the Preserve qualifier the Salaries array would be cleared of all values.
Be advised that re-dimensioning an array is a costly operation, therefore I would not advise to plan your program in such a way that will change the size of an array often, as in adding room for one more member within a loop over large amount of data or something like that.
Populating an array using the Split Function
We already saw how we can populate an array by using a loop. There are other ways to populate arrays, one of which is the Split function.
The Split function breaks a string by a defined delimiter into an array structure, populated with the string values.
We can break a single string holding all four seasons, separated by a comma, into our Seasons array in one shot. For this to work, our array must be un-dimensioned, to be dynamically populated by the Split function:
Dim Seasons() as String
Seasons = Split(“Winter,Spring,Summer,Autumn”, “,”)
The Split function, therefore, takes care of two functions at once: re-dimensioning the array variable and populating it with values. The first argument provided to the Split function is the string to be crunched. The second argument specifies the string delimiter, comma (“,”) in our example.
Populating an array using the Array Function
Similar to the Split function, we can use the Array function to gather separate variables (or values) into a single array.
For this to work, our array must be un-dimensioned and defined to store variables of Variant type:
Dim Seasons() as Variant
Seasons = Array(“Winter”,”Spring”,”Summer”,”Autumn”)
Think of a CSV file with comma-delimited values you want to process. It only takes a single statement to read the entire file and break it into an array using the Array function, ready for fast processing. This will happen very fast!
If we think of an array as a field of memory “cells”, the Seasons array would look like a list of seasons, lined up in a row, beautifully placed into the memory line of cells comprising the array.
Let’s now consider a field of memory “cells” accommodating a table:
Can we stretch the array from a “list” to a “table” structure? You bet we can!
While a list is a 1-dimensional structure, a table is a 2-dimensional structure (we have “rows” and “columns”). We can add the second dimension to the array declaration using a comma as a separator between the bounds of the first dimension and the bounds of the second dimension. Let’s define an array to accommodate the above table:
Dim Trainers(2, 3) as String
This is a declaration of a 3×4 elements array for holding strings. Let’s assign a value to the array element on “row” 1 and “column“ 2:
Trainers(0, 1) = “Israel”
Applying the classic For..Next loop to scan all members of an array, we come up with the following modal:
For i = Lbound(Trainers, 1) to Ubound(Trainers, 1)
For j = Lbound(Trainers, 2) to Ubound(Trainers, 2)
If Trainers(i,j) = “France” Then MsgBox “Salute”
As you can see, we added a second argument to the Lbound and Ubound functions, specifying the dimension for which we are requesting the bound. The outer loop scans the first dimension (loops 3 “rows”) and for each “row”, the inner loop scans the second dimension (loops 4 “columns”).
Can you see the endless possibilities we can play with here?
Just so you know, you need not limit your array to two dimensions. You can add a third dimension as well, representing sort of a 3-d cube of memory “cells”:
Dim xyz(1 To 10, 1 To 10, 1 To 50)
While you’re at it, add more dimensions (if you can conceptualize them in your head). Yes, up to 60(!) dimensions…
The Power of Arrays and Excel
The tabular nature of a 2-d array and the tabular nature of a Worksheet range makes arrays a perfect match for exchanging data with a range of cells.
As a matter of fact, exchanging data between a range and a 2-d array has been optimized and considered to be the most efficient and performant operation when such an exchange is called for. If you want a stunning demonstration, check out this Blog post I published a while ago.
If we want to read a range of cells into a 2-d array, all we need to do is to prepare a Variant, un-dimensioned variable, and assign the range values into the array variable:
Dim Trainers() as Variant ‘Must be Variant
Trainers = Range(“A1:H4000”).Value
This innocent statement first dimensions the array variable as per the target range size in two dimensions, and then swallow the whole range of cells in one shot, very fast, into the array variable – each cell into an array member placeholder. Isn’t this amazing?
By the way, this method of populating an array variable, defines the array to be 1-based, not the default 0-base. Keep that in mind when working with that array variable next.
In the opposite direction, the following statement will throw the array into a range of cells:
Range(“A1:H4000”).Value = Trainers
How elegant (and FAST) is that!
You shall never loop cells anymore – just sip them into an array in one shot and loop the array. Then, if needed, throw the manipulated array back into the Worksheet.
Couple of Last Comments on Arrays
- The Redim Preserve statement can only change the bounds of the last dimension of an array
- Passing an array to a Sub/Function is always done ByReference, not ByValue. The bounds need not be included in the Sub/Function stub but the type must match the array declaration:
Function FindTrainer(TrainerName as String, ByRef Trainers() as Variant) As Boolean
- The Erase arr statement will delete the array (if dynamic) or reset its values to default (0 for numbers, vbNullString, or empty string, for strings) otherwise
- An array dimensioned by a Worksheet range assignment always has 2 dimensions (even if reading a single cell) and a lower bound of 1
If some of the examples in this Blog post looks familiar to you, that’s not a coincidence – they are taken from my flagship online course: Computer Programming with Excel VBA.
Hey, if you found this Blog post valuable, please share it with a friend or a colleague – they’ll will thank you for that (and I will too!)
WANT TO LEARN PROGRAMMING WITH EXCEL VBA?
I created a complete 8-course program for you.
From programming to databases with Excel VBA to deliver business-grade solutions.
Take the first course today: Computer Programming with Excel VBA.
Click the button to see the complete program and start now!