A JoinAll function

Based on a discussion on the mrexcel.com website of Harlan Grove’s AConCat function, I wrote a join (or concatenate) function that works with a scalar data type (a number, string, or boolean) as well as complex data types like a range, an array, a collection, or a dictionary. It also accepts an arbitrary number of arguments (through the ParamArray parameter). The unfortunate consequence of using ParamArray is that the separator must be specified first and becomes mandatory.

On a side note, I don’t use the IsArray VB(A) function. For some reason I was under the impression that there was no such function. Now, after some basic testing I know why it’s not useful — so, it might as well not exist {grin}. The function returns True if the argument is a multi-cell range! I’ve used various functions to check if a variable contains an array and I share one below.

When processing a range argument, the function uses Range.Text. This means it gets the value as shown in the cell. It also means that the function shows error values correctly.

A simple test subroutine follows. It uses several fairly unusual data structures.

The function is also usable as a UDF in an Excel worksheet. Two examples:

=JoinAll(CHAR(10),Sheet2!E125:E129,{1,2,3,4,5,6},Sheet2!B123:B129,B5)

and

=JoinAll(CHAR(10),{11,12,13},B5,(C6,B7),B8:C10,{1,2,3;4,5,6})

A snapshot of the last example is below.

joinall

8 thoughts on “A JoinAll function

  1. Two possible red flags:

    1) A Collection object can contain a reference to itself, which would send your routine into an endless recursion (and eventual stack overflow).

    2) The algorithm for building up the result string takes a time proportional to the square of the number of elements.

    You could dodge both problems by using a “stringbuilder” of the sorts discussed in comments earlier here (http://www.dailydoseofexcel.com/archives/2010/07/19/string-building-class/), but adding some code to stop building after a certain length.

    As far as IsArray(), it works, but when you use it on a multi-cell Range object, VBA “helpfully” uses the range’s Value property because it’s the default property. So to check if a variable contains an array and not an object with a default property that’s an array, you have to first check for object-ness and only check for array-ness after that.

  2. Thanks for your comments, John.

    1) I had not considered that possibility. An object containing itself doesn’t seem to be very useful but it is technically feasible.

    Rather than restrict the length of the result, I would be more inclined to restrict the recursive calls to X levels.

    2) I considered and discarded both a collection and a dictionary. I have not seen any documentation that items will be returned in the order in which they were added. That left an array. Not a big deal but like you I wasn’t sure of the performance of Join() to concatenate all the elements of the array. My instinct is that a ‘buffer’ with Mid() to fill it up will perform better than the array approach but I don’t know for sure.

    And, as far as InArr goes, if I have to write a custom function to check for object-ness and then check for array-ness, why not use the custom function to directly check for array-ness? {grin}

  3. There’s an arguably more elegant way to prevent IsArray from returning false positives for ranges or any other object with a default property which could be an array of values: test to see if it’s an object first.

    Function isarraystrict(x As Variant) As Boolean
      If IsObject(x) Then Exit Function
      isarraystrict = IsArray(x)
    End Function

    Note: IsArray returns True for multicell ranges because IsArray(Range(..)) is equivalent to IsArray(Range(..).Value), and the default .Value property is an array for multicell ranges.

    Since arrays may be nested within other arrays but can’t be recursively nested, limiting the number of recursive calls of the generalized concatenation function in order to avoid infinite recursion through recursive collection references is problematic. Better to use a separate subprocedure with limited recursion just for collection objects, and better still to make the number of recursive calls an argument.

    For x . . . res = res & x & sep . . . Next and finish off with res = Left(res, Len(res) – Len(sep)) is just so much uglier and less clear than For x . . . res = res & sep & x . . . Next and finish off with res = Mid$(res, Len(sep) + 1).

  4. Detecting an array variant in Excel VBA:

    http://excellerando.blogspot.com/2009/02/detecting-array-variant-in-excel-vba.html

    Detecting the difference between Empty and Empty() is a challenge to all the published methods I’ve seen: one’s an array, and the other isn’t.

    My own approach (see link above) is to detect the brackets in the VarType:

        InStr(TypeName(varTest), “()”)

    The definitive answer is an ArrayDimensions function that uses the Kernel API. This was published on VB2TheMax several years ago, and it involves the dangerous lunacy of CopyMemory and pointers. I use it almost every day:

    Private Declare Sub CopyMemory Lib “kernel32” Alias “RtlMoveMemory” _
                            (Destination As Any, _
                             Source As Any, _
                             ByVal Length As Long)

    Public Function ArrayDimensions(arr As Variant) As Integer
      ‘—————————————————————–
     ‘ will return:
     ‘ -1 if not an array
     ‘ 0  if an un-dimmed array
     ‘ 1  or more indicating the number of dimensions of a dimmed array
     ‘—————————————————————–
     
      ‘ Originally published by R. B. Smissaert.
     ‘ Additional credits to Bob Phillips, Rick Rothstein, and  Thomas Eyde on VB2TheMax

      Dim ptr As Long
      Dim vType As Integer

      Const VT_BYREF = &H4000&

      ‘get the real VarType of the argument
     ‘this is similar to VarType(), but returns also the VT_BYREF bit
     CopyMemory vType, arr, 2

      ‘exit if not an array
     If (vType And vbArray) = 0 Then
        ArrayDimensions = -1
        Exit Function
      End If

      ‘get the address of the SAFEARRAY descriptor
     ‘this is stored in the second half of the
     ‘Variant parameter that has received the array
     CopyMemory ptr, ByVal VarPtr(arr) + 8, 4

      ‘see whether the routine was passed a Variant
     ‘that contains an array, rather than directly an array
     ‘in the former case ptr already points to the SA structure.
     ‘Thanks to Monte Hansen for this fix
     
      If (vType And VT_BYREF) Then
        ‘ ptr is a pointer to a pointer
       CopyMemory ptr, ByVal ptr, 4
      End If

      ‘get the address of the SAFEARRAY structure
     ‘this is stored in the descriptor

      ‘get the first word of the SAFEARRAY structure
     ‘which holds the number of dimensions
     ‘…but first check that saAddr is non-zero, otherwise
     ‘this routine bombs when the array is uninitialized

      If ptr Then
        CopyMemory ArrayDimensions, ByVal ptr, 2
      End If

    End Function

  5. A reply to FZZ:

    Since arrays may be nested within other arrays but can’t be recursively nested,

    Arrays definitely CAN be recursively nested. Question is, can the memory be cleaned up by an Erase()? You’re in deep trouble if you’ve got recursive objects! In both cases, you need a recursive call to an iterative destructor; that is to say, this type of code won’t release all the memory:

    For i = UBound(arrX) To LBound(arrX)
        Erase arrX(i)
       
    Next i

    Erase arrX

    For each ThisDictionary in MyDictionary

        Set ThisDictionary = Nothing

    Next

    Set MyDictionary = Nothing

    Feel free to run this recursive array-builder and prove it to your own satisfaction:

    Public Sub TestArrayRecursion()

    Dim arrX() As Variant
    Dim arrY() As Variant

    Dim i As Long
    Dim j As Long

    j = 65

    ‘Demonstrate a nested Array
    For i = 0 To 3

        ReDim Preserve arrX(0 To i)
       
        ‘ arrY = (“A”, “B”, “C”, “D”)
       arrY = Array(Chr(j + 0), Chr(j + 1), Chr(j + 2), Chr(j + 3))
        arrX(i) = arrY
       
        j = j + 4
       
        Erase arrY
       
    Next i
     
     
    ‘Demonstrate a recursed array:
    For i = 4 To 8

        ReDim Preserve arrX(0 To i)
        arrX(i) = arrX
     
    Next i
     
    Stop  ‘ Now view your work in the Locals Window

    Erase arrX

     
    End Sub

    You need something like this monster kludge for a recursive dictionary object – or even the dictionary-of-dictionaries object that’s returned by some crude XML parsers:

    Public Sub DictionaryDeepDelete(objDictionary As Scripting.Dictionary)
    ‘ Set a dictionary object = Nothing, setting all object members to nothing
    ‘ VBA does not always clean up objects in arrays, dictionaries and collections
    On Error Resume Next

    Dim i           As Long
    Dim iCount      As Long
    Dim iStep       As Long
    Dim arrKeys     As Variant
    Dim strKey      As String

    Static iRecurse As Integer

    ‘ Iterating a dictionary, even using the ordinal of .Items(), is slow.

        If objDictionary Is Nothing Then
            Exit Sub
        End If

        iRecurse = iRecurse + 1

        If iRecurse > 4 Then ‘ this is unsatisfactory, but an improvement on a stack overflow
       
            objDictionary.RemoveAll
            Set objDictionary = Nothing
            GoTo ExitSub
           
        End If
             
             
        iCount = objDictionary.Count
        iStep = CLng(iCount / 5)
        If iStep = 0 Then
            iStep = 1
        End If
       
        If iCount >= 200 Then
            If Not Application.StatusBar = “FALSE” Then
                Application.StatusBar = “Clearing local data… “
            End If
        End If
       
        arrKeys = objDictionary.Keys
       
        ‘ Note that the Keys() and Items() properties are static arrays, obtained by a ‘deep copy’
       ‘ from the parent dictionary: we can use them to list the keys, but not to obtain objects
       ‘ and variants ‘ByRef’ – so we get the keys, and always work via dictionary(key) references

        For i = 0 To iCount – 1
           
            strKey = “”
            strKey = arrKeys(i)
           
            If i Mod iStep = 0 And iCount >= 200 Then
                If Not Application.StatusBar = “FALSE” Then
                    Application.StatusBar = “Clearing local data… “ & CInt(100# * i / (1 + iCount)) & “%”
                End If
            End If
               
            Select Case varType(objDictionary(strKey))
            Case VBA.VbVarType.vbUserDefinedType
           
                objDictionary(strKey) = 0
                 
            Case VBA.VbVarType.vbObject
           
                If TypeName(objDictionary(strKey)) = “Dictionary” Then

                    DictionaryDeepDelete objDictionary(strKey)
                   
                Else

                    Set objDictionary(strKey) = Nothing
                   
                End If  ‘ type-specific destructors
           
            Case VBA.VbVarType.vbDataObject
           
                Set objDictionary(strKey) = Nothing
               
            Case Is >= VBA.VbVarType.vbArray
                   
                If Not IsEmpty(objDictionary(strKey)) Then
                    Erase objDictionary(strKey)
                End If
               
            Case Is < vbString
                ‘ it’s a native variant type, do nothing
           End Select

        Next i
       
        Erase arrKeys
       
        objDictionary.RemoveAll
        Set objDictionary = Nothing

        If iCount >= 200 Then
            If Not Application.StatusBar = “FALSE” Then
                Application.StatusBar = False
            End If
        End If

    ExitSub:
        iRecurse = iRecurse – 1

    End Sub

    Note that Tushar’s point about restrictive recursion is implemented here in the static iRecurse counter.

    Note, also, the call

    For i = 0 To iCount – 1
    The recursed objects have *probably* been added sequentially, so we’ve got a fairly good chance of navigating the recursion stack in a reversal of its creation and dismissing all the objects without skipping any items when the loop counter or enumeration ordinal is changing ‘on-the-fly’.
  6. Interesting stuff. I’ll just note that objects that contain references to themselves come up in all sorts of situations. Consider a containment situation where the container has references to the components and the components have a reference to the container. Even with raw collections it’s not implauisble:

    Public Sub adventure()
        Dim c1 As Collection: Set c1 = New Collection
        Dim c2 As Collection: Set c2 = New Collection
        Dim c3 As Collection: Set c3 = New Collection
       
        Call c1.Add(“gold!”, “contents”)
        Call c1.Add(c2, “down”)
       
        Call c2.Add(c1, “up”)
        Call c2.Add(c3, “down”)
       
        Call c3.Add(c2, “up”)
       
        Debug.Print c3(“up”)(“up”)(“contents”)
    End Sub

    On the serious side, this kind of thing does show why I find that completely general library functions are more trouble than they’re worth. The differences between collections, arrays, etc. probably mean that you’d never want to actually join raw instances without doing some processing first. You’re better off with a set of specific, more meaningful routines.

  7. Nigel — not the same thing. Try running the following.

    Sub foo()

      Dim a As Variant, c As New Collection

      Debug.Print String$(32, “=”)

      a = Array(0, 1, 2, 3)

      ReDim Preserve a(0 To 4)

      a(4) = a

      Debug.Print a(1), a(4)(1)

      a(1) = 1 / 2

      Debug.Print a(1), a(4)(1)

      a(4)(1) = 10

      Debug.Print a(1), a(4)(1)

      Debug.Print String$(32, “-“)

      c.Add key:=“0”, Item:=0

      c.Add key:=“1”, Item:=1

      c.Add key:=“2”, Item:=1

      c.Add key:=“3”, Item:=3

      c.Add key:=“4”, Item:=c

      Debug.Print c(“1”), c(“4”)(“1”)

      c.Remove (“1”)

      c.Add key:=“1”, Item:=1 / 4

      Debug.Print c(“1”), c(“4”)(“1”)

      c(“4”).Remove (“1”)

      c(“4”).Add key:=“1”, Item:=100

      Debug.Print c(“1”), c(“4”)(“1”)

      Debug.Print String$(32, “=”)

    End Sub

    On my machine this returns

    ================================

    1 1

    0.5 1

    0.5 10

    ——————––

    1 1

    0.25 0.25

    100 100

    ================================

    In other words, assigning array a to an entry in itself just copies the values already in a to that entry. It doesn’t put a reference to the array into the array. On the other hand, assigning a collection to an item within that collection does put a reference to the collection into that collection item.

    After you assign a to its last entry a(4), assign a different value to a(1); the values in a(1) and a(4)(1) differ. Then assign a third value to a(4)(1), and a(1) and a(4)(1) still differ. OTOH, replace the item for key “1? in collection c, and the item for key “1? in the collection item under key “4? is the same. Then replace the item for key “1? in the nested collection under key “4? in collection c, and the item for key “1? is the same.

    Assigning arrays to anything works ‘by value’. Assigning objects to collection items works ‘by reference’. That’s a fundamental distinction. I’ll stick by my assertion: arrays can’t be recursively nested in VBA.

  8. Performance in concatenation code is always better if the short string is joned to the delimiter and then to the long string. Rather than join long string to delimter, and then both to the short string (as per this code)

    It’s a basic design strategy that often gets missed.


Posting code? Use <pre> tags for VBA and <code> tags for inline.

Leave a Reply

Your email address will not be published.