Bushy Trees

A favorite peeve of mine is code with “bushy trees.” I first saw this phrase in Kernighan and Plauger’s Elements of Programming Style.

Recently, I saw some code that checked if a RefEdit control referred to a single cell that contained a non negative integer. I cleaned up the formatting some since the original indentation style might be best described as “random tabs.” But, code formatting is not the subject of this post.

The IFs are fairly easy to understand since they essentially follow how most people would think to validate the string parameter. The tricky part comes in ensuring that one has the correct Else clause at the correct indentation level. In this case, that too might not be too difficult since the Else clauses are fairly trivial, each consisting of a single boolean assignment. But, imagine how much more difficult the task would be if there were further If clauses or loop structures in some of the Else clauses!

Option Explicit
Public Function getNonNegInt(aRefEditText As String, _
        ByRef Rslt As Integer) As Boolean
    Dim aRng As Range
    Const MaxInt As Integer = 32767
    getNonNegInt = True
    On Error Resume Next
    Set aRng = Range(aRefEditText)
    On Error GoTo 0
    If Not aRng Is Nothing Then
        If aRng.Cells.Count = 1 Then
            If IsNumeric(aRng.Value) Then
                If CDbl(aRng.Value) >= 0 Then
                    If CDbl(aRng.Value) = CLng(aRng.Value) Then
                        If CLng(aRng.Value) < = MaxInt Then
                            Rslt = CInt(aRng.Value)
                        Else
                            getNonNegInt = False
                            End If
                    Else
                        getNonNegInt = False
                        End If
                Else
                    getNonNegInt = False
                    End If
            Else
                getNonNegInt = False
                End If
        Else
            getNonNegInt = False
            End If
    Else
        getNonNegInt = False
        End If
    End Function

Code Sample 1

As a first pass, one could remove all the boolean assignments by first setting the function value to false and not to true as in Code Sample 1. Then only if we have an acceptable value do we return a True value.

Option Explicit
Public Function getNonNegInt(aRefEditText As String, _
        ByRef Rslt As Integer) As Boolean
    Dim aRng As Range
    Const MaxInt As Integer = 32767
    On Error Resume Next
    Set aRng = Range(aRefEditText)
    On Error GoTo 0
    getNonNegInt = False
    If Not aRng Is Nothing Then
        If aRng.Cells.Count = 1 Then
            If IsNumeric(aRng.Value) Then
                If CDbl(aRng.Value) >= 0 Then
                    If CDbl(aRng.Value) = CLng(aRng.Value) Then
                        If CLng(aRng.Value) < = MaxInt Then
                            Rslt = CInt(aRng.Value)
                            getNonNegInt = True
                           End If
                        End If
                    End If
                End If
            End If
        End If
    End Function

Code Sample 2

However, this still doesn’t help with all the nested If and End If clauses.

So, how does one clean up this deeply nested code? How about if we reverse the tests? Instead of testing if the range is not nothing, test if it is nothing. Instead of testing if the range contains 1 cell, test if it contains more than 1 cell. And, so on. The result as shown in Code Sample 3 is a single If statement (with multiple ElseIf clauses) that is ‘flat’ — with no confusing nesting!

Option Explicit

Public Function getNonNegInt(aRefEditText As String, _
        ByRef Rslt As Integer) As Boolean
    Dim aRng As Range
    Const MaxInt As Integer = 32767
    getNonNegInt = False
    On Error Resume Next
    Set aRng = Range(aRefEditText)
    On Error GoTo 0
    If aRng Is Nothing Then
    ElseIf aRng.Cells.Count > 1 Then
    ElseIf Not IsNumeric(aRng.Value) Then
    ElseIf CDbl(aRng.Value) < 0 Then
    ElseIf CDbl(aRng.Value) <> CLng(aRng.Value) Then
    ElseIf CLng(aRng.Value) > MaxInt Then
    Else
        Rslt = CInt(aRng.Value)
        getNonNegInt = True
        End If
    End Function

Code Sample 3

The code above uses a very powerful concept — that of a ‘null clause.’

In most cases, when we have a If…Then, we have some statement in the True branch of the If statement. It might be a series of assignments or it might be another If or a loop of some sort but there is something in the True branch. For example, in Code Sample 1 and Code Sample 2 above, there are two assignments.

                        If CLng(aRng.Value) < = MaxInt Then
                            Rslt = CInt(aRng.Value)
                            getNonNegInt = True
                           End If

However, in Code Sample 3, each Then is followed not by a statement but by the ElseIf clause. This results in a null statement (or empty block) in the True branch. This is perfectly legal in every programming language I’ve used and in this case it serves a very powerful role in simplifying the code.

For the sake of completeness, we will look at one more way of coding the above. In this particular scenario since there is no further processing after the string is validated, one could use use a series of Ifs that simply go to an exit point for bad data. But, this would not work for scenarios in which we wanted to do additional processing after the string of Ifs.

Option Explicit
Public Function getNonNegInt(aRefEditText As String, _
        ByRef Rslt As Integer) As Boolean
    Dim aRng As Range
    Const MaxInt As Integer = 32767
    On Error Resume Next
    Set aRng = Range(aRefEditText)
    On Error GoTo 0
    If aRng Is Nothing Then GoTo ErrXIT
    If aRng.Cells.Count > 1 Then GoTo ErrXIT
    If Not IsNumeric(aRng.Value) Then GoTo ErrXIT
    If CDbl(aRng.Value) < 0 Then GoTo ErrXIT
    If CDbl(aRng.Value) <> CLng(aRng.Value) Then GoTo ErrXIT
    If CLng(aRng.Value) > MaxInt Then GoTo ErrXIT
   
    Rslt = CInt(aRng.Value)
    getNonNegInt = True
    Exit Function
ErrXIT:
    getNonNegInt = False
    End Function

Code Sample 4

Posted in Uncategorized

27 thoughts on “Bushy Trees

  1. Yes the code sample 2 example is one I like to use but I tend to rem it up a bit, so it is clear what is going on when I come back to it after six months

    If aRng Is Nothing Then
    ‘do nowt
    ElseIf aRng.Cells.Count > 1 Then
    ‘do nowt
    etc.

  2. @Tushar –

    I would do it like this:

    Public Function getNonNegInt(aRefEditText As String, _
            ByRef rslt As Integer) As Boolean
        Dim aRng As Range
        On Error GoTo ErrXIT:
        Set aRng = Range(aRefEditText)
        If Application.Evaluate(“(“ & aRefEditText & “>0)*(“ & aRefEditText & “<32767)”) Then
            rslt = aRng
            getNonNegInt = True
        End If
    ErrXIT:
    End Function

    Regards,
    Daniel Ferry
    excelhero.com

  3. Nice article Tushar.

    @Daniel

    Surely your solution also needs to test for the cell value being the same as the CLng case? Your evaluate would become something like

    Application.Evaluate(“(” & aRefEditText & “>0)*(” & aRefEditText & “<32767) * (CLng(” & aRefEditText & “)=” & aRefEditText &”)”)

  4. @Daniel, I might be tempted to use this slightly shorter function (which, as a personal choice, defaults to processing the first cell of a multi-cell range address is such a range is passed into the first argument)…

    Public Function getNonNegInt(aRefEditText As String, _
            ByRef rslt As Integer) As Boolean
      Dim CellValue As String
      CellValue = Range(aRefEditText)(1).Value
      getNonNegInt = IIf(Len(CellValue) > 0 And Not CellValue Like “*[!0-9]*”, CellValue < 32767, False)
      If getNonNegInt Then rslt = CellValue
    End Function
  5. @Daniel and Myself,

    Actually, I might consider passing the first argment ByVal and end up not creating that intermediate variable (and saving another line of code)…

    Public Function getNonNegInt(ByVal RefText As String, ByRef rslt As Integer) As Boolean
      RefText = Range(RefText)(1).Value
      getNonNegInt = IIf(Len(RefText) > 0 And Not RefText Like “*[!0-9]*”, RefText < 32767, False)
      If getNonNegInt Then rslt = CellValue
    End Function
  6. Another one

    Public Function getNonNegInt2(aRefEditText As String, _
        ByRef Rslt As Integer) As Boolean
       
        Dim aRng As Range
        Const MaxInt As Integer = 32767
        On Error Resume Next
            Set aRng = Range(aRefEditText)
        On Error GoTo 0
        Select Case True
            Case aRng Is Nothing, aRng.Cells.Count > 1, Not IsNumeric(aRng.Value), _
                CDbl(aRng.Value) < 0, CDbl(aRng.Value) <> CLng(aRng.Value), _
                CLng(aRng.Value) > MaxInt
            Case Else
                Rslt = CInt(aRng.Value)
                getNonNegInt2 = True
        End Select
       
    End Function
  7. @Rick –

    Nice. But won’t your method fail if RefText does not actually describe a valid range?

    Four of the lines in my procedure were there to deal with that possibility. If we are to ignore that, mine could have been:

    Public Function getNonNegInt(aRefEditText As String, ByRef rslt As Integer) As Boolean
        If Application.Evaluate(“(“ & aRefEditText & “>0)*(“ & aRefEditText & “<32767)”) Then
            rslt = aRng: getNonNegInt = True
        End If
    End Function
  8. Damn, damn, damn! Okay, a minor change to my functions is required. The IIf function call fails as written if the referenced cell does not contain a number; and the first assignment will fail if the reference cell contains an error value. Here is modified code for both of my posted functions that corrects these problems (still keeping the same number of code lines in each)…

    First posted version
    ————————————

    Public Function getNonNegInt(aRefEditText As String, _
            ByRef rslt As Integer) As Boolean
      Dim CellValue As String
      CellValue = Range(aRefEditText)(1).Text
      getNonNegInt = IIf(Len(CellValue) > 0 And Not CellValue Like “*[!0-9]*”, Val(CellValue) < 32767, False)
      If getNonNegInt Then rslt = CellValue
    End Function

    Second posted version
    ————————————

    Public Function getNonNegInt(ByVal RefText As String, ByRef rslt As Integer) As Boolean
      RefText = Range(RefText)(1).Text
      getNonNegInt = IIf(Len(RefText) > 0 And Not RefText Like “*[!0-9]*”, Val(RefText) < 32767, False)
      If getNonNegInt Then rslt = RefText
    End Function
  9. And this version forces the use of the first cell:

    Public Function getNonNegInt(aRefEditText As String, ByRef rslt As Integer) As Boolean
        If Application.Evaluate(“(index(” &amp; aRefEditText &amp; “,1,1)&gt;0)*(index(” &amp; aRefEditText &amp; “,1,1)
  10. @Daniel,

    Hmm, I didn’t think about the argument not describing a real range. Okay, I guess that will cost me two additional lines of code…

    First posted version
    —————————

    Public Function getNonNegInt(aRefEditText As String, _
            ByRef rslt As Integer) As Boolean
      Dim CellValue As String
      On Error GoTo Whoops
      CellValue = Range(aRefEditText)(1).Text
      getNonNegInt = IIf(Len(CellValue) > 0 And Not CellValue Like “*[!0-9]*”, Val(CellValue) < 32767, False)
      If getNonNegInt Then rslt = CellValue
    Whoops:
    End Function

    Second posted version
    —————————

    Public Function getNonNegInt(ByVal RefText As String, ByRef rslt As Integer) As Boolean
      On Error GoTo Whoops
      RefText = Range(RefText)(1).Text
      getNonNegInt = IIf(Len(RefText) > 0 And Not RefText Like “*[!0-9]*”, Val(RefText) < 32767, False)
      If getNonNegInt Then rslt = RefText
    Whoops:
    End Function
  11. Try that once more…

    Public Function getNonNegInt(aRefEditText As String, ByRef rslt As Integer) As Boolean
        If Application.Evaluate(“(index(“ & aRefEditText & “,1,1)>0)*(index(“ & aRefEditText & “,1,1)<32767)”) Then
            rslt = Range(aRefEditText)(1): getNonNegInt = True
        End If
    End Function
  12. And with the range error back in play…

    Public Function getNonNegInt(aRefEditText As String, ByRef rslt As Integer) As Boolean
        On Error GoTo Whoops
        If Application.Evaluate(“(index(“ & aRefEditText & “,1,1)>0)*(index(“ & aRefEditText & “,1,1)<32767)”) Then
            rslt = Range(aRefEditText)(1): getNonNegInt = True
        End If
    Whoops:
    End Function

    I would say both our methods are excellent, totally different roads to the same destination…

  13. In case folks missed the tags I used, they did *not* include Excel!

    This post was about programming techniques: bushy trees, reversing tests, null clauses, and empty blocks. Towards that goal I used an *example* of a RefEdit control in the context of Excel. That was an example, not an end on to itself.

    That said, Daniel, your code doesn’t check for an integer. You also have boundary condition errors. Also, over the years, I’ve been burned by Evaluate and use it only as a last resort. I have no idea how well it would fare if the consumer selected refedit fields from multiple workbooks/worksheets.

    Rick, the original example did not allow the customer to specify multiple cells. Selecting just the first might reduce the number of characters a developer has to type but it *ignores* bad input! Also, format a cell with 1, or more, decimals and the code flags an integer value as an error. Finally, there is a boundary condition error.

    I realize there are those who believe size matters, but is it really smaller is better? {vbg}

    There are times when explicitly and methodically testing the input pays off big time. Just for the record, there is a bug in the original code I posted but it was in the original code and I did not bother fixing since it was irrelevant to the concept of this post.

    The advantage of the the original code is that, with very little additional work, it is transportable to other environments for validating non negative integers that fit in an integer data type. It can also be *easily* adapted to include more specific error information. But, most of all, it is concept that matters. Avoid bushy trees, think of the benefit of reversing test conditions, and leverage null clauses.

  14. @Daniel,

    Yes, two totally different roads to reach the same destination. I notice, however, that you joined multiple statements on a single line using the colon operator. If the colon operator is “allowed”, then I can reduce my functon to a two-liner…

    Public Function getNonNegInt(ByVal RefText As String, ByRef rslt As Integer) As Boolean
      On Error GoTo Whoops: RefText = Range(RefText)(1).Text: getNonNegInt = IIf(Val(RefText) > 0 And Not RefText Like “*[!0-9]*”, Val(RefText) < 32767, False)
    Whoops: If getNonNegInt Then rslt = RefText
    End Function
  15. @Daniel… just to note, there was the symbol for “very big grin” at the end of my first sentence, but I forgot about these comment processors and their penchant for “eating” anything between “less than, greater than” symbols, so it did not show up in my posted message. Rest assured, there was DEFINITELY supposed to be a “very big grin” symbol introducing my “two-liner”.

  16. Tushar –

    Thanks for the response. I did indeed forget to check for the integer requirement. Over the years my path has gone opposite yours as I have gone from hating Evaluate to loving it. I use it frequently, especially the shortcut version, [].

    Here is a revised version that I believe accounts for all of the requirements:

    Public Function getNonNegInt3(aRefEditText As String, ByRef rslt As Integer) As Boolean
        Dim aRng As Range
        Dim t As String
        On Error GoTo ErrXIT:
        Set aRng = Range(aRefEditText)
        t = aRefEditText
        If Application.Evaluate(“(“ & t & “>0)*(“ & t & “<32767)” * (Int(” & t & “) = ” & t & “) * (Rows(” & t & “) = 1) * (Columns(” & t & “) = 1)) Then
            rslt = aRng
            getNonNegInt3 = True
        End If
    ErrXIT:
    End Function

    To point to the article tags as not including “Excel” when the article is posted on an Excel blog, and then proceed to use Excel-specific language items, and then not mention that the point was a general purpose routine that could be used or quickly converted to other platforms, is asking the reader to have ESP.

  17. @Tushar,

    I understand your point about your code being for example purposes and your point about lengthy versus brief code, but just in case anyone following this thread is interested, I believe the following revision to my function addresses the problems you found with how my code functioned…

    Public Function getNonNegInt(ByVal RefText As String, ByRef rslt As Integer) As Boolean
      On Error GoTo Whoops
      RefText = Range(RefText).Value
      getNonNegInt = IIf(Len(RefText) > 0 And RefText > 0 And Not RefText Like “*[!0-9]*”, RefText < 32768, False)
    Whoops: If getNonNegInt Then rslt = RefText
    End Function
  18. I would go with a modification of Code Sample 1:

        bln = False

        On Error Resume Next
        Set rng = Range(str)
        On Error GoTo 0

        If Not rng Is Nothing Then _
            If rng.Cells.Count = 1 Then _
                If IsNumeric(rng.Value) Then _
                    If CDbl(rng.Value) >= 0 Then _
                        If CDbl(rng.Value) = CLng(rng.Value) Then _
                            If CLng(rng.Value) <= cMaxInt Then _
                                bln = True
        If bln Then res = CInt(rng.Value)

  19. Tushar, thanks for the interesting post. I’ve used the null clause method before, although not to such an extent. I tend to put a comment in there like “‘continue”, to remember why it’s there. I like Dick’s Select Case True version – it’s easy to read, and would be easy to modify. (I see that syntax is still allowed in VB.Net, although it causes some teeth-gnashing.)

  20. Thinking about this more, I think when you’ve got a bunch of conditions any one of which leads to failure, it’s just a big OR statement, which is why Dick’s Select Case True appeals to me.

  21. Very neat examples and discussion – but I wonder which of these is easiest to maintain, especially for someone else to maintain (for obvious reasons we all know about). And my answer would be Code Sample 2.

    I like Dick’s suggestion, but I still think someone would scratch their head for a while before figuring it out.

    I guess comments could help speed the thinking, but if we are writing comments, then is it worth it – or should we just write the code in a way that would be understandable without the comments (I agree this is not possible in all situations, but in these situations there is a reasonable alternative)

    –Charlie

  22. Can’t use Evaluate if passing the function a textnum like “10? or a contrived textnumarray like “{1,2,3}” should throw an error. That is, if the string argument MUST be a textref, it’s NECESSARY to try to convert it to a range object.

    As for the actual coding, why return success/failure? The same Kernighan & Plauger wrote another book titled Software Tools, which was for a FORTRAN dialect, and a derivative Software Tools in Pascal, which was a screed against the poor design of Pascal’s I/O library which included asinine functions like this where the variable receiving the input was passed as a parameter to the library I/O function. Kinda inconsistent with these authors’ entire work to focus just on the bushy trees and ignore the suboptimal function interface.

    Lack of shortcircuit boolean evaluation in VBA necessitates multiple nested If blocks. Even so this could be cleanly handled by creating the range object, then derefencing it properly.

    Function foo(textref As String) As Long
      Const MAXINT As Long = 100000
      Dim rng As Range, v As Variant

      foo = -1  ‘defaults to error return value

      On Error GoTo ExitFunction
      ‘throws a runtime error when textref isn’t valid range address
     Set rng = Range(textref)

      ‘exits when rng isn’t a single cell
     ‘throws a runtime error when rng isn’t numeric
     If rng.Count = 1 Then v = CDbl(rng.Value2) Else GoTo ExitFunction

      ‘finally just check whole number between 0 and MAXINT
     If v >= 0 And v <= MAXINT And v = Int(v) Then foo = v

    ExitFunction:
    End Function

  23. @fzz –

    Agreed, but Tushar made it clear that the purpose was to validate a cell value – so Evaluate is a good choice.

    That being said, I like your approach and would choose it myself if I were not to use Evaluate.

    Regards,

    Daniel Ferry
    excelhero.com

  24. @Daniel –

    Agreed, validate a cell value. And if the function took a Range object as it’s 1st argument rather than a string textref, there’d be no reason for the initial validation checking that the first argument referred to a range. Evaluate, OTOH, won’t throw an error converting “10? to 10 or “{1,2,3,4}” to an array even though neither string refers to a range. That begs the essential question whether the string 1st argument should be tested to confirm that it refers to a range. If not, Evaluate would be expedient. If so, Evaluate would be inadequate.

    Perhaps worse than the syntax of passing the ‘return value’ as a ByRef parameter is passing the 1st reference as a textref rather than the intended Range object. If the 1st arg should be a range reference, WTH not make the argument a Range object and put that bit of validity checking into the calling procedure? If that’s not allowed or practical for some reason, this is an exquisite example of when encapsulating a type conversion inside another function is warranted.

    Function foo(TextRef As String) As Long
      Const MAXINT As Long = 100000
      Dim rng As Range, v As Variant

      foo = -1  ‘defaults to error return value

      Set rng = textref2range(TextRef)
      If rng Is Nothing Then Exit Function
      If rng.Cells.Count > 1 Then Exit Function

      v = CDbl(rng.Value2)
      If v >= 0 And v <= MAXINT And v = Int(v) Then foo = v

    End Function

    Function textref2range(TextRef As String) As Range
      On Error Resume Next
      Set textref2range = Range(TextRef)
    End Function

    This is closer to the true spirit of the collective writings of messrs Kernighan & Plaugher: multiple return/exit statements for different exceptions are clearer than nested If blocks.

  25. I’m with fzz on this one:

    This is closer to the true spirit of the collective writings of messrs Kernighan & Plaugher: multiple return/exit statements for different exceptions are clearer than nested If blocks.

    I write a lot of code with lengthy validation processes and my approach as always been a sequence of ‘drop-through-or-exit’ tests:

    If Not IsNumeric(Target.Value) Then
        Exit Sub
    End If

    If target.value < 0 Then
        Exit Sub
    End If

    … And so on.

    I do, however, make frequent use of Null ‘IF’ clauses, using a comment to show that I’m doing nothing:

    If Intersect(rng1, rng2) Is Nothing Then
        ‘ No action: range 2 is not in range 1
    Else
        arr1 = rng1.value2
        arr2 = rng2.value2
    Endif

    That being said, Tushar’s sequential ElseIf is functionally equivalent to the ‘drop-through-or-exit’ approach, with the additional benefit that you stay within the function and can handle the ‘rejected’ values with specific error messages.

    A nice example.

    Anyone got a good example out there of short-cutting (or ‘shortcircuiting’) a more complex decison tree in VBA? Constructing a binary token and applying bitwise operations is conceptually simple – we all do it, even if we don’t realise it, when we pass parameters to the msgBox function:

    MsgBox “Do you wish to retry this operation?”, vbExclamation + vbDefaultButton1 + vbRetryCancel, “File loading failed…”

    …However, constructing practical examples is much harder, and I’ve never seen a satisfactory solution to a decision tree with orthogonal tests: a common situation in which VBA coders always end up duplicating code in at least one ‘Else’ clause, no matter how the IF statements are nested.

    Orthogonality… This might not be expressed as clearly as it could be: – Tushar’s example can be thought of as a simple ‘linear’ test set, as we can drop straight through with a conceptual ‘valid=true’ or ‘valid=false’. An ‘orthogonal’ test forces us to ‘branch off’ in another direction with (say) ‘CurrentValue=true’ and ‘CurrentValue=false’, with each of those two choices containing separate ‘valid=true’ and ‘valid=false’ clauses.

    Add in (say) the test ‘ExtendedCharset=true’ and ‘ExtendedCharset=false’, and you’ll find yourself with eight separate cases – but there might sets of answers that are handled the same way (eg:three cases might go to a common error heandler)… But, in a genuinely orthogonal test set, you can only ever get one result set into the same ‘Else’ clause: all the others will end up in clauses containing duplicated code.


Posting code? Use <pre> tags for VBA and <code> tags for inline.

Leave a Reply

Your email address will not be published.