Archive for the ‘VBA Advanced’ Category.

Rows and Columns

gullible.info sez:

An average Microsoft Excel spreadsheet document has 1,102 rows and 18.2 columns.

I say:

That's 470 rows and 25 columns. I had 408 sheets whose last cell was A1, i.e. blank. I attribute that to two things: I probably have quite a few workbooks that are just code; Most of the workbooks I get from other people have a Sheet2 and a Sheet3 that are unused. The default number of sheets for a new workbook is three.

I also had 15 sheets whose last cell is IV65536, which is clearly wrong. That's a typical problem with using the SpecialCells(xlCellTypeLastCell) method, but I'll just exclude those from the average.

If I don't count the empty sheets and the "full" sheets, I get:

Mean: 103 rows and 28 columns
Median: 58 rows and 11 columns
Mode: 59 rows and 11 columns

That's over about 2,000 files. It's not all the Excel files I have, it's just all of them in the MyDocuments folder. If you'd like to see your average, I've posted the code I used below. It takes a few minutes to run and I had to click a few dialog boxes that asked me to edit links or start an external application.

Sub LastCells()
   
    Dim sro As Scripting.FileSystemObject
    Dim srFolder As Scripting.Folder
   
    Application.Calculation = xlCalculationManual
    Application.EnableEvents = False
   
    Set sro = New Scripting.FileSystemObject
   
    Set srFolder = sro.GetFolder("C:\Documents and Settings\dk\My Documents\")
   
    GetLastCells srFolder
   
    Application.Calculation = xlCalculationAutomatic
    Application.EnableEvents = True
   
End Sub
 
Sub GetLastCells(srFolder As Scripting.Folder)
   
    Dim srFile As Scripting.File
    Dim srSubFolder As Scripting.Folder
    Dim wb As Workbook, sh As Worksheet, rLast As Range
   
    For Each srFile In srFolder.Files
        If srFile.Type = "Microsoft Excel Worksheet" Then
            Set wb = Workbooks.Open(srFile.Path)
            For Each sh In wb.Worksheets
                If Not sh.ProtectContents Then
                    Set rLast = sh.Cells.SpecialCells(xlCellTypeLastCell)
                    With ThisWorkbook.Sheets(1).Range("A65536").End(xlUp)
                        .Offset(1, 0).Value = wb.FullName
                        .Offset(1, 1).Value = rLast.Address
                        .Offset(1, 2).Value = rLast.Row
                        .Offset(1, 3).Value = rLast.Column
                    End With
                End If
            Next sh
            wb.Close False
        End If
    Next srFile
   
    For Each srSubFolder In srFolder.SubFolders
        GetLastCells srSubFolder
    Next srSubFolder
       
End Sub

Terminating Dependent Classes

disco chick
I have four objects that are all dependent on each other. They are CInvoice has many CInvoiceLines, CInvoiceLine has many CRoyaltyLines, CRoyaltyLine has many CRoyaltyAdjustments. In order to properly terminate all of these class modules, I need to remove any and all dependencies. If I don't, the classes will stay alive and consume memory. With four levels like this eating memory, I will eventually have to answer the question: Do I want to send an error report?

I think I have my termination sequences right, but I'll let you be the judge in the comments. I start in the CInvoice class, which I'll call a Parent class. It has many children, including the CInvoiceLines class (a child collection) and each individual CInvoiceLine class (a child class). The CInvoiceLine class then becomes a parent class for my next round of terminations. His children are the CRoyaltyLines class and each instance of the CRoyaltyLine class.

All of my Terminate events are structured like

Public Sub Terminate()
 
End Sub

I don't use the built-in class terminate event because it doesn't fire at the right time. When I'm in a parent class, I do these three things:

  1. Call the Terminate method of the Child Collection Class
  2. Set the local Child Collection Class variable to Nothing
  3. Set the local Parent variable to Nothing

Items 1 and 2 are done in a parent class that is not also a child class. Item 3 is done in a child class that is not also a parent class. All three items are done in a class that is both a parent and a child. For instance, CInoviceLine is a parent with respect to CRoyaltyLines, but a child with respect to CInvoice, so all three steps must occur.

When I'm in a Child Collection Class (ex: CInvoiceLines), I do these three things:

  1. Call the Terminate method for each member of the collection
  2. Set the local Parent variable to Nothing
  3. Set the local Collection variable to Nothing

When item 1 is executed, it may be terminating a class that's a parent class and the whole things starts over again. Here's an example: In the CInvoice class

Public Sub Terminate()
   
    mobjLines.Terminate  'Term the child collection class
    Set mobjLines = Nothing 'term the local ccc variable
 
End Sub

Since mobjLines is a CInvoiceLines object, that Terminate method gets called first. In the CInvoiceLines class

Public Sub Terminate()
   
    Dim i As Long
   
    'Terminate each member
    For i = 1 To mcolLines.Count
        mcolLines.Item(i).Terminate
    Next i
       
    Set mobjParent = Nothing 'kill the parent variable
    Set mcolLines = Nothing 'kill the collection variable
   
End Sub

When mcolLines.Item(i).Terminate is called, it's calling the Terminate method of a class that's both a child and a parent. In CInvoiceLine

Public Sub Terminate()
   
    mobjRoyaltyLines.Terminate 'Term the child collection class
    Set mobjParent = Nothing    'kill the parent variable
    Set mobjRoyaltyLines = Nothing   'kill the ccc variable
   
End Sub

The only difference between this and CInvoice is that I killed the parent variable because it's also a child class. I won't go through the rest, except to show you the last class, CRoyaltyAdjustment

Public Sub Terminate()
   
    Set mobjParent = Nothing
   
End Sub

Since this is a child class, but not a parent class, only killing the local parent variable is necessary.

Boy, there's nothing more thrilling than terminating classes, is there? That's why I added the image - to spice it up a little. For a little background, I'm abstracting my relational database into objects in VBA. That way I can reference clsInvoice.Lines(1).RoyaltyLines(2).Adjustments(3).Amount, which I contend is easier to code and read. But the setup is a real pain.

VBA Code Documenting Tools: Project Analyzer and Visustin

Every VBA developer (which is anyone who knows how to get into the VBE I guess) develops his or her own programming habits: sparse commenting or elaborate commenting, naming convention (or no declaration at all), code indenting, preference for certain structures and methods over others. You name it. Many books have been written about this, of which I find "Code Complete" is a very good one.

Although I think I write pretty readable code, I do have one bad habit: I don't really document what I have done and I tend to "forget" to write comments.

Sometimes a customer wants elaborate documentation of the code. But of course they forgot to ask up front...
So here I am looking at this 10.000 lines-of-code VBA project and a request to produce tech documents on what the code does and how it is structured. Including flow diagrams (preferably in Visio), call trees, the works.

At first I estimated I'd need as much as maybe half the amount of time I originally used up to write the code itself. Which was significant of course. Let's say over a full week.

Like with any task that I find tedious: I look for a way to automate, so I dive into a search quest with Google.

Typical search strings: VBA code documenting, Document Code, Create Dependency tree, Show call stack,....

Well, I found this site.

Both their Project analyzer tool and Visustin looked like they might be a solution to my problem. But they're both rather expensive (I'm Dutch, remember?). I calculated I'd have to invest about € 1000 to cover for these two tools in the versions I think I'd need.

Luckily Aivosto granted me a time-limited full version of the Pro version of both tools so I could thoroughly test them (and to be frank I also promised to write up my experiences).

Here they are then.

I opened Project Analyzer and since I also installed the Office VBA plug, the File menu shows "Analyse Office VBA..." as one of the options. You point it to your file and it happily imports the entire VBA Project (if you have "Allow access to VBA project" set, of course). So far so good!

Take a look at the Report Menu. It shows a myriad of reporting tools. Very impressive!

projanal01.gif

I pick the "Problem Report" and it shows me lots of useful (and maybe even embarrassing) stuff, indicating line numbers and of course the routines and module, like:

Too many parameters: WriteName2sheet
194 Consider short circuit with nested Ifs
432 Unicode function is faster: ChrW$

Function without type specification
692 Too many uncommented lines: 81 (ouch)
Dead procedure

And lots of other useful stuff. Didn't know I produced such a load of rubbish :-).

OK, let's try something else:Procedure call tree. Wow. Everything's there:

projanal02.gif

Then I tried the graphical version of the call tree:

projanal03.gif

(Yes I blurred this one on purpose).

So far so good. I won't bother you all with the dozens of other reports I tried and used. I got more impressed every minute I can tell you!

...Lots of copying and pasting from Project Analyzer to Word followed...

Now let's have a look at Visustin. Ever needed to create a flow diagram? Well I haven't, because I tend to just dive in (I know, bad habit...).

Have a look at this procedure:

Option Explicit

Sub GetFilesInDirectory(ByVal sDirToSearch As String, colFoundFiles As Collection)
'-------------------------------------------------------------------------
' Procedure : GetFilesInDirectory Created by Jan Karel Pieterse
' Company   : JKP Application Development Services (c) 2006
' Author    : Jan Karel Pieterse
' Created   : 04-10-2007
' Purpose   : Retrieves all files in sDirToSearch, stacks matches into cLookForFIles
'-------------------------------------------------------------------------
    Dim NextFile As String
    Dim lCount As Long
    Dim sFileName As String
    Dim sFileSpec As String
    Dim lFoundMatches As Long
    Dim oCtlNew As CommandBarButton
    Application.EnableCancelKey = xlErrorHandler
    If Right(sDirToSearch, 1) <> "\" Then
        sDirToSearch = sDirToSearch & "\"
    End If
    NextFile = Dir(sDirToSearch & "*.xls")
    Do Until NextFile = ""
        If Err.Number = 0 Then
            If TypeName(oObj2Add2) Like "Command*" Then
                Set oCtlNew = oObj2Add2.Controls.Add(msoControlButton, , , , True)
                oCtlNew.Caption = NextFile
                oCtlNew.OnAction = "OpenFileFromMenu"
                oCtlNew.Tag = sDirToSearch & NextFile
            Else
                AddFile2Wizard oObj2Add2, NextFile, sDirToSearch
            End If
        End If

        NextFile = Dir()
    Loop
    On Error GoTo 0
TidyUp:
    Exit Sub
End Sub

So now what? Well, copy, paste and hit F5. You get this:

projanal04.gif

WOW! (also proves commenting is useful...)

So next I found myself in the process of alt-tab to the VBE, select code, control-c, alt-tab to Word, paste code, alt-tab to Visustin, control-v, F5 (builds this chart), control-c, alt-tab back to Word, paste the diagram, .....

And the fun part was creating the Visio diagrams. They didn't turn off screenupdating and I can tell you it is great fun seeing this program spitting out these (for me) complex diagrams in seconds, which would have taken me hours and hours...

All in all, producing the entire document set took me about half a day. Man, this tool cost me money! (but I gained quite a happy customer).

And to think that the enterprise version of Project Analyzer comes with macros...

Tell me what you think and what your experiences are! Have you got similar experiences, or completely different,...
Share them here!

Regards,

Jan Karel Pieterse
www.jkp-ads.com

Performance Monitor

Professional Excel Development

Professional Excel Development has a chapter on optimization that discusses the PerfMon utility (available on the companion CD). I used it for the first time on a 40 second process and I thought I would share the results. Thrilling, I know.

One hundred fifty thousand calls to class properties? Yikes! Noting that FillFinals was the biggest culprit, I manually added some perfmon calls inside that procedure to see what I could see.

Inserting the final reports consists of adding sheets to the final report workbooks, among other things. In this case it adds nine sheets to six different workbooks. I decided to break up that block of code even further. Specifically, I wanted to isolate the Sheets.Add line.

I guess adding sheets takes a lot of time. Maybe I should create a report with some 'final reports' already in it so I can limit the amount of sheets that I have to add. Of course I'll have to delete extraneous sheets, so I'll have to weigh the costs of that. Well, nothing earth shattering here. It was just the first time I used it on a real program and it was fun.

A couple of bugs I noted in the utility:
It puts PerfMonProcEnd statements before any Exit Sub statements, but when it deletes them it doesn't respect my original tabbing.
My manual lines looked like PerfMonProcEnd "FireAssay.MProcess.FillFinals.HeaderData". I don't think I was supposed to put a period after FillFinals (the procedure name) because the output file added another column. That's OK, but it didn't adjust the headers. In the screen shots above, I manually adjusted the headers and added a Section header. It's probably user error rather than a bug.

Preventing Event Conflicts

I have two custom add-ins loaded that use application-level events. I only want those events to run when a workbook associated with my application is active. Otherwise, the events in my purchase order application will try and do stuff to my invoices and vice versa. Not good.

For every application, the first thing I do is test to make sure I'm dealing with an appropriate object; be it a sheet, workbook, or whatever. I use two utilities to do the testing. The first utility verifies an open workbook and the second verifies a closed one.

As described in Custom Document Properties, I use custom document properties in my templates to identify the workbook as being part of the application. The utilities check the property and return True if it's there.

Function IsOpenInvoice(ByRef Wb As Workbook, _
    Optional ByVal sProperty As String = gsCDPINVAPP) As Boolean
   
    Dim bTemp As Boolean
   
    On Error Resume Next
        bTemp = Wb.CustomDocumentProperties(sProperty).Value
    On Error GoTo 0
   
    IsOpenInvoice = bTemp
   
End Function
 
Function IsInvoice(ByVal sName As String, _
    Optional ByVal sProperty As String = gsCDPINVAPP) As Boolean
       
    With Application.FileSearch
        .NewSearch
        .FileType = msoFileTypeAllFiles
        .Filename = Dir(sName)
        If Not sName = Dir(sName) Then
            .LookIn = Replace(sName, Dir(sName), "")
        End If
        .PropertyTests.Add sProperty, msoConditionIsYes
       
        .Execute
       
        IsInvoice = .FoundFiles.Count> 0
    End With
   
End Function

In IsOpenInvoice, the property value from the supplied workbook is set to a Boolean variable. If the property value is False or if the property doesn't exist, the variable is False. Of course the property value will never be false. I created the properties in the template and set the value to True. It will either exist or not.

For closed workbooks, IsInvoice uses FileSearch to find a file with the proper name and with the correct property. The argument sName is the full path and name of the file. If the file exists in the directory and has the property, True is returned. This is typically used before a file is opened, such as after GetOpenFilename is used but before the file is actually opened.

Both functions have an optional second argument, sProperty, that defaults to the property name that identifies the application in general. I also use these utilities to identify particular templates within my app. I may need to know, for instance, if the open workbook is an invoice, a sales order, or a report in particular, rather than just part of my app in general. I have constants set up for each template type that I want to test.

Here's an example of an application level event in a custom class module that uses IsOpenInvoice:

Private Sub mApp_SheetChange(ByVal Sh As Object, ByVal Target As Range)
   
    If IsOpenInvoice(Sh.Parent, gsINVOICE) Then
        'Do stuff
    End If
   
End Sub

Having a number and a formula “co-exist” in a cell

On an ongoing project, the client uses the TM1 OLAP system. One of the interesting things I noticed was this:

A user can "slice" data from the database into Excel. The result is a new worksheet where the appropriate cell contains a formula such as =DBRW(...), which essentially looks up the TM1 database for the current value that corresponds to the specified parameters.

So far so good. But, here's the twist. One can enter a new value into the cell containing the formula. TM1 will update the OLAP database with this new value and restore the formula.

As soon as I saw what was happening I guessed how it was done. Here's the implementation of a proof-of-concept. Of course, as a proof-of-concept there are a lot of safeties, performance issues, and other niceties that are missing.

An obvious requirement is that that one must have a secondary data storage since it is impossible for a value and a formula to actually co-exist in a cell. So, that requires a backend database to store the actual value and I decided to use MS Access to create one.

The database had a single table with 3 columns: Col1, Col2, DataVal. For those who want to know how this maps to an OLAP system, think of Col1 and Col2 as dimensions in a TM1 OLAP system and DataVal as the value at the intersection of specific elements in those dimensions.

That led to the infrastructure to access data in the database. In a standard module:

Option Explicit
Dim Cn As ADODB.Connection
    Dim aRSTable1 As ADODB.Recordset
Function initializeADO(DataSrcName As String) As ADODB.Connection
        Dim Cn As ADODB.Connection
        Set Cn = New ADODB.Connection
        With Cn
            .Provider = "Microsoft.Jet.OLEDB.4.0"
            .ConnectionString = DataSrcName
            .Open
            End With
        Set initializeADO = Cn
    End Function
Sub openADO()
    Set Cn = initializeADO("C:\Documents and Settings\Owner\My Documents\testADO\db1.mdb")
    End Sub
Public Function DBVal(Table, Col1, Col2)
    If aRSTable1 Is Nothing Then Set aRSTable1 = New ADODB.Recordset
    If Cn Is Nothing Then openADO
    On Error Resume Next: aRSTable1.Close: On Error GoTo 0
    aRSTable1.Open "SELECT DataVal FROM " & Table _
        & " WHERE Col1='" & Col1 & "' AND Col2='" & Col2 & "'", Cn
    DBVal = aRSTable1.Fields("DataVal").Value
    End Function

OK, nothing unusual about the above. It's standard stuff to write a User Defined Function that retrieves data from an external database. Again, remember this is proof-of-concept code and leaves out a lot of safeties and performance effectiveness issues.

This is used in a worksheet cell as =DBVal(A3,B3,C3) where A3 contains the Access table being queried, and B3 and C3 the values for the 2 columns Col1 and Col2 respectively. In TM1 parlance, this would correspond to the cube name, and the two elements of the 2 dimensions in the cube.

Next, the infrastructure to allow a new value to update the database.

First, a worksheet event procedure that keeps track of the existing formula. Note that I would never deploy an event procedure in a worksheet code module, but it works well to test concepts.

Option Explicit
Dim CellFormula As String

Private Sub Worksheet_SelectionChange(ByVal Target As Range)
    If Target.Cells.Count <> 1 Then Exit Sub
    If Not Target.HasFormula Then CellFormula = "": Exit Sub
    CellFormula = Target.Formula
    End Sub

Next, an event procedure that responds to a new value entered by the user.

Private Sub Worksheet_Change(ByVal Target As Range)

    If Target.Cells.Count <> 1 Then Exit Sub
    If Target.HasFormula Then Exit Sub
    If CellFormula = "" Then Exit Sub
   
   
    Dim NewVal
    NewVal = Target.Value
    updateDB CellFormula, NewVal
   
    On Error Resume Next
    Application.EnableEvents = False
    Target.Formula = CellFormula
    Application.EnableEvents = True
    On Error GoTo 0
   
    End Sub

The SelectionChange event procedure above saves the cell formula, if it has one, at the time the user selects the cell. Then, if the user enter a value, the Change event procedure uses the new value to update the database through UpdateDB and then restores the formula saved by the SelectionChange procedure.

The corresponding code for the updateDB subroutine in the standard code module:

Sub updateDB(CellFormula As String, NewVal)
    If Cn Is Nothing Then openADO
    Dim Params
    Params = Split(CellFormula, "(")
    Params = Split(Left(Params(1), Len(Params(1)) - 1), ",")
    Cn.Execute "UPDATE " & Range(Params(0)).Value & " SET DataVal=" & NewVal _
        & " WHERE Col1='" & Range(Params(1)).Value & "' AND Col2='" & Range(Params(2)).Value & "'"
    End Sub

And, for completeness, code, in the standard module, to close the database connection:

Sub closeADO()
    On Error Resume Next
    aRSTable1.Close
    Set aRSTable1 = Nothing
    Cn.Close
    Set Cn = Nothing
    End Sub

To use the above, in the Access database, a table named Table1. I had 4 records in it corresponding to values of "a" and "b" for Col1 and Col2. The associated DataVal values were 1, 2, 3, and 4, respectively.

In the Excel worksheet, Cell A3 had the value "Table1" and in B3 and C3 one could enter the values corresponding to Col1 and Col2 (i.e., either "a" or "b"). The =DBVal() formula then retrieved the corresponding value from the database.

Now, enter a new value in the cell that contains the =DBVal() formula. The code will update the database with the new value and restore the =DBVal() formula. This will show the new value in the cell.