Author Archive

Tell Tale Signs

There are a few programs whose output I can identify immediately. One is Quickbooks. When I get an invoice or a quote from someone using Quickbooks, I know it right away. Today, I was admitted to the Omaha West Rotary, officially making me a Rotarian. They gave me the list of members. Any guesses what software they’re using?

Sorry for the crappy scan. Here’s the footer in case the header didn’t give it away.

Excel Versions

I currently have Excel 2003 and Excel 2007 installed. I’m pretty happy with this arrangement from an Excel standpoint, but Word and Access are always trying to install themselves. Happy or not, I’m in a situation where I need to install Excel 2010. I don’t know what to do.

What I want to do is delete everything and install only 2010. But sometimes I need 2003 for support reasons. So maybe I have 2010 and 2003.

I haven’t totally embraced the Virtual Machine paradigm for production work. I use it for beta testing or for specific OS/Excel needs (like when I’m editing a book), but I don’t use it for every day client work. I know some people do and I’m genuinely impressed by them. When Windows or Office has an update, do they go into all their VMs and run update? That sounds almost as fun as poking myself in the eye.

So what should I do? Your comments are always appreciated.

Update

This isn’t really an update, but is a question that is somewhat related. My laptop is nearing the end of its life. What if…stay with me here…I replaced it with a Macbook Air? I only use it about 20% of the time and mostly while traveling. I wouldn’t mind having a Mac for Rails stuff, although I have it working under Windows pretty well now. I’ve heard many times, and twice in the last week, someone say something like “I bought a Mac in 2006 and never looked back.” It makes me we want to see if I’m missing anything. However, my first love is Excel, so if it doesn’t do that well, I’m not interested. I expect an opinion or two on this.

Excel Dev Con Live Blogging

Ross is live blogging the Excel Dev Conference in the UK. It’s already after lunch over there, but if you’re interested, check it out.

Messy Donut Junk

I’ve been searching for a new search engine in the wake of Google’s ‘plus’ enhancements. I’ve tried blekko, ixquick, and I’m currently using DuckDuckGo. They’re all lacking, but I’ll let you know if I settle on one. In the meantime, I happened on kleemi and tried it out. I was startled that half the page was a donut chart, so I couldn’t resist trying these search terms.

Fixing Exported Dates

If you export a Profit and Loss statement from Quickbooks to a CSV file, you get dates that look like this:

Excel turns them into this.

January 2005 becomes January 5th of the current year. February 2005 becomes February 5th. That’s no good. In a spare row, I used this formula

=DATE(DAY(A1)+2000,MONTH(A1)+1,0)

to create the date I wanted. Then I copied and pasted special – values over the top of the original dates. It adds 2000 the day to get the correct year. Adding 1 to the month and taking the zeroth day will get you the last day of the month.

January 5, 2012 becomes January 31, 2005.

Finding Almost This Friday

I have a utility that determines the date of the upcoming Friday. It used to look like this

dtReturn = Date + 8 - Weekday(Date, vbFriday)

It has served me well for many years. Until this year. This year, the US celebrated Veterans Day on November 11. While I’m work, the bankers are all at home rolling around in their money. Since my bank would not process any ACH transactions that day, I paid my employees the preceding Thursday. But my function didn’t say “return this Friday unless it happens to be a banking holiday” but now it does.

Public Function ThisFriday() As Date
   
    Dim dtReturn As Date
    Dim i As Long
   
    'if i'm testing, hardcode the date
   If Len(Dir(gsDEBUGDATE)) > 0 Then
        Stop
        dtReturn = #2/18/2011#
    Else
        'return the next friday
       dtReturn = Date + 8 - Weekday(Date, vbFriday)
       
        'if there are no checks on that date, look for
       'a nearby date with checks
       If Not gclsEmployees Is Nothing Then
            If Not gclsEmployees.HasChecks(dtReturn) Then
                For i = 1 To 7
                    If gclsEmployees.HasChecks(dtReturn + i) Then
                        dtReturn = dtReturn + i
                        Exit For
                    ElseIf gclsEmployees.HasChecks(dtReturn - i) Then
                        dtReturn = dtReturn - i
                        Exit For
                    End If
                Next i
            End If
        End If
    End If
   
    ThisFriday = dtReturn
   
End Function

And the HasChecks property is just a loop

Public Property Get HasChecks(dtCheck As Date) As Boolean
   
    Dim bReturn As Boolean
    Dim clsEmployee As CEmployee
   
    bReturn = False
   
    For Each clsEmployee In Me
        If Not clsEmployee.CheckByDate(dtCheck) Is Nothing Then
            bReturn = True
            Exit For
        End If
    Next clsEmployee
   
    HasChecks = bReturn
   
End Property

Man I hate when real life gets in the way of good code. I need to get rid of that arrow antipattern, but otherwise I’d say this is solved for all time. Jinx.

Procedure Flow and Raising Errors

I have a pretty simple procedure that I run a few times every other week. Someone emails their timesheet to me, I open the timesheet in Excel, and I want a quick way to save it to the proper directory. GMail stores the timesheet in my Temp directory and navigating through the file structure is a pain. I wrote this procedure and it includes the following conditions:

  • Is there an active workbook
  • Is that workbook a timesheet
  • Has the timesheet already been saved
  • Does the path exist

That results in four levels of nested ‘If’ statements. I like nested statements. I think it makes the code very readable and easy to follow.

Public Sub SaveDownloadedTimesheet()
   
    Dim wb As Workbook
    Dim dtThisFriday As Date
    Dim sNewPath As String
    Dim sNewName As String
   
    Const sPAYROLLPATH As String = "\\Server\Share\Accounting\Payroll\Processing\"
    Const sCOPY As String = "Copy of "
       
    On Error Resume Next
        Set wb = ActiveWorkbook
    On Error GoTo 0
   
    If Not wb Is Nothing Then
        If IsTimesheet(wb) Then
            If IsTemp(wb) Then
                dtThisFriday = ThisFriday
                sNewPath = sPAYROLLPATH & Format(dtThisFriday, "mmdd") & "\"
               
                If Len(Dir(sNewPath, vbDirectory)) > 0 Then
                    sNewName = Replace(wb.Name, sCOPY, "", 1, 1)
                    wb.SaveAs sNewPath & sNewName
                    wb.Close False
                End If
            End If
        End If
    End If
   
End Sub

The code works great except when it doesn’t. When there’s a problem, I don’t get any kind of message telling which one of my conditionals failed. It’s usually that the path doesn’t exist, but today I was absolutely freaking positive that the path existed and it still wasn’t saving. Of course I was wrong, but that’s not the point.

One solution is to add an Else clause to each If statement with a message box. There’s nothing wrong with that solution, but I don’t find it aesthetically pleasing. Did I just say “aesthetically pleasing”? My goodness. It adds eight lines of code and, to me, ruins the flow of the program.

An alternative is to raise errors. That might look like this:

Public Sub SaveDownloadedTimesheet()
   
    Dim wb As Workbook
    Dim dtThisFriday As Date
    Dim sNewPath As String
    Dim sNewName As String
   
    Const sPAYROLLPATH As String = "\\Server\Share\Accounting\Payroll\Processing\"
    Const sCOPY As String = "Copy of "
       
    On Error GoTo ErrHandler
   
    On Error Resume Next
        Set wb = ActiveWorkbook
    On Error GoTo ErrHandler
   
    If wb Is Nothing Then Err.Raise 9999, , "No active workbook"
    If Not IsTimesheet(wb) Then Err.Raise 9999, , "Active workbook is not a timesheet"
    If Not IsTemp(wb) Then Err.Raise 9999, , "Timesheet already saved"
   
    dtThisFriday = ThisFriday
    sNewPath = sPAYROLLPATH & Format(dtThisFriday, "mmdd") & "\"
   
    If Len(Dir(sNewPath, vbDirectory)) = 0 Then Err.Raise 9999, , "Can't find folder for " & Format(dtThisFriday, "mmdd")
   
    sNewName = Replace(wb.Name, sCOPY, "", 1, 1)
    wb.SaveAs sNewPath & sNewName
    wb.Close False
   
ErrExit:
    Exit Sub
   
ErrHandler:
    MsgBox Err.Description
    Stop
    Resume
   
End Sub

I don’t like that as much as nicely nested Ifs, but I like it better than a bunch of Else clauses. Most of my conditions are nicely grouped, so it’s not difficult to see what I’m checking. And it tells me the problem, so that’s nice. Which do you prefer? Do you use a different method?

Get a Table from a Web Page with an XML Request

I’m tired of trying to remember the library names and the syntax for doing this. I’m putting here so when I search for XMLRequest, Web Request, HTTPRequest, I can find it.

Sub GetData()

    Dim oHttp As MSXML2.XMLHTTP
    Dim sHtml As String
    Dim hDoc As HTMLDocument
    Dim hTable As HTMLTable
    Dim hRow As HTMLTableRow
    Dim hCell As HTMLTableCell
    Dim rStart As Range
           
    Const sURL As String = "http://www.contextures.com/xlsampledata01.html"
   
    Set oHttp = New MSXML2.XMLHTTP
    Set hDoc = New HTMLDocument
    Set rStart = Sheet1.Range("A1")
   
    'Send the web request
   oHttp.Open "GET", sURL
    oHttp.send
   
    'Give it enough time to process
   Do
        DoEvents
    Loop Until oHttp.readyState = 4
   
    'put the web page into an HTML Document
   hDoc.body.innerHTML = oHttp.responseText
   
    'Find the right table and write it to a sheet
   For Each hTable In hDoc.all.tags("TABLE")
        If hTable.Rows(0).Cells(0).innerText = "OrderDate" Then
            For Each hRow In hTable.Rows
                For Each hCell In hRow.Cells
                    rStart.Offset(hRow.RowIndex, hCell.cellIndex).Value = hCell.innerText
                Next hCell
            Next hRow
        End If
    Next hTable
   
End Sub

It sends a request to the sample data page over at Contextures, jams that into an HTML Doc, then writes the table out to a sheet.

UK Excel User Conference

If you’re in or around London in late January, don’t miss the UK Excel User Conference. It’s Wednesday, January 25th at Skills Matter
Big Ben, Westminster
Here’s the agenda.

Check out this gem from the agenda:

15:30 – 16:30
VBA to C : Pratfalls and Perils
- Stories based on a c++ newby’s efforts to convert 10K lines of VBA UDFs to
C++ XLLs.

If that doesn’t sound interesting to you, you’re just not trying.

Pay the fee here and somebody live blog the event. I can’t make it because I’ll be in Cancun that week but I’ll still be thinking about VBA.

A Case for Value as a Default Property

Charles Williams had an interesting post1 the other day about the Text, Value, and Value2 properties of the Range object. In it, he wrote:

So .Value2 really should be the default, and is definitely the one to use 99% of the time.

I disagree. I never use Value2.

The Value2 property returns a Double data type (64-bit floating point). The Value property returns the same thing except when the cell is formatted as Date or Currency. When it’s currency, Value returns a Currency data type (64-bit scaled integer), and when it’s a date, Value returns a Date date type (another 64-bit floating point). Who cares (besides Charles and me) and what does it mean?

Let’s imagine that we’re writing our own spreadsheet program to get a better understanding of what Excel is doing. I don’t actually know what Excel does under the hood, so this is all conjecture. In our spreadsheet application that we’re writing from scratch, we’ll store all numbers in memory as IEEE double precision floating point number, or Doubles. There is a case when the user may want more calculation accuracy than floating point allows and is willing to sacrifice a little precision (I’m looking at you fellow accountants). To offer that feature, we’ll allow the user to identify certain numbers as scaled integers with 15 digits to the left of the decimal and four digits to the right. Let’s call that data type Currency. We’re still going to store the value as a Double, but we’ll do the conversion prior to any calculations to ensure the accuracy.

As spreadsheet writers, we have a decision to make. I said we’d allow the user to identify certain numbers as Currency, but how exactly do they identify it. We could add a property called DataType to the Range class. OK, but how does the user set that property? Most Range properties are set in the Format Cells dialog box. We have a few tabs on that dialog already, so let’s add another one called Data Type. The user can set the data type and we’ll do the floating point to scaled integer conversion when the DataType property is set to Currency. That’s when the usability people come in.

The usability people begin by hurling epithets at us regarding our lack of sex lives and penchant for role playing games. They say that normal people (i.e. non-programmers) don’t have any idea what a “data type” is and if we make them learn, our sales will go down by 14.1% (a totally made up number because marketing people can’t do math). Surely, they go on, there is a better way to identify Currency values. Then they get out the corporate programming guidelines and remind us that no dialog box can have more than six tabs. We already have six tabs on the Format Cells dialog, so we can’t add another (Tools – Options hasn’t been invented yet).

Time to compromise. One of our bright, young interns suggests that we make Currency a format. As programmers, we are incensed. We already have an Accounting format and a Currency format will be confusing. And besides, Currency is a data type, not a format. Due to our lack of persuasiveness or the caffeine withdrawals, we agree to add it is a format. Because of the Currency debacle we go from two levels to three levels for data display. The bottom level is the raw Double that’s stored in memory, just like before. The top level is the text representation of the number with all the commas and periods and such, just like before. We need to add a pre-processor level. At this level, we’ll check to see if the Range object has it’s NumberFormat property set to Currency and, if so, we’ll convert from floating point to scaled integer. Once converted, we’ll send to the formatting layer to add the text goodies.

The deed is done and we all feel a little dirty. We code the changes and send them over to the VBA folks. The guy building the object model already has a Value property and a Text property. Now that we’ve added another layer, he doesn’t know which value to return for the Value property. There’s really not much debate – it has to be the value that comes from the pre-processor. If people “format” a value as Currency, they’re going to expect a scaled integer from the Value property even if they don’t know what the hell a scaled integer is. Object model guy then asks if he should expose the raw value. Of course. We’re all about exposing ourselves. We call in the dullest intern on staff to come up with the name and, true to form, he produces Value2. Positively inspired.

And scene.

You may disagree with some of the decision made in the above dramatic reenactment, but you have to realize they’re all connected. I personally disagree with calling Currency a format. But whether we call it format or a data type, doesn’t change the fact that we have to convert it. You can disagree with the decision to store all numbers as Doubles, and just store Currency as a scaled integer in memory. That doesn’t really change how the user identifies Currency values and you can’t simply store everything as currency because then everything would be slow.

Back to my assertion that Value is the appropriate default (or, put another way, that Value points to the right layer). I contend that if you “format” something as Currency or Date, that you do so knowingly and for a specific purpose. One of the knocks on Value is that you lose precision for Currency because Currency only goes four digits out to the right. I say you don’t lose precision. The precision that’s there is a myth. It’s only there because we had to convert the number to a float. When we chose Currency, we consciously forfeited all rights to precision beyond the 10,000th place (or should have). The fact that we can convert that number to a Double and see what looks like precision, doesn’t make it there.

For the Date type, I have a different argument. It’s not that there’s any problem with the data manipulation when converting from Double to Date, it’s just that it takes longer. If I’m reading in a date and spitting it back out, I agree that Value2 is probably the better choice. However, if that’s all I’m really doing I might want to address what value I’m adding. In most cases, I’m manipulating the date in such a way that I care that it’s a date. What I don’t want to do is read in a double using Value2 (avoiding the conversion overhead), then have to convert that number to a date myself to manipulate it, then convert it back to a double. In almost all cases, I want a Date or Currency typed number when I “format” it that way.

I suspect that most people who prefer Value2, in fact, disagree with the decision to confuse formatting and data types in the user interface. And if so, then we agree on that. Let me see if I can reword part of Charles’ statement so that we both agree with it: When performance matters, Value2 should be used 100% of the time there are no dates or currency, and Value should be used 100% of the time there are. I wonder if that will fly.

Footnotes
1In fact, all of Charles’ posts are interesting, so if you’re not subscribed to his RSS feed2 yet, get on it.
2If you’re still not reading blogs via RSS, be sure to say hello to my mom at the next Rotary meeting.