A ‘Price is Right’ Algorithm

You’ve probably watched The Price is Right TV game show. On it they run the Clock Game.

The game is played for two prizes. The actual price of the first prize is shown to the studio and home audiences. After the contestant gives their first bid, a 30 second clock is started and the host tells the contestant whether the actual price is higher or lower than the bid. The contestant continues to bid, responding to the host’s clues, until either the contestant wins by correctly guessing the price of the prize or the time expires. If time remains after the first prize is won, the process is repeated for the second prize.…With few exceptions, only prizes valued below $1,000 have traditionally been offered in the Clock Game.

The algorithm to use is simple:

  • Pick a ceiling ($1000) and a floor ($0)
  • Bid the average ($500)
  • If the bid is too low, the bid becomes the floor
  • If the bid is too high, the bid becomes the ceiling
  • Repeat

Being computer types, we’ll put the ceiling at $1024. For a $407-priced prize, your guesses would look like this:

  C D E F
1   407 <-- Price  
2 Floor BID   Ceiling
3 0 512 Too High 1024
4 0 256 Too Low 512
5 256 384 Too Low 512
6 384 448 Too High 512
7 384 416 Too High 448
8 384 400 Too Low 416
9 400 408 Too High 416
10 400 404 Too Low 408
11 404 406 Too Low 408
12 406 407 Stop 408

 
Ten guesses, and you’ve won a washing machine. Here are the formulas that make this work.

  • D1: = 407 (the unknown price)
  • C3: = 0 (the initial floor)
  • F3: = 1024 (the initial ceiling)
  • D3: = (C3+F3)/2 (512—the initial bid)
  • E3: = IF(D3=$D$1,”Stop”,IF(D3>$D$1,”Too High”,”Too Low”)) (the host’s clues)
  • C4: = IF(E3=”Stop”,”",IF(E3=”Too Low”,D3,C3))
  • D4: = IF(E3=”Stop”,”",(C4+F4)/2)
  • E4: = IF(D4=$D$1,”Stop”,IF(D4>$D$1,”Too High”,”Too Low”))
  • F4: = IF(E3=”Stop”,”",IF(E3=”Too High”,D3,F3))

Watch out for “curly quotes” if you copy and paste in. Filldown C4:F13. So what’s the point? We knew the “unknown price” going in. Here’s a recent prospective employee question the BBC got from Qualcomm:

Given 20 ‘destructible’ light bulbs (which break at a certain height), and a building with 100 floors, how do you determine the height the light bulbs break?

You watch The Price is Right or you read DDoE, and you think “Clock Game!” In 20 bulbs, if they break from a 407.407 foot drop, and a floor = 10 feet:

    407.407 <-- Break  
  Floor BID   Ceiling
1 0 512 Too High 1024
2 0 256 Too Low 512
3 256 384 Too Low 512
4 384 448 Too High 512
5 384 416 Too High 448
6 384 400 Too Low 416
7 400 408 Too High 416
8 400 404 Too Low 408
9 404 406 Too Low 408
10 406 407 Too Low 408
11 407 407.5 Too High 408
12 407 407.25 Too Low 407.5
13 407.25 407.375 Too Low 407.5
14 407.375 407.4375 Too High 407.5
15 407.375 407.4063 Too Low 407.4375
16 407.4063 407.4219 Too High 407.4375
17 407.4063 407.4141 Too High 407.4219
18 407.4063 407.4102 Too High 407.4141
19 407.4063 407.4082 Too High 407.4102
20 407.4063 407.4072 Too High 407.4082

 
You’re 2-ten-thousandths of a foot off. You get the job, a great way to start the new year.

 
…mrt
©¿©¬

Populating Class Properties

Depending on the requirements, I’ll choose a method for populating my custom collection classes using from data from a worksheet.

In this example, I’m using a list of the best selling albums of all time.

My Album class has properties as follows:

Public Artist As String
Public Album As String
Public Released As Date
Public Genre As String
Public Sales As Long

The code in my main routine is:

Sub test()
    Dim albs As Albums, alb As Album
 
    Set albs = New Albums
    albs.FillFromSheet Sheet1
End Sub

Filling the collection is just a matter of reading each row and popping the contained values into the right property.
The difficulty is knowing which columns relate to what properties. It’s a mapping problem – mapping columns to properties.

I could make an assumption about the positions of the columns and assume each is a known index.

Public Sub FillFromSheet(wks As Worksheet)
    Const cFirstRow = 2
    Dim i As Long, obj As Album
 
    With wks
        For i = cFirstRow To .Cells(Rows.Count, 1).End(xlUp).Row
            Set obj = New Album
            obj.Artist = .Cells(i, 1)
            obj.Album = .Cells(i, 2)
            obj.Released = .Cells(i, 3)
            obj.Genre = .Cells(i, 4)
            obj.Sales = .Cells(i, 5)
 
            Me.Add obj
        Next
    End With
End Sub

I don’t really like this because I’ve been taught and have tried to follow the approach that offsets/indices like these should be defined as constants.
I’ll modify my FillFromSheet code.

Const cArtistCol = 1, cAlbumCol = 2, cReleasedCol = 3, cGenreCol = 4, cSalesCol = 5
...
    obj.Artist = .Cells(i, cArtistCol)
    obj.Album = .Cells(i, cAlbumCol)
    obj.Released = .Cells(i, cReleasedCol)
    obj.Genre = .Cells(i, cGenreCol)
    obj.Sales = .Cells(i, cSalesCol)

Seems roughly better, but this too has problems. It can be a pain when you want to change the column order or insert a new column.
You’d have to go through the code and update all of the numbers.
While this is ok for 5, as I have here, changing out 50 columns is a chore.

So, I use an Enum block to determine column positions.
At the top of the class module:

Private Enum AlbumCols
    Artist = 1  ' =1 is necessary, otherwise Enum starts at zero
    Album
    Released
    Genre
    Sales
End Enum

And another modification to my FillFromSheet code:

    obj.Artist = .Cells(i, AlbumCols.Artist)
    obj.Album = .Cells(i, AlbumCols.Album)
    obj.Released = .Cells(i, AlbumCols.Released)
    obj.Genre = .Cells(i, AlbumCols.Genre)
    obj.Sales = .Cells(i, AlbumCols.Sales)

That works well. If I rearrange my columns on the worksheet, the only code change needed is a swap of items in the Enum block – a painless and quick update!

But that’s only if the changes to column order are in my control.
What if an end user changes the column order? Do I really want them poking around in my Enum code too?
I’d usually stop now and decide that if the end user starts screwing with column positions, it’s fair enough that the code should break.
However, if I wanted to go the extra mile, I’d have to find the column index by searching for the text in the column headers.

Excel 2007 (and 2003 to an extent) has a feature called Tables, otherwise known as ListObjects.

My code in the main routine changes to:

Set albs = New Albums
albs.FillFromTable Sheet1.ListObjects("Best_Selling_Albums")

… and the code in my collection class is:

Public Sub FillFromTable(tbl As ListObject)
    Const cArtistCol = "Artist", cAlbumCol = "Album", cReleasedCol = "Released"
    Const cGenreCol = "Genre", cSalesCol = "Sales (millions)"
 
    Dim i As Long, obj As Album, row As ListRow, col As New Collection
 
    With tbl.HeaderRowRange: For i = 1 To .Count: col.Add i, .Columns(i): Next: End With
 
    For Each row In tbl.ListRows
        Set obj = New Album
        obj.Artist = row.Range(, col(cArtistCol))
        obj.Album = row.Range(, col(cAlbumCol))
        obj.Released = row.Range(, col(cReleasedCol))
        obj.Genre = row.Range(, col(cGenreCol))
        obj.Sales = row.Range(, col(cSalesCol))
 
        Me.Add obj
    Next
End Sub

In the preceding code I created a collection of key-value pairs. The key is the column label, the value is column index.
I use that as a lookup when populating my properties.

That’s just a few approaches to reading a table of values into a collection.
I’m interested in feedback. How do you map column positions to properties?

Finding Almost This Friday

I have a utility that determines the date of the upcoming Friday. It used to look like this

dtReturn = Date + 8 - Weekday(Date, vbFriday)

It has served me well for many years. Until this year. This year, the US celebrated Veterans Day on November 11. While I’m work, the bankers are all at home rolling around in their money. Since my bank would not process any ACH transactions that day, I paid my employees the preceding Thursday. But my function didn’t say “return this Friday unless it happens to be a banking holiday” but now it does.

Public Function ThisFriday() As Date
   
    Dim dtReturn As Date
    Dim i As Long
   
    'if i'm testing, hardcode the date
   If Len(Dir(gsDEBUGDATE)) > 0 Then
        Stop
        dtReturn = #2/18/2011#
    Else
        'return the next friday
       dtReturn = Date + 8 - Weekday(Date, vbFriday)
       
        'if there are no checks on that date, look for
       'a nearby date with checks
       If Not gclsEmployees Is Nothing Then
            If Not gclsEmployees.HasChecks(dtReturn) Then
                For i = 1 To 7
                    If gclsEmployees.HasChecks(dtReturn + i) Then
                        dtReturn = dtReturn + i
                        Exit For
                    ElseIf gclsEmployees.HasChecks(dtReturn - i) Then
                        dtReturn = dtReturn - i
                        Exit For
                    End If
                Next i
            End If
        End If
    End If
   
    ThisFriday = dtReturn
   
End Function

And the HasChecks property is just a loop

Public Property Get HasChecks(dtCheck As Date) As Boolean
   
    Dim bReturn As Boolean
    Dim clsEmployee As CEmployee
   
    bReturn = False
   
    For Each clsEmployee In Me
        If Not clsEmployee.CheckByDate(dtCheck) Is Nothing Then
            bReturn = True
            Exit For
        End If
    Next clsEmployee
   
    HasChecks = bReturn
   
End Property

Man I hate when real life gets in the way of good code. I need to get rid of that arrow antipattern, but otherwise I’d say this is solved for all time. Jinx.

Procedure Flow and Raising Errors

I have a pretty simple procedure that I run a few times every other week. Someone emails their timesheet to me, I open the timesheet in Excel, and I want a quick way to save it to the proper directory. GMail stores the timesheet in my Temp directory and navigating through the file structure is a pain. I wrote this procedure and it includes the following conditions:

  • Is there an active workbook
  • Is that workbook a timesheet
  • Has the timesheet already been saved
  • Does the path exist

That results in four levels of nested ‘If’ statements. I like nested statements. I think it makes the code very readable and easy to follow.

Public Sub SaveDownloadedTimesheet()
   
    Dim wb As Workbook
    Dim dtThisFriday As Date
    Dim sNewPath As String
    Dim sNewName As String
   
    Const sPAYROLLPATH As String = "\\Server\Share\Accounting\Payroll\Processing\"
    Const sCOPY As String = "Copy of "
       
    On Error Resume Next
        Set wb = ActiveWorkbook
    On Error GoTo 0
   
    If Not wb Is Nothing Then
        If IsTimesheet(wb) Then
            If IsTemp(wb) Then
                dtThisFriday = ThisFriday
                sNewPath = sPAYROLLPATH & Format(dtThisFriday, "mmdd") & "\"
               
                If Len(Dir(sNewPath, vbDirectory)) > 0 Then
                    sNewName = Replace(wb.Name, sCOPY, "", 1, 1)
                    wb.SaveAs sNewPath & sNewName
                    wb.Close False
                End If
            End If
        End If
    End If
   
End Sub

The code works great except when it doesn’t. When there’s a problem, I don’t get any kind of message telling which one of my conditionals failed. It’s usually that the path doesn’t exist, but today I was absolutely freaking positive that the path existed and it still wasn’t saving. Of course I was wrong, but that’s not the point.

One solution is to add an Else clause to each If statement with a message box. There’s nothing wrong with that solution, but I don’t find it aesthetically pleasing. Did I just say “aesthetically pleasing”? My goodness. It adds eight lines of code and, to me, ruins the flow of the program.

An alternative is to raise errors. That might look like this:

Public Sub SaveDownloadedTimesheet()
   
    Dim wb As Workbook
    Dim dtThisFriday As Date
    Dim sNewPath As String
    Dim sNewName As String
   
    Const sPAYROLLPATH As String = "\\Server\Share\Accounting\Payroll\Processing\"
    Const sCOPY As String = "Copy of "
       
    On Error GoTo ErrHandler
   
    On Error Resume Next
        Set wb = ActiveWorkbook
    On Error GoTo ErrHandler
   
    If wb Is Nothing Then Err.Raise 9999, , "No active workbook"
    If Not IsTimesheet(wb) Then Err.Raise 9999, , "Active workbook is not a timesheet"
    If Not IsTemp(wb) Then Err.Raise 9999, , "Timesheet already saved"
   
    dtThisFriday = ThisFriday
    sNewPath = sPAYROLLPATH & Format(dtThisFriday, "mmdd") & "\"
   
    If Len(Dir(sNewPath, vbDirectory)) = 0 Then Err.Raise 9999, , "Can't find folder for " & Format(dtThisFriday, "mmdd")
   
    sNewName = Replace(wb.Name, sCOPY, "", 1, 1)
    wb.SaveAs sNewPath & sNewName
    wb.Close False
   
ErrExit:
    Exit Sub
   
ErrHandler:
    MsgBox Err.Description
    Stop
    Resume
   
End Sub

I don’t like that as much as nicely nested Ifs, but I like it better than a bunch of Else clauses. Most of my conditions are nicely grouped, so it’s not difficult to see what I’m checking. And it tells me the problem, so that’s nice. Which do you prefer? Do you use a different method?

Working with the MAXIF() and LARGEIF() functions

Oh, Wait! Excel doesn’t have those functions. We’ll have to roll our own. We’re going to use the Double Unary tip found here. In Excel, 1*TRUE = 1, 0+TRUE = 1, TRUE^1 = 1, and minus-minus TRUE (–TRUE, the double unary) = 1. It’s two sign changes, and thought to be the best way to turn an array of TRUE/FALSE into ones/zeros. Multiplying an array of parameters by an array of ones and zeros leaves only the TRUE parameters non-zero.

Back to our list of the 122 major teams in professional sports, the Boss has decided that sorting by name length, then alphabetically, isn’t going to do. He wants it by name length by stadium size. So, thanks to Wikipedia, your data looks like this, Column(F) is arena size:

  D E F
1 Anaheim Ducks 13 17,174
2 Arizona Cardinals 17 63,400
3 Arizona Diamondbacks 20 48,633
4 Atlanta Braves 14 50,097
5 Atlanta Falcons 15 71,228
6 Atlanta Hawks 13 18,729
7 Baltimore Orioles 17 45,363
8 Baltimore Ravens 16 71,008
9 Boston Bruins 13 17,565
10 Boston Celtics 14 18,624

Using fzz’s comment about the ROW() function, we’ll sort the length by G1=LARGE($E$1:$E$122,ROWS($G$1:$G1)) filled down. The table looks much like before:

  D E F G
1 Anaheim Ducks 13 17,174 29
2 Arizona Cardinals 17 63,400 22
3 Arizona Diamondbacks 20 48,633 22
4 Atlanta Braves 14 50,097 22
5 Atlanta Falcons 15 71,228 21
6 Atlanta Hawks 13 18,729 21
7 Baltimore Orioles 17 45,363 21
8 Baltimore Ravens 16 71,008 20
9 Boston Bruins 13 17,565 20
10 Boston Celtics 14 18,624 20

We know there are only three teams of name length 22. A formula – -($E$1:$E$122=22), using the double unary operation, gives us an array of 119 zeros, and just 3 ones that line up on 22. If we multiply that array by the array of arena capacities (F1:F122) we have 119 multiplications by zero, and three multiplications by 1. A MAX() function or an equivalent LARGE(,1) function on that array product returns the capacity of largest stadium hosting a team name 22 letters long. Those equations formulas, array entered, would look like this:

  • {=LARGE(- -($E$1:$E$122=22)*$F$1:$F$122,1)}
  • {=MAX(- -($E$1:$E$122=22)*$F$1:$F$122)}

We want to modify those for different teams. Changing the =22 to =$G1 is a start, and we’ve gone about as far as we can with our ‘MAXIF().’ It’s going to return the same thing three times. A scheme using COUNTIF() starting from the top and working down will improve the ‘LARGEIF()’. Using mixed references, $G$1:$G1 will grow as we fill down. COUNTIF($G$1:$G2,$G2) will only count one 22, COUNTIF($G$1:$G3,$G3) will count two 22s, and COUNTIF($G$1:$G4,$G4) will count 3, and that’s all there are. This array-entered equationformula then, filled down, is ‘LARGEIF()’:

  • {=LARGE(- -($E$1:$E$122=$G1)*$F$1:$F$122,COUNTIF($G$1:$G1,$G1))}

In I1 filldown =MATCH(H1,$F$1:$F$122,0) and in J1 filldown =INDEX(D:D,I1). Your table should look like this:

  D E F G H I J K
1 Anaheim Ducks 13 17,174 29 45,389 52 Los Angeles Angels of Anaheim  
2 Arizona Cardinals 17 63,400 22 19,980 95 Portland Trail Blazers  
3 Arizona Diamondbacks 20 48,633 22 19,356 63 Minnesota Timberwolves  
4 Atlanta Braves 14 50,097 22 18,144 30 Columbus Blue Jackets  
5 Atlanta Falcons 15 71,228 21 43,651 89 Philadelphia Phillies  
6 Atlanta Hawks 13 18,729 21 19,596 42 Golden State Warriors  
7 Baltimore Orioles 17 45,363 21 18,203 83 Oklahoma City Thunder  
8 Baltimore Ravens 16 71,008 20 68,756 69 New England Patriots  
9 Boston Bruins 13 17,565 20 67,164 49 Jacksonville Jaguars  
10 Boston Celtics 14 18,624 20 65,857 108 Tampa Bay Buccaneers  

Column(I) shows the indices reordered by size of the arena.

There are some problems with this approach. It only works with positive numbers, such as our capacities. Negative numbers will be less than a FALSE, and a zero will be returned in their place. And if ever capacities are equal, it will always return only the first (same old problem). Nicely here, hockey and basketball held in the same arena draw to different capacity. The file is available at http://wl.filegenie.com/~JMOprof/LargeIF.xls

…mrt
©¿©¬

Get a Table from a Web Page with an XML Request

I’m tired of trying to remember the library names and the syntax for doing this. I’m putting here so when I search for XMLRequest, Web Request, HTTPRequest, I can find it.

Sub GetData()

    Dim oHttp As MSXML2.XMLHTTP
    Dim sHtml As String
    Dim hDoc As HTMLDocument
    Dim hTable As HTMLTable
    Dim hRow As HTMLTableRow
    Dim hCell As HTMLTableCell
    Dim rStart As Range
           
    Const sURL As String = "http://www.contextures.com/xlsampledata01.html"
   
    Set oHttp = New MSXML2.XMLHTTP
    Set hDoc = New HTMLDocument
    Set rStart = Sheet1.Range("A1")
   
    'Send the web request
   oHttp.Open "GET", sURL
    oHttp.send
   
    'Give it enough time to process
   Do
        DoEvents
    Loop Until oHttp.readyState = 4
   
    'put the web page into an HTML Document
   hDoc.body.innerHTML = oHttp.responseText
   
    'Find the right table and write it to a sheet
   For Each hTable In hDoc.all.tags("TABLE")
        If hTable.Rows(0).Cells(0).innerText = "OrderDate" Then
            For Each hRow In hTable.Rows
                For Each hCell In hRow.Cells
                    rStart.Offset(hRow.RowIndex, hCell.cellIndex).Value = hCell.innerText
                Next hCell
            Next hRow
        End If
    Next hTable
   
End Sub

It sends a request to the sample data page over at Contextures, jams that into an HTML Doc, then writes the table out to a sheet.

Embedding Excel Web App objects in your web page

Hi everyone!

As you may have seen on some blogs, Microsoft now enables you to include any Excel file -or parts thereof- in your web pages and blog posts.

For example (yes you can make changes to the cells, they are not retained):

I have written an article that explains how this is done.

Even more: I have also created a demo where you can enter information in a web form (a textbox), which in turn updates information in the embedded Excel web app file.

Enjoy the read: Embedding Excel files on your website

Regards,

Jan Karel Pieterse
www.jkp-ads.com

UK Excel User Conference

If you’re in or around London in late January, don’t miss the UK Excel User Conference. It’s Wednesday, January 25th at Skills Matter
Big Ben, Westminster
Here’s the agenda.

Check out this gem from the agenda:

15:30 – 16:30
VBA to C : Pratfalls and Perils
- Stories based on a c++ newby’s efforts to convert 10K lines of VBA UDFs to
C++ XLLs.

If that doesn’t sound interesting to you, you’re just not trying.

Pay the fee here and somebody live blog the event. I can’t make it because I’ll be in Cancun that week but I’ll still be thinking about VBA.

The Twelve Days of Excel

When you carol along with The Twelve Days of Christmas do you think of spreadsheets? I do, and that certainly says all you need to know about my musical ability. The folks at PNC Wealth Management do also. Annually they issue their Christmas Price Index, with the 2011 version here. Spreading the data out, The Twelve Days of Christmas looks like this:

A B C D E F G H I J K L M N O P
1   1st
Day
2nd
Day
3rd
Day
4th
Day
5th
Day
6th
Day
7th
Day
8th
Day
9th
Day
10th
Day
11th
Day
12th
Day
Total
Quantity
Unit Cost Bundled Cost
2 Drummers Drumming                       12 12 $ 219.16 $ 2,629.90
3 Pipers Piping                     11 11 22 $ 220.69 $ 2,427.60
4 Lords-a-leaping                   10 10 10 30 $ 476.67 $ 4,766.70
5 Ladies Dancing                 9 9 9 9 36 $ 699.34 $ 6,294.03
6 Maids-a-milking               8 8 8 8 8 40 $ 7.25 $ 58.00
7 Swans-a-swimming             7 7 7 7 7 7 42 $ 900.00 $ 6,300.00
8 Geese-a-laying           6 6 6 6 6 6 6 42 $ 27.00 $ 162.00
9 Golden Rings         5 5 5 5 5 5 5 5 40 $ 129.00 $ 645.00
10 Calling Birds       4 4 4 4 4 4 4 4 4 36 $ 129.99 $ 519.96
11 French Hens     3 3 3 3 3 3 3 3 3 3 30 $ 50.00 $ 150.00
12 Turtle Doves   2 2 2 2 2 2 2 2 2 2 2 22 $ 62.50 $ 125.00
13 Partridge 1 1 1 1 1 1 1 1 1 1 1 1 12 $ 15.00 $ 15.00
14 Pear Tree 1 1 1 1 1 1 1 1 1 1 1 1 12 $ 169.99 $ 169.99

PNC provides the Bundled Cost. We have to work backwards to find the Unit Cost. In other words, eight total maids-a-milking cost $58. The data sources are at the Wiki link. The poor dairy lasses are working minimum wage.

For DDoE, PNC’s table is a good way to review the concept of named ranges. Define Christmas as =Sheet1!$B$2:$M$14, _12th_Day as =Sheet1!$M$2:$M$14, Bundled_Cost as =Sheet1!$P$2:$P$14, Maids as =Sheet1!$B$6:$P$6, Quantity as =Sheet1!$N$2:$N$14, and Unit_Cost as =Sheet1!$O$2:$O$14.

The total number of gifts can be =SUM(Christmas) or =SUM(Quantity), being 376 (more on this later). The cost of the 12th Day of Christmas can be =SUM(Bundled_Cost) or = SUMPRODUCT(_12th_Day,Unit_Cost), being $24,263.18. The total expenditure of your True Love is =SUMPRODUCT(Quantity,Unit_Cost), or $101,119.84. A very generous person, your True Love.

We can use the Intersection Operator, a space, as =Maids Quantity, to find that we need 40 work hours from them. The value of named ranges is in the legibility it brings to your spreadsheet. You can expand this until you cover all days and all gifts. The work is in the defining of the names, the benefit is in the presentation.

PNC says the total gifts number 364. Surely a partridge and a pear tree count as two gifts in your household, no? Happy Holidays!

…mrt
©¿©¬

A Case for Value as a Default Property

Charles Williams had an interesting post1 the other day about the Text, Value, and Value2 properties of the Range object. In it, he wrote:

So .Value2 really should be the default, and is definitely the one to use 99% of the time.

I disagree. I never use Value2.

The Value2 property returns a Double data type (64-bit floating point). The Value property returns the same thing except when the cell is formatted as Date or Currency. When it’s currency, Value returns a Currency data type (64-bit scaled integer), and when it’s a date, Value returns a Date date type (another 64-bit floating point). Who cares (besides Charles and me) and what does it mean?

Let’s imagine that we’re writing our own spreadsheet program to get a better understanding of what Excel is doing. I don’t actually know what Excel does under the hood, so this is all conjecture. In our spreadsheet application that we’re writing from scratch, we’ll store all numbers in memory as IEEE double precision floating point number, or Doubles. There is a case when the user may want more calculation accuracy than floating point allows and is willing to sacrifice a little precision (I’m looking at you fellow accountants). To offer that feature, we’ll allow the user to identify certain numbers as scaled integers with 15 digits to the left of the decimal and four digits to the right. Let’s call that data type Currency. We’re still going to store the value as a Double, but we’ll do the conversion prior to any calculations to ensure the accuracy.

As spreadsheet writers, we have a decision to make. I said we’d allow the user to identify certain numbers as Currency, but how exactly do they identify it. We could add a property called DataType to the Range class. OK, but how does the user set that property? Most Range properties are set in the Format Cells dialog box. We have a few tabs on that dialog already, so let’s add another one called Data Type. The user can set the data type and we’ll do the floating point to scaled integer conversion when the DataType property is set to Currency. That’s when the usability people come in.

The usability people begin by hurling epithets at us regarding our lack of sex lives and penchant for role playing games. They say that normal people (i.e. non-programmers) don’t have any idea what a “data type” is and if we make them learn, our sales will go down by 14.1% (a totally made up number because marketing people can’t do math). Surely, they go on, there is a better way to identify Currency values. Then they get out the corporate programming guidelines and remind us that no dialog box can have more than six tabs. We already have six tabs on the Format Cells dialog, so we can’t add another (Tools – Options hasn’t been invented yet).

Time to compromise. One of our bright, young interns suggests that we make Currency a format. As programmers, we are incensed. We already have an Accounting format and a Currency format will be confusing. And besides, Currency is a data type, not a format. Due to our lack of persuasiveness or the caffeine withdrawals, we agree to add it is a format. Because of the Currency debacle we go from two levels to three levels for data display. The bottom level is the raw Double that’s stored in memory, just like before. The top level is the text representation of the number with all the commas and periods and such, just like before. We need to add a pre-processor level. At this level, we’ll check to see if the Range object has it’s NumberFormat property set to Currency and, if so, we’ll convert from floating point to scaled integer. Once converted, we’ll send to the formatting layer to add the text goodies.

The deed is done and we all feel a little dirty. We code the changes and send them over to the VBA folks. The guy building the object model already has a Value property and a Text property. Now that we’ve added another layer, he doesn’t know which value to return for the Value property. There’s really not much debate – it has to be the value that comes from the pre-processor. If people “format” a value as Currency, they’re going to expect a scaled integer from the Value property even if they don’t know what the hell a scaled integer is. Object model guy then asks if he should expose the raw value. Of course. We’re all about exposing ourselves. We call in the dullest intern on staff to come up with the name and, true to form, he produces Value2. Positively inspired.

And scene.

You may disagree with some of the decision made in the above dramatic reenactment, but you have to realize they’re all connected. I personally disagree with calling Currency a format. But whether we call it format or a data type, doesn’t change the fact that we have to convert it. You can disagree with the decision to store all numbers as Doubles, and just store Currency as a scaled integer in memory. That doesn’t really change how the user identifies Currency values and you can’t simply store everything as currency because then everything would be slow.

Back to my assertion that Value is the appropriate default (or, put another way, that Value points to the right layer). I contend that if you “format” something as Currency or Date, that you do so knowingly and for a specific purpose. One of the knocks on Value is that you lose precision for Currency because Currency only goes four digits out to the right. I say you don’t lose precision. The precision that’s there is a myth. It’s only there because we had to convert the number to a float. When we chose Currency, we consciously forfeited all rights to precision beyond the 10,000th place (or should have). The fact that we can convert that number to a Double and see what looks like precision, doesn’t make it there.

For the Date type, I have a different argument. It’s not that there’s any problem with the data manipulation when converting from Double to Date, it’s just that it takes longer. If I’m reading in a date and spitting it back out, I agree that Value2 is probably the better choice. However, if that’s all I’m really doing I might want to address what value I’m adding. In most cases, I’m manipulating the date in such a way that I care that it’s a date. What I don’t want to do is read in a double using Value2 (avoiding the conversion overhead), then have to convert that number to a date myself to manipulate it, then convert it back to a double. In almost all cases, I want a Date or Currency typed number when I “format” it that way.

I suspect that most people who prefer Value2, in fact, disagree with the decision to confuse formatting and data types in the user interface. And if so, then we agree on that. Let me see if I can reword part of Charles’ statement so that we both agree with it: When performance matters, Value2 should be used 100% of the time there are no dates or currency, and Value should be used 100% of the time there are. I wonder if that will fly.

Footnotes
1In fact, all of Charles’ posts are interesting, so if you’re not subscribed to his RSS feed2 yet, get on it.
2If you’re still not reading blogs via RSS, be sure to say hello to my mom at the next Rotary meeting.