New Year’s Resolution: No More Offset

On Simon’s blog, sam comments regarding OFFSET vs. INDEX:

There is a huge performance hit.

You will notice hardly any difference in the Calculation times

But the recalculation times will significantly different in case of OFFSET- to the tune of 25-30 times slower compared Index.

I’ve heard this many times before, but I guess I’m just lazy. I’ve continued to use OFFSET when defining range names. Here’s the code I used to test what he said.

Sub TestVolatile()
   
    Dim wb As Workbook
    Dim aWrite(1 To 65000, 1 To 1) As Double
    Dim i As Long
    Dim clsTimer As CTimer
   
    Set clsTimer = New CTimer
   
    For i = LBound(aWrite, 1) To UBound(aWrite, 1)
        aWrite(i, 1) = i
    Next i
   
    Set wb = ActiveWorkbook
    wb.Sheets(1).Range(“A1:A65000”).Value = aWrite
   
    For i = 1 To 10
        clsTimer.StartCounter
        wb.Names.Add “MyVol”, “=OFFSET($A$1,0,0,COUNTA($A:$A),1)”
        wb.Sheets(1).Range(“B1:B10”).Formula = “=COUNTA(MyVol)”
        Debug.Print “OFFSET Calc”, clsTimer.TimeElapsed
       
        clsTimer.StartCounter
        Application.CalculateFull
        Debug.Print “OFFSET Recalc”, clsTimer.TimeElapsed
       
        clsTimer.StartCounter
        wb.Names.Add “MyVol”, “=$A$1:INDEX($A:$A,COUNTA($A:$A))”
        Debug.Print “INDEX Calc”, clsTimer.TimeElapsed
       
        clsTimer.StartCounter
        Application.CalculateFull
        Debug.Print “INDEX Recalc”, clsTimer.TimeElapsed
    Next i
   
End Sub

And here’s the results I got

I must be using a slightly different method, but the results still point to INDEX as being more efficient.

The timer code comes from Mike Woodhouse via StackOverflow.

Posted in Uncategorized

28 thoughts on “New Year’s Resolution: No More Offset

  1. I thought the issue of Offset vs Index was that Offset is a volatile function where Index isn’t
    and hence Offset is always forcing a recalculate on any sheet change, where Index doesn’t
    and that that was the biggest reason for any slowdown when using Offset

  2. Not really a real world test. To get a taste of the true performance drag using OFFSET, enter values in a cell that’s not referenced either by the MyVol name or the formula in B1:B10. For example,

    Sub foo()
      Dim i As Long, j As Long, dt As Double

      With ThisWorkbook.Worksheets(1)
        .Range(“A:A”).FormulaR1C1 = “=ROW()”
        .Range(“A:A”).Value2 = .Range(“A:A”).Value2
       
        For i = 1 To 10
          .Range(“B1:B10”).FormulaR1C1 = “=COUNTA(MyVol)”

          .Names.Add “MyVol”, “=OFFSET($A$1,0,0,COUNTA($A:$A),1)”
         
          dt = Timer
          For j = 1 To 100
            .Range(“G5”).Value2 = j
          Next j
          Debug.Print “OFFSET”, Timer – dt

          .Names.Add “MyVol”, “=$A$1:INDEX($A:$A,COUNTA($A:$A))”
         
          dt = Timer
          For j = 1 To 100
            .Range(“G5”).Value2 = j
          Next j
          Debug.Print “INDEX”, Timer – dt
        Next i

      End With

    End Sub

    On my machine, this gives the following results.

    OFFSET 2.00012499999866
    INDEX 3.19999999992433E-02
    OFFSET 1.9539999999979
    INDEX 3.18750000005821E-02
    OFFSET 1.92262499999924
    INDEX 1.64999999979045E-02
    OFFSET 1.92225000000326
    INDEX 3.21249999979045E-02
    OFFSET 1.98475000000326
    INDEX 3.16250000032596E-02
    OFFSET 1.8914999999979
    INDEX 1.63749999992433E-02
    OFFSET 1.92212499999732
    INDEX 3.19999999992433E-02
    OFFSET 1.92212499999732
    INDEX 3.16250000032596E-02
    OFFSET 1.9227499999979
    INDEX 0.015625
    OFFSET 1.93800000000192
    INDEX 3.18750000005821E-02

    IOW, OFFSET recalc time is 2 decimal orders of magnitude greater than INDEX’s. That’s because Excel doesn’t waste cycles recalculating the B1:B10 formulas calling INDEX because they don’t refer to cell G5, but Excel does recalc the B1:B10 formulas calling OFFSET even though their values don’t change due to recalc.

  3. Dicks timings are swamped by COUNTA.

    There is not much time difference between INDEX and OFFSET (apart from the volatility effect)

    Sub TestVolatile()

    Dim wb As Workbook
    Dim aWrite(1 To 65000, 1 To 1) As Double
    Dim i As Long
    Dim dTime As Double

    dTime = microtimer

    For i = LBound(aWrite, 1) To UBound(aWrite, 1)
    aWrite(i, 1) = i
    Next i

    Set wb = ActiveWorkbook
    wb.Sheets(1).Range(“A1:A65000?).Value = aWrite

    For i = 1 To 10

    wb.Names.Add “MyVol”, “=OFFSET($A$1,0,0,65000,1)”
    wb.Sheets(1).Range(“B1:B10?).Formula = “=SUM(MyVol)”
    dTime = microtimer
    Application.CalculateFull
    Debug.Print “OFFSET Calc “, microtimer – dTime

    wb.Names.Add “MyVol”, “=$A$1:INDEX($A:$A,65000)”
    wb.Sheets(1).Range(“B1:B10?).Formula = “=SUM(MyVol)”

    dTime = microtimer
    Application.CalculateFull
    Debug.Print “INDEX Calc “, microtimer – dTime
    Next i

    End Sub

    OFFSET Calc 8.16992606269196E-03
    INDEX Calc 8.23291971028084E-03
    OFFSET Calc 8.08453467470827E-03
    INDEX Calc 8.15172789589269E-03
    OFFSET Calc 8.09328378818464E-03
    INDEX Calc 8.21927109063836E-03

    and even that is probably swamped by SUM

  4. Isn’t this just a case of “use whatever fits best”
    index is OK if you don’t require dynamic column (or row) selection, or don’t go to the left or above your reference.

    Why doesn’t anyone use Address() to set up their dynamic ranges (like me)

  5. One of our main uses for OFFSET is for dynamic chart ranges. With one complex Name doing the hard work of selecting the (usually weekly rolling) data, OFFSET gives really simple formulae for the other ranges, effectively locks all the ranges to the initial range (usually the X values) and so is more reliable and not prone to error.

    I like to mix and match, using INDEX quite often (I learned it first) but for this application in charts I’d only want to use OFFSET – unless someone out there knows better?

  6. Dynamic Names are best set up using

    a) $A$1:Index($A:$A,Counta($A:$A))

    b) Even better is to define as name LastRow = Counta($A:$A) or any non blank column and then Use
    $A$1:Index($A:$A,LastRow) or $B$1:Index($B:$B,LastRow)

    c) Even better is to set a UDF called LastRow defined as
    Function LastRow()
    LastRow = Cells.Find(“*”, After:=Range(Cells(Rows.Count, Columns.Count), Cells(Rows.Count, Columns.Count)), SearchOrder:=xlByRows, SearchDirection:=xlPrevious).Row
    End Function

    And use this instead of CountA

    The UDF / Index combo will work the fastest.

  7. @Sam,

    You can omit the After argument from your LastRow statement… if omitted, VBA will use the first cell in the range by default (A1 for all the cells) and since the SearchOrder is xlPrevious, the search will automatically start from the bottom of the worksheet. Secondly, since Excel is “helpful” and will remember the argument settings from a previous run of the Find method, it is probably a good idea to specify the LookIn argument to make sure you find what you actually want to find… use xlFormulas if you want to find any filled cell, whether filled with a value or a formula (even a formula evaluating to the empty string), or use xlValues if you want to ignore formulas displaying empty strings and only search for cells with displayed values in them. So, you can shorten your statement to one of these…

    For last value or formula (even displaying empty string)
    ——————————————-
    LastUsedRow = ActiveSheet.Cells.Find(What:=”*”, SearchOrder:=xlRows, _
    SearchDirection:=xlPrevious, LookIn:=xlFormulas).Row

    For last displayed non-empty value only
    ——————————————-
    LastUsedCol = ActiveSheet.Cells.Find(What:=”*”, SearchOrder:=xlByColumns, _
    SearchDirection:=xlPrevious, LookIn:=xlValues).Column

  8. The robust equivalent for OFFSET(r,a,b,c,d) is (Excel 2003 and prior – change $1:$65536 to $1:$1048576 for Excel 2007 and later)

    INDEX($1:$65536,SUM(ROW(r),a),SUM(COLUMN(r),b))
    :INDEX($1:$65536,SUM(ROW(r),a+c-SIGN(c)+(c*d=0)*1E12),SUM(COLUMN(r),b+d-SIGN(d)))))

    I’m using SUM rather than +’s because ROW and COLUMN functions always return arrays, and that can screw up the INDEX calls. If x were a single value array, e.g., {12}, x+a would also be an array, but SUM(x,a) is just a number, not an array. The (c*d=0)*1E12 term ensures that this returns #REF! when either c or d equals 0.

    Anything one can do with the OFFSET expression one could also do with the INDEX:INDEX expression. The latter may recalculate unnecessarily when any cell in the worksheet changes, but it won’t recalc when nonprecedent cells in other worksheet change.

    Now if OFFSET were just being used in defined names and those names were only being used to specify series for charts, that’s a good and fitting use precisely because it would involve relatively few OFFSET calls. I figure the technically unnecessary recalc time would be dwarfed by the chart redraw time.

  9. I have this snippet from Harlan Grove:
    OFFSET(r,a,b,c,d) == INDEX(r,a+1,b+1):INDEX(r,a+c,b+d)
    The former is volatile, the latter isn’t. OFFSET can refer to ranges outside its first argument’s range, INDEX can’t.

  10. To add to Sam’s I’d alter it to:

    =INDEX(Sheet2!$A:$A,1):INDEX(Sheet2!$A:$A,LastRow)
    because $A$1:Index($A:$A,LastRow) yields a #REF error if row 1 is deleted. In practice my dynamic ranges have headers, so to start on row 2 I’d use

    =INDEX(Sheet2!$A:$A,2):INDEX(Sheet2!$A:$A,LastRow)
    Of course that can cause problems if all the non-header rows are deleted because it will refer to A1:A2, so I’d change the definition for LastRow to:
    =MAX(2, COUNTA(Sheet2!$A:$A))

  11. Dough…Nice technique to prevent # ref if the header row got deleted.

    if only MS simply had created a built in function called LastRow which would work like this

    =LastRow(A:A) to return a number indicating last non blank row in Col A (irrespective of blanks in between)
    =A1:LastRow(A:A) to return the Address of the last non blank row or A1

    off course till then there is one in MoreFunc.xll….

  12. To answer Ikkeman’s question, without doing any testing :) I would expect ADDRESS to be a lot slower. You first have to construct the address as a string, then use INDIRECT to interpret the string.

  13. Those are interesting solutions and substitutes for OFFSET. However, I think that we have to consider more than the volatility or non-volatility of a solution.

    1- While OFFSET might seem complicated to some users, I think it pales in comparison to many substitutes presented here.
    2- Longer formulas are more error prone and harder to read, not to mention the character limit that might become an issue.
    3- Slow recalculation is not a problem for many workbooks.

    I think we may blinded and biased by the context we work with.

    Personnally, I use OFFSET extensively, maybe too much, but that’s not an issue in most of the models I develop. So I don’t see why I should substitute it’s convenience with some hyper-robust solution twice the lenght.

    And you know what, I might have some sub-optimal code too.

    Sebastien

  14. In the past the only reason I have had to resort to Offset was at the point when a very large model broke the calculation tree limit and left behind that frustrating “Calculate” in the status bar instead of the normal “Ready”. By replacing large swathes of Indexes with the volatile Offset I have been able to force the model to demonstrate a completed calculation. I have often wondered whether this was actually an improvement or just simply aesthetically more pleasant result. In either case the models would work even though a basic F9 on the “Index” Scenario was insufficient to clear the Calculate message.

    Was there any real benefit other than appearance? (albeit at an accepted cost to both best practice and speed)

  15. This was mentioned on Simon’s blog, but another reason I sometimes prefer INDEX is that a cell that references another cell via OFFSET does not show up as a dependent in the auditing tools. Coming back to one of my own complicated workbooks a year later, Trace Dependents is a very useful tool for me, and it doesn’t work for OFFSET.

  16. @Jan.

    I’ve had the persistent ‘Calculate’ message left behind before in workbooks, a problem which I never understood. How did you know to convert to volatile functions (or that this would work)?

    Do you (or any one here really!) have any information about the calculation tree limit? Is it number of levels, overall size, or something else? In addition, are there ways to analyse the depth or size of your current calc tree for comparison?

  17. Showing completed calculations has its merits. From my perspective, OFFSET is used most often for dynamic chart ranges and interpolation. I have no problem with OFFSET used in chart ranges because there won’t be many OFFSET calls. OTOH, interpolation usually involves cell formulas like

    =IF(x<MIN(x_Range),INDEX(y_Range,1),IF(x>=MAX(x_Range),INDEX(y_Range,ROWS(y_Range)),
    FORECAST(x,OFFSET(y_Range,MATCH(x,x_Range)-1,0,2,1),OFFSET(x_Range,MATCH(x,x_Range)-1,0,2,1))))

    This can be done, arguably better, using nonvolatile udfs which look like

    =LININTERP(x,x_Range,y_Range)

    There are many implementations in the newsgroups. Much simpler/shorter/less error prone than the OFFSET approach, and Excel would only recalculate it when x, x_Range or y_Range change rather than at every recalc. When x_Range and y_Range are static and x changes rarely, recalc will be faster using such udfs than using OFFSET. The only down side is needing VBA.

  18. Why exactly IS Offset volatile?

    The most compelling reasons I can find are:
    http://blog.xlcubed.com/2011/08/warning-excel-can-get-volatile/
    Offset is volatile because, if it wasn’t then there is a danger that Excel would take so long to work out if it needs to be calculated that it might as well always calculate it.

    http://www.excelforum.com/excel-general/733923-moving-averages.html
    Is it because it can run into issues if you OFFSET back into a cell that doesn’t exist (e.g., OFFSET up 3 cells from A2)?

    …but it would be good to hear more on this. I can see how that first one might add a few milliseconds, but that’s hardly reason to introduce a much larger potential performance hit by making it – and everything that hangs off of it – recalculate.

    And in that second case, you can just as easily make an INDEX funciton return REF by deleting some of the range. So I’m not sure that’s it either.

    Any thoughts, anyone?

  19. I did see something at http://www.excelbanter.com/showthread.php?t=116045 that probably is the reason. To paraphrase,
    “The likely reason is that Excel probably bases dependencies on cell references. And a text like INDIRECT(“B1″) or the numerical arguments like OFFSET($A$1,1,1) would need to be evaluated to determine what the associated cell reference is.”

    So I guess that to make these non-volatile, MS would have to reprogram the calculation tree algorythim to determine what range INDIRECT and OFFSET refer to. What’s wrong with that approach, I wonder? Would remove far more bottlenecks worldwide than it would add, surely.

  20. I use OFFSET and Define Name to dynamically track the most recent 5 test results on a spreadsheet that expands over time… and that data is fed to rolling charts.
    =OFFSET(‘Test Results’!J6,1,COUNTA(‘Test Results’!$I$5:$DD$5)-5,1,-1)

    The problem I run into is when entering the first few results. Any time there are less columns (or rows) with data than you have indicated in your OFFSET formula, the formula will reference irrelevant cells or produce errors if it runs out of columns (or rows).
    For example, if my OFFSET is set to -5 (columns), but I only have 2 or 3 columns of data, my Offset formula will still try to reference 5 columns… so it will end up referencing column titles or whatever is to the left of my starting reference, or it will produce an error because there are no more columns remaining on the spreadsheet.

    Should I be using INDEX instead? Or maybe there’s some kind of parameter that tells OFFSET to stop at a specified column, or mixing in an IF, THEN formula? I feel like I’m missing something simple:/

  21. If you want to sum five columns to the left, you might have a formula like this

    =SUM(OFFSET(D9,0,-4,1,5))

    But that will return a #REF! error because four columns to the left of D is off the page. You can change the formula to

    =SUM(OFFSET(D9,0,MAX(-4,-COLUMN()+1),1,MIN(5,COLUMN())))

    This won’t let OFFSET go off the page.


Posting code? Use <pre> tags for VBA and <code> tags for inline.

Leave a Reply

Your email address will not be published.