Summing the Digits of a Number

Summing the digits of a number is a chore I’ve been doing alot lately. Originally I’d parse the number out over the columns. And since SUM() ignores text, I’d turn the characters into digits by applying an arithmatic identity operation, like this:

  • =- -MID($A10,COLUMN(),1)

That’s double minus signs before the MID() function. The reasons for picking that identity operation are here at XLDYNAMIC’s website, about half-way down.

And filling right. But if the numbers were of uneven length, filling down would throw a #VALUE! error for all but the longest number. Contrary to what the Help advises, I find that SUM() does not ignore error values. So I was ending up with this formula so I could fill down:

  • =IF(ISERR(–MID($A10,COLUMN(),1)),0,–MID($A10,COLUMN(),1))

That’s double ugly, and a cell-eater to boot. I did a Google search and found Microsoft Knowledge Base article 214053 on this topic. Here’s what it says:

Formula 1: Sum the Digits of a Positive Number
To return the sum of the digits of a positive number contained in cell A10, follow these steps:

  1. Start Excel 2000.
  2. Type 123456 in cell A10.
  3. Type the following formula in cell B10:
    =SUM(VALUE(MID(A10,ROW(A1:OFFSET(A1,LEN(A10)-1,0)),1)))
  4. Press CTRL+SHIFT+ENTER to enter the formula as an array formula.
  5. The formula returns the value 21.

Ignoring Step 1, I looked at Step 4 and thought, from hanging around here, that we can do better. But to do better, let’s first look at the formula from the inside out. OFFSET() returns a reference one row less then A10 is long (more on OFFSET() later). ROW() then returns an array of row numbers starting from 1 (the row of A1–It’s the 1 that’s important, not the A) to the bottom of the offset. The array has as many elements as the length of the number in A10. MID() then creates an array of each digit as text. VALUE() turns the text into numbers, and then SUM(), array entered, sums the array of values.

While my formula was double-ugly, this one is just ugly. To impove it, from the outside in:

  1. Replace SUM() with SUMPRODUCT(). The formula no longer has to be array-entered, and it works just as well.
  2. Replace VALUE() with the double minus
  3. Instead of using LEN(A10)-1 as a row offset, use LEN(A10) as a height parameter.
  4. Make the reference to A1 absolute with respect to row, allowing fill-down.

The new formula is:

  • =SUMPRODUCT(- -MID(A10,ROW(OFFSET(A$1,,,LEN(A10))),1))

Much prettier, and even not counting curly-braces, two characters shorter. The Knowledge Base goes on to give this as the formula for summing the digits of a negative number:

  • =SUM(VALUE(MID(A11,ROW(A2:OFFSET(A2,LEN(A11)-2,0)),1))) also array-entered.

This is the better version, simply entered:

  • =SUMPRODUCT(–MID(A11,ROW(OFFSET(A$2,,,LEN(A11)-1)),1))

You have to start the array at 2 (via A$2) to skip the negative sign, and then also shorten the length by one for the same reason. This one is the same length as Microsoft’s. If you want one formula for all numbers, this one has no counterpart in the Knowledge base:

  • =SUMPRODUCT(- -MID(ABS(A11),ROW(OFFSET($A$1,,,LEN(ABS(A11)))),1))

It uses the absolute value ABS() function for the obvious reason. It only works for true numbers. It will not handle long text strings as numbers, such as you may have for credit cards or international phone numbers. For those, either use the earlier one, or use SUBSTITUTE(A11,”-“,””) in place of ABS(A11). Now we’re the ones getting getting ugly.

…mrt

Posted in Uncategorized

27 thoughts on “Summing the Digits of a Number

  1. Here’s another one (array entered):

    =SUM(IFERROR(MID(A1,ROW(INDIRECT(“1:”&LEN(A1))),1)*1,0))

    It ignores everything except the numeric characters in the cell. It uses IFERROR, so it works only with Excel 2007 or later.

    I’ve always thought of this as a solution in search of a problem. What do you use it for. Michael?

  2. THis array formula works pre-2007 as well for mixed text and number strings

    =SUM(IF(ISNUMBER(–MID(A1,ROW($1:$256),1)),–MID(A1,ROW($1:$256),1),0))

  3. I’d be tempted to use Longre’s MOREFUNC add-in, as in

    =EVAL(REGEX.SUBSTITUTE(REGEX.SUBSTITUTE(x,”D”,””),”(.)”,”+[1]”))

    To sum only the decimal numerals in an arbitrary string, use

    =SUMPRODUCT(LOOKUP(MID(E1,s,1),d&””,N(d)))

    where s is defined as =ROW(INDEX($1:$6656,1,1):INDEX($1:$6656,255,1)) and d is defined as ={“”,1,2,3,4,5,6,7,8,9,”A”}.

  4. or you could use a VBA function. i don’t understand the obsession with pure formula.

  5. I agree with Rob, a UDF will work better and probably faster:

    Public Function SumDigits(Number As Variant) As Variant
    Dim aByte() As Byte
    Dim j As Long

    aByte = CStr(Number)
    For j = LBound(aByte) To UBound(aByte) – 1 Step 2
    SumDigits = SumDigits + Val(Chr(aByte(j)))
    Next j
    End Function

  6. Well, I just timed it on XP SP3, Excel 2003 SP3, and sure your UDF is faster than my formula, and probably faster than Walk’s, but ours caters for non-numeric characters. If the source is all digits as I read Micheal’s post then the last formula that he gives is faster than the UDF by some 7%.

  7. Declaring the SumDigits function as a Double and using Chr$ instead of Chr makes it about 10% faster.

  8. Hi John –

    Well, the thought was to create a Checksum number for data being created as an input file to a server. What it does in reality is say that one of these two numbers is wrong. And since that hasn’t happened yet, it is still a solution seeking a problem ;-) We’ve got belts and suspenders. How often does an output file get written wrong or an input file read wrong these days?

    Rob – amplifying Dick K’s MVP remarks about new users, I provide corporate XL support, and I find there are users who steer clear of UDF’s. That’s a plateau in their learning still in their future. They have to change their security setting, learn about modules, etc. But give them a formula, and they’re all set.

    …mrt

  9. “But give them a formula, and they’re all set.”
    Until they need to do something slightly different, and your services are again required.

    To me, this is nothing more than “Look Ma, No Macros”.
    I think it’s clever, but unsuitable for business.

  10. Rob sez: “i don’t understand the obsession with pure formula.”

    If you use a formula, you can be assured that it will actually work. As you know, Microsoft wants people to fear VBA macros and does everything possible to make the user disable macros when a file is opened. And, if you’re doing thousands of such calculations, it’s likely that there will be a noticeable performance hit if you use VBA functions rather than formulas.

  11. John – you say “you can be assured that it will actually work”. You mean “work” as in give the correct answer?

    Which version of Excel are you using, and where can I get a copy? :)

    One of the reasons I prefer a UDF to an array with convoluted logic is because it is easier to see how they work, and thus easier to see what’s going wrong, when something goes wrong.

    FWIW, I’ve posted my effort at this problem and the Sum-skip problem at: http://newtonexcelbach.wordpress.com/2009/10/05/three-udfs/

  12. Doug – and you say “You mean “work” as in give the correct answer?”

    No, he means it doesn’t work because the user has a low security setting or doesn’t understand what enabling macros means so doesn’t, or doesn’t work because the IT department don’t allow macros/scripts.

  13. The UDF also handles non-numeric characters.
    The version below is about 1/3 faster and is faster than the numerics-only SUMPRODUCT formula, but both are probably fast enough for most real-life purposes.

    I use the Byte array technique for verifying Check Characters, since many check character algorithms are based on non-numeric characters, and its fast enough to handle large ranges.

    Putting UDFs into an XLA or Automation adding packaged with an Installer generally gets around the end-user’s lack of understanding about Macro warnings (BTW when was the last time anyone saw a Macro virus?), but does not solve the problem with paranoid IT departments.

    Public Function SumDigits2(Number As Variant) As Double
        Dim aByte() As Byte
        Dim j As Long
        On Error GoTo Fail
        aByte = CStr(Number.Value2)
        For j = LBound(aByte) To UBound(aByte) – 1 Step 2
            If aByte(j)  47 And aByte(j)  58 Then
                SumDigits2 = SumDigits2 + aByte(j) – 48
            End If
        Next j
        Exit Function
    Fail:
        SumDigits2 = -1
    End Function
  14. Good morning, Charles –

    Yep, there is an elusive VB-tag bug. It treats less-than/greater-than pairs in that order as HTML markers and ignores/throws away everything between.

    In your case, it just discarded the not-equals I think. In other instances, when the less-than is lines away from the greater-than, it will shorten your code for you. Quite helpful that way. ;-)

    The workarounds are to use things like LT, LE, GT, GE and !=, or post the code not inside the VB tags. I always give a warning about angle-bracket substitutions, and trust the IDE to flag those instances. A pita, I know. It doesn’t show up in normal posts.

    ..mrt

  15. one more: =SUM(LEN(SUBSTITUTE(A1,{1,2,3,4,5,6,7,8,9,””},””))*-{1,2,3,4,5,6,7,8,9,-45})

    don’t know how it compares in speed, but is efficient in that it doesn’t depend on other ranges so only recalculates when A1 changes. lhm

  16. Lori: That’s slick. It works with positive and negative numbers, with an empty cell, and with decimal numbers. It evens adds the numbers in a string that contains non-numeric characters. Of course, whether that’s good or bad is not clear…

  17. If you are willing to accept an error being generated if a cell does not contain a real number (whether as text or as an actual number), 12AB34 as an example, and if you textual numbers are no more than 29 significant digits long, then this one-liner UDF should work…

    Function SumDigits2(S As Variant) As Long
    If S “” Then SumDigits2 = Evaluate(Format(Replace(S, “.”, “”), “0? & Replace(String(28, “x”), “x”, “+0?)))
    End Function

  18. Let’s try that formula again, this time using .ne. in place of the “not equal” sign (less than symbol followed by greater than symbol) that the Comment Processor “ate” (so if copy/pasting this, make the substition back to the “not equal” sign)…

    Function SumDigits2(S As Variant) As Long
    If S .ne. “” Then SumDigits2 = Evaluate(Format(Replace(S, “.”, “”), “0? & Replace(String(28, “x”), “x”, “+0?)))
    End Function

  19. Function SumDigits2(S As Variant) As Long
    If S NEQ “” Then SumDigits2 = Evaluate(Format(Replace(S, “.”, “”), Replace(string(len(s),“x”), “x”, “+0”)))
    End Function
  20. Hans,

    Your function will not be able to handle more than 29 total (text) digits because of a limitation in the Format function… it appears to use the Decimal sub-type of the Variant data type to process its first argument and the Decimal sub-type is limited to a maximum of 29 digits total digits providing there is no decimal point (which is the reason for the first Replace function call). For example, try this macro out with your function and you will see the problem (displayed in the Immediate Window)…

    Sub test()
      Dim X As Long, TestNumber As String
      For X = 1 To 3
        TestNumber = “123456789012345678901234567” & Mid(“890”, 1, X)
        Debug.Print “Sum “ & TestNumber & ” digits is “ & SumDigits2(TestNumber)
      Next
    End Sub

    The first two digit sums will be correct but the third one (where only a zero is being affixed to the end of the second one) will be obviously incorrect. However, I do note your elimination of the concatenation of the leading zero in favor of including it inside the second Replace function call is a good way to simplify the function further…

    Function SumDigits2(S As Variant) As Long
      If S  “” Then SumDigits2 = Evaluate(Format(Replace(S, “.”, “”), Replace(String(29, “x”), “x”, “+0”)))
    End Function
  21. Arrgh! I did it again… I forgot about the “not equal” sign problem…

    Function SumDigits2(S As Variant) As Long
      If S  “” Then SumDigits2 = Evaluate(Format(Replace(S, “.”, “”), Replace(String(29, “x”), “x”, “+0”)))
    End Function
  22. What! I did it again!! Okay, I changed the test so that I can’t screw it up again ;-)

    Function SumDigits2(S As Variant) As Long
      If Len(S) Then SumDigits2 = Evaluate(Format(Replace(S, “.”, “”), Replace(String(29, “x”), “x”, “+0”)))
    End Function
  23. Rick,

    to avoid limits and ‘impossible’ tests:

    Function SumDigits2(S As Variant) As Long
     c1 = Replace(String(29, “+”), “+”, “+0”)
     For j = 1 To Len(Replace(S, “.”, “”)) Step 29
       SumDigits2 = SumDigits2 + Evaluate(Format(Mid(Replace(S, “.”, “”), j, 29), c1))
     Next
    End Function
  24. I wrote my original function more as an interesting case because it was a one-liner solution. If you want to go to multiple lines and, especially, a loop solution, then I would use the code Charles Williams posted earlier… the Byte array method would be among the fastest possible loop solutions that could be constructed.


Posting code? Use <pre> tags for VBA and <code> tags for inline.

Leave a Reply

Your email address will not be published.