Count Active Customers

Jake wants to know, given an active date and an inactive date, how to count the customers that were active in a certain time period.

The ones we want are highlighted in yellow. The formula is

=COUNT(D2:D21)-SUMPRODUCT(($C$2:$C$21>=D25)+($D$2:$D$21<=C25))

It's easier to figure out who is not active during that date range and subtract it from the total. The formula starts by counting everyone using the COUNT function. The SUMPRODUCT function is then subtracted from that. It gives the total of all the customers who became active after our End date plus the customers who became inactive before our Start date.

Note that this formula excludes the actual Start and End date. Customer 17 isn't included because we're really looking at 7/16/2010 to 7/24/2010. If you want the formula to be inclusive, simply change the <= and >= to < and >, respectively.

If you don't like SUMPRODUCT, you can get there with COUNTIF

=COUNT(D2:D21)-(COUNTIF(C2:C21,">="&D25)+COUNTIF(D2:D21,"<="&C25))

Sometimes it's easier to turn the problem around and figure out who's excluded.

25 thoughts on “Count Active Customers

  1. subtracting is definitely the way to go… but I’ve happily cut my bad habit of combining arithmetic & logical expressions in the same Excel formula ever since Excel 2007. Ever since, I’ve used the functions countifs, sumifs, and averageifs which are much more intuitive.

  2. If some cells in C2:C21 contained 7/25/2010, the formulas offered would include them as 1 in the SUMPRODUCT result. In that way the active period is treated as >= C25 but < C26. So the date labeled End is exclusive. This would not be what a typical user would expect (they’d expect bothstart and end dates to be inclusive).

    If Start and End dates were inclusive, it’s sufficient to test that the Active Date values fall between them. The Inactive Date values are irrelevant as long as Inactive Date values are strictly larger than Active Date values. And with moderate trickery this requires only 1 pass through the array.

    =SUMPRODUCT(–(ABS(C2:C21-(C25+D25)/2)=(D25-C25)/2))

  3. @#$%&*! blog parser!

    Let me try that formula again.

    =SUMPRODUCT(–(ABS(C2:C21-(C25+D25)/2)<=(D25-C25)/2))
  4. DK:

    1) Please consider posting the data in a table or attach a file. It’s a PITA to recreate the data set! :(

    2) As long as active date is always < = inactive date, define an active customer as active date < = end date (use < to exclude the start date) and inactive date > = (or > ) start date. This, of course, is the opposite of the definition of an inactive customer, which is what you used.

    3) I would use a column to flag a customer as active / inactive. It serves multiple purposes:
    * Sum that column to get the total number of active customers.
    * Use the flag as the conditional format rule. This makes the c.f. and the numeric totals reference the same criterion/criteria.
    * Finally, it’s an easy way to validate the formula.

    fzz:

    That’s a nice way to check if a number is within a range. :)

    There’s a difference in definitions. You’ve defined a customer as active if the active date is between the start and end dates. But, from DK’s post and my experience in such scenarios, the more common definition is a customer is active if

    s/he became active before the start date and was active on the start date
    or s/he became active after the start date but before the end date

    To see that your definition does not match Dick’s, check row 3. (C25+D25)/2 is 7/20/2010. (D25-C25)/2 is 5. So, C3 – 7/20/2010 is 16 which is not < = 5. But, row 3 should be included in the active count.

  5. I discussed a similar problem a couple of months ago:
    http://roymacleanvba.wordpress.com/2010/06/08/sub-sequence-iterator-3/

    Here, the start-event and end-event are held as separate records, with records being appended to the table in date order. This simplifies data entry in a large table, since you don’t have to find the relevant record to add its end date – you just append. Anyway, I went for a VBA ‘cursor’ solution, which I think offers interesting possibilities. See also Sebastien Labonne’s comment, which is similar to your solution.

    I think that there are interesting issues with ‘temporal’ data like this. My wife deals with this sort of thing as a ‘Business Intelligence’ consultant. For example, an employee can hold different posts, successively, in an organization; holiday, sickness, etc, are booked against post, not employee. That’s fine until, temporarily, an employee holds two posts at the same time; do you end up with double-counted sickness, etc, and how do you correct this?

    /Roy

  6. Dick…you can improve your formula more by using named ranges :-)

    =COUNT(active_date)-SUMPRODUCT((active_date>=end)+(inactive_date<=start))
  7. I think this formula should also do the trick:

    =SUMPRODUCT((active_date>=(start-(inactive_date-active_date)))*(inactive_date<end+(inactive_date-active_date)))

    …but then I’ve been staring at this for such a long time that my wife has stopped talking to me. Come to think of it, when exactly did she stop talking to me…could have been years ago, in hindsight.

    Note that the second half has only a less than sign after the inactive_date range, to emulate Dick’s example. Personally I’d include a ‘less than or equal to’ clause here, to flag customer no. 17 (Sto Plains Holdings) given they started on your end date.

  8. I was wrong. If the Active Date is well before the Start date, it’s necessary to look at both start and end dates. However, I think directly counting active accounts is better than subtracting inactive accounts from all accounts. Since Start and End dates are cells, it’s possible to use

    =SUMPRODUCT((ActiveDate<=End)*(InactiveDate>Start))

    This raises the question how typical users would believe the record for customer 17 (in row 18) should be handled. I still hold that most users would expect that record to be included as active because it was active on 7/25 (the End date) and most users would expect the end date to be inclusive. The formula above is for inclusive Start and End dates. If the End date were exclusive, the formula would become

    =SUMPRODUCT((ActiveDate<End)*(InactiveDate>Start))
  9. fzz: Concur that the formula should be inclusive rather than exclusive. The only reason my example was exclusive is because I already took my screen shot before I noticed it. It was easier to explain it than to take another screen shot. The epitome of laziness.

  10. Here’s another approach that I think works, using a very different approach:

    {=SUM(- -(IF(end-active_date>inactive_date-start,end-active_date,inactive_date-start)<(inactive_date-active_date+end-start)))}

    It’s built up from this:

    =MAX(end-active_date, inactive_date-start)>(inactive_date-active_date+end-start)

    . How this works: Say you’ve got a pair of pants, and a shirt. The pants are 80cm long, and the shirt is 60cm long (sorry, too lazy to convert to inches). If you put these on, and the distance from the top of the shirt to the bottom of the pants is less than 140 cm (80cm + 60cm), then your shirt and pants must overlap. If the distance is more than 140cm, then people can see your belly button.

    The bit of the formula MAX(end-active_date, inactive_date-start) acconts for the fact that you may put your pants on your top half and your shirt on your bottom half. Basically, it measures the distance from the highest bit of clothing to the lowest bit.

  11. I anted to use SUMPRODUCT IF to use this on a conditional formula. For example, I want to know out of that total count, how many pertain to “client 1” in the vent event that Client 1 had several entries. Any advice on this?

  12. If the name of the client is in B28

    =COUNTIF($B$2:$B$21,B28)-SUMPRODUCT(($B$2:$B$21=B28)*(($C$2:$C$21>D$25)+($D$2:$D$21<=C$25)))

  13. Here’s the data set I used. And Ajax counted to 3.

    Cust No Cust Name Active Date Inactive Date
    1 Monks Diner 8/1/2010 8/18/2010
    2 ZiffCorp 7/4/2010 7/26/2010
    3 Ajax 7/7/2010 8/2/2010
    4 Cogswell Cogs 7/18/2010 8/16/2010
    5 ZiffCorp 8/9/2010 8/19/2010
    6 ZiffCorp 7/21/2010 8/22/2010
    7 Ajax 7/31/2010 8/30/2010
    8 Cogswell Cogs 7/12/2010 8/19/2010
    9 Ajax 8/11/2010 8/25/2010
    10 Monks Diner 7/24/2010 8/26/2010
    11 Cogswell Cogs 7/21/2010 8/24/2010
    12 Monks Diner 7/26/2010 8/17/2010
    13 ZiffCorp 7/8/2010 7/14/2010
    14 ZiffCorp 7/28/2010 8/29/2010
    15 Ajax 7/15/2010 8/19/2010
    16 Monks Diner 8/13/2010 8/24/2010
    17 Ajax 7/25/2010 8/7/2010
    18 Cogswell Cogs 7/2/2010 7/24/2010
    19 Cogswell Cogs 7/11/2010 7/25/2010
    20 ZiffCorp 7/11/2010 8/23/2010
  14. Is there a way to total how many minutes overlapped? Rather than just count if they overlapped or not.

  15. You want a sum of all overlapping minutes? Or you want to count only those that overlapped a certain/minimum/maximum number of minutes?

  16. If column D is greater than D25, use D25, else use column D. If column C is less than C25, use C25, else use column C. Subtract the two. If the total is less than zero, use zero, else add up the result. Multiply by 24 hours in a day and 60 minutes in an hour.

    That’s one ugly formula.

    Also, it’s an array formula so you have to enter by press Ctrl+Shift+Enter, not just enter.

  17. Following up on the previous comments. I need a formula that sums the minutes that DO NOT overlap. Is this possible? I’ve been trying to adjust the previous formula you sent to do this but I can’t seem to get it working properly.

  18. What does it mean to not overlap? Not overlap with any other entry? Can you give a short example?

  19. Sorry for the lack of specificity! My goal is to sum the total number of minutes that a person is logged into computer OUTSIDE of their allotted time and bill them for those minutes. I have an array of start and end times that they are exempt from being billed because they are able to schedule times in advance and would not be billed for those. But anytime they are logged in outside of those times they had previously scheduled, they would be billed.

    For example, a person schedules the following times to use the computer in the month of December:
    12/1/2019 23:00 to 12/2/2019 08:00
    12/12/2019 11:00 to 12/12/2019 12:00

    And they actually use the computer during the following times:
    12/1/2019 23:45 to 12/2/2019 08:06 – they should be billed for 6 minutes of use
    12/3/2019 09:00 to 12/3/2019 09:30 – They should be billed for 30 minutes of use

    Hope this clears things up!

  20. Columns F:H make three separate calculations. Column J puts all three into one monster formula.

    F2: =SUMPRODUCT((D2<$A$2:$A$3)*(E2>$A$2:$A$3)*($A$2:$A$3-D2))*24*60

    This says if Actual Start is less than Scheduled Start and Actual End is greater than Scheduled Start, return the difference between Scheduled Start and Actual Start. This is billable minutes before the scheduled time.

    G2: =SUMPRODUCT((E2>$B$2:$B$3)*(D2<$B$2:$B$3)*(E2-$B$2:$B$3))*24*60

    This says if Actual End is greater than Scheduled End and Actual Start is less than Scheduled End, return the difference between Actual End and Scheduled End. This is billable minutes after the scheduled time.

    H2: =(SUMPRODUCT(((D2<$A$2:$A$3)*(E2<$A$2:$A$3))+((D2>$B$2:$B$3)*(E2>$B$2:$B$3)))-(COUNT($A$2:$A$3)-1))*(E2-D2)*24*60

    This says if Actual Start and End are before Scheduled Start OR Actual Start and End are after Scheduled End, count them up. If that count is the same number as schedules, then there is no overlap with any scheduled time, so add up the total minutes.

    Everything is multiplied by 24 hours per day and 60 minutes per hour to convert to minutes.

    The monster formula is:

    J2: =(SUMPRODUCT((D2<$A$2:$A$3)*(E2>$A$2:$A$3)*($A$2:$A$3-D2))+SUMPRODUCT((E2>$B$2:$B$3)*(D2<$B$2:$B$3)*(E2-$B$2:$B$3))+(SUMPRODUCT(((D2<$A$2:$A$3)*(E2<$A$2:$A$3))+((D2>$B$2:$B$3)*(E2>$B$2:$B$3)))-(COUNT($A$2:$A$3)-1))*(E2-D2))*24*60


Posting code? Use <pre> tags for VBA and <code> tags for inline.

Leave a Reply

Your email address will not be published.