Filter and Sort Classes

Following up on Finding Comparable Objects and Filling Classes and Finding Specific Instances, today we look at filtering collection classes and putting the results into a new instance and sorting collection classes. For reference, here’s the controlling sub:

Sub GetComperablesByState()
   
    Dim clsCompanies As CCompanies
    Dim clsWisky As CCompanies
    Dim vaOutput As Variant
    Dim clsFocus As CCompany
   
    ‘Fill all companies
   Set clsCompanies = New CCompanies
    clsCompanies.Fill Sheet1.Range(“A2:D201”)
   
    ‘identify the one I care about
   Set clsFocus = clsCompanies.FindByName(“Monarch”)
   
    ‘get only those companies in the same state
   Set clsWisky = clsCompanies.FilterByState(clsFocus.State)
   
    ‘sort them by sales
   clsWisky.SortBySales
   
    ‘get an array of the 3 below and 3 above
   vaOutput = clsWisky.WriteComparables(clsFocus, 3)
   
    ‘write it to a range
   Sheet1.Range(“G1”).Resize(UBound(vaOutput, 1), UBound(vaOutput, 2)).Value = vaOutput
   
End Sub

The FilterByState property takes a subset of clsCompanies (all the companies) and puts them into a different instance of CCompanies called clsWisky. FilterByState takes one String argument for the state on which to filter.

Public Property Get FilterByState(sState As String) As CCompanies
   
    Dim clsReturn As CCompanies
    Dim clsCompany As CCompany
   
    Set clsReturn = New CCompanies
   
    For Each clsCompany In Me
        If clsCompany.State = sState Then
            clsReturn.Add clsCompany
        End If
    Next clsCompany
   
    Set FilterByState = clsReturn
   
End Property

There’s nothing too complicated about this one. First, a new CCompanies variable is created (clsReturn) that will hold all the CCompany instances that have the correct State property. I loop through all the companies and compare the State property to my argument. If there’s a match, that CCompany instance is added to clsReturn. At the end, clsReturn is assigned to FilterByState (the property name) so that it is returned to the calling procedure. Back in the controlling procedure clsWisky will consist of the eight CCompany instances that have a State property of “WI” (the State that Monarch is in).

Next I sort clsWisky by the Sales property.

Public Sub SortBySales(Optional lLow As Long, Optional lHigh As Long)
         
          Dim clsPivot As CCompany
          Dim clsSwapLow As CCompany, clsSwapHigh As CCompany
          Dim lTempLow As Long
          Dim lTempHigh As Long
           
10        If lLow = 0 Then lLow = 1
20        If lHigh = 0 Then lHigh = Me.Count
         
30        lTempLow = lLow
40        lTempHigh = lHigh
             
50        Set clsPivot = Me.Company((lTempLow + lTempHigh) 2)
         
60        Do While lTempLow < = lTempHigh
         
70            Do While Me.Company(lTempLow).Sales < clsPivot.Sales And lTempLow < lHigh
80                lTempLow = lTempLow + 1
90            Loop
             
100           Do While (clsPivot.Sales < Me.Company(lTempHigh).Sales And lTempHigh > lLow)
110               lTempHigh = lTempHigh – 1
120           Loop
             
130           If lTempLow < lTempHigh Then
140               Set clsSwapLow = Me.Company(lTempLow)
150               Set clsSwapHigh = Me.Company(lTempHigh)
160               mcolCompanies.Remove lTempHigh
170               mcolCompanies.Remove lTempLow
180               If lTempLow > Me.Count Then
190                   mcolCompanies.Add clsSwapHigh, CStr(clsSwapHigh.CompanyID)
200               Else
210                   mcolCompanies.Add clsSwapHigh, CStr(clsSwapHigh.CompanyID), lTempLow
220               End If
230               mcolCompanies.Add clsSwapLow, CStr(clsSwapLow.CompanyID), , lTempHigh – 1
240           End If
         
250           If lTempLow < = lTempHigh Then
260               lTempLow = lTempLow + 1
270               lTempHigh = lTempHigh – 1
280           End If
             
290       Loop
         
300       If (lLow < lTempHigh) Then Me.SortBySales lLow, lTempHigh
310       If (lTempLow < lHigh) Then Me.SortBySales lTempLow, lHigh

End Sub

I started with this QuickSort algorithm and adapted it to my needs. I’ll be the first to admit that sorting beyond a bubble sort takes a lot of brain cycles for me. QuickSort is recursive, i.e. it calls itself. It works by finding the middle element (the pivot) of a list (a collection in our case) and putting all lesser values before the pivot and all greater values after it. After the first pass, the collection is treated as two lists: the part that’s less than the pivot and the part that’s more. Each of those two subsets are then sorted individually in the same manner.

For example, given the list {5,3,2,1,4}, the sort goes like this:

1. {1,2,4,5,3} The middle element (2) is found and every element that is greater is put after it and every element that is less is put before it. Going through the elements, 5 is greater so put after, 3 is greater so put after, 1 is less so put before, 4 is already after so leave it. Now we have two subsets that need to be sorted individually {1} and {4,5,3}.

1.1 {1} One element so don’t do anything

1.2 {4,3,5} Find the middle element (5). Four is less so leave it before 5. Three is less, so move it before 5. That leaves only a list before 5 {4,3} and nothing after.

1.2.1 {3,4} The “middle” element was 4. Three is less, so move it before 4.

By continually dividing my list and sorting the subsets, I get a fully sorted list at the end. I’m not sure this graphic really helps, but here it is anyway.

Back to the SortBySales method. If no lLow and lHigh are passed into the method (the first pass), then they are set to 1 and Me.Count, respectively representing the whole list (lines 10 and 20). In line 50, I use integer division to find the middle element. In lines 70-90, I start looping through the elements and find the first element whose sales are greater than clsPivot’s. I do the same thing in 100-120 to find the first element whose sales are less. Assuming I find two elements, those two are swapped putting the lesser before clsPivot and the greater after.

To swap objects in a collection, I store the two elements in variables (140-150), remove them both from the collection (160-170), then add the stored variables back into the collection in the proper place (180-230). In 250-280, I move my TempLow one to the right and TempHigh one to the left. TempLow will be the lower bound of my “greater than” subset (Pass 1.2) and TempHigh will be the upper bound of my “less than” subset (Pass 1.1). Finally (300-310), I run the procedure again with my new boundaries and keep going until TempHigh is as low as it can go and TempLow is as high as it can go.

At this point in my controlling procedure, I have the following variables initialized:
clsCompanies – a CCompanies instance containing all of the companies in the same order as they are on the spreadsheet
clsFocus – a CCompany instance containing the first company with “Monarch” in the name
clsWisky – a CCompanies instance containing all companies with a State property of WI, sorted by the Sales property

In the final installment, I’ll create an array from clsWisky and write it to a range.

You can download SortFilterClass.zip

Posted in Uncategorized

8 thoughts on “Filter and Sort Classes

  1. Thank you once again for a very helpful and timely post. Your posts seem to arrive right in time for me to make use of them! I adapted your sort algorithm to sort a class I have which contains roughly 250 items. Unfortunately, when I run it, I get the run time error 28 that I am out of stack space. I have 8 gigs of ram on my computer, so it boggles my mind that I’d be out of anything! My reading online seems to indicate it may be due to recursive functions such as this one making too many calls.

    Is there some fix to this? Or at least some guidelines as to how many items I can use this function on before running into this problem?

    Thanks so much!
    Michael

  2. Wow 250? That’s less than I thought. Memory isn’t necessarily the problem. First, Excel can only address so much (2GB?) so you can add all you want and it won’t help. Since QuickSort is recursive, Excel is holding every function call (and all the objects) in memory until it’s completely done. If you’re QuickSorting integers, you can sort a lot more items than if you’re doing classes because integers take up so much less space.

    You should probably do a BubbleSort, which is not recursive. The irony is that BubbleSort is slower than QuickSort. So QuickSort is better for larger data sets – except that you run out of stack space. Here’s a BubbleSort algorithm that you can adapt

    Public Sub BubbleSortBySales()
       
        Dim i As Long, j As Long
        Dim clsTemp As CCompany
       
        For i = 1 To Me.Count – 1
            For j = i To Me.Count
                If Me.Company(j).Sales < Me.Company(i).Sales Then
                    Set clsTemp = Me.Company(j)
                    mcolCompanies.Remove (j)
                    mcolCompanies.Add clsTemp, CStr(clsTemp.CompanyID), i
                End If
            Next j
        Next i
       
    End Sub
  3. Dick,
    Great series of posts. I’ve been able to immediately put to work many of the object-oriented aspects I’ve been learning from you and Rob. Objects are making my programming life much better. Many thanks from a grateful reader.

    Michael,
    I tested Dick’s code with 1000 companies and ran it without a problem. I’m guessing that you’ve got a bug in your code which is preventing the recursive calls from properly terminating. I believe that the maximum size of the stack should be around log2(Count). So with 200 companies the maximum stack size should be roughly 8. I don’t know just how big Excel’s maximum stack size is, but I do believe it’s well over a thousand. Memory shouldn’t be an issue, since the objects are sorted in place and you really only need extra memory for the temp object.

    Hope this helps,
    Josh

  4. Scratch what I said about the maximum stack size of 8 – not sure what I was thinking. In any case, you still shouldn’t be anywhere near Excel’s recursion limit.

    Josh

  5. Thanks JoshG. I just generated 1000 entries in the example workbook and ran this code

    Sub testsort()
       
        Dim clsComps As CCompanies
        Dim clsComp As CCompany
       
        Set clsComps = New CCompanies
        clsComps.Fill Sheet1.Range(“A2:D1001”)

        clsComps.SortBySales
       
        For Each clsComp In clsComps
            Debug.Print clsComp.CompanyID, clsComp.Sales
        Next clsComp
       
    End Sub

    and got no error. Micheal, if you want to send me what you have at dkusleika@gmail.com, I’ll be happy to take a look at it.

  6. Consider using an ADO recordset as a container object, which has Filter and Sort methods built in and support filtering/sorting on multiple attributes too.

  7. Sorry for my slow reply after all of your very helpful and quick responses. I’ve figured out a bit of my problem. I am not trying to sort the company classes he very properly created. Instead I am applying this to a personal project I’ve been working on. My classes have filled the stack before (and I forgot until re-reading your comments!) because they are a bit recursive in their instantiation. So simply creating one of my classes adds quite a bit to the stack. I am quite certain it is a design flaw of my project, but I do not know how to get around it.

    Here are how my classes are structured. I am coding a project relating to class schedules for our department (I’m a mathematics professor and not a coder–I know, I should stick to my area of real expertise. But this is a fun hobby). I have a main class called “Schedule.” It’s members are collections of Instructors, Courses, and few other pieces of information like a calendar and such. So the Schedule class creates collections of Instructors and Courses. When it initializes the Course classes, it (surprise!) assigns an Instructor to the course. So the Course class has an Instructor collection as a member. And I add the Instructor to that collection. But guess what? The Instructor teaches the Course, so the Instructor has as a member a collection of Courses the Instructor is teaching. So I add the Course to the collection of Courses for the Instructor. So do you see my predicament? The Course has an Instructor which has Courses which have Instructors who have Courses and so on… So in the end I “could” write expressions like

    theSchedule.Instructors(1).Courses(1).Instructors(1).Courses(1).OtherCourseProperty

    Now I never reference that deeply. But I do on occasion do one of the following

    theSchedule.Instructors(1).Courses(1).OtherCourseProperty
    theSchedule.Courses(1).Instructors(1).OtherInstructorProperty

    So I suspect the root of my stack overflow isn’t the sorting, but the instantiation and structure of my classes being too recursive. Any advice here?

    Thanks again for your very helpful feedback and blog. It is wonderfully written for even an outsider like me to benefit from!

    Michael


Posting code? Use <pre> tags for VBA and <code> tags for inline.

Leave a Reply

Your email address will not be published.