Subjective Sorting

I have a list of objects. Oh, I don’t know, let’s say it’s the top 5 grossing movies from 1988.

I want to rank these movies by how much I like them. To do this, I want to compare only two movies at a time until they’re in the proper order. Any comparison sort should do the trick and I like QuickSort. With five items, I really don’t need to compare two at a time as I can easily do the comparison in my head. But, as usual, there’s more to the story. I want it to work if the list is 100 items. That’s not so easy to do in your head. It’s also not so easy to do two at a time because that’s a lot of comparing.

Another element to this is that I want to crowdsource it. I don’t want to make all the comparisons myself. Rather, I want to do a few comparisons and have other people do a few comparisons. In the end, I would have enough comparisons to see the groups preference. This is where QuickSort really falls down. It’s no good for only doing partial comparisons or for having more than one judgment in the mix. But it’s a worthy exercise, so I did it.

Sub RankQuickSort()
   
    Dim aItems() As Variant
    Dim rList As Range
    Dim rCell As Range
    Dim i As Long
   
    Set rList = wshList.Range("B1", wshList.Range("B1").End(xlDown))
   
    ReDim aItems(1 To rList.Cells.Count)
   
    For Each rCell In rList
        i = i + 1
        aItems(i) = rCell.Value
    Next rCell
   
    QuickSort aItems, LBound(aItems), UBound(aItems)
   
    For i = LBound(aItems) To UBound(aItems)
        Debug.Print aItems(i)
    Next i
   
End Sub

Public Sub QuickSort(ByRef vArray As Variant, lLow As Long, lHigh As Long)
     
  Dim sPivot As String
  Dim sSwap As String
  Dim lTmpLow As Long
  Dim lTmpHigh As Long
   
  lTmpLow = lLow
  lTmpHigh = lHigh
   
  sPivot = vArray((lLow + lHigh) \ 2)
 
  Do While lTmpLow <= lTmpHigh
 
     Do While (FirstIsLess(vArray(lTmpLow), sPivot) And lTmpLow < lHigh)
        lTmpLow = lTmpLow + 1
     Loop
     
     Do While (FirstIsLess(sPivot, vArray(lTmpHigh)) And lTmpHigh > lLow)
        lTmpHigh = lTmpHigh - 1
     Loop

     If lTmpLow < lTmpHigh Then
        sSwap = vArray(lTmpLow)
        vArray(lTmpLow) = vArray(lTmpHigh)
        vArray(lTmpHigh) = sSwap
    End If
   
    If lTmpLow <= lTmpHigh Then
        lTmpLow = lTmpLow + 1
        lTmpHigh = lTmpHigh - 1
     End If
 
  Loop
 
  If lLow < lTmpHigh Then QuickSort vArray, lLow, lTmpHigh
  If lTmpLow < lHigh Then QuickSort vArray, lTmpLow, lHigh
 
End Sub

Public Function FirstIsLess(sFirst, sLast) As Boolean
   
    Dim lResp As Long
   
    If sFirst = sLast Then
        lResp = vbNo
    Else
        lResp = MsgBox(sFirst & " is better than " & sLast, vbYesNo)
    End If
   
    FirstIsLess = (lResp = vbYes)
   
End Function

Sub RankQuickSort()

Dim aItems() As Variant

Dim rList As Range

Dim rCell As Range

Dim i As Long

Set rList = wshList.Range("B1", wshList.Range("B1").End(xlDown))

ReDim aItems(1 To rList.Cells.Count)

For Each rCell In rList

i = i + 1

aItems(i) = rCell.Value

Next rCell

QuickSort aItems, LBound(aItems), UBound(aItems)

For i = LBound(aItems) To UBound(aItems)

Debug.Print aItems(i)

Next i

End Sub

Public Sub QuickSort(ByRef vArray As Variant, lLow As Long, lHigh As Long)

Dim sPivot As String

Dim sSwap As String

Dim lTmpLow As Long

Dim lTmpHigh As Long

lTmpLow = lLow

lTmpHigh = lHigh

sPivot = vArray((lLow + lHigh) \ 2)

Do While lTmpLow <= lTmpHigh

Do While (FirstIsLess(vArray(lTmpLow), sPivot) And lTmpLow < lHigh)

lTmpLow = lTmpLow + 1

Loop

Do While (FirstIsLess(sPivot, vArray(lTmpHigh)) And lTmpHigh > lLow)

lTmpHigh = lTmpHigh - 1

Loop

If lTmpLow < lTmpHigh Then

sSwap = vArray(lTmpLow)

vArray(lTmpLow) = vArray(lTmpHigh)

vArray(lTmpHigh) = sSwap

End If

If lTmpLow <= lTmpHigh Then

lTmpLow = lTmpLow + 1

lTmpHigh = lTmpHigh - 1

End If

Loop

If lLow < lTmpHigh Then QuickSort vArray, lLow, lTmpHigh

If lTmpLow < lHigh Then QuickSort vArray, lTmpLow, lHigh

End Sub

Public Function FirstIsLess(sFirst, sLast) As Boolean

Dim lResp As Long

If sFirst = sLast Then

lResp = vbNo

Else

lResp = MsgBox(sFirst & " is better than " & sLast, vbYesNo)

End If

FirstIsLess = (lResp = vbYes)

End Function

RankQuickSort is the starting procedure where I fill the array and kick off the recursive QuickSort procedure. QuickSort is a recursive procedure that sorts the array. The difference between this procedure and a normal QuickSort procedure is that the comparison is made by a human. Instead of code like

sPivot < vArray(lTmpHigh)

1	sPivot < vArray(lTmpHigh)

, the FirstIsLess procedure is called that presents a message box for the user to decide which is less and which is more.

That works well because I’m the only one judging and there are only a small number of judgments to make. As the list grows and I want to distribute the work to more people, I don’t think QuickSort is the answer. Even with one person, there is a possibility for inconsistency. I could say that Coming to America is better than Twins, Twins is better than Big, and Big is better than Coming to America. I need a method to avoid that.

I thought a “voting” system would work. Instead of swapping one list item for the other like in a comparison sort, I would simply award points to the better one and sort on the points. As it turned out, and as I’ll explain in the next post, it takes far too many comparisons for that to work reliably.

You can download VoteRank.zip

8 thoughts on “Subjective Sorting”

Hmmm… did somebody just see “The Social Network”?

What you invented, Dick, is a technique commonly known as Pairwise Comparisons.

http://en.wikipedia.org/wiki/Pairwise_comparison

Taking the example of movies, have you seen http://www.flickchart.com ? It reranks after each pairwise comparison – when you rate a film above one that’s higher in your current list then the preferred one is promoted to be one place above the other. That’s all it appears to do (or appeared to when I looked at it) and I hate it.

Actually, I hated it so much that I seriously contemplated writing an alternative that would apply something like Elo ratings. (http://en.wikipedia.org/wiki/Elo_rating_system). Then I woke up and got sensible.

No sorting is possible if Coming to America > Twins, Twins > Big, Big > Coming to America. The set wouldn’t be well-ordered.

If you have a lot of values, A..Z with A > B, B > C and C > A, then if the first pivot value were B, values A and C would be partitioned into different subsets. A and C will NEVER be compared to the other, and the values other than B in the same subset as either A or C will NEVER be compared to C or A, respectively.

Set partitioning sort algorithms like quicksort will never have a problem with lack of transitive pairwise ordering because any pair of values will be compared at most once, and many pairs of values will never be compared. The final ordering depends on the exact sequence of pairwise comparisons, which depends on the original ordering and the process used to select pivot values.

If you award points and quicksort on points, then the EXPECTED number of comparisons is N log(N), but pathologies in the data (original ordering and pivot value selection) could result in N^2 comparisons. However, no commonly used sort algorithm will ever need more than N^2 comparisons. Not all that much for a computer to handle if N = 100 or even N = 200.

What if you just had people compare between two items at a time like when you go to the eye doctor? A or B?, B or C? Ect.

This way you would only need to know which is better between two items (a much easier decision for people to make). Decision making would also be sped up and you could compile more user comparisons.

From here it would be simple enough to rank the overall scores (i.e. how many people chose A when presented verses how many people chose C wehn presented).

The only issue might be when items tie.

Whatever it’s called, you’ve clearly chosen the best movie.

What if you build a sorted list, one movie at a time, using a binary search to find the correct position of each successive movie? Similar to QuickSort’s divide-and-conquer approach, you keep paring away half of the list until you find where it goes.

As an example, let’s say you’ve got 9 movies ranked so far, and you want to find out where Movie10 goes. Take the middle of the list, Movie5, and compare it to Movie10. Let’s say Movie10 is better. Pick the middle of Movies 1-4 and compare. Let’s say Movie10 is worse than Movie3. The middle of 3 and 5 is 4. Now we just need to compare Moviie10 to Movie4. If it’s better, it gets the new 4-spot. If it’s worse, it gets the new 5-spot. A collection object might work best for this, as it allows insertion before or after items.

The nice thing about this is that to add a new item to the list, you’ll only need around Log2(N) comparisons, where N is the number of items in the current list. For example, if your list is 50 items long, it would take about 6 comparisons to find where it goes. To build a whole list of 50 items, you’d need about 237 total comparisons.

Does this post come with subtitles?

Posting code? Use <pre> tags for VBA and <code> tags for inline.

Daily Dose of Excel

Haphazardly Posted Excel Information and Other Stuff

8 thoughts on “Subjective Sorting”

Leave a Reply