Ranking Values in the Wilcoxon Signed Rank Test

I’m working on a demonstration activity for the Wilcoxon Signed Rank Test. I’m stuck with a problem of ranking values in a table. For a list of values, say L = [5, 8, 13, 4, 7], I want to create a matching column ranking their distance from zero, which in the example would be R= [2, 4, 5, 1, 3] (i.e. 4 is closest to zero, then 5, then 7, and so on). I had hoped that “sort([1…5], L)” would give R but it actually produces [4,1,5,2,3] showing the rank order positions in L (i.e. the smallest number is in position 4 in L, the second in position 1, and so on). It feels as if I’m close but not sure what to try next to produce the column I need. Can anyone nudge me in the right direction?

Someone may be able to spot something that we’re both missing, but in the meantime you can achieve what you want by running two sorts - so


should do the trick.

Many thanks for the rapid response @pirsquared. Unfortunately, when I apply the double sort to the example L = [5, 8, 13, 4, 7], the result is [2, 4, 3, 1, 5]. That seems close to the expected result of [2, 4, 5, 1, 3] but that may be a coincidence, as the difference is more pronounced when I use a larger data set. It does feel that something like that should work?

Ah, yes! Sorry, didn’t spot that.

Again, someone might have a more elegant solution but I think this works:

R=[[1...length(L)][sort(L)=i][1] for i=L]

Excellent, that works! I still have a bit more adjustment to do for the Wilcoxon ranking (exclude zero values and rank identical values in a different way), which I’ll try myself, but may be back to tap further into your expertise if that’s okay. In the meantime, again thanks!

…Sorry @pirsquared, I’m back! I was rather hoping that I knew enough to adjust your solution

R=[[1...length(L)][sort(L)=i][1] for i=L]

to the form required for the Wilcoxon test but alas that hasn’t worked out. Applying your ranking to L=[8,13,13,15,19,19,19] produces R=[1,2,2,4,5,5,5] but with Wilcoxon, the required values would be R=[1,2.5,2.5,4,6,6,6]. i.e. where there are duplicates the ranking is “r+(n-1)*0.5”, where “r” is the original rank value and “n” is the number of values with that ranking. I can identify the ranking duplicates and their frequency but I can work out how to replace the rankings in the original list with their revised values? Happy to be directed to a tutorial on this material to improve my understanding!

Here’s a workaround:

1 Like

Many thanks Daniel, that looks perfect! …and such a rapid response!! Hopefully, I’ll be able to finish this activity without further pleas for help, but let’s see. :wink: