Abstract. Ranking methods are fundamental tools in data analysis, applied across recommendation systems, social choice, and decision-making. While methods such as HodgeRank leverage topological properties of preference data, their empirical performance on real-world cyclic datasets remains understudied. This paper presents a comprehensive comparison of four ranking methods—HodgeRank, Borda Count, Bradley-Terry, and Spectral Ranking—across three datasets with varying cyclicity levels (55% to 96%). Performance is evaluated using pairwise accuracy (PA); stability is validated via 5-fold cross-validation; statistical significance of differences is assessed via McNemar’s test with Bonferroni correction; and top-k agreement is quantified using Jaccard similarity J and Kendall’s τ. Key findings reveal a non-monotonic relationship between cyclicity and HodgeRank performance: PA peaks at moderate cyclicity (0.851 at β₁ = 82%) but degrades at both low (0.574 at 55%) and high (0.791 at 96%) extremes. Bradley-Terry achieves the highest average accuracy (0.767) with strong cross-validation stability. On highly cyclic data (SUSHI3), three methods—Bradley-Terry, Borda, and Spectral — reach perfect top-10 consensus (J = 1.000), while HodgeRank diverges from all three (J = 0.250, τ = −0.283), a difference statistically confirmed (McNemar χ² = 119.4, p < 0.001) despite aggregate PA differing by only 5.8 percentage points. These results demonstrate that aggregate accuracy metrics alone are insufficient for deployment decisions in ranking systems.
Keywords: ranking methods, HodgeRank, cyclicity, preference aggregation, pairwise comparison, Bradley-Terry model, Borda Count, combinatorial Hodge theory.