This function acts as a drop-in replacement for the base rank()
function with the added option to:
Rank categorical factors based on frequency instead of alphabetically
Rank in descending or ascending order
Usage
smartrank(
x,
sort_by = c("alphabetical", "frequency"),
desc = FALSE,
ties.method = "average",
na.last = TRUE,
verbose = TRUE
)
Arguments
- x
A numeric, character, or factor vector
- sort_by
Sort ranking either by "alphabetical" or "frequency" . Default is "alphabetical"
- desc
A logical indicating whether the ranking should be in descending ( TRUE ) or ascending ( FALSE ) order. When input is numeric, ranking is always based on numeric order.
- ties.method
a character string specifying how ties are treated, see ‘Details’; can be abbreviated.
- na.last
a logical or character string controlling the treatment of
NA
s. IfTRUE
, missing values in the data are put last; ifFALSE
, they are put first; ifNA
, they are removed; if"keep"
they are kept with rankNA
.- verbose
verbose (flag)
Note
When sort_by = "frequency"
, ties based on frequency are broken by alphabetical order of the terms
When sort_by = "frequency"
and input is character, ties.method is ignored. each distinct element level gets its own rank, and each rank is 1 unit away from the next element, irrespective of how many duplicates
Examples
## CATEGORICAL INPUT -----------------------
fruits <- c("Apple", "Orange", "Apple", "Pear", "Orange")
# rank alphabetically
smartrank(fruits)
#> [1] 1.5 3.5 1.5 5.0 3.5
#> [1] 1.5 3.5 1.5 5.0 3.5
# rank based on frequency
smartrank(fruits, sort_by = "frequency")
#> smartrank: Sorting a categorical variable by frequency: ignoring ties.method
#> [1] 2 3 2 1 3
#> smartrank: Sorting a categorical variable by frequency: ignoring ties.method
#> [1] 2 3 2 1 3
# rank based on descending order of frequency
smartrank(fruits,sort_by = "frequency", desc = TRUE)
#> smartrank: Sorting a categorical variable by frequency: ignoring ties.method
#> [1] 1 2 1 3 2
#> smartrank: Sorting a categorical variable by frequency: ignoring ties.method
#> [1] 1 2 1 3 2
## NUMERICAL INPUT -----------------------
# rank numerically
smartrank(c(1, 3, 2))
#> [1] 1 3 2
#> [1] 1 3 2
# rank numerically based on descending order
smartrank(c(1, 3, 2), desc = TRUE)
#> [1] 3 1 2
#> [1] 3 1 2
# always rank numerically, irrespective of sort_by
smartrank(c(1, 3, 2), sort_by = "frequency")
#> smartrank: Sorting a non-categorical variable. Ignoring `sort_by` and sorting numerically
#> [1] 1 3 2
#> smartrank: Sorting a numeric variable. Ignoring `sort_by` and sorting numerically
#> [1] 1 3 2