# Q：R中基于频率的整数转换因子

A vector of factor:

``````vec <- factor(c('a','b','b','c','b','c'))
[1] a b b c b c
Levels: a b c
``````

Would expect a new vector of

``````vec_new
[1] 3 1 1 2 1 2
``````

The one with higher frequency will be converted to smaller integer. Any help is appreciated, thank you

``````vec <- factor(c('a','b','b','c','b','c'))
[1] a b b c b c
Levels: a b c
``````

``````vec_new
[1] 3 1 1 2 1 2
``````

The one with higher frequency will be converted to smaller integer. Any help is appreciated, thank you

``````x2 <- rev(sort(table(x)))
names(x2) <- names(sort(table(x)))
levels(x) <- x2[order(names(x2))]
x
[1] 3 1 1 2 1 2
Levels: 3 1 2
``````

We first find the highest frequency factor and reverse the order (smallest to largest) with rev(sort(table(x))). Next we rename that smallest to largest vector to match the names of the regular largest to smallest frequency table. Lastly, we can now assign the new levels based on the order of the names while using the smallest to largest indices.

Another option courtesy of @RichardScriven:

``````s <- sort(table(x))
x <- factor(vec, labels = rev(s), levels = names(s))
``````

Data

``````vec <- letters[c(1,2,2,3,2,3)]
x <- factor(vec)
[1] a b b c b c
Levels: a b c
``````
``````x2 <- rev(sort(table(x)))
names(x2) <- names(sort(table(x)))
levels(x) <- x2[order(names(x2))]
x
[1] 3 1 1 2 1 2
Levels: 3 1 2
``````

“richardscriven另一选择礼貌：

``````s <- sort(table(x))
x <- factor(vec, labels = rev(s), levels = names(s))
``````

``````vec <- letters[c(1,2,2,3,2,3)]
x <- factor(vec)
[1] a b b c b c
Levels: a b c
``````

Not sure if there's a more efficient approach, but you you can find out how frequently different levels of the factor occur with table(vec), and then you can manually order the levels of the factor with levels(vec) <- c("b", "c", "a").

Just to throw in another one-liner:

``````as.numeric(reorder(vec, -ave(as.numeric(vec), vec, FUN = length)))
# [1] 3 1 1 2 1 2
``````

First, you calculate the (negative - to have the proper ordering afterwards) frequency of each vector level with ave, then you reorder the factor levels with reorder. The latter calculates the mean of -ave(.) for each level and resorts the factor levels accordingly in increasing order (that's why we used -ave(.)). Finally, transform the factor into a numeric.

``````as.numeric(reorder(vec, -ave(as.numeric(vec), vec, FUN = length)))
# [1] 3 1 1 2 1 2
``````

r