找到你要的答案

Q:Convert factor into integer based on frequency in R

Q:R中基于频率的整数转换因子

A vector of factor:

vec <- factor(c('a','b','b','c','b','c'))
[1] a b b c b c                                                                                                                                                              
Levels: a b c

Would expect a new vector of

vec_new
[1] 3 1 1 2 1 2

The one with higher frequency will be converted to smaller integer. Any help is appreciated, thank you

因子向量:

vec <- factor(c('a','b','b','c','b','c'))
[1] a b b c b c                                                                                                                                                              
Levels: a b c

期待一个新的向量

vec_new
[1] 3 1 1 2 1 2

The one with higher frequency will be converted to smaller integer. Any help is appreciated, thank you

answer1: 回答1:
x2 <- rev(sort(table(x)))
names(x2) <- names(sort(table(x)))
levels(x) <- x2[order(names(x2))]
x
[1] 3 1 1 2 1 2
Levels: 3 1 2

We first find the highest frequency factor and reverse the order (smallest to largest) with rev(sort(table(x))). Next we rename that smallest to largest vector to match the names of the regular largest to smallest frequency table. Lastly, we can now assign the new levels based on the order of the names while using the smallest to largest indices.

Another option courtesy of @RichardScriven:

s <- sort(table(x)) 
x <- factor(vec, labels = rev(s), levels = names(s))

Data

vec <- letters[c(1,2,2,3,2,3)]
x <- factor(vec)
[1] a b b c b c
Levels: a b c
x2 <- rev(sort(table(x)))
names(x2) <- names(sort(table(x)))
levels(x) <- x2[order(names(x2))]
x
[1] 3 1 1 2 1 2
Levels: 3 1 2

我们首先找到最高的频率因子和扭转顺序(最小到最大)与转速(排序(表))。接下来,我们将最小到最大向量,以匹配的规则的最大到最小频率表的名称。最后,我们现在可以根据名称的顺序分配新的级别,同时使用最小到最大的索引。

“richardscriven另一选择礼貌:

s <- sort(table(x)) 
x <- factor(vec, labels = rev(s), levels = names(s))

数据

vec <- letters[c(1,2,2,3,2,3)]
x <- factor(vec)
[1] a b b c b c
Levels: a b c
answer2: 回答2:

Not sure if there's a more efficient approach, but you you can find out how frequently different levels of the factor occur with table(vec), and then you can manually order the levels of the factor with levels(vec) <- c("b", "c", "a").

不知道是否有一个更有效的办法,但你可以找到多少不同层次的因素发生的表(VEC),然后你可以手动订单水平的因素与水平(VEC)& lt;C(“B”、“C”、“A”)。

answer3: 回答3:

Just to throw in another one-liner:

as.numeric(reorder(vec, -ave(as.numeric(vec), vec, FUN = length)))
# [1] 3 1 1 2 1 2

First, you calculate the (negative - to have the proper ordering afterwards) frequency of each vector level with ave, then you reorder the factor levels with reorder. The latter calculates the mean of -ave(.) for each level and resorts the factor levels accordingly in increasing order (that's why we used -ave(.)). Finally, transform the factor into a numeric.

只是扔在另一个班轮:

as.numeric(reorder(vec, -ave(as.numeric(vec), vec, FUN = length)))
# [1] 3 1 1 2 1 2

首先,你计算(负之后有适当的顺序)每个向量水平与平均频率,然后你排序因子的排序。后者计算的平均-大道(。)每个级别和度假村的因子水平也相应的增加(这就是为什么我们使用大道(。))。最后,将因子转换为数字。

r