找到你要的答案

Q:Combining columns based on group

Q:基于组的组合列

I have a dataframe like this:

         POLY_KEY_I     SP1     SP2
0   FS01080100SM001  POAPRA  TOXRYD
1   FS01080100SM001     NaN     NaN
2   FS01080100SM001   OXRYD  SYMOCC
3   FS01080100SM001  EUPESU  POAPRA
4   FS01080100SM001  BOUGRA  KOEPYR
5   FS01080100SM002  POAPRA  EUPESU
6   FS01080100SM002  POAPRA     NaN
7   FS01080100SM002  POAPRA  KOEPYR

And I want to groupby POLY_KEY_I and then combine SP1 and SP2 based on this.

My desired output would be something like:

         POLY_KEY_I      SP
0   FS01080100SM001  POAPRA
1   FS01080100SM001  TOXRYD
2   FS01080100SM001     NaN
3   FS01080100SM001     NaN
4   FS01080100SM001   OXRYD
5   FS01080100SM001  SYMOCC
6   FS01080100SM001  EUPESU
7   FS01080100SM001  POAPRA
8   FS01080100SM001  BOUGRA
9   FS01080100SM001  KOEPYR 
10  FS01080100SM002  POAPRA
11  FS01080100SM002  EUPESU
12  FS01080100SM002  POAPRA
13  FS01080100SM002     NaN
14  FS01080100SM002  POAPRA
15  FS01080100SM002  KOEPYR

我有一个这样的帧:

         POLY_KEY_I     SP1     SP2
0   FS01080100SM001  POAPRA  TOXRYD
1   FS01080100SM001     NaN     NaN
2   FS01080100SM001   OXRYD  SYMOCC
3   FS01080100SM001  EUPESU  POAPRA
4   FS01080100SM001  BOUGRA  KOEPYR
5   FS01080100SM002  POAPRA  EUPESU
6   FS01080100SM002  POAPRA     NaN
7   FS01080100SM002  POAPRA  KOEPYR

我想通过poly_key_i然后结合基于此和SP2 SP1。

我想要的输出将是类似的东西:

         POLY_KEY_I      SP
0   FS01080100SM001  POAPRA
1   FS01080100SM001  TOXRYD
2   FS01080100SM001     NaN
3   FS01080100SM001     NaN
4   FS01080100SM001   OXRYD
5   FS01080100SM001  SYMOCC
6   FS01080100SM001  EUPESU
7   FS01080100SM001  POAPRA
8   FS01080100SM001  BOUGRA
9   FS01080100SM001  KOEPYR 
10  FS01080100SM002  POAPRA
11  FS01080100SM002  EUPESU
12  FS01080100SM002  POAPRA
13  FS01080100SM002     NaN
14  FS01080100SM002  POAPRA
15  FS01080100SM002  KOEPYR
answer1: 回答1:

You can use melt to reshape from wide to long, like this:

In [10]: pd.melt(df, id_vars='POLY_KEY_I', value_name='SP')
Out[10]: 
         POLY_KEY_I variable      SP
0   FS01080100SM001      SP1  POAPRA
1   FS01080100SM001      SP1     NaN
2   FS01080100SM001      SP1   OXRYD
3   FS01080100SM001      SP1  EUPESU
4   FS01080100SM001      SP1  BOUGRA
5   FS01080100SM002      SP1  POAPRA
6   FS01080100SM002      SP1  POAPRA
7   FS01080100SM002      SP1  POAPRA
8   FS01080100SM001      SP2  TOXRYD
9   FS01080100SM001      SP2     NaN
10  FS01080100SM001      SP2  SYMOCC
11  FS01080100SM001      SP2  POAPRA
12  FS01080100SM001      SP2  KOEPYR
13  FS01080100SM002      SP2  EUPESU
14  FS01080100SM002      SP2     NaN
15  FS01080100SM002      SP2  KOEPYR

你可以用融化来重塑从长到长,像这样:

In [10]: pd.melt(df, id_vars='POLY_KEY_I', value_name='SP')
Out[10]: 
         POLY_KEY_I variable      SP
0   FS01080100SM001      SP1  POAPRA
1   FS01080100SM001      SP1     NaN
2   FS01080100SM001      SP1   OXRYD
3   FS01080100SM001      SP1  EUPESU
4   FS01080100SM001      SP1  BOUGRA
5   FS01080100SM002      SP1  POAPRA
6   FS01080100SM002      SP1  POAPRA
7   FS01080100SM002      SP1  POAPRA
8   FS01080100SM001      SP2  TOXRYD
9   FS01080100SM001      SP2     NaN
10  FS01080100SM001      SP2  SYMOCC
11  FS01080100SM001      SP2  POAPRA
12  FS01080100SM001      SP2  KOEPYR
13  FS01080100SM002      SP2  EUPESU
14  FS01080100SM002      SP2     NaN
15  FS01080100SM002      SP2  KOEPYR
python  pandas