找到你要的答案

Q:How to go from relative dates to absolute dates in DataFrame columns

Q:如何从相关数据在帧的绝对日期列

I have a pandas DataFrame containing forward prices for future maturities, quoted on multiple different trading months ('trade date'). Trade dates are given in absolute terms ('January'). The maturities are given in relative terms ('M+1').

How can I convert the maturities into an absolute format, i.e. in trade date 'January' the maturity 'M+1' should say 'February'.

Here is example data:

import pandas as pd
import numpy as np

data_keys = ['trade date', 'm+1', 'm+2', 'm+3']
data = {'trade date':['jan','feb','mar','apr'],
        'm+1':np.random.randn(4),
        'm+2':np.random.randn(4),
        'm+3':np.random.randn(4)}

df = pd.DataFrame(data)
df = df[data_keys]

Starting data:

  trade date       m+1       m+2       m+3
0        jan -0.446535 -1.012870 -0.839881
1        feb  0.013255  0.265500  1.130098
2        mar  0.406562 -1.122270 -1.851551
3        apr -0.890004  0.752648  0.778100

Result:

Should have Feb, Mar, Apr, May, Jun, Jul in the columns. NaN will be shown in many instances.

我有一只熊猫dataframe包含了未来的债券价格,报价在多个不同的交易月(变动日期”)。交易日期是绝对条件('january”)。给出的期限相对而言(我+ 1)。

我怎么能把期限变成绝对的格式,即在交易日期一月到期的M + 1”应该说“二月”。

这里是示例数据:

import pandas as pd
import numpy as np

data_keys = ['trade date', 'm+1', 'm+2', 'm+3']
data = {'trade date':['jan','feb','mar','apr'],
        'm+1':np.random.randn(4),
        'm+2':np.random.randn(4),
        'm+3':np.random.randn(4)}

df = pd.DataFrame(data)
df = df[data_keys]

起始数据:

  trade date       m+1       m+2       m+3
0        jan -0.446535 -1.012870 -0.839881
1        feb  0.013255  0.265500  1.130098
2        mar  0.406562 -1.122270 -1.851551
3        apr -0.890004  0.752648  0.778100

结果:

应该有月,损坏,APR,可俊,月在列。楠将在许多情况下显示。

answer1: 回答1:

The starting DataFrame:

  trade date       m+1       m+2       m+3
0        jan -1.350746  0.948835  0.579352
1        feb  0.011813  2.020158 -1.221110
2        mar -0.183187 -0.303099  1.323092
3        apr  0.081105  0.662628 -0.703152

Solution:

  • Define a list of all possible absolute dates you will encounter, in chronological order. Do the same for relative dates.
  • Create a function to act on groups coming from df.groupby. The function will convert the column names of each group appropriately to an absolute format.
  • Apply the function.
  • Pandas handles the clever concatenation of all groups.

Code:

abs_in_order = ['jan','feb','mar','apr','may','jun','jul','aug']
rel_in_order = ['m+0','m+1','m+2','m+3','m+4']

def rel2abs(group, abs_in_order, rel_in_order):
    abs_date = group['trade date'].unique()[0]    
    l = len(rel_in_order)
    i = abs_in_order.index(abs_date)
    namesmap = dict(zip(rel_in_order, abs_in_order[i:i+l]))
    group.rename(columns=namesmap, inplace=True)
    return group

grouped = df.groupby(['trade date'])
df = grouped.apply(rel2abs, abs_in_order, rel_in_order)

Pandas may mess up the column order. Do this to get back to something in chronological order:

order = ['trade date'] + abs_in_order
cols = [e for e in order if e in df.columns]
df[cols]

Result:

  trade date       feb       mar       apr       may       jun       jul
0        jan -1.350746  0.948835  0.579352       NaN       NaN       NaN
1        feb       NaN  0.011813  2.020158 -1.221110       NaN       NaN
2        mar       NaN       NaN -0.183187 -0.303099  1.323092       NaN
3        apr       NaN       NaN       NaN  0.081105  0.662628 -0.703152

开始DataFrame:

  trade date       m+1       m+2       m+3
0        jan -1.350746  0.948835  0.579352
1        feb  0.011813  2.020158 -1.221110
2        mar -0.183187 -0.303099  1.323092
3        apr  0.081105  0.662628 -0.703152

解决方案:

  • Define a list of all possible absolute dates you will encounter, in chronological order. Do the same for relative dates.
  • Create a function to act on groups coming from df.groupby. The function will convert the column names of each group appropriately to an absolute format.
  • Apply the function.
  • Pandas handles the clever concatenation of all groups.

代码:

abs_in_order = ['jan','feb','mar','apr','may','jun','jul','aug']
rel_in_order = ['m+0','m+1','m+2','m+3','m+4']

def rel2abs(group, abs_in_order, rel_in_order):
    abs_date = group['trade date'].unique()[0]    
    l = len(rel_in_order)
    i = abs_in_order.index(abs_date)
    namesmap = dict(zip(rel_in_order, abs_in_order[i:i+l]))
    group.rename(columns=namesmap, inplace=True)
    return group

grouped = df.groupby(['trade date'])
df = grouped.apply(rel2abs, abs_in_order, rel_in_order)

大熊猫可能打乱了列的顺序。这样做,以回到某事按时间顺序:

order = ['trade date'] + abs_in_order
cols = [e for e in order if e in df.columns]
df[cols]

结果:

  trade date       feb       mar       apr       may       jun       jul
0        jan -1.350746  0.948835  0.579352       NaN       NaN       NaN
1        feb       NaN  0.011813  2.020158 -1.221110       NaN       NaN
2        mar       NaN       NaN -0.183187 -0.303099  1.323092       NaN
3        apr       NaN       NaN       NaN  0.081105  0.662628 -0.703152
answer2: 回答2:

You question doesn't contain enough information to answer it.

You say that the prices are quoted on dates given in absolute terms ('January').

January is not a date, but 2-Jan-2015 is.

What is your actual 'date' and what is its format (i.e. text, datetime.date, pd.Timestamp, etc.). You can use type(date) to check where date is an instance of whatever quote date it represents.

The easiest solution is to get your trade dates into pd.Timestamps and then add an offset:

trade_date = pd.Timestamp('2015-1-15')

>>> trade_date + pd.DateOffset(months=1)
Timestamp('2015-02-15 00:00:00')

你的问题没有足够的信息来回答。

你说,所报的价格是绝对值在给定日期('january”)。

一月不是一个日期,但2-jan-2015是。

什么是你的实际日期和格式是什么(即文本,datetime.date,PD,时间戳等)。可以使用类型(日期)检查日期是它所代表的任何引用日期的实例。

最简单的方法是让你的交易日期为PD。时间戳,然后再加上一个偏移量:

trade_date = pd.Timestamp('2015-1-15')

>>> trade_date + pd.DateOffset(months=1)
Timestamp('2015-02-15 00:00:00')
pandas