vectorbt学习_09WalkForwardOptimization

滚动窗口法回测

获取行情,可视化#

1
print(price)
2
date
3
2017-01-03 00:00:00+00:00    2.028
4
2017-01-04 00:00:00+00:00    2.046
5
2017-01-05 00:00:00+00:00    2.043
6
2017-01-06 00:00:00+00:00    2.035
7
2017-01-09 00:00:00+00:00    2.040
8
                             ...
9
2022-12-26 00:00:00+00:00    2.612
10
2022-12-27 00:00:00+00:00    2.642
11
2022-12-28 00:00:00+00:00    2.649
12
2022-12-29 00:00:00+00:00    2.635
13
2022-12-30 00:00:00+00:00    2.649
14
Name: Close, Length: 1459, dtype: float64
15

16
price.vbt.plot().show_svg()

del01

rolling_split数据切分,可视化#

1
split_kwargs = dict(
2
    n=30,
3
    window_len=365 * 2,#每个case总长365*2,包含set_lens的180=730除250交易日/年，大致对应了3年
4
    set_lens=(180,),
5
    left_to_right=False
6
)  # 30 windows, each 2 years long, reserve 180 days for test
7

8
def roll_in_and_out_samples(price, **kwargs):
9
    return price.vbt.rolling_split(**kwargs)
10
roll_in_and_out_samples(price, **split_kwargs, plot=True, trace_names=['in-sample', 'out-sample']).show_svg()

del01

1
(in_price, in_indexes), (out_price, out_indexes) = roll_in_and_out_samples(price, **split_kwargs)
2

3
print(in_price.shape, len(in_indexes))  # in-sample
4
print(out_price.shape, len(out_indexes))  # out-sample
5

6
(550, 30) 30 #30对应上面的n=30,365*2=730-180=550
7
(180, 30) 30 #180对应上面的set_lens

如果使用如下配置参数

1
split_kwargs = dict(
2
    n=30,
3
    window_len=250, # 1年的交易日
4
    set_lens=(70,),# 70天用来做测试集
5
    left_to_right=False
6
)

del01

可见，in-sample，和out-sample加起来恰好1年。

1
print(in_indexes)

del01

每个周期的初始日期怎么来的呢？
根据可视化图形，可以看出，

1
初始日期差异=(总天数-单个周期天数)/(总周期数-1)
2
=》
3
diff=(1459-365*2)/(30-1)=25.13
4

5
验证一下
6
print(in_indexes[0][0])
7
print(in_indexes[1][0])
8
print(in_indexes[0][25:27]) # 观察in_indexes[1][0]位于in_indexes[0][0]的第几个位置，此处应该为26
9

10
2017-01-03 00:00:00+00:00
11
2017-02-14 00:00:00+00:00
12
DatetimeIndex(['2017-02-14 00:00:00+00:00', '2017-02-15 00:00:00+00:00'], dtype='datetime64[ns, UTC]', name='split_0', freq=None)
13
可见，2个区间的起始日期in_indexes[0][0]，in_indexes[1][0]，相差的交易日个数，恰好为26

持有收益#

1
pf_kwargs = dict(
2
    direction='both',  # long and short
3
    freq='d'
4
)
5

6
def simulate_holding(price, **kwargs):
7
    pf = vbt.Portfolio.from_holding(price, **kwargs)
8
    return pf.sharpe_ratio()
9

10
in_hold_sharpe = simulate_holding(in_price, **pf_kwargs)
11
print(in_hold_sharpe)
12

13
split_idx
14
0     0.954906
15
1     0.685842
16
2     0.868976
17
，，
18
27    0.383577
19
28    0.334316
20
29    0.088459
21
Name: sharpe_ratio, dtype: float64
22
含义为，30个时间区间的sharpe值，由于采用hold策略，只要买入没卖出，可以看做价格本身的sharpe值

from_holding

1
close = pd.Series([1, 2, 3, 4, 5])
2
pf = vbt.Portfolio.from_holding(close)
3
pf.final_value()
4

5
500 #第一天买入100块的,100/1=100份，第五天卖出得到总资产500，

双均线#

1
def simulate_all_params(price, windows, **kwargs):
2
    fast_ma, slow_ma = vbt.MA.run_combs(price, windows, r=2, short_names=['fast', 'slow']) # 指标参数组合构造
3
    entries = fast_ma.ma_crossed_above(slow_ma) #指标转买卖信号
4
    exits = fast_ma.ma_crossed_below(slow_ma)
5
    pf = vbt.Portfolio.from_signals(price, entries, exits, **kwargs) #信号组合的收益评估
6
    return pf.sharpe_ratio()#回测的sharpe值
7

8
# Simulate all params for in-sample ranges
9
in_sharpe = simulate_all_params(in_price, windows, **pf_kwargs) #评估window里各参数组合的sharpe值
10

11
print(in_sharpe.shape)
12
(23400,)
13

14
print(in_sharpe)
15
fast_window  slow_window  split_idx
16
10           11           0            1.202383
17
                          1            1.358301
18
                          2            1.629335
19
                          3            1.760235
20
                          4            1.258015
21
                                         ...
22
48           49           25           0.610420
23
                          26           0.738654
24
                          27           0.602987
25
                          28           0.689741
26
                          29           0.778645
27
Name: sharpe_ratio, Length: 23400, dtype: float64

各时间区间sharpe最高的参数组#

1
def get_best_index(performance, higher_better=True):
2
    if higher_better:
3
        return performance[performance.groupby('split_idx').idxmax()].index
4
    return performance[performance.groupby('split_idx').idxmin()].index
5
in_best_index = get_best_index(in_sharpe)
6

7
print(in_best_index) #最优参数组
8
MultiIndex([(21, 35,  0),
9
            (21, 35,  1),
10
            (11, 14,  2),
11
         ,,,
12
            (45, 48, 28),
13
            (45, 48, 29)],
14
           names=['fast_window', 'slow_window', 'split_idx'])
15

16
def get_best_params(best_index, level_name):
17
    return best_index.get_level_values(level_name).to_numpy()
18
in_best_fast_windows = get_best_params(in_best_index, 'fast_window')
19
in_best_slow_windows = get_best_params(in_best_index, 'slow_window')
20
in_best_window_pairs = np.array(list(zip(in_best_fast_windows, in_best_slow_windows)))
21

22
print(in_best_window_pairs) #最优参数组，去掉了时间区间信息
23
[[21 35]
24
 [21 35]
25
 [11 14]
26
,,,
27
 [45 48]
28
 [45 48]]

参数变化可视化

1
pd.DataFrame(in_best_window_pairs, columns=['fast_window', 'slow_window']).vbt.plot().show_svg()

del01

测试集实施hold策略#

1
out_hold_sharpe = simulate_holding(out_price, **pf_kwargs)
2

3
print(out_hold_sharpe)
4
split_idx
5
0     0.601067
6
1     0.850085
7
2    -0.844581
8
，，，
9
28   -1.233394
10
29   -0.335148
11
Name: sharpe_ratio, dtype: float64

测试集的参数探测

1
# Simulate all params for out-sample ranges
2
out_sharpe = simulate_all_params(out_price, windows, **pf_kwargs)
3

4
print(out_sharpe)
5
fast_window  slow_window  split_idx
6
10           11           0            0.495589
7
                          1           -1.111064
8
                          2           -2.526381
9
                          3           -1.415069
10
                          4           -1.979918
11
                                         ...
12
48           49           25          -1.560619
13
                          26          -3.410102
14
                          27          -1.743216
15
                          28          -1.648796
16
                          29          -0.405472
17
Name: sharpe_ratio, Length: 23400, dtype: float64

1
def simulate_best_params(price, best_fast_windows, best_slow_windows, **kwargs):
2
    fast_ma = vbt.MA.run(price, window=best_fast_windows, per_column=True)
3
    slow_ma = vbt.MA.run(price, window=best_slow_windows, per_column=True)
4
    entries = fast_ma.ma_crossed_above(slow_ma)
5
    exits = fast_ma.ma_crossed_below(slow_ma)
6
    pf = vbt.Portfolio.from_signals(price, entries, exits, **kwargs)
7
    return pf.sharpe_ratio()
8
# Use best params from in-sample ranges and simulate them for out-sample ranges
9
# 入参：in_best_fast_windows，in_best_slow_windows是蓝色数据集计算出的 最佳参数组合
10
# 整体效果是，用历史数据得到最佳fast,slow参数组合，然后用户未来一段时间上
11
out_test_sharpe = simulate_best_params(out_price, in_best_fast_windows, in_best_slow_windows, **pf_kwargs)
12

13
print(out_test_sharpe)

del01

训练集,测试集上各策略sharpe中位数,最大值变化,可视化#

1
cv_results_df = pd.DataFrame({
2
    'in_sample_hold': in_hold_sharpe.values,
3
    'in_sample_median': in_sharpe.groupby('split_idx').median().values,
4
    'in_sample_best': in_sharpe[in_best_index].values,
5
    'out_sample_hold': out_hold_sharpe.values,
6
    'out_sample_median': out_sharpe.groupby('split_idx').median().values,
7
    'out_sample_test': out_test_sharpe.values
8
})
9

10
color_schema = vbt.settings['plotting']['color_schema']
11

12
cv_results_df.vbt.plot(
13
    trace_kwargs=[
14
        dict(line_color=color_schema['blue']),
15
        dict(line_color=color_schema['blue'], line_dash='dash'),
16
        dict(line_color=color_schema['blue'], line_dash='dot'),
17
        dict(line_color=color_schema['orange']),
18
        dict(line_color=color_schema['orange'], line_dash='dash'),
19
        dict(line_color=color_schema['orange'], line_dash='dot')
20
    ]
21
).show_svg()

del01

黄金矿工

获取行情,可视化#

rolling_split数据切分,可视化#

持有收益#

双均线#

各时间区间sharpe最高的参数组#

测试集实施hold策略#

训练集,测试集上各策略sharpe中位数,最大值变化,可视化#