pandas.DataFrameからRSIを求める

テクニカル指標を用いたらもっと収益が改善するんじゃない?という話になったので、

RSI(相対力指数)を算出したい(移動平均線出せれば簡単そうだった)

用いたデータ

USD/JPY(期間:2005/1/3 ~ 2020/3/9)

CSVファイルで取得

処理の流れ

まずはデータフレーム作成

import glob
import pandas as pd
from dateutil.relativedelta import relativedelta
import numpy as np
import scipy.optimize as sco
import sys
import matplotlib.pyplot as plt
import os
import math

def read_and_pre_process_asset_df(asset_file_path : str) -> pd.DataFrame :
    asset_name_ext = os.path.basename(asset_file_path)
    asset_name = os.path.splitext(asset_name_ext)[0]

    asset_df = pd.read_csv(asset_file_path, encoding = "utf-8", usecols = ['日付け','終値'])

    asset_df = asset_df.rename(columns = {'日付け' : 'Date', '終値' : asset_name})
    asset_df['Date'] = pd.to_datetime(asset_df['Date'], format = '%Y年%m月%d日')
    asset_df['Date'] = asset_df['Date'].dt.date

    return asset_df

def merge_df(df_list : list, on : str = 'Date') -> pd.DataFrame :
    total_df = df_list[0]
    for df in df_list[1:]:
        total_df = pd.merge(df, total_df, on = on)

    return total_df

def pre_process_total_df(total_df : pd.DataFrame) -> pd.DataFrame :
    total_df = total_df.sort_values(['Date'])
    total_df = total_df.set_index('Date')
    total_df = total_df.astype(str)\
                .applymap(lambda x: x.replace(',', ''))\
                .astype(float)

    return total_df

asset_file_paths = glob.glob('data/USDJPY.csv') #  dataディレクトリ直下のcsvファイル読み込み
asset_df_list = [read_and_pre_process_asset_df(asset_file_path)
                 for asset_file_path in asset_file_paths]
asset_total_df = merge_df(asset_df_list)
asset_total_df = pre_process_total_df(asset_total_df)

過去に作った関数を使うと一瞬!便利!

データフレームの内容はこんな感じ

            USDJPY
Date              
2005-01-03  102.72
2005-01-04  104.61
2005-01-05  104.09
2005-01-06  105.07
2005-01-07  104.80
...            ...
2020-03-04  107.53
2020-03-05  106.17
2020-03-06  105.31
2020-03-08  103.93
2020-03-09  102.36

データフレーム作成後前処理

RSIを求める上では、n日間での上昇幅の平均と下落幅の平均をそれぞれ出さないといけないので、その処理を行う

historical_price = asset_total_df
historical_diff = historical_price.diff().dropna()
up, down = historical_diff.copy(),historical_diff.copy()

print (historical_diff)
print (up)
print (down)
            USDJPY
Date              
2005-01-04    1.89
2005-01-05   -0.52
2005-01-06    0.98
2005-01-07   -0.27
2005-01-10   -0.60
...            ...
2020-03-04    0.39
2020-03-05   -1.36
2020-03-06   -0.86
2020-03-08   -1.38
2020-03-09   -1.57

[3961 rows x 1 columns]
            USDJPY
Date              
2005-01-04    1.89
2005-01-05   -0.52
2005-01-06    0.98
2005-01-07   -0.27
2005-01-10   -0.60
...            ...
2020-03-04    0.39
2020-03-05   -1.36
2020-03-06   -0.86
2020-03-08   -1.38
2020-03-09   -1.57

[3961 rows x 1 columns]
            USDJPY
Date              
2005-01-04    1.89
2005-01-05   -0.52
2005-01-06    0.98
2005-01-07   -0.27
2005-01-10   -0.60
...            ...
2020-03-04    0.39
2020-03-05   -1.36
2020-03-06   -0.86
2020-03-08   -1.38
2020-03-09   -1.57

[3961 rows x 1 columns]

変数名が以前のコードをそのまま代用していたのでわかりずらかった、ので変えた

.copyはそのままコピーしてくれる

.mask

pandasでデータフレームの要素を条件付きで変換したいときにこれを使った

historical_price = asset_total_df
historical_diff = historical_price.diff().dropna()
up, down = historical_diff.copy(),historical_diff.copy()

up = up.mask(up < 0, 0)
down = down.mask(down > 0, 0)
print (up)
print (down)
            USDJPY
Date              
2005-01-04    1.89
2005-01-05    0.00
2005-01-06    0.98
2005-01-07    0.00
2005-01-10    0.00
...            ...
2020-03-04    0.39
2020-03-05    0.00
2020-03-06    0.00
2020-03-08    0.00
2020-03-09    0.00

[3961 rows x 1 columns]
            USDJPY
Date              
2005-01-04    0.00
2005-01-05   -0.52
2005-01-06    0.00
2005-01-07   -0.27
2005-01-10   -0.60
...            ...
2020-03-04    0.00
2020-03-05   -1.36
2020-03-06   -0.86
2020-03-08   -1.38
2020-03-09   -1.57

[3961 rows x 1 columns]

pandas.rollng

up_sma = up.rolling(14).mean()
down_sma = down.rolling(14).mean()

print (up_sma)
print (down_sma)
              USDJPY
Date                
2005-01-04       NaN
2005-01-05       NaN
2005-01-06       NaN
2005-01-07       NaN
2005-01-10       NaN
...              ...
2020-03-04  0.230000
2020-03-05  0.230000
2020-03-06  0.220000
2020-03-08  0.220000
2020-03-09  0.114286

[3961 rows x 1 columns]
              USDJPY
Date                
2005-01-04       NaN
2005-01-05       NaN
2005-01-06       NaN
2005-01-07       NaN
2005-01-10       NaN
...              ...
2020-03-04 -0.392857
2020-03-05 -0.485714
2020-03-06 -0.547143
2020-03-08 -0.645000
2020-03-09 -0.757143

[3961 rows x 1 columns]

移動平均もこれ一つで簡単!

SRIを求める

RS = up_sma / down_sma
RSI = 100.0 - (100.0 / (1.0 + RS))

print (RSI)
               USDJPY
Date                 
2005-01-04        NaN
2005-01-05        NaN
2005-01-06        NaN
2005-01-07        NaN
2005-01-10        NaN
...               ...
2020-03-04  36.926606
2020-03-05  32.135729
2020-03-06  28.677840
2020-03-08  25.433526
2020-03-09  13.114754

[3961 rows x 1 columns]

できた