pandas.DataFrameからRSIを求める
テクニカル指標を用いたらもっと収益が改善するんじゃない?という話になったので、
RSI(相対力指数)を算出したい(移動平均線出せれば簡単そうだった)
用いたデータ
USD/JPY(期間:2005/1/3 ~ 2020/3/9)
CSVファイルで取得
処理の流れ
まずはデータフレーム作成
import glob import pandas as pd from dateutil.relativedelta import relativedelta import numpy as np import scipy.optimize as sco import sys import matplotlib.pyplot as plt import os import math def read_and_pre_process_asset_df(asset_file_path : str) -> pd.DataFrame : asset_name_ext = os.path.basename(asset_file_path) asset_name = os.path.splitext(asset_name_ext)[0] asset_df = pd.read_csv(asset_file_path, encoding = "utf-8", usecols = ['日付け','終値']) asset_df = asset_df.rename(columns = {'日付け' : 'Date', '終値' : asset_name}) asset_df['Date'] = pd.to_datetime(asset_df['Date'], format = '%Y年%m月%d日') asset_df['Date'] = asset_df['Date'].dt.date return asset_df def merge_df(df_list : list, on : str = 'Date') -> pd.DataFrame : total_df = df_list[0] for df in df_list[1:]: total_df = pd.merge(df, total_df, on = on) return total_df def pre_process_total_df(total_df : pd.DataFrame) -> pd.DataFrame : total_df = total_df.sort_values(['Date']) total_df = total_df.set_index('Date') total_df = total_df.astype(str)\ .applymap(lambda x: x.replace(',', ''))\ .astype(float) return total_df asset_file_paths = glob.glob('data/USDJPY.csv') # dataディレクトリ直下のcsvファイル読み込み asset_df_list = [read_and_pre_process_asset_df(asset_file_path) for asset_file_path in asset_file_paths] asset_total_df = merge_df(asset_df_list) asset_total_df = pre_process_total_df(asset_total_df)
過去に作った関数を使うと一瞬!便利!
データフレームの内容はこんな感じ
USDJPY Date 2005-01-03 102.72 2005-01-04 104.61 2005-01-05 104.09 2005-01-06 105.07 2005-01-07 104.80 ... ... 2020-03-04 107.53 2020-03-05 106.17 2020-03-06 105.31 2020-03-08 103.93 2020-03-09 102.36
データフレーム作成後前処理
RSIを求める上では、n日間での上昇幅の平均と下落幅の平均をそれぞれ出さないといけないので、その処理を行う
historical_price = asset_total_df historical_diff = historical_price.diff().dropna() up, down = historical_diff.copy(),historical_diff.copy() print (historical_diff) print (up) print (down)
USDJPY Date 2005-01-04 1.89 2005-01-05 -0.52 2005-01-06 0.98 2005-01-07 -0.27 2005-01-10 -0.60 ... ... 2020-03-04 0.39 2020-03-05 -1.36 2020-03-06 -0.86 2020-03-08 -1.38 2020-03-09 -1.57 [3961 rows x 1 columns] USDJPY Date 2005-01-04 1.89 2005-01-05 -0.52 2005-01-06 0.98 2005-01-07 -0.27 2005-01-10 -0.60 ... ... 2020-03-04 0.39 2020-03-05 -1.36 2020-03-06 -0.86 2020-03-08 -1.38 2020-03-09 -1.57 [3961 rows x 1 columns] USDJPY Date 2005-01-04 1.89 2005-01-05 -0.52 2005-01-06 0.98 2005-01-07 -0.27 2005-01-10 -0.60 ... ... 2020-03-04 0.39 2020-03-05 -1.36 2020-03-06 -0.86 2020-03-08 -1.38 2020-03-09 -1.57 [3961 rows x 1 columns]
変数名が以前のコードをそのまま代用していたのでわかりずらかった、ので変えた
.copy
はそのままコピーしてくれる
.mask
pandas
でデータフレームの要素を条件付きで変換したいときにこれを使った
historical_price = asset_total_df historical_diff = historical_price.diff().dropna() up, down = historical_diff.copy(),historical_diff.copy() up = up.mask(up < 0, 0) down = down.mask(down > 0, 0) print (up) print (down)
USDJPY Date 2005-01-04 1.89 2005-01-05 0.00 2005-01-06 0.98 2005-01-07 0.00 2005-01-10 0.00 ... ... 2020-03-04 0.39 2020-03-05 0.00 2020-03-06 0.00 2020-03-08 0.00 2020-03-09 0.00 [3961 rows x 1 columns] USDJPY Date 2005-01-04 0.00 2005-01-05 -0.52 2005-01-06 0.00 2005-01-07 -0.27 2005-01-10 -0.60 ... ... 2020-03-04 0.00 2020-03-05 -1.36 2020-03-06 -0.86 2020-03-08 -1.38 2020-03-09 -1.57 [3961 rows x 1 columns]
pandas.rollng
up_sma = up.rolling(14).mean() down_sma = down.rolling(14).mean() print (up_sma) print (down_sma)
USDJPY Date 2005-01-04 NaN 2005-01-05 NaN 2005-01-06 NaN 2005-01-07 NaN 2005-01-10 NaN ... ... 2020-03-04 0.230000 2020-03-05 0.230000 2020-03-06 0.220000 2020-03-08 0.220000 2020-03-09 0.114286 [3961 rows x 1 columns] USDJPY Date 2005-01-04 NaN 2005-01-05 NaN 2005-01-06 NaN 2005-01-07 NaN 2005-01-10 NaN ... ... 2020-03-04 -0.392857 2020-03-05 -0.485714 2020-03-06 -0.547143 2020-03-08 -0.645000 2020-03-09 -0.757143 [3961 rows x 1 columns]
移動平均もこれ一つで簡単!
SRIを求める
RS = up_sma / down_sma RSI = 100.0 - (100.0 / (1.0 + RS)) print (RSI)
USDJPY Date 2005-01-04 NaN 2005-01-05 NaN 2005-01-06 NaN 2005-01-07 NaN 2005-01-10 NaN ... ... 2020-03-04 36.926606 2020-03-05 32.135729 2020-03-06 28.677840 2020-03-08 25.433526 2020-03-09 13.114754 [3961 rows x 1 columns]
できた