at the moment I’m writing my master thesis about: ML applied to bitcoin trading strategies. I collected technical and on-chain data as well as sentiment data and created a RNN which tries to predict 1 day price of BTC.
Now I have two questions open and I hope you can help me:
How can I apply the finish model to new real life data? Most time we only check how good the model is in the test data set, but if I am satisfied I want to apply the model to new data (e.g. I’m keeping a validation data set, and I want to apply the model in another notebook), but how does it work in detail? I think I have to save the model and apply it to the new sequence of data once they occur, is this correct? Is there any code which shows this process?
If I want to predict the price in 7 days, I think there are two options. I can take the daily sequence data and label the price in 7 days from now, so I shift it back, or the second option is to compress the daily data to weekly data and label the price in 1 one week from now. Am I right on this two solutions?
I have very little experience with python and machine learning models, but you definitely point out my biggest frustration with a lot of "tutorials’ and notebooks on running ML for trading. Most of them, like 99%, don’t actually tell you have to make a future forecast on unseen data. The actual thing I want to do as a real trader, hahaha.
Nonetheless, after searching google, github repos, and medium.com I finally was able to find out one way. I’m not sure if it’s the correct way, but it’s the only way I’ve been able to implement so far. Basically, you just add future rows to your data frame, such as your test set. If your model is just predicting one step, then just add one row. seven steps, add seven new rows filled with zeros, assuming your predictions aren’t “windowing” and using the zeros to predict days 2-7.
I think for your second question, the medium.com article is probably the approach you’re looking for. I believe the author is predicting one day ahead and filling your dataframe with the predicted values and using that value to make your next prediction. Though, like I said, I’m far from an expert and I have a feeling you know way more than me so maybe your approach is also fine.