Hello,
Just starting through the book and as I was going through the jupyter notebook in chapter 2 I received an error when trying to run pd.read_excel('message_types.xlsx', sheet_name='messages',).
Notebook: machine-learning-for-trading/01_parse_itch_order_flow_messages.ipynb at master · stefan-jansen/machine-learning-for-trading · GitHub
section 4.3.1 Load Message Types
The first error was saying "No module xlrd as that version 0.22.0 of Pandas tried to use xlrd to parse .xlsx files. Well, I installed xlrd but it’s latest version no longer supports .xlsx.
There was a bug mention for this here with some tips to add the engine variable with value openpyxl:
BUG: Cannot read XLSX files with xlrd version 2.0.0 #38410  Closed.
What I did to get things to work.
I did the following:
1.) Upgraded pandas and installed openpyxl.
2.) added engine='openpyxl'  to pandas.read_excel() like:
pd.read_excel('message_types.xlsx', sheet_name='messages', engine='openpyxl')
pip install pandas --upgrade
pip install openpyxl --upgrade
This worked for me. Not sure if that was the right thing to do . Proceeding forward but thought to share in case anyone else was just about to start and ran into this.
[EDIT] : Using the docker image. Noticed I was in the backtest environment. Not sure if that may have been why and If I needed to be in ml4t.
- Adrian
