Hello,
Just starting through the book and as I was going through the jupyter notebook in chapter 2 I received an error when trying to run pd.read_excel('message_types.xlsx', sheet_name='messages',)
.
Notebook: machine-learning-for-trading/01_parse_itch_order_flow_messages.ipynb at master · stefan-jansen/machine-learning-for-trading · GitHub
section 4.3.1 Load Message Types
The first error was saying "No module xlrd
as that version 0.22.0
of Pandas tried to use xlrd
to parse .xlsx
files. Well, I installed xlrd
but it’s latest version no longer supports .xlsx
.
There was a bug mention for this here with some tips to add the engine
variable with value openpyxl
:
BUG: Cannot read XLSX files with xlrd version 2.0.0 #38410 Closed.
What I did to get things to work.
I did the following:
1.) Upgraded pandas and installed openpyxl
.
2.) added engine='openpyxl'
to pandas.read_excel() like:
pd.read_excel('message_types.xlsx', sheet_name='messages', engine='openpyxl')
pip install pandas --upgrade
pip install openpyxl --upgrade
This worked for me. Not sure if that was the right thing to do . Proceeding forward but thought to share in case anyone else was just about to start and ran into this.
[EDIT] : Using the docker image. Noticed I was in the backtest
environment. Not sure if that may have been why and If I needed to be in ml4t
.
- Adrian