It is a basic problem in the first jupyter notebook file for you. However, I struggled for two days.
I tried to run …\02_market_and_fundamental_data\01_NASDAQ_TotalView-ITCH_Order_Book\01_parse_itch_order_flow_messages.ipynb, and it returns errors.
In cell #31 (I make the error in bold style):
Start of Messages
03:02:31.65 0
Start of System Hours
04:00:00.00 241,258
Start of Market Hours
09:30:00.00 9,559,279
09:44:09.23 25,000,000 00:00:52.34 Cannot serialize the column [primary_market_maker] because its data contents are not [string] but [integer] object dtype L
<class ‘pandas.core.frame.DataFrame’>
RangeIndex: 214749 entries, 0 to 214748
Data columns (total 7 columns):
Column Non-Null Count Dtype
I haven’t yet solved the issue, but just putting it out there that I’m also experiencing the same error message when running through the notebook.
"KeyError: 'No object named P in the file'"
The output from the script that leverages the store_messages() function does show that it’s parsing P messages however:
Start of Messages
03:02:31.65 0
Start of System Hours
04:00:00.00 241,258
Start of Market Hours
09:30:00.00 9,559,279
09:44:09.23 25,000,000 00:00:43.43
S
R
H
Y
L
Cannot serialize the column [primary_market_maker]
because its data contents are not [string] but [integer] object dtype
L
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 214749 entries, 0 to 214748
Data columns (total 7 columns):
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 stock_locate 214749 non-null int64
1 tracking_number 214749 non-null int64
2 timestamp 214749 non-null timedelta64[ns]
3 mpid 214749 non-null object
4 primary_market_maker 214749 non-null object
5 market_maker_mode 214749 non-null object
6 market_participant_state 214749 non-null object
dtypes: int64(2), object(4), timedelta64[ns](1)
memory usage: 11.5+ MB
None
S 1
U 1
Q 1
C 1
I 1
V 1
P 1
E 1
X 1
R 1
F 1
D 1
A 1
L 1
Y 1
H 1
J 1
Name: count, dtype: int64
J 1
V 1
S 3
H 8885
Q 8887
R 8887
Y 8926
C 9176
P 108412
L 214749
E 364951
F 836655
I 1072326
X 1086393
U 2132765
D 9044692
A 10094291
dtype: int64
Duration: 00:00:45.49
For whatever reason it’s not being appended to the hd5 file - perhaps a consequence of this error in the output:
Cannot serialize the column [primary_market_maker]
because its data contents are not [string] but [integer] object dtype