Error in Chapter 2 Code

Hi all.

It is a basic problem in the first jupyter notebook file for you. However, I struggled for two days.

I tried to run …\02_market_and_fundamental_data\01_NASDAQ_TotalView-ITCH_Order_Book\01_parse_itch_order_flow_messages.ipynb, and it returns errors.

  1. In cell #31 (I make the error in bold style):
    Start of Messages
    03:02:31.65 0

Start of System Hours
04:00:00.00 241,258

Start of Market Hours
09:30:00.00 9,559,279
09:44:09.23 25,000,000 00:00:52.34
Cannot serialize the column [primary_market_maker]
because its data contents are not [string] but [integer] object dtype
L
<class ‘pandas.core.frame.DataFrame’>
RangeIndex: 214749 entries, 0 to 214748
Data columns (total 7 columns):
Column Non-Null Count Dtype


0 stock_locate 214749 non-null int64
1 tracking_number 214749 non-null int64
2 timestamp 214749 non-null timedelta64[ns]
3 mpid 214749 non-null object
4 primary_market_maker 214749 non-null object
5 market_maker_mode 214749 non-null object
6 market_participant_state 214749 non-null object

D 9044692
A 10094291
dtype: int64
Duration: 00:00:54.44

  1. In Cell #34:
    KeyError: ‘No object named P in the file’

I tried to run this notebook on Windows 11 with Python 3.11.2 and Ubuntu 22.04.2 LTS with Python 3.10. It did not work on both OS.

Could anyone help me out?

Thank you.

In Chapter 3 example. “META” should replace “FB”.

I haven’t yet solved the issue, but just putting it out there that I’m also experiencing the same error message when running through the notebook.

"KeyError: 'No object named P in the file'"

The output from the script that leverages the store_messages() function does show that it’s parsing P messages however:

 Start of Messages
	03:02:31.65	           0

 Start of System Hours
	04:00:00.00	     241,258

 Start of Market Hours
	09:30:00.00	   9,559,279
	09:44:09.23	  25,000,000	00:00:43.43
S
R
H
Y
L
Cannot serialize the column [primary_market_maker]
because its data contents are not [string] but [integer] object dtype
L
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 214749 entries, 0 to 214748
Data columns (total 7 columns):
 #   Column                    Non-Null Count   Dtype          
---  ------                    --------------   -----          
 0   stock_locate              214749 non-null  int64          
 1   tracking_number           214749 non-null  int64          
 2   timestamp                 214749 non-null  timedelta64[ns]
 3   mpid                      214749 non-null  object         
 4   primary_market_maker      214749 non-null  object         
 5   market_maker_mode         214749 non-null  object         
 6   market_participant_state  214749 non-null  object         
dtypes: int64(2), object(4), timedelta64[ns](1)
memory usage: 11.5+ MB
None
S    1
U    1
Q    1
C    1
I    1
V    1
P    1
E    1
X    1
R    1
F    1
D    1
A    1
L    1
Y    1
H    1
J    1
Name: count, dtype: int64
J           1
V           1
S           3
H        8885
Q        8887
R        8887
Y        8926
C        9176
P      108412
L      214749
E      364951
F      836655
I     1072326
X     1086393
U     2132765
D     9044692
A    10094291
dtype: int64
Duration: 00:00:45.49

For whatever reason it’s not being appended to the hd5 file - perhaps a consequence of this error in the output:

Cannot serialize the column [primary_market_maker]
because its data contents are not [string] but [integer] object dtype