04_alpha_factor_research/01_feature_engineering.ipynb

Under the Create monthly return series… I’ve replicated the code using Bloomberg data and I have a few questions…

  1. Why would we compute returns this way when we are ignoring dividends? The returns calculated with the code in the notebook are very different to the total returns of the assets when I use bloomberg data.
  2. the notebook states, “Finally, we normalize returns using the geometric average” I can’t see any averaging in the code provided.

outlier_cutoff = 0.01
data = pd.DataFrame()
lags = [1, 2, 3, 6, 9, 12]
for lag in lags:
data[f’return_{lag}m’] = (monthly_prices
.pct_change(lag)
.stack()
.pipe(lambda x: x.clip(lower=x.quantile(outlier_cutoff),
upper=x.quantile(1-outlier_cutoff)))
.add(1)
.pow(1/lag)
.sub(1)
)
data = data.swaplevel().dropna()
data.info()

Thanks