Zipline in Google Colab!

How to installl Zipline in Google Colab ?

Sorry for the late response, you can run the following to install TA-Lib:

%%bash
wget http://prdownloads.sourceforge.net/ta-lib/ta-lib-0.4.0-src.tar.gz
tar -xzf ta-lib-0.4.0-src.tar.gz
cd ta-lib/
./configure
make
make install

and then install using pip:

%pip install zipline-reloaded

After this,

import zipline
zipline.__version__

should show the latest version (just did).

1 Like

Evening Everyone!

After a month or so of learning on the laptop I realized that it didn’t quite seem up to the challenge of some of these models. I rationalized that working thru all the ml4t on google colab seemed like a faster way to get access to some more powerful hardware, so I started moving everything over Saturday evening. I’ve been able to clear many of the straight forward library discrepancies on many of the examples, and also getting Quandl data pulled (I think?).

I am really stuck on getting zipline to execute and not throw an error on the first line. Error is … assets-7.sqlite’ doesn’t exist. Below are all the specifics on what I’m doing on chapter 4 : 04_single_factor_zipline

I worked my way thru the #205 pickling error on the laptop so I’m trying to just get zipline to work with the workaround defined in the github comments to comment out first line and add workaround at the bottom. (have tried it both ways in Colab with no luck)

All my digging into this is turning me upside down. There is some documentation on the zipline docs about it being related to the zipline_root directory (I admit to being lost on how that gets implemented let alone resolved in Colab) and that the resolution is to not import with the:

zipline ingest -b quandl

But I also know that the zipline-reload is the latest and greatest and the docs say to do it that way…

Hope this makes sense. Happy to clarify / add / break / fix

I really think if I can get this working then I’m off running full speed in Colab. Am still working thru some file handling issues around reading and writing the .h5 files but I’m confident I’ll get that sorted and post what I’ve found

Thanks,
Doug


ValueError Traceback (most recent call last)
in ()
6 capital_base=10_000_000,
7 before_trading_start=before_trading_start,
----> 8 bundle=“quandl”,
9 )

6 frames
/usr/local/lib/python3.7/dist-packages/zipline/utils/sqlite_utils.py in verify_sqlite_path_exists(path)
31 def verify_sqlite_path_exists(path):
32 if path != “:memory:” and not os.path.exists(path):
—> 33 raise ValueError(“SQLite file {!r} doesn’t exist.”.format(path))
34
35

ValueError: SQLite file ‘/root/.zipline/data/quandl/2021-11-02T01;00;05.475419/assets-7.sqlite’ doesn’t exist.

%%bash
wget http://prdownloads.sourceforge.net/ta-lib/ta-lib-0.4.0-src.tar.gz
tar -xzf ta-lib-0.4.0-src.tar.gz
cd ta-lib/
./configure
make
make install

%pip install zipline-reloaded

###Have tried it with and without the pip’ing the setuptools
!pip install -U pip setuptools wheel

import zipline
zipline.version

%matplotlib inline
import matplotlib.pyplot as plt
import seaborn as sns

sns.set_style(‘whitegrid’)

%load_ext zipline

below is what I have tried both ways. This is shown commented out and the extra code at the bottom to start the execution

#%%zipline --start 2015-1-1 --end 2018-1-1 --output single_factor.pickle --no-benchmark --bundle quandl

from zipline.api import (
attach_pipeline,
date_rules,
time_rules,
order_target_percent,
pipeline_output,
record,
schedule_function,
get_open_orders,
calendars
)
from zipline.finance import commission, slippage
from zipline.pipeline import Pipeline, CustomFactor
from zipline.pipeline.factors import Returns, AverageDollarVolume
import numpy as np
import pandas as pd

MONTH = 21
YEAR = 12 * MONTH
N_LONGS = N_SHORTS = 25
VOL_SCREEN = 1000

class MeanReversion(CustomFactor):
“”“Compute ratio of latest monthly return to 12m average,
normalized by std dev of monthly returns”""
inputs = [Returns(window_length=MONTH)]
window_length = YEAR

def compute(self, today, assets, out, monthly_returns):
    df = pd.DataFrame(monthly_returns)
    out[:] = df.iloc[-1].sub(df.mean()).div(df.std())

def compute_factors():
“”“Create factor pipeline incl. mean reversion,
filtered by 30d Dollar Volume; capture factor ranks”""
mean_reversion = MeanReversion()
dollar_volume = AverageDollarVolume(window_length=30)
return Pipeline(columns={‘longs’: mean_reversion.bottom(N_LONGS),
‘shorts’: mean_reversion.top(N_SHORTS),
‘ranking’: mean_reversion.rank(ascending=False)},
screen=dollar_volume.top(VOL_SCREEN))

def exec_trades(data, assets, target_percent):
“”“Place orders for assets using target portfolio percentage”""
for asset in assets:
if data.can_trade(asset) and not get_open_orders(asset):
order_target_percent(asset, target_percent)

def rebalance(context, data):
“”“Compute long, short and obsolete holdings; place trade orders”""
factor_data = context.factor_data
record(factor_data=factor_data.ranking)

assets = factor_data.index
record(prices=data.current(assets, 'price'))

longs = assets[factor_data.longs]
shorts = assets[factor_data.shorts]
divest = set(context.portfolio.positions.keys()) - set(longs.union(shorts))

exec_trades(data, assets=divest, target_percent=0)
exec_trades(data, assets=longs, target_percent=1 / N_LONGS)
exec_trades(data, assets=shorts, target_percent=-1 / N_SHORTS)

def initialize(context):
“”“Setup: register pipeline, schedule rebalancing,
and set trading params”""
attach_pipeline(compute_factors(), ‘factor_pipeline’)
schedule_function(rebalance,
date_rules.week_start(),
time_rules.market_open(),
calendar=calendars.US_EQUITIES)
context.set_commission(commission.PerShare(cost=.01, min_trade_cost=0))
context.set_slippage(slippage.VolumeShareSlippage())

def before_trading_start(context, data):
“”“Run factor pipeline”""
context.factor_data = pipeline_output(‘factor_pipeline’)

This is what I found on how to do this in Colab…

import os
os.environ[‘QUANDL_API_KEY’] = ‘my super secret API Key’
!zipline ingest -b quandl

This is the run code that causes the error

from zipline import run_algorithm
result = run_algorithm(
start=pd.Timestamp(“2015-1-1”, tz=“UTC”),
end=pd.Timestamp(“2018-1-1”, tz=“UTC”),
initialize=initialize,
capital_base=10_000_000,
before_trading_start=before_trading_start,
bundle=“quandl”,
)

1 Like

I’m planning on including some instructions for google colab in a 3rd edition given how popular it is. I have not yet had the time to test this environment because it’s not covered in the 2nd edition so I don’t have an answer at this point, my apologies.

2 Likes

Install quandl

!pip install quandl

Thereafter pass following:

!QUANDL_API_KEY=<your_own_API_key>- zipline ingest -b ‘quandl’