ValueError: 'date' is both an index level and a column label, which is ambiguous

tsferro · November 3, 2021, 3:40pm

when attempting to run the last cell in the Chapter 8 notebook “03_backtesting_with_backtrader”:

pf.create_full_tear_sheet(returns,
                          transactions=transactions,
                          positions=positions,
                          benchmark_rets=benchmark.dropna())

I am getting the following error:
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
in
2 transactions=transactions,
3 positions=positions,
----> 4 benchmark_rets=benchmark.dropna())

~\anaconda3\lib\site-packages\pyfolio\tears.py in create_full_tear_sheet(returns, positions, transactions, market_data, benchmark_rets, slippage, live_start_date, sector_mappings, bayesian, round_trips, estimate_intraday, hide_positions, cone_std, bootstrap, unadjusted_returns, style_factor_panel, sectors, caps, shares_held, volumes, percentile, turnover_denom, set_context, factor_returns, factor_loadings, pos_in_dollars, header_rows, factor_partitions)
    197 
    198     positions = utils.check_intraday(estimate_intraday, returns,
--> 199                                      positions, transactions)
    200 
    201     create_returns_tear_sheet(

~\anaconda3\lib\site-packages\pyfolio\utils.py in check_intraday(estimate, returns, positions, transactions)
    298                               'ons from transactions. Set estimate_intraday' +
    299                               '=False to disable.')
--> 300                 return estimate_intraday(returns, positions, transactions)
    301             else:
    302                 return positions

~\anaconda3\lib\site-packages\pyfolio\utils.py in estimate_intraday(returns, positions, transactions, EOD_hour)
    350     # Cumulate transaction amounts each day
    351     txn_val['date'] = txn_val.index.date
--> 352     txn_val = txn_val.groupby('date').cumsum()
    353 
    354     # Calculate exposure, then take peak of exposure every day

~\anaconda3\lib\site-packages\pandas\core\frame.py in groupby(self, by, axis, level, as_index, sort, group_keys, squeeze, observed, dropna)
   6725             squeeze=squeeze,
   6726             observed=observed,
-> 6727             dropna=dropna,
   6728         )
   6729 

~\anaconda3\lib\site-packages\pandas\core\groupby\groupby.py in __init__(self, obj, keys, axis, level, grouper, exclusions, selection, as_index, sort, group_keys, squeeze, observed, mutated, dropna)
    566                 observed=observed,
    567                 mutated=self.mutated,
--> 568                 dropna=self.dropna,
    569             )
    570 

~\anaconda3\lib\site-packages\pandas\core\groupby\grouper.py in get_grouper(obj, key, axis, level, sort, observed, mutated, validate, dropna)
    803             if gpr in obj:
    804                 if validate:
--> 805                     obj._check_label_or_level_ambiguity(gpr, axis=axis)
    806                 in_axis, name, gpr = True, gpr, obj[gpr]
    807                 exclusions.add(name)

~\anaconda3\lib\site-packages\pandas\core\generic.py in _check_label_or_level_ambiguity(self, key, axis)
   1636                 f"{label_article} {label_type} label, which is ambiguous."
   1637             )
-> 1638             raise ValueError(msg)
   1639 
   1640     @final

ValueError: 'date' is both an index level and a column label, which is ambiguous.

I tried renaming the “transactions” df to an arbitrary name, but the same error message is generated:
transactions.index.names = ['Date_Time'] # workaround

this is confusing because “transactions” is the only df in the 4 create_full_tear_sheet arguments that has “date” as an index name, and none of the columns in any of the 4 df are called “date” (is column label different than column name)?

any thoughts are much appreciated

mresquivias · May 1, 2022, 9:05am

Have you found any solution?
I am facing the same issue.
I realized that if you comment either transactions or positions, the code works but the results are biased.

I think there is something wrong between these 2 DF.

Best,

DavidXu · July 26, 2022, 3:14am

me too. waiting online…

CyCTW · November 4, 2023, 2:20pm

The problem is in pyfolio package. The error means that there’s a dataframe index whose name is the same as one of a column name in same dataframe.

To fix this, I have to fix the source code of pyfolio package by following steps:

Go to utils.py in package source code. (E.g. My pkg is at ~/.local/lib/pyfolio/utils.py)
Go to 351 line, found some code like

# Cumulate transaction amounts each day
txn_val['date'] = txn_val.index.date
txn_val = txn_val.groupby('date').cumsum()

Rename the date field to other name
For example:

# Cumulate transaction amounts each day
txn_val['date_tmp'] = txn_val.index.date
txn_val = txn_val.groupby('date_tmp').cumsum()

Hope it helps!