Skip to content Skip to sidebar Skip to footer

Pandas Applying Multicolumnindex To Dataframe

The situation is that I have a few files with time_series data for various stocks with several fields. each file contains time, open, high, low, close, volume the goal is to get t

Solution 1:

I think you can first get all files to list files, then with list comprehension get all DataFrames and concat them by columns (axis=1). If add parameter keys, you get Multiindex in columns:

Files:

a.csv, b.csv, c.csv

import pandas as pd
import glob

files = glob.glob('files/*.csv')
dfs = [pd.read_csv(fp) for fp in files]

eqty_names_list = ['hk1','hk2','hk3']
df = pd.concat(dfs, keys=eqty_names_list, axis=1)

print (df)
  hk1       hk2       hk3      
    a  b  c   a  b  c   a  b  c
00120960711158164132

Last need swaplevel and sort_index:

df.columns = df.columns.swaplevel(0,1)
df = df.sort_index(axis=1)
print (df)
    a           b           c        
  hk1 hk2 hk3 hk1 hk2 hk3 hk1 hk2 hk3
0   0   0   0   1   9   7   2   6   1
1   1   1   1   5   6   3   8   4   2

Post a Comment for "Pandas Applying Multicolumnindex To Dataframe"