Merge in R large number of data tables with different row count vertically with additional column for each original list in the merged list -


i've ohlc+volume data aapl, goog, txn etc (all constituents of s&p 500) in individual data tables loaded in r environment. data tables has 7 columns, not has equal number of rows.

i need merge all tables, 1 top of another. distinguish between various companies in merged table, need additional column have name of original tables rows taken original table. no list of tables merged exists. need merge existing tables in r workspace. apologies being unclear.

also performance might issues; cross more 3 mn rows 8 rows. tried plyr , reshape, of no avail.

as example:

table1: named aapl     col1 col2 1. 2    4 2. 3    11   table2: named goog     col1 col2 1. 10    12 2. 14    19 3. 11    15    merged table: named output     col1 col2  col3 1. 2    4     aapl 2. 3    11    aapl 3. 10   12    goog 4. 14   19    goog 5. 11   15    goog 

sample of actual data:

 > str(astrazen) 'data.frame':   3440 obs. of  7 variables:  $ date  : int  20130628 20130628 20130628 20130628 20130628 20130628 20130628 20130628 20130628 20130628 ...  $ time  : int  916 917 918 919 920 921 922 923 924 925 ...  $ close : num  660 658 654 657 654 ...  $ high  : num  660 660 655 657 656 ...  $ low   : num  653 654 654 652 652 ...  $ open  : num  654 660 655 654 655 ...  $ volume: int  522 265 320 498 417 138 135 975 132 126 ...`  > head(astrazen)       date time  close   high    low   open volume 1 20130628  916 659.95 659.95 652.55 654.00    522 2 20130628  917 658.00 659.95 654.20 659.95    265 3 20130628  918 654.00 655.00 654.00 655.00    320 4 20130628  919 656.65 656.65 652.50 654.00    498 5 20130628  920 653.65 655.85 651.95 655.00    417 6 20130628  921 654.00 654.50 654.00 654.50    138 

it's not clear how you're going have list of names in op, assuming it's literally list of names, can this:

library(data.table) aapl = data.table(col1 = c(2:3), col2 = c(4,11)) goog = data.table(col1 = c(10:12), col2 = c(1:3))  sp = list('aapl', 'goog') rbindlist(lapply(sp, function(name) get(name)[, col3 := name])) #   col1 col2 col3 #1:    2    4 aapl #2:    3   11 aapl #3:   10    1 goog #4:   11    2 goog #5:   12    3 goog 

edit

if you'd list of data.frame objects in environment, , above, can do:

all.objects = sapply(ls(), function(x) class(get(x))) sp = names(all.objects[all.objects == 'data.frame']) rbindlist(lapply(sp, function(name) data.table(get(name))[, col3 := name])) 

Comments