Merge in R large number of data tables with different row count vertically with additional column for each original list in the merged list -
i've ohlc+volume data aapl, goog, txn etc (all constituents of s&p 500) in individual data tables loaded in r environment. data tables has 7 columns, not has equal number of rows.
i need merge all tables, 1 top of another. distinguish between various companies in merged table, need additional column have name of original tables rows taken original table. no list of tables merged exists. need merge existing tables in r workspace. apologies being unclear.
also performance might issues; cross more 3 mn rows 8 rows. tried plyr , reshape, of no avail.
as example:
table1: named aapl col1 col2 1. 2 4 2. 3 11 table2: named goog col1 col2 1. 10 12 2. 14 19 3. 11 15 merged table: named output col1 col2 col3 1. 2 4 aapl 2. 3 11 aapl 3. 10 12 goog 4. 14 19 goog 5. 11 15 goog sample of actual data:
> str(astrazen) 'data.frame': 3440 obs. of 7 variables: $ date : int 20130628 20130628 20130628 20130628 20130628 20130628 20130628 20130628 20130628 20130628 ... $ time : int 916 917 918 919 920 921 922 923 924 925 ... $ close : num 660 658 654 657 654 ... $ high : num 660 660 655 657 656 ... $ low : num 653 654 654 652 652 ... $ open : num 654 660 655 654 655 ... $ volume: int 522 265 320 498 417 138 135 975 132 126 ...` > head(astrazen) date time close high low open volume 1 20130628 916 659.95 659.95 652.55 654.00 522 2 20130628 917 658.00 659.95 654.20 659.95 265 3 20130628 918 654.00 655.00 654.00 655.00 320 4 20130628 919 656.65 656.65 652.50 654.00 498 5 20130628 920 653.65 655.85 651.95 655.00 417 6 20130628 921 654.00 654.50 654.00 654.50 138
it's not clear how you're going have list of names in op, assuming it's literally list of names, can this:
library(data.table) aapl = data.table(col1 = c(2:3), col2 = c(4,11)) goog = data.table(col1 = c(10:12), col2 = c(1:3)) sp = list('aapl', 'goog') rbindlist(lapply(sp, function(name) get(name)[, col3 := name])) # col1 col2 col3 #1: 2 4 aapl #2: 3 11 aapl #3: 10 1 goog #4: 11 2 goog #5: 12 3 goog edit
if you'd list of data.frame objects in environment, , above, can do:
all.objects = sapply(ls(), function(x) class(get(x))) sp = names(all.objects[all.objects == 'data.frame']) rbindlist(lapply(sp, function(name) data.table(get(name))[, col3 := name]))
Comments
Post a Comment