i looking solution compute weighted sum of variables groups data.table. hope example clear enough.
require(data.table) dt <- data.table(matrix(1:200, nrow = 10)) dt[, gr := c(rep(1,5), rep(2,5))] dt[, w := 2] # error: object 'w' not found dt[, lapply(.sd, function(x) sum(x * w)), .sdcols = paste0("v", 1:4)] # error: object 'w' not found dt[, lapply(.sd * w, sum), .sdcols = paste0("v", 1:4)] # works out groups dt[, lapply(.sd, function(x) sum(x * dt$w)), .sdcols = paste0("v", 1:4)] # not work groups dt[, lapply(.sd, function(x) sum(x * dt$w)), .sdcols = paste0("v", 1:4), keyby = gr] # result expected dt[, list(v1 = sum(v1 * w), v2 = sum(v2 * w), v3 = sum(v3 * w), v4 = sum(v4 * w)), keyby = gr] ### aruns answer dt[, lapply(.sd[, paste0("v", 1:4), = f], function(x) sum(x*w)), by=gr]
final attempt (copying roland's answer :))
copying @roland's excellent answer:
print(dt[, lapply(.sd, function(x, w) sum(x*w), w=w), by=gr][, w := null]) still not efficient one: (second attempt)
following @roland's comment, it's indeed faster operation on columns , remove unwanted ones (as long operation not time consuming, case here).
dt[, {lapply(.sd, function(x) sum(x*w))}, by=gr][, w := null][] for reason, w seems not found when don't use {}.. no idea why though.
old (inefficient) answer:
(subsetting can costly if there many groups)
you can without using .sdcols , removing while providing lapply follows:
dt[, lapply(.sd[, -1, with=false], function(x) sum(x*w)), by=gr] # gr v1 v2 v3 v4 # 1: 1 20 120 220 320 # 2: 2 70 170 270 370 .sdcols makes .sd without w column. so, it's not possible multiply w doesn't exist within scope of .sd environment then.
Comments
Post a Comment