The [SO question](http://stackoverflow.com/a/26345246/559784) that triggered the thought: Instead of having to do: ``` R DT[, length(unique(.)), by=.] ``` We could do with: ``` R DT[, n_unique(.), by=.] ``` This'll especially be faster for data.tables though because we don't have to subset the entire data.table to know the number of unique values. Here's a quick benchmark: ``` R require(data.table) x = sample(1e2, 1e7, TRUE) system.time(ans1 <- length(unique(x))) # 0.667 seconds system.time(ans2 <- length(attr(data.table:::forderv(x, retGrp=TRUE), 'starts'))) # 0.1 seconds ``` We could, in addition, also internally optimise `length(unique(.))` to `n_unique(.)`.