Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
49 commits
Select commit Hold shift + click to select a range
6fee38c
added condition
venom1204 Jul 8, 2025
9a89e90
news entry
venom1204 Jul 8, 2025
0e39296
Update tables.R
venom1204 Jul 10, 2025
a04d3ed
Merge branch 'master' of https://github.com/Rdatatable/data.table int…
venom1204 Jul 12, 2025
f1cc4c8
updates tests and comments
venom1204 Jul 12, 2025
0c39e3b
Merge branch 'issue_2606' of https://github.com/Rdatatable/data.table…
venom1204 Jul 12, 2025
088bb3e
..
venom1204 Jul 12, 2025
1ae5f7d
updated tes
venom1204 Jul 12, 2025
bb830f8
updaed testst
venom1204 Jul 13, 2025
2a54898
..
venom1204 Jul 13, 2025
4f5ee7f
,,,
venom1204 Jul 13, 2025
04cf5ff
added test for uncovered line
venom1204 Jul 13, 2025
405d693
..
venom1204 Jul 13, 2025
26bd48a
Merge branch 'master' of https://github.com/Rdatatable/data.table int…
venom1204 Jul 13, 2025
597a5b3
..
venom1204 Jul 13, 2025
dccca8f
Merge branch 'master' into issue_2606
MichaelChirico Jul 14, 2025
b69930e
updated tests
venom1204 Jul 15, 2025
1b02090
Merge branch 'master' of https://github.com/Rdatatable/data.table int…
venom1204 Jul 15, 2025
acdb19d
..
venom1204 Jul 15, 2025
f39df47
updated tests
venom1204 Jul 15, 2025
d79b465
Merge branch 'master' into issue_2606
MichaelChirico Jul 15, 2025
91b20a9
ws style
MichaelChirico Jul 15, 2025
9398673
test style
MichaelChirico Jul 15, 2025
9629fa5
ws style
MichaelChirico Jul 15, 2025
8f4d931
use implicit coercion
MichaelChirico Jul 15, 2025
c4cff93
no need to agend[1]=NULL, just drop if from the overwrite
MichaelChirico Jul 15, 2025
ce726c2
i am wrong (need to remove the item for non-list entries)
MichaelChirico Jul 15, 2025
653d798
Use mapply()
MichaelChirico Jul 15, 2025
e01e674
Use vapply_1c
MichaelChirico Jul 15, 2025
63694aa
i18n fix
MichaelChirico Jul 15, 2025
8978cf2
shrink diff with extra newline
MichaelChirico Jul 15, 2025
dec9834
Merge branch 'master' into issue_2606
venom1204 Jul 16, 2025
c757efa
added atime test
venom1204 Jul 16, 2025
1ab6068
table()
venom1204 Jul 16, 2025
e335ccd
..
venom1204 Jul 16, 2025
affa183
..
venom1204 Jul 16, 2025
e6aebc8
tests.R
venom1204 Jul 16, 2025
9746a05
final
venom1204 Jul 16, 2025
9018240
..
venom1204 Jul 16, 2025
a4c7e7a
Merge branch 'issue_2606' of https://github.com/Rdatatable/data.table…
venom1204 Jul 18, 2025
f6ee96d
Merge branch 'master' into issue_2606
venom1204 Jul 21, 2025
ca6637c
Merge branch 'master' of https://github.com/Rdatatable/data.table int…
venom1204 Jul 21, 2025
66024e6
Merge branch 'issue_2606' of https://github.com/Rdatatable/data.table…
venom1204 Jul 21, 2025
16cec98
updated format
venom1204 Jul 21, 2025
5291335
Merge branch 'master' of https://github.com/Rdatatable/data.table int…
venom1204 Jul 22, 2025
c0e3db7
Merge branch 'master' of https://github.com/Rdatatable/data.table int…
venom1204 Aug 18, 2025
074a46a
merged
venom1204 Aug 18, 2025
e63b553
atime test update
venom1204 Aug 18, 2025
d6aa6b5
..
venom1204 Aug 18, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
14 changes: 14 additions & 0 deletions .ci/atime/tests.R
Original file line number Diff line number Diff line change
Expand Up @@ -286,5 +286,19 @@ test.list <- atime::atime_test_list(
Slow = "548410d23dd74b625e8ea9aeb1a5d2e9dddd2927", # Parent of the first commit in the PR (https://github.com/Rdatatable/data.table/commit/548410d23dd74b625e8ea9aeb1a5d2e9dddd2927)
Fast = "c0b32a60466bed0e63420ec105bc75c34590865e"), # Commit in the PR (https://github.com/Rdatatable/data.table/pull/7144/commits) that uses a much faster implementation

"tables() default performance unchanged in #7141" = atime::atime_test(
N = as.integer(10^seq(1, 4)),
setup = {
test_env <- new.env()
for (i in seq_len(N)) {
assign(paste0("dt_", i), data.table::data.table(a = 1), envir = test_env)
assign(paste0("vec_", i), 1:10, envir = test_env)
}
},
expr = data.table::tables(env = test_env, silent = TRUE),

Slow = "7c59daaed1836db57747d92494b1ce96612bbf80", # Parent of the first commit in the PR (https://github.com/Rdatatable/data.table/commit/7c59daaed1836db57747d92494b1ce96612bbf80)
Fast = "6fee38c89200e10dcc10a6f2057ab784f9a011e7"), # Commit in the PR (https://github.com/Rdatatable/data.table/pull/7141/commits)

tests=extra.test.list)
# nolint end: undesirable_operator_linter.
2 changes: 2 additions & 0 deletions NEWS.md
Original file line number Diff line number Diff line change
Expand Up @@ -69,6 +69,8 @@

15. New function `isoyear()` has been implemented as a complement to `isoweek()`, returning the ISO 8601 year corresponding to a given date, [#7154](https://github.com/Rdatatable/data.table/issues/7154). Thanks to @ben-schwen and @MichaelChirico for the suggestion and @venom1204 for the implementation.

16. `tables()` now supports a `recursive=TRUE` argument to detect `data.table` objects nested within plain lists, such as those produced by `split()` or manual list construction, [#2606](https://github.com/Rdatatable/data.table/issues/2606). The recursive search skips data.frame and data.table objects to avoid descending into list-columns. Nested data.tables are reported with intuitive R-like names using $ and [[ ]] notation. Thanks to @MichaelChirico for the suggestion and @venom1204 for the implementation.

### BUG FIXES

1. `fread()` no longer warns on certain systems on R 4.5.0+ where the file owner can't be resolved, [#6918](https://github.com/Rdatatable/data.table/issues/6918). Thanks @ProfFancyPants for the report and PR.
Expand Down
56 changes: 47 additions & 9 deletions R/tables.R
Original file line number Diff line number Diff line change
Expand Up @@ -19,30 +19,68 @@ type_size = function(DT) {
}

tables = function(mb=type_size, order.col="NAME", width=80L,
env=parent.frame(), silent=FALSE, index=FALSE)
env=parent.frame(), silent=FALSE, index=FALSE, recursive=FALSE)
{
# Prints name, size and colnames of all data.tables in the calling environment by default
mb_name = as.character(substitute(mb))
if (isTRUE(mb)) { mb=type_size; mb_name="type_size" }
names = ls(envir=env, all.names=TRUE) # include "hidden" objects (starting with .)
obj = mget(names, envir=env) # doesn't copy; mget is ok with ... unlike get, #5197
w = which(vapply_1b(obj, is.data.table))
if (!length(w)) {
objs = mget(names, envir=env) # doesn't copy; mget is ok with ... unlike get, #5197
found_items = list()
if (recursive) {
agenda = mapply(function(obj, name) list(obj=obj, name=name), objs, names, SIMPLIFY=FALSE, USE.NAMES=FALSE)
visited_env = new.env(hash=TRUE)

while (length(agenda)) {
current_item = agenda[[1L]]
agenda[[1L]] = NULL
x = current_item$obj
x_name = current_item$name
if (is.data.table(x)) {
found_items[[length(found_items) + 1L]] = list(name=x_name, obj=x)
next
}
if (is.list(x) && !is.data.frame(x)) {
# Cycle detection
addr = address(x)
if (exists(addr, envir=visited_env, inherits=FALSE)) next
assign(addr, TRUE, envir=visited_env)

item_names = names(x)
children_to_add = vector("list", length(x))
for (i in seq_along(x)) {
child_name = if (!is.null(item_names) && nzchar(item_names[i])) {
paste0(x_name, "$", item_names[i])
} else {
paste0(x_name, "[[", i, "]]")
}
children_to_add[[i]] = list(obj=x[[i]], name=child_name)
}
agenda = c(rev(children_to_add), agenda)
}
}
} else {
w = which(vapply_1b(objs, is.data.table))
if (length(w)) {
found_items = lapply(w, function(i) list(name=names[i], obj=objs[[i]]))
}
}
if (!length(found_items)) {
if (!silent) catf("No objects of class data.table exist in %s\n", if (identical(env, .GlobalEnv)) ".GlobalEnv" else format(env))
return(invisible(data.table(NULL)))
}
info = data.table(NAME=names[w], NROW=0L, NCOL=0L, MB=0.0, COLS=list(), KEY=list(), INDICES=list())
for (i in seq_along(w)) { # avoid rbindlist(lapply(DT_names)) in case of a large number of tables
DT = obj[[w[i]]]
info = data.table(NAME=vapply_1c(found_items, `[[`, "name"), NROW=0L, NCOL=0L, MB=0.0, COLS=list(), KEY=list(), INDICES=list())
for (i in seq_along(found_items)) { # avoid rbindlist(lapply(DT_names)) in case of a large number of tables
DT = found_items[[i]]$obj
set(info, i, "NROW", nrow(DT))
set(info, i, "NCOL", ncol(DT))
if (is.function(mb)) set(info, i, "MB", as.integer(mb(DT)/1048576L)) # i.e. 1024**2
if (!is.null(tt<-names(DT))) set(info, i, "COLS", tt) # TODO: don't need these if()s when #5526 is done
if (!is.null(tt<-key(DT))) set(info, i, "KEY", tt)
if (index && !is.null(tt<-indices(DT))) set(info, i, "INDICES", tt)
}
if (!is.function(mb)) info[,MB:=NULL]
if (!index) info[,INDICES:=NULL]
if (!is.function(mb)) info$MB = NULL
if (!index) info$INDICES = NULL
if (!order.col %chin% names(info)) stopf("order.col='%s' not a column name of info", order.col)
info = info[base::order(info[[order.col]])] # base::order to maintain locale ordering of table names
if (!silent) {
Expand Down
34 changes: 34 additions & 0 deletions inst/tests/tests.Rraw
Original file line number Diff line number Diff line change
Expand Up @@ -21620,3 +21620,37 @@ local({
test(2338.9, {fwrite(dd, f, forceDecimal=FALSE); fread(f)}, di)
})


#2606 Recursive tables() naming convention
local({
lst_named <- list(inner=data.table(a=1))
lst_unnamed <- list(data.table(b=2))
nested <- list(l1=list(l2=data.table(c=3)))
mixed <- list(data.table(x=1), y=data.table(z=2))
mixed_nested <- list(A=list(data.table(p=1), q=data.table(q=2)))
out <- tables(recursive=TRUE)$NAME
expected <- c(
"lst_named$inner", "lst_unnamed[[1]]", "nested$l1$l2",
"mixed[[1]]", "mixed$y",
"mixed_nested$A[[1]]", "mixed_nested$A$q")
test(2339.1, out, sort(expected))
})
local({
dt <- data.table(val=42)
e <- new.env()
e$dt <- dt
e$self <- e # possible infinite loop if we're not careful
test(2339.2, tables(recursive=TRUE, env=e)$NAME, "dt")
})
local({
test_obj <- local({
common_list <- list(dt_inner=data.table(d=4))
outer_list <- list(first=common_list, unique=data.table(e=5))
outer_list$second <- outer_list$first
outer_list
})
out <- tables(recursive=TRUE)$NAME
test(2339.3, length(out), 2L)
test(2339.4, "test_obj$unique" %in% out)
test(2339.5, sum(grepl("\\$dt_inner$", out)), 1L)
})
15 changes: 13 additions & 2 deletions man/tables.Rd
Original file line number Diff line number Diff line change
Expand Up @@ -5,8 +5,9 @@
Convenience function for concisely summarizing some metadata of all \code{data.table}s in memory (or an optionally specified environment).
}
\usage{
tables(mb=type_size, order.col="NAME", width=80,
env=parent.frame(), silent=FALSE, index=FALSE)
tables(mb=type_size, order.col="NAME", width=80L,
env=parent.frame(), silent=FALSE, index=FALSE,
recursive=FALSE)
}
\arguments{
\item{mb}{ a function which accepts a \code{data.table} and returns its size in bytes. By default, \code{type_size} (same as \code{TRUE}) provides a fast lower bound by excluding the size of character strings in R's global cache (which may be shared) and excluding the size of list column items (which also may be shared). A column \code{"MB"} is included in the output unless \code{FALSE} or \code{NULL}. }
Expand All @@ -15,6 +16,9 @@ tables(mb=type_size, order.col="NAME", width=80,
\item{env}{ An \code{environment}, typically the \code{.GlobalEnv} by default, see Details. }
\item{silent}{ \code{logical}; should the output be printed? }
\item{index}{ \code{logical}; if \code{TRUE}, the column \code{INDICES} is added to indicate the indices assorted with each object, see \code{\link{indices}}. }
\item{recursive}{ \code{logical}; if \code{TRUE}, \code{tables} will perform a full,
iterative search into list objects to find nested data.tables.
Defaults to \code{FALSE} for backward compatibility. }
}
\details{
Usually \code{tables()} is executed at the prompt, where \code{parent.frame()} returns \code{.GlobalEnv}. \code{tables()} may also be useful inside functions where \code{parent.frame()} is the local scope of the function; in such a scenario, simply set it to \code{.GlobalEnv} to get the same behaviour as at prompt.
Expand All @@ -32,5 +36,12 @@ DT = data.table(A=1:10, B=letters[1:10])
DT2 = data.table(A=1:10000, ColB=10000:1)
setkey(DT,B)
tables()

# Finding data.tables nested in a list
dt_list <- list(a = data.table(x=1:5), b = data.table(y=6:10))
# By default, nested tables are not shown:
tables()
# Use recursive=TRUE to find them:
tables(recursive = TRUE)
}
\keyword{ data }
Loading