Skip to content

slice_sample() does not interact correctly with set.seed() when working with an SQLite database #466

@stephenashton-dhsc

Description

@stephenashton-dhsc

I'm not too sure if this issue sits within dbplyr or RSQLite, but the slice_sample() function does not appear to be taking account of a seed set via set.seed() when using an SQLite database.

If I set a seed, then run slice_sample(), then later set the same seed, and run slice_sample() again, the sampled table is different.

For comparison, when undertaking these sample operations using a Microsoft SQL Server database, the expected behaviour is observed (i.e., the two sampled tables are identical).

library(DBI)
library(RSQLite)
library(dplyr)
library(dbplyr)


tbl <- tibble(n = 1:30)
conn <- dbConnect(SQLite(), ":memory:")
dbWriteTable(conn, "test_table", tbl)

set.seed(100)
sample1 <- slice_sample(tbl(conn, "test_table"), n = 3)

set.seed(100)
sample2 <- slice_sample(tbl(conn, "test_table"), n = 3)

identical(sample1, sample2)

This has been raised on dbplyr as tidyverse/dbplyr#1344

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions