Skip to content

[R-Forge #5072] Merging using character and factor results in by column becoming NA #499

@arunsrinivasan

Description

@arunsrinivasan

Submitted by: Abiel Reinhart; Assigned to: Arun ; R-Forge link

When using merge() with option all.x=TRUE or all=TRUE, if the column passed to by= is a character in the first data.table, and a factor in the second, then rows in x that are not in y will have their by column value set to NA.

Here's an example. Note how country has become NA in the first example.

require(data.table)
x <- data.table(country="US")
y <- data.table(country="USA")
y$country <- factor(y$country)
merge(x, y, by="country", all=T)
#    country
#1:      NA
#2:     USA

This will not occur if the merging column is a factor in the x input but a character in the y input.

x <- data.table(country="US")
y <- data.table(country="USA")
x$country <- factor(x$country)
merge(x, y, by="country", all=T)
#    country
#1:      US
#2:     USA

Metadata

Metadata

Labels

HighbugjoinsUse label:"non-equi joins" for rolling, overlapping, and non-equi joins

Type

No type

Projects

No projects

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions