-
Notifications
You must be signed in to change notification settings - Fork 1k
Closed
Description
fread
is having difficulty with a tab-separated file that features lines that are partially quoted. To illustrate (str_works confirms embedded quotes as the cause):
str_fails = 'L1\tsome\tunquoted\tstuff\nL2\tsome\t"half" quoted\tstuff\nL3\tthis\t"should work"\tok thought'
str_works = gsub('"', '', str_fails)
fread(str_works, sep='\t', header=F, skip=0L)
# V1 V2 V3 V4
#1: L1 some unquoted stuff
#2: L2 some half quoted stuff
#3: L3 this should work ok thought
fread(str_fails, sep='\t', header=F, skip=0L)
# Error in fread(str_fails, sep = "\t", header = F, skip = 0L) :
# Expected sep (' ') but
# ' ends field 3 on line 1 when detecting types: L2 some "half" quoted stuff`
Would it be possible to add a quote argument to mitigate, or alternatively an automated way to identify embedded quotes during the quote classification code?