This repository was archived by the owner on Mar 30, 2020. It is now read-only.

Description
Hi Matt (@mbookman),
So to continue our discussion from #10 (comment), I understand the REST interface here:
https://www.googleapis.com/discovery/v1/apis/genomics/v1alpha2/rest
But this is too cumbersome for bioinformaticians who just want a turn-key solution and to run stuff. The examples are great, but we should have secondary ones to simplify them, which will increase the audience spectrum. This includes the ability for multiple files. This can be done now, even if the backend does not support it directly. Also include examples of connected pipelines as workflows and nested pipelines examples - and yes, there are several ways :)
So with each example there should be pipelines like this, which are defined in a file that the program (Python/R/Java, etc) will pick up and adapt to the REST interface. Here one provides only the necessary information, and the parser will transform the generalized names and also fill out the required on it's own:
Pipeline:
name: 'fastqc'
CPU: 1
RAM: 3.75 GB
disks:
name: 'datadisk'
mountPoint: '/mnt/data'
size: 500 GB
persistent: true
docker:
image: 'gcr.io/PROJECT_ID_ARGUMENT/fastqc'
cmd: ( 'mkdir /mnt/data/output && '
'fastqc /mnt/data/input/* --outdir=/mnt/data/output/' )
inputParameters:
name: inputFile + [idx : 1...len(INPUT)]
location:
path: 'input/'
disk: 'datadisk'
outputParameters:
name: 'outputPath'
location:
path: 'output/*'
disk: 'datadisk'
pipelineArgs:
RAM: 1 GB
disks:
name: 'datadisk'
size: DISK_SIZE_ARGUMENT
persistent: true
inputs:
inputFile + [idx : 1...len(INPUT)]
outputs:
path: OUTPUT_ARGUMENT
logging:
path: LOGGING_ARGUMENT
Let me know what you think.
Thanks,
Paul