- Directory with R data files storing counts of L1 target loci throughout each protein-coding Ensembl gene.
- Directory with R data files storing counts of L1 target loci in exonic ranges of each protein-coding Ensembl gene.
- Contains analyses of simulation results.
- Notebook to estimate the probability of driver, passenger, and null (no effect) L1 insertions across lung, colon, and brain cancer.
- Notebook for checking that the L1 target site annotation data is valid.
- Notebook for comparing the probability of L1 insertion in each Ensembl gene when considering only exonic insertions, or insertions across the whole gene.
- Counts the number of L1 target loci of each snap-velcro category across the whole range of each Ensembl gene.
- Notebook that uses the hg38 gff3 file to compute the number of L1 target sites of each SV category in known coding regions and non-coding regions of the genome (separately), and also the coding fraction of each chromosome.
- Equivalent to count_cds_sites_for_each_chromosome.ipynb, except for exonic regions rather than just coding regions.
- Uses the table exann.rda (output by sim-develop/src/sim_exon_annotation.ipynb) to determine the number of sites in the exonic regions of each gene in Ensembl v86.
- Notebook to count the distribution of L1 target loci in exonic vs. non-exonic regions of the human genome (hg38).
- Notebook to count the distribution of L1 target loci in gene vs. non-gene regions of the human genome (hg38).
- Notebook for computing probability of insertion for each Ensembl gene, using the counts computed by count_all_sites_for_each_gene.ipynb and count_exon_sites_for_each_gene.ipynb.