biotools

This repository comprises various standalone scripts that are useful for manipulating DNA sequence data. These are summarised here, see the documentation of each script using the help argument.

annotation_distances.py reports the number of bases between each consecutive pair of annotations for each record in a genbank format file

translate.py translates nucleotide sequences in a fasta into amino acids

backtranslate.py takes as input a aligned amino acid sequences and their corresponding unaligned nucleotide sequences and returns a nucleotide alignment

consensus.py outputs the majority rule consensus of an input alignment

extract_genes.pl extracts the nucleotide sequences corresponding to selected annotations from the records of a genbank file

filter_fasta_by_fasta.py filters the sequences in a fasta based on their presence in a second fasta file

filter_fasta_by_sintax.py filters the sequences in a fasta according to the taxonomy in a sintax-format file

filterfasta.pl filters the sequences in a fasta according to a text file of required headers or a specified search term

fixgb.py corrects genbank file LOCUS lines to work properly in biopython

gb_to_fasta.pl extracts the sequences from a genbank file and outputs them as a fasta file

get_genbanks.py downloads genbank-format files from NCBI GenBank by accession number

get_NCBI_Taxonomy.py retrieves taxonomy information from the NCBI Taxonomy database by taxid numbers

MBC_refmatcher.py integrates the results of searching an OTU/ASV file against a local set of references into the OTU/ASV file and a read mapping table

muldemux.py wraps cutadapt over many files and indices

reducealign.py removes redundant columns from an alignment

split_fasta.pl splits a multifasta into individual fastas for each sequence

split_fasta_by_label.py splits a multifasta into separate fastas according to labels in the sequence headers

split_genbank.pl splits a genbank-format file with multiple records into individual files for each record

subset_fasta.py randomly subsets a given proportion or number of sequences from a fasta

subset_gb_by_taxonomy.pl extracts sequences from a genbank-format file according to taxonomy

trimalign.py removes sequences from an alignment according to regular expressions, optionally dropping gap-only columns afterwards.

Deprecated scripts

isPCR.py performs a very basic attempt at in silico PCR - cutadapt is better than this script

rename_newick_with_classifiers.pl renames a phylogeny using a table - phylabel.R in the (phylostuff)[https://github.com/tjcreedy/phylostuff] repository is better than this

rename_newick_with_gb.pl renames a phylogeny using metadata parsed from a genbank file

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

biotools

Deprecated scripts

About

Uh oh!

Releases

Packages

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 474 Commits
bin		bin
rscripts		rscripts
.gitignore		.gitignore
LICENSE		LICENSE
MBC_refmatcher.py		MBC_refmatcher.py
README.md		README.md
annotation_distances.py		annotation_distances.py
backtranslate.py		backtranslate.py
blast2taxonomy.py		blast2taxonomy.py
collapsealignment.py		collapsealignment.py
consensus.py		consensus.py
extract_genes.pl		extract_genes.pl
extract_genes.py		extract_genes.py
filter_fasta_by_fasta.py		filter_fasta_by_fasta.py
filter_fasta_by_sintax.py		filter_fasta_by_sintax.py
filterfasta.pl		filterfasta.pl
findframe.py		findframe.py
fixgb.py		fixgb.py
fixtrnas.py		fixtrnas.py
gb_to_fasta.pl		gb_to_fasta.pl
get_NCBI_taxonomy.py		get_NCBI_taxonomy.py
get_genbanks.py		get_genbanks.py
isPCR.py		isPCR.py
makedb4b2t.py		makedb4b2t.py
mitogenomestats.py		mitogenomestats.py
muldemux.py		muldemux.py
rarefilter.R		rarefilter.R
reducealign.py		reducealign.py
rename_newick_with_classifiers.pl		rename_newick_with_classifiers.pl
rename_newick_with_gb.pl		rename_newick_with_gb.pl
retrieve_genbank.py		retrieve_genbank.py
split_fasta.pl		split_fasta.pl
split_fasta_by_label.py		split_fasta_by_label.py
split_genbank.pl		split_genbank.pl
subset_fasta.py		subset_fasta.py
subset_gb_by_taxonomy.pl		subset_gb_by_taxonomy.pl
swap_rRNAs.py		swap_rRNAs.py
taxcluster.R		taxcluster.R
template.py		template.py
translate.py		translate.py
trimalign.py		trimalign.py

License

tjcreedy/biotools

Folders and files

Latest commit

History

Repository files navigation

biotools

Deprecated scripts

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages