Note that the threshold parameter represents the minimum number of base pairs required for a fragmented sequence to be extracted.
It is not used when fragmented = false
Running the above will generate a directory full of FASTA files for each BUSCO.
What if you want to add another sample?
Fear not, simply repeat the above with the same directory!
The files will be appended to include any additional samples. In this example say we forgot three outgroup taxa.
#Location of tsv files
tsvLocations<-c("~/Documents/Branchiops/Busco_results/Antarctodrilus_proboscidea_TRI_1_15_NORM.fasta/run_arthropoda_odb10/full_table.tsv",
"~/Documents/Branchiops/Busco_results/Haenopis_sanguisuga.fasta/run_arthropoda_odb10/full_table.tsv",
"~/Documents/Branchiops/Busco_results/Theromyzon_tessulatum.fasta/run_arthropoda_odb10/full_table.tsv")
#location of fasta files
fastaLocations<-c("~/Documents/Branchiops/Transcriptomes/Real_Data/Branch_fastas-1/Antarctodrilus_proboscidea_TRI_1_15_NORM.fasta",
"~/Documents/Branchiops/Transcriptomes/Real_Data/Branch_fastas-1/Haenopis_sanguisuga.fasta",
"~/Documents/Branchiops/Transcriptomes/Real_Data/Branch_fastas-1/Theromyzon_tessulatum.fasta")
# get the additional "Genus_species" names
SampleIDs<-c("Antarctodrilus_proboscidea","Haenopis_sanguisuga","Theromyzon_tessulatum")
As you can see this part is the same, now just run the
extractBuscos
function again using the same directory
extractBuscos(tsvLocations, fastaLocations, ed, SampleIDs,complete=TRUE, threshold=300)
4. Alignment
TOAST can call mafft to quickly align all of these BUSCO sequences (or any folder full of fasta files!)
All you need to do is point toast to the directory with the unaligned fastafiles and a new directory you would like to write alignments into
#note that mafft is multithreaded so you can speed things up by changing the thread count depending on your machine
MafftOrientAlign(extract_dir = "~/Documents/Branchiops/Transcriptomes/Extracted_Buscos", mafft_dir = "~/Documents/Branchiops/Transcriptomes/Mafft_aligned", threads = 12)
That's it! Now you are ready to concatenate some data or start filtering based on missing data patterns or gene trees!
Next Section: Concatenation | Missing Data
Skip to: Gene tree based filtration
Skip to: Utilities | Interactive Plots
Back to: Installation
Back to: TOAST main page