Whole Genome Resequencing Pipeline v2.0
NEW!: SAMtools and Picard dependencies replaced by built-in MATLAB capabilities. Now only requires BWA and GATK.
Automates single-end whole genome resequence (WGRS) data processing whereby pre-installed dependencies are used to map reads from FASTQ to a reference and realign indels. BWA must be installed and available on the system path and GenomeAnalysisTK.jar must be available on the MATLAB path. If no arguments are provided, the user will be asked to provide one or more FASTQ files of reads and a reference FASTA. Developers are encouraged to adapt this template to their needs. Pipeline steps are:
(0a) FM-index reference (BWA index)
(0b) Create FASTA index (Internal fai)
(0c) Create sequence dictionary (Internal dict)
(1) Map reads (BWA mem)
(2) Convert SAM to BAM (MATLAB sam2bam)
(3) Sort BAM (MATLAB bamsort)
(4) Index BAM (MATLAB BioMap)
(5) Discover indels (GATK RealignerTargetCreator)
(6) Realign indels (GATK IndelRealigner)
(7) Cleanup
Cite As
Turner Conrad (2024). Whole Genome Resequencing Pipeline v2.0 (https://www.mathworks.com/matlabcentral/fileexchange/46078-whole-genome-resequencing-pipeline-v2-0), MATLAB Central File Exchange. Retrieved .
MATLAB Release Compatibility
Platform Compatibility
Windows macOS LinuxCategories
- Industries > Biotech and Pharmaceutical > Genomics and Next Generation Sequencing >
- Sciences > Physics > Biological Physics >
Tags
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!Discover Live Editor
Create scripts with code, output, and formatted text in a single executable document.