![]() Alternatively, go to Tools → Generate Consensus Sequence. To do this, click on Consensus to select the entire sequence, then click Extract to extract it to a new sequence document. To work with the consensus sequence in a downstream analysis, you must first Extract it from your alignment. This is particularly useful for exporting sequences to file formats which do not preserve quality (for example FASTA). Note that if any sequence in the alignment/contig has an internal gap in it, that is still considered valid coverage at that position, and this setting will not apply.Ĭhoose Call N if Quality below to change consensus bases to N’s if the quality is below the threshold that you set. ![]() If Ref is selected, then the consensus is assigned whatever character the reference sequence has at that position. A ‘?’ represents an unknown character, potentially a gap. So the consensus call with be a C with quality 20 + 3 − 15 = 8.įor alignments or contigs with a reference sequence, the If no coverage call setting can be used to control what character the consensus sequence should use when the reference sequence has no coverage. The gap will have an effective combined quality score of 30 ∕ 2 = 15. The two C’s will have combined quality scores of approximately 3 and 20 respectively. Gaps are assigned a quality score equal to half the minimum base call quality on either side of the gap.įor example, if a column has a C with quality 41 and mapping quality 3, a gap with adjacent base calls of quality 41 and 30 with mapping quality 240, and a C with quality 41 and mapping quality 20. The log scale qualities are combined as probabilities so a very rough rule of thumb is the combined quality will be approximately equal to the minimum of the mapping quality and base call quality, except in cases where the two values are very close in which case the combined quality will be slightly smaller. When reads have mapping qualities (confidence that the entire read is mapped to the correct location), the mapping quality is combined with the base pair quality to form the quality used during consensus calling. The quality assigned to this R will be the sum of the bases that agree with the consensus call minus the bases that disagree, which is 30 + 25 + 30 − 15 = 70. With the G included, the total quality is 30 + 25 + 30 = 85, which is higher than the 60% threshold, so a consensus call of R will be made. Because the total qualities of the A’s is 55 out of 100 for the column, this is not higher than the 60% threshold to call an A. This consensus A will have a quality of (40 + 42 + 50) − (16 + 24) = 92 if using Total or 50 if using Highest.Ī more complicated example for Highest Quality consensus calling using Total: Assume a column contains 2 A’s with qualities of 30 and 25, 1 G with quality 30 and 1 T with quality 15. 60% of (40 + 42 + 50 + 16 + 24), then an A will be called for the consensus.
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |