peposeb
Hi,

I have analised the quality of the provided BAM file using samtools-coverage and these are the results for the 23 chromosomes
+--------------------------------------------------------------------------------------+
|#rname|startpos|endpos    |numreads  |covbases  |coverage|meandepth|meanbaseq|meanmapq|
+--------------------------------------------------------------------------------------+
|     1|       1| 249250621|  39698878| 225028132|   90.28|    23.13|    35.80|   57.20|
|     2|       1| 243199373|  40703396| 238071939|   97.89|    24.23|    35.80|   57.60|
|     3|       1| 198022430|  32635857| 194720328|   98.33|    23.93|    35.80|   58.90|
|     4|       1| 191154276|  30715632| 187478725|   98.08|    23.29|    35.80|   58.50|
|     5|       1| 180915260|  29762289| 177582884|   98.16|    23.87|    35.80|   57.90|
|     6|       1| 171115067|  28090145| 167283988|   97.76|    23.81|    35.80|   58.70|
|     7|       1| 159138663|  26390568| 155178474|   97.51|    24.04|    35.80|   57.40|
|     8|       1| 146364022|  24329314| 142751964|   97.53|    24.12|    35.80|   57.80|
|     9|       1| 141213431|  20507162| 119760807|   84.81|    21.09|    35.80|   55.00|
|    10|       1| 135534747|  22575454| 131199951|   96.80|    24.17|    35.80|   58.00|
|    11|       1| 135006516|  22757704| 131077971|   97.09|    24.49|    35.80|   58.70|
|    12|       1| 133851895|  22257726| 130324598|   97.36|    24.12|    35.80|   58.80|
|    13|       1| 115169878|  15686601|  95538261|   82.95|    19.73|    35.80|   58.70|
|    14|       1| 107349540|  15191368|  88234896|   82.19|    20.54|    35.80|   58.30|
|    15|       1| 102531392|  14412175|  81644554|   79.63|    20.42|    35.80|   56.30|
|    16|       1|  90354753|  15226381|  78840853|   87.26|    24.46|    35.80|   54.50|
|    17|       1|  81195210|  14535702|  77744772|   95.75|    25.98|    35.80|   57.40|
|    18|       1|  78077248|  12522267|  74630980|   95.59|    23.27|    35.80|   58.70|
|    19|       1|  59128983|  10793535|  55766402|   94.31|    26.46|    35.70|   58.00|
|    20|       1|  63025520|  10808057|  59486621|   94.39|    24.90|    35.80|   58.90|
|    21|       1|  48129895|   6266763|  35074202|   72.87|    18.87|    35.80|   56.80|
|    22|       1|  51304566|   6798499|  34865192|   67.96|    19.26|    35.80|   56.20|
|     X|       1| 155270560|  13114389| 150884891|   97.18|    12.14|    35.80|   55.90|
|     Y|       1|  59373566|   2448518|  22964629|   38.68|     5.95|    35.80|   31.20|
|    MT|       1|     16569|     68919|     16569|  100.00|   607.71|    35.70|   55.20|
+--------------------------------------------------------------------------------------+

I would like to mention the following issues:
  • Low coverage of some chromosomes (some as low as 39% of chromosome Y)
  • Mean depth slightly lower than expected (x30)
It seems that the provided fastq files are already filtered and trimmed (expected otherwise) and these results might be consecuence of that process.

Are these normal results for the x30 WGS product? Can I have the raw (untrimmed) fastq files?
Quote 1 0
brett_lowry

03 May 2020,

Try sending e-mail contact@dantelabs.com . They are sometimes slow (week or two) to respond but they "mostly" do respond.

There's also their customer care Facebook page: https://www.facebook.com/groups/558973438185946/

- Brett

Quote 0 0
Randy H

Like Samtools IDXstats and similar tools, the coverage is based on the declared model size.  But the Y chromosome model has nearly 50% blanked out in alignment reference models.  Either because of PAR regions with the X or highly variant, repeated regions of seemingly no use that the aligner cannot map too.  Some other chromosomes have blacked out (N) areas as well but nowhere near the amount as with Y. Generally, for a male, the X and Y coverage should be the same and 50% less than the autosomes.  This because there are 1/2 as many copies of the chromosomes available and so, statistically, they get sequenced half as much (randomly).  Sometimes we have seen Y drop to even 50% of that as appears to be the case here.  It may be that this 25% of the autosomes for Y is the normal and those with higher may be showing enhanced Y coverage for as yet unexplained reasons.

qual.iobio.io is a web based tool that uses samtools idxstats and follows the methods described in  http://bit.ly/304ciw0. You can use that for comparison. It takes seconds to run as idxstats grabs the data already summarized in the BAM Index file (bai).

Quote 0 0