ICode9

精准搜索请尝试: 精确搜索
首页 > 其他分享> 文章详细

vcftools 软件中 --site-pi 计算的杂合度指的是什么?

2022-07-14 10:33:38  阅读:218  来源: 互联网

标签:vcftools 20 plink -- site result NP oar3


 

--site-pi:位点的期待杂合度。(计算等位基因频率p、q, 处于哈迪温伯格平衡时杂合子的概率,即2pq。)

 

001、

plink 软件中计算位点的期待杂合度

root@DESKTOP-1N42TVH:/home/test3# ls
result.map  result.ped
root@DESKTOP-1N42TVH:/home/test3# plink --file result --hardy
PLINK v1.90b6.26 64-bit (2 Apr 2022)           www.cog-genomics.org/plink/1.9/
(C) 2005-2022 Shaun Purcell, Christopher Chang   GNU General Public License v3
Logging to plink.log.
Options in effect:
  --file result
  --hardy

16007 MB RAM detected; reserving 8003 MB for main workspace.
.ped scan complete (for binary autoconversion).
Performing single-pass .bed write (442957 variants, 407 people).
--file: plink-temporary.bed + plink-temporary.bim + plink-temporary.fam
written.
442957 variants loaded from .bim file.
407 people (0 males, 0 females, 407 ambiguous) loaded from .fam.
Ambiguous sex IDs written to plink.nosex .
Using 1 thread (no multithreaded calculations invoked).
Before main variant filters, 407 founders and 0 nonfounders present.
Calculating allele frequencies... done.
Total genotyping rate is exactly 1.
--hardy: Writing Hardy-Weinberg report (founders only) to plink.hwe ... done.
root@DESKTOP-1N42TVH:/home/test3# ls
plink.hwe  plink.log  plink.nosex  result.map  result.ped
root@DESKTOP-1N42TVH:/home/test3# head plink.hwe                                      ## 期待杂合度
 CHR                           SNP     TEST   A1   A2                 GENO   O(HET)   E(HET)            P
   1               oar3_OAR1_17218  ALL(NP)    G    A            9/105/293    0.258   0.2565            1
   1               oar3_OAR1_20658  ALL(NP)    C    A           25/123/259   0.3022   0.3347      0.05401
   1               oar3_OAR1_28296  ALL(NP)    A    G           17/140/250    0.344   0.3361       0.7679
   1               oar3_OAR1_31152  ALL(NP)    G    A          103/185/119   0.4545   0.4992      0.07405
   1               oar3_OAR1_38175  ALL(NP)    A    G           14/119/274   0.2924    0.296       0.8667
   1               oar3_OAR1_38264  ALL(NP)    A    G           39/191/177   0.4693   0.4425       0.2626
   1                      s64199.1  ALL(NP)    A    G            6/101/300   0.2482   0.2391       0.5385
   1               oar3_OAR1_52919  ALL(NP)    G    A           98/198/111   0.4865   0.4995         0.62
   1               oar3_OAR1_55363  ALL(NP)    A    G           70/166/171   0.4079   0.4692      0.00837

 

002、vcftools中--site-pi计算杂合度

root@DESKTOP-1N42TVH:/home/test3# ls
plink.hwe  plink.log  plink.nosex  result.map  result.ped
root@DESKTOP-1N42TVH:/home/test3# plink --file result --recode vcf-iid --out result  ## 转vcf格式
PLINK v1.90b6.26 64-bit (2 Apr 2022)           www.cog-genomics.org/plink/1.9/
(C) 2005-2022 Shaun Purcell, Christopher Chang   GNU General Public License v3
Logging to result.log.
Options in effect:
  --file result
  --out result
  --recode vcf-iid

16007 MB RAM detected; reserving 8003 MB for main workspace.
.ped scan complete (for binary autoconversion).
Performing single-pass .bed write (442957 variants, 407 people).
--file: result-temporary.bed + result-temporary.bim + result-temporary.fam
written.
442957 variants loaded from .bim file.
407 people (0 males, 0 females, 407 ambiguous) loaded from .fam.
Ambiguous sex IDs written to result.nosex .
Using 1 thread (no multithreaded calculations invoked).
Before main variant filters, 407 founders and 0 nonfounders present.
Calculating allele frequencies... done.
Total genotyping rate is exactly 1.
442957 variants and 407 people pass filters and QC.
Note: No phenotypes present.
--recode vcf-iid to result.vcf ... done.
root@DESKTOP-1N42TVH:/home/test3# ls                       ## 转换结果
plink.hwe  plink.log  plink.nosex  result.log  result.map  result.nosex  result.ped  result.vcf
root@DESKTOP-1N42TVH:/home/test3# vcftools --vcf result.vcf --site-pi --out vcf_pi  ## 计算杂合度pi

VCFtools - 0.1.16
(C) Adam Auton and Anthony Marcketta 2009

Parameters as interpreted:
        --vcf result.vcf
        --out vcf_pi
        --site-pi

Warning: Expected at least 2 parts in INFO entry: ID=PR,Number=0,Type=Flag,Description="Provisional reference allele, may not be based on real reference genome">
After filtering, kept 407 out of 407 Individuals
Outputting Per-Site Nucleotide Diversity Statistics...
After filtering, kept 442957 out of a possible 442957 Sites
Run Time = 12.00 seconds

 

比较验证:

root@DESKTOP-1N42TVH:/home/test3# ls
plink.hwe  plink.log  plink.nosex  result.log  result.map  result.nosex  result.ped  result.vcf  vcf_pi.log  vcf_pi.sites.pi
root@DESKTOP-1N42TVH:/home/test3# head vcf_pi.sites.pi
CHROM   POS     PI
1       17218   0.256861
1       20658   0.335135
1       28296   0.336546
1       31152   0.499841
1       38175   0.296318
1       38264   0.443061
1       52854   0.239393
1       52919   0.500104
1       55363   0.469786
root@DESKTOP-1N42TVH:/home/test3# head plink.hwe
 CHR                           SNP     TEST   A1   A2                 GENO   O(HET)   E(HET)            P
   1               oar3_OAR1_17218  ALL(NP)    G    A            9/105/293    0.258   0.2565            1
   1               oar3_OAR1_20658  ALL(NP)    C    A           25/123/259   0.3022   0.3347      0.05401
   1               oar3_OAR1_28296  ALL(NP)    A    G           17/140/250    0.344   0.3361       0.7679
   1               oar3_OAR1_31152  ALL(NP)    G    A          103/185/119   0.4545   0.4992      0.07405
   1               oar3_OAR1_38175  ALL(NP)    A    G           14/119/274   0.2924    0.296       0.8667
   1               oar3_OAR1_38264  ALL(NP)    A    G           39/191/177   0.4693   0.4425       0.2626
   1                      s64199.1  ALL(NP)    A    G            6/101/300   0.2482   0.2391       0.5385
   1               oar3_OAR1_52919  ALL(NP)    G    A           98/198/111   0.4865   0.4995         0.62
   1               oar3_OAR1_55363  ALL(NP)    A    G           70/166/171   0.4079   0.4692      0.00837
root@DESKTOP-1N42TVH:/home/test3# tail vcf_pi.sites.pi
20      51060804        0.499742
20      51081995        0.438815
20      51083803        0.454147
20      51104501        0.50018
20      51110559        0.50037
20      51114511        0.498073
20      51119355        0.42517
20      51137793        0.437948
20      51138395        0.454147
20      51139507        0.498073
root@DESKTOP-1N42TVH:/home/test3# tail plink.hwe
  20           oar3_OAR20_51060804  ALL(NP)    G    A           96/198/113   0.4865   0.4991         0.62
  20           oar3_OAR20_51081995  ALL(NP)    A    G           39/186/182    0.457   0.4383       0.4296
  20           oar3_OAR20_51083803  ALL(NP)    A    G           46/191/170   0.4693   0.4536       0.5139
  20           oar3_OAR20_51104501  ALL(NP)    G    A          103/189/115   0.4644   0.4996       0.1648
  20           oar3_OAR20_51110559  ALL(NP)    G    A          101/196/110   0.4816   0.4998       0.4875
  20           oar3_OAR20_51114511  ALL(NP)    A    G           92/194/121   0.4767   0.4975       0.4253
  20           oar3_OAR20_51119355  ALL(NP)    A    G           34/181/192   0.4447   0.4246       0.4135
  20           oar3_OAR20_51137793  ALL(NP)    G    A           40/183/184   0.4496   0.4374       0.6504
  20           oar3_OAR20_51138395  ALL(NP)    G    A           48/187/172   0.4595   0.4536       0.8278
  20           oar3_OAR20_51139507  ALL(NP)    A    G           92/194/121   0.4767   0.4975       0.4253

 

标签:vcftools,20,plink,--,site,result,NP,oar3
来源: https://www.cnblogs.com/liujiaxin2018/p/16476665.html

本站声明: 1. iCode9 技术分享网(下文简称本站)提供的所有内容,仅供技术学习、探讨和分享;
2. 关于本站的所有留言、评论、转载及引用,纯属内容发起人的个人观点,与本站观点和立场无关;
3. 关于本站的所有言论和文字,纯属内容发起人的个人观点,与本站观点和立场无关;
4. 本站文章均是网友提供,不完全保证技术分享内容的完整性、准确性、时效性、风险性和版权归属;如您发现该文章侵犯了您的权益,可联系我们第一时间进行删除;
5. 本站为非盈利性的个人网站,所有内容不会用来进行牟利,也不会利用任何形式的广告来间接获益,纯粹是为了广大技术爱好者提供技术内容和技术思想的分享性交流网站。

专注分享技术,共同学习,共同进步。侵权联系[81616952@qq.com]

Copyright (C)ICode9.com, All Rights Reserved.

ICode9版权所有