ICode9

精准搜索请尝试: 精确搜索
首页 > 编程语言> 文章详细

python 中统计绵羊 ARS-UI_Ramb_v2.0)参考基因组中GC含量及每条染色体的长度

2022-08-17 08:32:00  阅读:174  来源: 互联网

标签:count ARS ratio dict1 python NC v2.0 test home


 

001、方法1

root@PC1:/home/test# ls
a.fasta  test.py
root@PC1:/home/test# head -n 5 a.fasta              ## 参考基因组文件
>NC_056054.1 Ovis aries strain OAR_USU_Benz2616 breed Rambouillet chromosome 1, ARS-UI_Ramb_v2.0, whole genome shotgun sequence
CCTGAGATAACACATTTCTGACTTCTGCCAATTTTGCTGAGAAGCCCAAGGACTGTTGAAAATCAAGAAACACCCAAGAG
CAGCCCGGGCCGTTTATGTCTACATACGTGAGCAGGCTCAATGGAGCATGGATAGTGTCCCTCTGGGCATGGCTGCTGGG
TCAGTCCTCACCTCCCGCCCAGGGTCTCTCCTGAGCCACCTTCCTCccccagggcagagggagaaCACCAGGACCCACAC
TGAAGCCTCTCATTGGTGTGACCCTCAGGAGGCATGTCTGGTCTGGGGTTAGACAGAGCCTGTATCAGAGGGGCTGAGAG
root@PC1:/home/test# cat test.py                     ## 测试程序
#!/usr/bin/python

in_file = open("a.fasta", "r")
dict1 = dict()

for i in in_file:
    i = i.strip()
    if i[0] == ">":
        key = i.split(" ")[0]
        dict1[key] = []
    else:
        dict1[key].append(i.upper())
print("chr" + "\ta" + "\tt" + "\tc" + "\tg"  + "\tgc_ratio"  + "\tlength_chr")
for i,j in dict1.items():
    j = "".join(j)
    a = j.count("A")
    t = j.count("T")
    c = j.count("C")
    g = j.count("G")
    n = j.count("N")
    gc_ratio = (c + g)/(len(j) - n)
    n_ratio = n/len(j)
    print("{0}\t{1}\t{2}\t{3}\t{4}\t{5}\t{6}".format(i, a, t, c, g, gc_ratio, len(j)))
root@PC1:/home/test# python test.py               ## 程序执行结果
chr     a       t       c       g       gc_ratio        length_chr
>NC_056054.1    82403250        82167753        57066962        56974237        0.40931875266539836     278617202
>NC_056055.1    73870206        73664071        51359104        51305177        0.4103312258098626      250202058
>NC_056056.1    65030736        65146643        47958587        47951134        0.42421580443997026     226089100
>NC_056057.1    36042375        36096251        24711270        24727203        0.4066429731145337      121578099
>NC_056058.1    31324976        31390024        22730579        22772709        0.4204768791019869      108220788
>NC_056059.1    35575151        35517658        23688161        23687727        0.3999021614967201      118469697
>NC_056060.1    29406577        29709861        21031276        21124704        0.4162631922148832      101274418
>NC_056061.1    27346922        27467584        18457301        18520064        0.40283921219995616     91792871
>NC_056062.1    28059178        28042984        19541289        19535207        0.4105594344480041      95179658
>NC_056063.1    25741738        25783717        17454908        17477108        0.40403698599974086     86459471
>NC_056064.1    16751668        16822146        14486707        14485976        0.4632183158075184      62547497
>NC_056065.1    22959990        23016479        17190305        17236381        0.42817580976766395     80403655
>NC_056066.1    23370039        23543434        18267514        18329848        0.43823489490914563     83511835
>NC_056067.1    17929002        18169794        15175937        15240424        0.45728466069771134     66516657
>NC_056068.1    23961907        23980053        17269787        17324890        0.41914328300049347     82538637
>NC_056069.1    21123795        21200255        14793713        14779101        0.41132272472969056     71897364
>NC_056070.1    20984288        21100700        15531825        15549410        0.42480305427273457     73167223
>NC_056071.1    19199396        19362319        14687520        14734225        0.432777987469305       67984460
>NC_056072.1    16963178        17145902        13196413        13255057        0.4367772419504116      60561550
>NC_056073.1    14560862        14480238        11216357        11192760        0.43554951381448986     51451717
>NC_056074.1    13094151        13171511        10608847        10638185        0.4471864297991606      47514194
>NC_056075.1    14656953        14747140        11051402        11054996        0.4291630223443221      51512491
>NC_056076.1    18129864        18107164        13099588        13103028        0.4196471075331563      62440644
>NC_056077.1    11202652        11267006        10074078        10086500        0.47291734439377725     42630236
>NC_056078.1    12899395        12904427        9529300 9530632 0.42484032878746614     44863754
>NC_056079.1    13069877        13124355        9405991 9451636 0.4185760014919695      45052359
>NC_056080.1    42505821        42516400        29047656        29098348        0.4061376328441594      143171725

 

参考:https://www.jianshu.com/p/a7b20c2af042

 

标签:count,ARS,ratio,dict1,python,NC,v2.0,test,home
来源: https://www.cnblogs.com/liujiaxin2018/p/16593610.html

本站声明: 1. iCode9 技术分享网(下文简称本站)提供的所有内容,仅供技术学习、探讨和分享;
2. 关于本站的所有留言、评论、转载及引用,纯属内容发起人的个人观点,与本站观点和立场无关;
3. 关于本站的所有言论和文字,纯属内容发起人的个人观点,与本站观点和立场无关;
4. 本站文章均是网友提供,不完全保证技术分享内容的完整性、准确性、时效性、风险性和版权归属;如您发现该文章侵犯了您的权益,可联系我们第一时间进行删除;
5. 本站为非盈利性的个人网站,所有内容不会用来进行牟利,也不会利用任何形式的广告来间接获益,纯粹是为了广大技术爱好者提供技术内容和技术思想的分享性交流网站。

专注分享技术,共同学习,共同进步。侵权联系[81616952@qq.com]

Copyright (C)ICode9.com, All Rights Reserved.

ICode9版权所有