LOCUS HSINSU 4992 bp DNA PRI 30-MAR-1995 DEFINITION Human gene for preproinsulin, from chromosome 11. Includes a highly polymorphic region upstream from the insulin gene containing tandemly repeated sequences. ACCESSION V00565 VERSION V00565.1 GI:33930 KEYWORDS germ line; insulin; repetitive sequence; signal peptide; tandem repeat. SOURCE human. ORGANISM Homo sapiens Eukaryota; Animalia; Metazoa; Chordata; Vertebrata; Mammalia; Theria; Eutheria; Primates; Haplorhini; Catarrhini; Hominidae. REFERENCE 1 (bases 1925 to 3715) AUTHORS Bell,G.I., Pictet,R.L., Rutter,W.J., Cordell,B., Tischer,E. and Goodman,H.M. TITLE Sequence of the human insulin gene JOURNAL Nature 284 (5751), 26-32 (1980) MEDLINE 80120725 REFERENCE 2 (bases 1928 to 3651) AUTHORS Ullrich,A., Dull,T.J., Gray,A., Brosius,J. and Sures,I. TITLE Genetic variation in the human insulin gene JOURNAL Science 209 (4456), 612-615 (1980) MEDLINE 80236313 REFERENCE 3 (bases 1 to 2227) AUTHORS Bell,G.I., Selby,M.J. and Rutter,W.J. TITLE The highly polymorphic region near the human insulin gene is composed of simple tandemly repeating sequences JOURNAL Nature 295 (5844), 31-35 (1982) MEDLINE 82125365 REFERENCE 4 (bases 1 to 4992) AUTHORS Bell,G.I. TITLE Direct Submission JOURNAL Submitted (01-APR-1982) to the EMBL/GenBank/DDBJ databases COMMENT This entry is assembled from and replaces the previous entries and , and contains other new data. Some sequence and feature data have been adapted from the Los Alamos sequence data base entry HUMINS1. The immediate translation product of the gene is preproinsulin. The signal peptide facilitates membrane transit of the insulin precursor, and is cleaved off in the process. In the resulting proinsulin molecule, the peptide chains A and B are joined by the connecting peptide C, which is believed to help in the formation of the disulphide bridges required for insulin. (See [1].). NCBI gi: 33930 FEATURES Location/Qualifiers source 1..4992 /organism="Homo sapiens" misc_feature 1340..1823 /note="polymorphic region (tandem repeats)" conflict 2101 /note="G is AGG in [2]" /citation=[2] misc_feature 2156..2161 /note="TATAAA (Hogness) box" mRNA 2186..2227 /note="preproinsulin mRNA (part 1)" prim_transcript 2186..3615 /note="preproinsulin primary transcript" intron 2228..2406 /note="intron" allele 2401 /note="T can be A (see [2])" mRNA 2407..2610 /note="preproinsulin mRNA (part 2)" CDS join(2424..2610,3397..3542) /note="NCBI gi: 758088" /codon_start=1 /product="preproinsulin" /translation="MALWMRLLPLLALLALWGPDPAAAFVNQHLCGSHLVEALYLVCG ERGFFYTPKTRREAEDLQVGQVELGGGPGAGSLQPLALEGSLQKRGIVEQCCTSICSL YQLENYCN" sig_peptide 2424..2495 /note="peptide pre (signal peptide)" mat_peptide 2496..2585 /note="proinsulin peptide B" mat_peptide join(2586..2610,3397..3476) /note="proinsulin peptide C (part 1) (2610 is 1st base in codon)" intron 2611..3396 /note="intron" conflict 3068 /note="G is GG in [2]" /citation=[2] allele 3229 /note="G can be C (see [2])" mRNA 3397..3615 /note="preproinsulin mRNA (part 3)" mat_peptide 3477..3539 /note="proinsulin peptide A" allele 3551 /note="T can be C (see [2])" allele 3564 /note="A can be C (see [2])" conflict 3638..3639 /note="TT is T in [2]" /citation=[2] unsure 4381..4441 /note="sequence unconfirmed" BASE COUNT 849 a 1553 c 1755 g 835 t ORIGIN 1 ctcgaggggc ctagacattg ccctccagag agagcaccca acaccctcca ggcttgaccg 61 gccagggtgt ccccttccta ccttggagag agcagcccca gggcatcctg cagggggtgc 121 tgggacacca gctggccttc aaggtctctg cctccctcca gccaccccac tacacgctgc 181 tgggatcctg gatctcagct ccctggccga caacactggc aaactcctac tcatccacga 241 aggccctcct gggcatggtg gtccttccca gcctggcagt ctgttcctca cacaccttgt 301 tagtgcccag cccctgaggt tgcagctggg ggtgtctctg aagggctgtg agcccccagg 361 aagccctggg gaagtgcctg ccttgcctcc ccccggccct gccagcgcct ggctctgccc 421 tcctacctgg gctcccccca tccagcctcc ctccctacac actcctctca aggaggcacc 481 catgtcctct ccagctgccg ggcctcagag cactgtggcg tcctggggca gccaccgcat 541 gtcctgctgt ggcatggctc agggtggaaa gggcggaagg gaggggtcct gcagatagct 601 ggtgcccact accaaacccg ctcggggcag gagagccaaa ggctgggtgt gtgcagagcg 661 gccccgagag gttccgaggc tgaggccagg gtgggacata gggatgcgag gggccggggc 721 acaggatact ccaacctgcc tgcccccatg gtctcatcct cctgcttctg ggacctcctg 781 atcctgcccc tggtgctaag aggcaggtaa ggggctgcag gcagcagggc tcggagccca 841 tgccccctca ccatgggtca ggctggacct ccaggtgcct gttctgggga gctgggaggg 901 ccggaggggt gtaccccagg ggctcagccc agatgacact atgggggtga tggtgtcatg 961 ggacctggcc aggagagggg agatgggctc ccagaagagg agtgggggct gagagggtgc 1021 ctggggggcc aggacggagc tgggccagtg cacagcttcc cacacctgcc cacccccaga 1081 gtcctgccgc cacccccaga tcacacggaa gatgaggtcc gagtggcctg ctgaggactt 1141 gctgcttgtc cccaggtccc caggtcatgc cctccttctg ccaccctggg gagctgaggg 1201 cctcagctgg ggctgctgtc ctaaggcagg gtgggaacta ggcagccagc agggagggga 1261 cccctccctc actcccactc tcccaccccc accaccttgg cccatccatg gcggcatctt 1321 gggccatccg ggactgggga caggggtcct ggggacaggg gtccggggac agggtcctgg 1381 ggacaggggt gtggggacag gggtctgggg acaggggtgt ggggacaggg gtgtggggac 1441 aggggtctgg ggacaggggt gtggggacag gggtccgggg acaggggtgt ggggacaggg 1501 gtctggggac aggggtgtgg ggacaggggt gtggggacag gggtctgggg acaggggtgt 1561 ggggacaggg gtcctgggga caggggtgtg gggacagggg tgtggggaca ggggtgtggg 1621 gacaggggtg tggggacagg ggtcctgggg ataggggtgt ggggacaggg gtgtggggac 1681 aggggtcccg gggacagggg tgtggggaca ggggtgtggg gacaggggtc ctggggacag 1741 gggtctgagg acaggggtgt gggcacaggg gtcctgggga caggggtcct ggggacaggg 1801 gtcctgggga caggggtctg gggacagcag cgcaaagagc cccgccctgc agcctccagc 1861 tctcctggtc taatgtggaa agtggcccag gtgagggctt tgctctcctg gagacatttg 1921 cccccagctg tgagcaggga caggtctggc caccgggccc ctggttaaga ctctaatgac 1981 ccgctggtcc tgaggaagag gtgctgacga ccaaggagat cttcccacag acccagcacc 2041 agggaaatgg tccggaaatt gcagcctcag cccccagcca tctgccgacc cccccacccc 2101 gccctaatgg gccaggcggc aggggttgac aggtagggga gatgggctct gagactataa 2161 agccagcggg ggcccagcag ccctcagccc tccaggacag gctgcatcag aagaggccat 2221 caagcaggtc tgttccaagg gcctttgcgt caggtgggct cagggttcca gggtggctgg 2281 accccaggcc ccagctctgc agcagggagg acgtggctgg gctcgtgaag catgtggggg 2341 tgagcccagg ggccccaagg cagggcacct ggccttcagc ctgcctcagc cctgcctgtc 2401 tcccagatca ctgtccttct gccatggccc tgtggatgcg cctcctgccc ctgctggcgc 2461 tgctggccct ctggggacct gacccagccg cagcctttgt gaaccaacac ctgtgcggct 2521 cacacctggt ggaagctctc tacctagtgt gcggggaacg aggcttcttc tacacaccca 2581 agacccgccg ggaggcagag gacctgcagg gtgagccaac cgcccattgc tgcccctggc 2641 cgcccccagc caccccctgc tcctggcgct cccacccagc atgggcagaa gggggcagga 2701 ggctgccacc cagcaggggg tcaggtgcac ttttttaaaa agaagttctc ttggtcacgt 2761 cctaaaagtg accagctccc tgtggcccag tcagaatctc agcctgagga cggtgttggc 2821 ttcggcagcc ccgagataca tcagagggtg ggcacgctcc tccctccact cgcccctcaa 2881 acaaatgccc cgcagcccat ttctccaccc tcatttgatg accgcagatt caagtgtttt 2941 gttaagtaaa gtcctgggtg acctggggtc acagggtgcc ccacgctgcc tgcctctggg 3001 cgaacacccc atcacgcccg gaggagggcg tggctgcctg cctgagtggg ccagacccct 3061 gtcgccagcc tcacggcagc tccatagtca ggagatgggg aagatgctgg ggacaggccc 3121 tggggagaag tactgggatc acctgttcag gctcccactg tgacgctgcc ccggggcggg 3181 ggaaggaggt gggacatgtg ggcgttgggg cctgtaggtc cacacccagt gtgggtgacc 3241 ctccctctaa cctgggtcca gcccggctgg agatgggtgg gagtgcgacc tagggctggc 3301 gggcaggcgg gcactgtgtc tccctgactg tgtcctcctg tgtccctctg cctcgccgct 3361 gttccggaac ctgctctgcg cggcacgtcc tggcagtggg gcaggtggag ctgggcgggg 3421 gccctggtgc aggcagcctg cagcccttgg ccctggaggg gtccctgcag aagcgtggca 3481 ttgtggaaca atgctgtacc agcatctgct ccctctacca gctggagaac tactgcaact 3541 agacgcagcc tgcaggcagc cccacacccg ccgcctcctg caccgagaga gatggaataa 3601 agcccttgaa ccagccctgc tgtgccgtct gtgtgtcttg ggggccctgg gccaagcccc 3661 acttcccggc actgttgtga gcccctccca gctctctcca cgctctctgg gtgcccacag 3721 gtgccaacgc caggcaggcc cagcatgcag tggctctccc caaagcggcc atgcctgttg 3781 gctgcctgct gcccccaccc tgtggctcag ggtccagtat gggagcttcg ggggtctctg 3841 aggggccagg gatggtgggg ccactgagaa gtgactctgt cagtagccga cctggagtcc 3901 ccagagacct tgttcaggaa agggaatgag aacattccag caattttccc cccacctagc 3961 cctcccaggt tctattttta gagttatttc tgatggagtc cctgtggagg gaggaggctg 4021 ggctgaggga gggggtcctg cagggcgggg ggctgggaag gtggggagag gctgccgaga 4081 gccacccgct atccccagct ctgggcagcc ccgggacagt cacacaccct ggcctcgcgg 4141 cccaagctgg cagccgtctg cagccacagc ttatgccagc ccaggtccag ccagacacct 4201 gagggaccca ctggtgcctt ggaggaagca ggagaggtca gatggcacca tgagctgggg 4261 caggtgcagg gaccgtggca gcacctggca gggcctcaga acccatgcct tgggcacccc 4321 ggccatgagg ccctgaggat tgcagcccaa gagaagcagg gaacgccagg gccacagggg 4381 cagagaccag gccagggtcc cttgcggccc ttagcccacc ccctcccagt aagcaggggc 4441 tgcttggcta ggcttccttt tgctacagac ctgctgctca cccagaggcc cacgggccct 4501 agtgacaagg tcgttgtggc tccaggtcct tgggggtcct gacacagagc ctcttctgca 4561 gcacccctga ggacagggtg ctccgctggg cacccagcct agtgggcaga cgagaaccta 4621 ggggctgcct gggcctactg tggcctggga ggtcagcggg tgaccctagc taccctgtgg 4681 ctgggccagt ctgcctgcca cccaggccaa accaatctgc acctttcctg agagctccac 4741 ccagggctgg gctggggatg gctgggcctg gggctggcat gggctgtggc tgcagaccac 4801 tgccagcttg ggcctcgagg ccaggagctc accctccagc tgccccgcct ccagagtggg 4861 ggccagggct gggcaggcgg gtggacggcc ggacactggc cccggaagag gagggaggcg 4921 gtggctggga tcggcagcag ccgtccatgg gaacacccag ccggccccac tcgcacgggt 4981 agagacaggc gc 1 ctcgaggggcctagacattgccctccagagagagcacccaacaccctccaggcttgaccg 61 gccagggtgtccccttcctaccttggagagagcagccccagggcatcctgcagggggtgc 121 tgggacaccagctggccttcaaggtctctgcctccctccagccaccccactacacgctgc 181 tgggatcctggatctcagctccctggccgacaacactggcaaactcctactcatccacga 241 aggccctcctgggcatggtggtccttcccagcctggcagtctgttcctcacacaccttgt 301 tagtgcccagcccctgaggttgcagctgggggtgtctctgaagggctgtgagcccccagg 361 aagccctggggaagtgcctgccttgcctccccccggccctgccagcgcctggctctgccc 421 tcctacctgggctccccccatccagcctccctccctacacactcctctcaaggaggcacc 481 catgtcctctccagctgccgggcctcagagcactgtggcgtcctggggcagccaccgcat 541 gtcctgctgtggcatggctcagggtggaaagggcggaagggaggggtcctgcagatagct 601 ggtgcccactaccaaacccgctcggggcaggagagccaaaggctgggtgtgtgcagagcg 661 gccccgagaggttccgaggctgaggccagggtgggacatagggatgcgaggggccggggc 721 acaggatactccaacctgcctgcccccatggtctcatcctcctgcttctgggacctcctg 781 atcctgcccctggtgctaagaggcaggtaaggggctgcaggcagcagggctcggagccca 841 tgccccctcaccatgggtcaggctggacctccaggtgcctgttctggggagctgggaggg 901 ccggaggggtgtaccccaggggctcagcccagatgacactatgggggtgatggtgtcatg 961 ggacctggccaggagaggggagatgggctcccagaagaggagtgggggctgagagggtgc 1021 ctggggggccaggacggagctgggccagtgcacagcttcccacacctgcccacccccaga 1081 gtcctgccgccacccccagatcacacggaagatgaggtccgagtggcctgctgaggactt 1141 gctgcttgtccccaggtccccaggtcatgccctccttctgccaccctggggagctgaggg 1201 cctcagctggggctgctgtcctaaggcagggtgggaactaggcagccagcagggagggga 1261 cccctccctcactcccactctcccacccccaccaccttggcccatccatggcggcatctt 1321 gggccatccgggactggggacaggggtcctggggacaggggtccggggacagggtcctgg 1381 ggacaggggtgtggggacaggggtctggggacaggggtgtggggacaggggtgtggggac 1441 aggggtctggggacaggggtgtggggacaggggtccggggacaggggtgtggggacaggg 1501 gtctggggacaggggtgtggggacaggggtgtggggacaggggtctggggacaggggtgt 1561 ggggacaggggtcctggggacaggggtgtggggacaggggtgtggggacaggggtgtggg 1621 gacaggggtgtggggacaggggtcctggggataggggtgtggggacaggggtgtggggac 1681 aggggtcccggggacaggggtgtggggacaggggtgtggggacaggggtcctggggacag 1741 gggtctgaggacaggggtgtgggcacaggggtcctggggacaggggtcctggggacaggg 1801 gtcctggggacaggggtctggggacagcagcgcaaagagccccgccctgcagcctccagc 1861 tctcctggtctaatgtggaaagtggcccaggtgagggctttgctctcctggagacatttg 1921 cccccagctgtgagcagggacaggtctggccaccgggcccctggttaagactctaatgac 1981 ccgctggtcctgaggaagaggtgctgacgaccaaggagatcttcccacagacccagcacc 2041 agggaaatggtccggaaattgcagcctcagcccccagccatctgccgacccccccacccc 2101 gccctaatgggccaggcggcaggggttgacaggtaggggagatgggctctgagactataa 2161 agccagcgggggcccagcagccctcagccctccaggacaggctgcatcagaagaggccat 2221 caagcaggtctgttccaagggcctttgcgtcaggtgggctcagggttccagggtggctgg 2281 accccaggccccagctctgcagcagggaggacgtggctgggctcgtgaagcatgtggggg 2341 tgagcccaggggccccaaggcagggcacctggccttcagcctgcctcagccctgcctgtc 2401 tcccagatcactgtccttctgccatggccctgtggatgcgcctcctgcccctgctggcgc 2461 tgctggccctctggggacctgacccagccgcagcctttgtgaaccaacacctgtgcggct 2521 cacacctggtggaagctctctacctagtgtgcggggaacgaggcttcttctacacaccca 2581 agacccgccgggaggcagaggacctgcagggtgagccaaccgcccattgctgcccctggc 2641 cgcccccagccaccccctgctcctggcgctcccacccagcatgggcagaagggggcagga 2701 ggctgccacccagcagggggtcaggtgcacttttttaaaaagaagttctcttggtcacgt 2761 cctaaaagtgaccagctccctgtggcccagtcagaatctcagcctgaggacggtgttggc 2821 ttcggcagccccgagatacatcagagggtgggcacgctcctccctccactcgcccctcaa 2881 acaaatgccccgcagcccatttctccaccctcatttgatgaccgcagattcaagtgtttt 2941 gttaagtaaagtcctgggtgacctggggtcacagggtgccccacgctgcctgcctctggg 3001 cgaacaccccatcacgcccggaggagggcgtggctgcctgcctgagtgggccagacccct 3061 gtcgccagcctcacggcagctccatagtcaggagatggggaagatgctggggacaggccc 3121 tggggagaagtactgggatcacctgttcaggctcccactgtgacgctgccccggggcggg 3181 ggaaggaggtgggacatgtgggcgttggggcctgtaggtccacacccagtgtgggtgacc 3241 ctccctctaacctgggtccagcccggctggagatgggtgggagtgcgacctagggctggc 3301 gggcaggcgggcactgtgtctccctgactgtgtcctcctgtgtccctctgcctcgccgct 3361 gttccggaacctgctctgcgcggcacgtcctggcagtggggcaggtggagctgggcgggg 3421 gccctggtgcaggcagcctgcagcccttggccctggaggggtccctgcagaagcgtggca 3481 ttgtggaacaatgctgtaccagcatctgctccctctaccagctggagaactactgcaact 3541 agacgcagcctgcaggcagccccacacccgccgcctcctgcaccgagagagatggaataa 3601 agcccttgaaccagccctgctgtgccgtctgtgtgtcttgggggccctgggccaagcccc 3661 acttcccggcactgttgtgagcccctcccagctctctccacgctctctgggtgcccacag 3721 gtgccaacgccaggcaggcccagcatgcagtggctctccccaaagcggccatgcctgttg 3781 gctgcctgctgcccccaccctgtggctcagggtccagtatgggagcttcgggggtctctg 3841 aggggccagggatggtggggccactgagaagtgactctgtcagtagccgacctggagtcc 3901 ccagagaccttgttcaggaaagggaatgagaacattccagcaattttccccccacctagc 3961 cctcccaggttctatttttagagttatttctgatggagtccctgtggagggaggaggctg 4021 ggctgagggagggggtcctgcagggcggggggctgggaaggtggggagaggctgccgaga 4081 gccacccgctatccccagctctgggcagccccgggacagtcacacaccctggcctcgcgg 4141 cccaagctggcagccgtctgcagccacagcttatgccagcccaggtccagccagacacct 4201 gagggacccactggtgccttggaggaagcaggagaggtcagatggcaccatgagctgggg 4261 caggtgcagggaccgtggcagcacctggcagggcctcagaacccatgccttgggcacccc 4321 ggccatgaggccctgaggattgcagcccaagagaagcagggaacgccagggccacagggg 4381 cagagaccaggccagggtcccttgcggcccttagcccaccccctcccagtaagcaggggc 4441 tgcttggctaggcttccttttgctacagacctgctgctcacccagaggcccacgggccct 4501 agtgacaaggtcgttgtggctccaggtccttgggggtcctgacacagagcctcttctgca 4561 gcacccctgaggacagggtgctccgctgggcacccagcctagtgggcagacgagaaccta 4621 ggggctgcctgggcctactgtggcctgggaggtcagcgggtgaccctagctaccctgtgg 4681 ctgggccagtctgcctgccacccaggccaaaccaatctgcacctttcctgagagctccac 4741 ccagggctgggctggggatggctgggcctggggctggcatgggctgtggctgcagaccac 4801 tgccagcttgggcctcgaggccaggagctcaccctccagctgccccgcctccagagtggg 4861 ggccagggctgggcaggcgggtggacggccggacactggccccggaagaggagggaggcg 4921 gtggctgggatcggcagcagccgtccatgggaacacccagccggccccactcgcacgggt 4981 agagacaggcgc