Understanding gene size

David fathers day 2015Father’s Day gifts from David

David has a terminal deletion of chromosome 22 caused by a balanced translocation.  Like nearly everyone with 22q13 deletion syndrome, he is missing a lot more than one gene.  What, exactly, does that mean?

DNA and genes

Each gene is made up of many “bases”.  DNA has two strands (strings) that grip each other tightly. Imagine a bunch of magnets threaded onto a string like pearls. Now make two of these in your mind and hold them near each other.  Slowly bring them close together.  When they get near, the north poles of magnets from one string will start to find the south poles from the other string. When the magnets come together, opposite poles will grab each other.  Anywhere north faces north, or south faces south, that pair will repel each other until one flips around and the opposites unite.  DNA is made of chemical strings that have opposite poles. These opposites find their mate and the two DNA strands lock together. Each time a north meets a south you get a “base pair”.

Magnets can only make one type of partnership (north attracted to south).  DNA actually has two kinds of partnerships from four chemical bases.  The bases are abbreviate T, A, G and C.  T and A attract each other.  G and C attract each other.  If you make a string like this: -T-A-G-G-C-A-, the matching string will always look like this: -A-T-C-C-G-.  That is, the strings stick to each other in this way:


Voilà! You have a small strand of DNA.  This miniature DNA has 6 base pairs.  The order of the base pairs describe the protein that this segment of DNA makes.  The lower strand is kind of mirror of the upper strand. If you know what is on one strand you can always figure out the other strand. Thus, we now know a bunch of properties of DNA:  1) The sequence of base pairs describes how to make a protein, 2) DNA is strongly stuck to itself, 3) DNA keeps a mirror copy of itself available at all times, and 4) the length of the DNA can be measured by counting the number of base pairs.  There is a lot more to learn about DNA, but this is enough to discuss gene size.

Big genes are easier to find

In a previous posting I explained that 95% of all people with 22q13 deletion syndrome are missing at least 1 Mbase from their chromosome (see Understanding deletion size).  1 Mbase means 1,000,000 (1 million) base pairs along the two parallel strands of DNA.  Genes are segments of the long strings, like chapters in a book.  And, like many books, some chapters are long and some are short.  There are 32 genes in the distal 1 Mbase of 22q13, many of which influence brain function. Chromosome deletion syndromes are inherently difficult to study because so many genes are involved.  It is hard enough to study and understand the impact of losing a single gene.  It is much harder to study and understand 22q13 deletion syndrome, where many genes are missing.

This problem with studying multiple genes is not unique to 22q13 deletion syndrome.  It shows up in neuropsychiatric disorders like autism and schizophrenia, each of which have hundreds of associated “risk factor” genes.  Autism, for example, results from various combinations of these many genes (see review by Gratten et al., 2014).  Chromosomal deletions are known to operate in a similar way (see contiguous gene syndrome). Each missing gene weakens the normal operation of the brain.  No one gene needs to be “dominant” for the combined loss to be devastating, especially when so many brain-related genes are missing at once.

Not everyone thinks of 22q13 deletion syndrome this way.  Much of the current thinking about the genes lost in 22q13 deletion syndrome focuses on one or two genes that code for synaptic proteins.  The term “synaptopathy” has been used a lot recently, but that word originates from the study of the inner ear where they are able to clearly demonstrate the relationship between synaptic function and hearing loss (Sergeyenko et al., 2013).  The same does not hold true for 22q13 deletion syndrome. Synapses are involved, but the synapse may not be the primary site of dysfunction (see Is 22q13 deletion syndrome a ciliopathy?). For many years no one thought primary cilia were important. Now, ciliopathies are a recognized type of brain dysfunction despite the fact that synapses are also involved. Science often goes off in a wrong direction; it is part of the process.

There is another reason that synaptic genes have taken the spotlight.  The synaptic genes of 22q13 are relatively large genes.  Defects of these genes are simply easier to notice.  If we look at the history of 22q13 deletion syndrome, the first cases were discovered in people with very large deletions and with the most “severe” phenotype.  As the research in 22q13 deletion syndrome advanced, smaller and smaller deletions were identified and studied.  At the moment, the only gene getting any attention is a large gene that has a large effect when mutated, even though mutations do not necessarily tell you what happens when a gene is deleted (see When missing a gene is a good thing).  So, why does size matter?

Pie chart of mRNA size of first 1 mbase
Genes lost in a 1 Mbase deletion of 22q13 sorted by their sizes (mRNA size).
Right click on the graph to see a full size image.

The pie chart shows the 32 genes missing in 95% of patients with 22q13 deletion syndrome.  They are in order of size. The largest gene is SBF1 and the second largest is SHANK3.  The genes continue in descending order of size in a counter-clockwise direction.  Although the reality is a bit more complex, it is generally true that the likelihood of a gene mutation depends on the gene’s size.  This pie graph shows that the 10 largest genes account for half of the “protein-coding” DNA in the first 1 Mbase.  To put it another way, you are twice as likely to incur a mutation of SHANK3 than incur a mutation of MAPK8IP2, simply because SHANK3 is twice as large. SHANK3 is 16 times larger than SYCE3.  So, when studying mutations, SHANK3 can show up more often simply because it is big.

As I noted above, no one knows what a complete deletion of SHANK3 might do on its own. A gene can have a severe phenotype when mutated, but might do little or no harm when missing altogether.  SHANK3 may have some contribution to 22q13 deletion syndrome, but its relative contribution is very poorly understood.  There are other 22q13 genes that have severe consequences after mutation, usually when both copies are mutated.  We have discussed some of these previously (Can 22q13 deletion syndrome cause cancer?, Can 22q13 deletion syndrome cause ulcerative colitis? and Is 22q13 deletion syndrome a ciliopathy?).  Another gene is SBF1, which causes Charcot-Marie-Tooth disease type 4B3. The phenotype includes intellectual disability.  MAPK11 and MAPK12 are involved in responses to oxidative stress, and are likely important to recovery from infection and brain trauma. SBF1 is large, but MAPK11 is much smaller. SCO2 is one of the smallest genes, yet it is implicated in a series of severe, including fatal, syndromes (DiMauro et al., 2012).  What happens when all of these genes are deleted together?  You get 22q13 deletion syndrome.

The take-home message is that certain genes are more likely to come under the microscope (literally and figuratively) simply because they are larger genes. Being large makes a gene easier to study (usually), but it does not necessarily confer importance. When a gene gets popularized in the scientific literature, lots of papers are published on that one gene, at least for a while.  Scientists will focus on genes that get them grants and publications. That is how science typically works, even if it is not necessarily the best approach to finding effective treatments that families really need. The direction of science can be influenced by patient groups, but choosing the right direction requires a deep understanding of the science (the current state of research), science (the discipline) and scientists (who do science).


Previous posts:
Gene deletions versus mutations: sometimes missing a gene is better.
Is 22q13 deletion syndrome a ciliopathy?
Understanding deletion size
Can 22q13 deletion syndrome cause ulcerative colitis?
Can 22q13 deletion syndrome cause cancer?
22q13 deletion syndrome – an introduction


6 thoughts on “Understanding gene size

  1. Thank you, Curtis. I do my best to make genetics understandable for other parents. Parents are the real experts in 22q13 deletion syndrome, but they get overwhelmed by the science. The basic principles are not that difficult if they are presented properly. Having done both, I can honestly say that raising a child with 22q13 deletion syndrome is a lot harder than learning the science!



  2. Thanks Andy-very well done as it filled in areas i didnt understand very well. I have a question though-driven by a question someone asked me that i couldn’t answer or at least wasnt confident i knew the precise answer:

    1. The mirror image you discussed above-how does that work since only certain bases can sit next to other bases? It is confusing as there is the three of four holes make a codon part of this structure. Could you elaborate on this? This seems like the perfect place to show that part of the dna “rules” as it would help to visualize it for those of us who stumble with the visualation.. Similarly, some stumble on the shape of a chromosome vs, a DNA double helix.




    • Richard,

      Any arrangement (any order) of bases can be put on a string. That is how the body codes for each of the thousands of proteins it makes. However, once you make a string, the second (mirror) string has to follow the rules about matching bases. My drawing shows the two strings in a straight line. In the nucleus, the two strings are twisted into a corkscrew (helix) shape. Since there are two strings, you get a double helix. The Wikipedia page has some nice drawings and explanations (https://en.wikipedia.org/wiki/DNA). There is a nice description of just the structure in this YouTube video (


      The sequences of base pairs that make up codons is another part of the story…for a future blog.



  3. I enjoyed reading the science, with regards to the deletions. It appeared, in your descriptions, that the larger the size of the structure the more open it is to mutate, or have deletions. This seems simplistic: the part I have grasped–and I have the basics of the structure–my gut response is that the surface area is the vulnerable part which makes a difference.

    Does this depend on strictly the inheritable part of genetic science, or does environment make its impact? I was interested since they interviewed Steve Silber, on Fresh Air, NPR radio (http://www.npr.org/sections/health-shots/2015/09/02/436742377/neurotribes-examines-the-history-and-myths-of-the-autism-spectrum), about his book on whether science is doing all that society needs done, in order to make the person with a disability welcome or able to lead a full life. This interview seemed to point out that understanding the science can be wonderfully informative, but questions still, the purport–of whether the information will be used in a good way, to accomplish what is needed–for families. Granted, these two topics are not interchangeable for each concern, but I understand the need to further integrate people who are affected by these genetic deletions and mutations, into the everyday world.

    I would like to say this is a helpful blog and very encouraging. My brother who experienced Down Syndrome, passed away at the age of 42. His background was that as a resident from a progressive area (Northern Virginia/ DC) he had managed a full, active life, until pancreatic cancer caused his death.

    I credit the various families who helped him, and others of his friends with challenges, by being creative and supportive of early education and many of the community whose insistence on an adaptive environment for their children to be involved in, passage of laws for education, and social activity groups with the rigor of regular social events, Special Olympics, and community. This steady and helpful involvement would challenge and encourages abilities to be developed: to do as much as each person was ultimately able to do.

    David’s progress with his walking at a relatively young age, reflects that encouragement and assistance to learn and grow. Thank you for the information and encouraging discourse.
    Kathy Beckwith

    Liked by 1 person

  4. Kathy,

    Thank you for the nice comments. You are right that I simplified the relationship between gene size and impact. For general understanding, the relationship between number of base pairs and probability of mutation is not far off. I used mRNA sizes, which approximate total exon length. There are advanced algorithms that adjust for regions of genes that are highly conserved. That is, these regions do not change much, if at all, between species. The inference is that changing these regions will adversely impact the resultant protein’s performance. I also did not discuss non-coding DNA regions, which are filled with promotors, enhancers and inhibitors. Still, I think the graphs are informative and explain some of the logic we should use to interpret the frequency of mutations.

    Environment has impact in a number of ways. However, with syndromic autism, the primary impact seems to be from inherited genetic defects. Certainly, it is the case with 22q13 deletion syndrome. Whether or not any of this genetic information is used to the benefit of families speaks to the whole purpose of my arm22q13 blog. The blog is written with the intent to address issues that matter. I am a member of both the scientific community and the family community. All of what I do is aimed at bringing benefits to families from the science. Sadly, neither the vocal scientists nor the parent-run organization have done a good job making sure that the science benefits the greatest number of patients and the most severely affected families.

    I am sad for the loss of your brother. My son David is part of the community. Although he requires full-time attention, he is comfortable traveling on public transportation, attending events and shopping in the local stores. He does prefer being at home watching TV much of the time, but so do I!

    David benefited from his special education program, although I am quite unhappy about how much we had to fight to protect his right to a free and appropriate education. The Greater Washington, DC region is lucky to have sophisticated school systems, and talented and dedicated teachers. Through this education, David gained skills and independence that bring life-long benefits to him and his caregivers.

    David was encouraged to grow, but he also took it upon himself to work hard and learn as much as he could. David is an athlete and hero. His success was ultimately through his own hard work. He is an amazing man.



Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s