The Protection of Privacy: Mission Impossible?


Abstract—Genetic databases containing information specific to the genomes of individuals will prove to be a vital tool for the future of biomedical research and medicine.  There exists a wealth of information to be discovered upon mapping and analyzing a person’s genome, and it is imperative to set forth adequate guidelines in accordance with patients’ and subjects’ rights when collecting and using this data.  With the integration of computational genomics and genetic databases paving the way for improved personalized healthcare, this discussion-based article examines novel ethical concerns associated with the compilation of large-scale DNA databases and suggests policies for protecting privacy and confidentiality in the coming years.

In recent years, the creation of genetic databases and the applications of computational genomics have revolutionized the field of biomedicine.  With advances in technology, our knowledge of biological mechanisms and the genomes of different organisms has increased at an impressive rate.  Although the integration of this new material with modern medicine has not been completely brought to fruition, it certainly paves the way for the promising future of biomedicine.  It is the common goal of researchers, scientists, and donors alike that information amassed in scientific studies will result in a greater good, benefiting human health universally.  Nonetheless, there exist many ethical obligations associated with this developing frontier of medicine.  The incorporation of computers as analytic tools and the formation of large databases require new considerations in the realms of privacy and protection.  It is essential that we first sort out the moral and ethical concerns involved in large-scale DNA work in order for us to appropriately further our understanding of the human genome and improve health care for the long term.

As a whole, Bioinformatics represents the scientific field that deals with “the use of information technology to acquire, store, manage, share, analyze, represent and transmit genetic data” (Goodman and Cava 2008).  This general definition embodies the all-encompassing features and applications of this area of inquiry.  Bioinformatics uses high-throughput machinery, namely computers, to sort through large quantities of biological information in a relatively short period of time.  The advantages of using computers for DNA analysis and management are plentiful.  Genomes, or the entirety of information included in an organism’s DNA, can be broken down into a drawn-out sequence of letters, representative of chemical bases.  At this microscopic level, genomes are not complex, but they become rather complicated as a result of the breadth of scientific information involved.  In fact, even seemingly simple organisms such as the bacterium Escherichia coli possess about 5 million base pairs (Moody 2004).  Furthermore, algorithms have been specially designed for gene identification.  Given that many genomes possess a large amount of “junk DNA,” or non-coding DNA, it is purposeful to be able to sort through this material and locate genes.  Gene finding is a primary example of a process rendered much more efficient with the use of computers.  Within this large framework of DNA, it is also essential for biomedical analysis that errors in DNA sequence are detected.  The invention of ultra high-throughput machines not only increased the speed at which analysis could be performed, but also reduced the cost of detailing genomes.  In the 1990s, the abundance of relatively affordable personal computers, combined with newly accessible sequence programs such as Basic Local Alignment Search Tool (BLAST) via the internet made data mining more plausible for a global community of scientists.  It comes as no surprise that computer machinery has been labeled the “single most important tool” for studying genomes (Moody 2004). 

The emergence of bioinformatics in recent years has been coupled with an increasing amount of tissue collection and genetic database construction.  Genetic databases are formed from the “collection, storage and use of physical tissue (usually blood, but by no means exclusively so), genotype and other biological information derived from that tissue, and a variety of personal data from populations of various sizes” (Tutton and Corrigan 2004).  In the last eleven years, a substantial number of large-scale, database construction projects have begun in countries such as Iceland, the United Kingdom, Estonia, Canada, Latvia, Singapore, and the United States (Tutton and Corrigan 2004).  While some of these projects target contributors from specific regions, others like those in Estonia, Iceland and Singapore have the aim of becoming national databases, with the intention of collecting genetic material from every citizen (Kaye 2004).

 As with all biomedical research, the topic of consent is often delved into by bioethicists and policy-makers.  For genetic studies, it is similarly imperative that possible database subjects are made clear of both the implications and consequences before deciding whether they wish to participate.  According to Oonagh Corrigan, a sociologist concerned with developments in genetic research and the implications for human participants, certain issues associated with genetic databases pose new questions regarding the limits of existing consent models, and may require the rewriting of today’s guidelines.  The main problem with current models lies in the fact that they do not give adequate attention to what the future uses of donor materials will be.  Often, tissue donated to genetic studies is used again in secondary studies with different aspirations.  In order to protect an individual from exploitation, Corrigan properly argues that more concrete consent laws must be established which specify the long-term uses of donated samples, in addition to calling for protection beyond simply informed consent readings (2003).  Although attempts for improving these laws have since been made, further progress is required to achieve better, more complete protection for individuals. 

It is obvious that computational genomics shares many of the same ethical issues as genetics and other fields involving clinical research.  However, it is more important to examine the ways in which gathering collections of DNA differ from “Genethics,” to determine appropriate ethical guidelines and distinctions.  Utilizing computers to decode information present in one’s DNA can have consequences that are unforeseeable.  As a result, it is difficult to set stringent regulations in advance without knowing the outcome of such studies (Tavani 2006).   Bioinformatics also raises several new issues of privacy and confidentiality when it comes to building genetic databases, three of which are outlined in a recent journal publication by Kenneth Goodman and Anita Cava:  (1) Using digital devices to store and transmit large volumes of genetic material augments the risk of “inappropriate disclosure,” more so than other media.  (2) Secondly, collecting data for genetic databases may enable researchers to draw conclusions about certain communities within a population.   Any correlations made between genomic analysis and the behavioral traits of a certain subgroup must take ethical considerations into account, so as not to attach a stigma to research participants and their respective communities.  The authors underscore the importance of realizing that incorporating individuals’ data for genetic analysis has consequences for broader populations, in addition to the study participants.   (3) Lastly, data mining machinery can discover information about individuals they may not feel comfortable divulging, without it being immediately clear that this is the case (2008).  It is evident that the use of computers for mapping and analyzing genomes sets forth novel ethical issues, which must be taken into account when compiling genetic databases.  More importantly, the questions these issues bring forth concerning privacy and protection need to be “identified and elucidated” to help determine “which kinds of normative policies should be adopted” (Tavani 2006). 

With the cooperation of computational genomics, genetic databases, and modern medicine, there is a shared goal of bettering the future of healthcare for everyone.  In order to accomplish this, genetic databases are continually being constructed with large pools of information.  This translates into “a repository of information that can be used as a research tool” in that genetic mapping and computational analysis can be performed to learn more about “the interactions between genes, environment and lifestyle that are thought to be responsible for common diseases” (Kaye 2004).  Being informed that one’s DNA contains, or makes a person more susceptible to, a malady is beneficial in that it allows for careful planning and discussion between a physician and a patient (Moor 1999).  Furthermore, understanding that one is considered to be highly at-risk for a certain condition, or genetically predispositioned to a disease, allows for the implementation of preventative care.  In turn, this information could be integrated with a personal plan of attack, in which physicians develop a scheme of treatment that best suits a particular individual, in light of having access to how others with similar genetic makeup responded to a particular course of medication.  Physicians and geneticists, with pharmaceutical research and genetic databases at their disposal, should be better prepared to predict how a patient will respond to specific drugs.  This is particularly true for cancer treatments, as cancer has been described as “the quintessential disease of the genome” (Moody 2004).  Information from the complete human genome was studied in an unprecedented fashion beginning with the Sanger Institute’s Cancer Genome Project in 2001.  This ongoing study has allowed for an introduction as to how cancer types differ, and how to best “develop new, more specific drugs for improved treatment” (Moody 2004).  Certainly, the interplay of genes, lifestyle and medicine is a relationship that will be more thoroughly studied in coming years, and improve healthcare in many facets. 

There is a lot of excitement being generated in the scientific community about the future of these inter-related fields, but there are also several reasons to proceed with caution.  Many ethicists have discussed hypothetical situations that could arise in the future as consequences of advancements in our knowledge of genetic information.  Such speculation proves useful when outlining rules and regulations for creating genetic databases and using the gathered information.  It is imperative that data amassed from genetic testing and analysis is not used in any way “to discriminate against individuals to deny them health benefits, educational programs, [or] employment opportunities” (Moor 1999).  Since DNA sequencing and genomic analysis may discover, sometimes accidentally and undesirably, information encoded by one’s genetic material, it is necessary to keep these records private and inaccessible to those not granted approval by a patient. 

It has also been suggested that discrimination of individuals included in such sizeable databases may be facilitated as a result of researchers not having personal relations with participants; though I would argue that familiarity of individuals may be more likely to lead to cases of privacy intrusions.  This is evidenced by one estimate that as many as “80% of the invasions of privacy within a hospital come from one employee improperly reading another employee’s medical record” (Moor 1999).  To combat this, anyone employed by a hospital, healthcare facility, or data collecting agency should be thoroughly informed of all HIPAA laws and privacy rights reserved for patients.  After all, this “culture” of respect for others, their privacy and equality, is a central component of healthcare.  While some view the notion of nondisclosure as “an ideal,” there is no denying that “it has been, and should still be, central to the patient-physician relationship” as well as the various relationships among others in health-related positions (Lowrance 1997).  With the increasing use of electronic medical records, it is now more important than ever that: (1) strong access control and authorization governs record-keeping systems, (2) identifiable patient information, such as one’s name or address, are hidden when possible, and (3) oversight and audit practices continually review the actions of individuals and organizations (Cooper and Collman 2005). 

Clarifying and formalizing laws regarding our right to privacy will enhance our means and understanding of protection significantly.  Increasing the penalties for violating such laws may emphasize their significance, but is not sufficient in and of itself.  It would be wise to revisit laws concerning both inappropriate use and disclosure after having discussed some ethical considerations associated with genetics and computing.  In a world where caste systems continue to exist, and instances of genocide have been recently documented, an ethically-sound system of respect and privacy is crucially importance for the future of both healthcare and human relations (Moor 1999). 



Cancer Genome Project.  2008. (accessed November 9, 2008).

Cooper, Ted and Jeff Collman. “Managing Information Security and Privacy in Healthcare Data Mining: State of the Art.” In Medical Informatics: Knowledge Management and Data Mining in Biomedicine, 97-137, eds. H. Chen, S. S. Fuller, C. Friedman, & W. Hersh,  New York: Springer Science + Business Media, Inc., 2005.

Corrigan, Oonagh. “Informed consent: the contradictory ethical safeguards in pharmacogenetics.” In Tutton and Corrigan, Genetic Databases: Socio-ethical issues in the collection and use of DNA, 78-96.

Goodman, Kenneth W. and Anita Cava.  2008.  Bioethics, Business Ethics, and Science: Bioinformatics and the Future of Healthcare.  Cambridge Quarterly of Healthcare Ethics 17, no. 4: 361-372. ?fromPage=online&aid=2151384/ (accessed October 12, 2008).

Kaye, Jane. “Abandoning informed consent: the case of genetic research in population collections.” In Tutton and Corrigan, Genetic Databases: Socio-ethical issues in the collection and use of DNA, 117-138.

Lowrance, William H. 1997.  Privacy and Health Research: A Report to the U.S. Secretary of Health and Human Services. PHR1.htm (accessed November 3, 2008).

Moody, Glyn. 2004. Digital Code of Life: How Bioinformatics is Revolutionizing Science, Medicine and Business.  Hoboken: Wiley and Sons, Inc.

Moor, James H. “Using Genetic Information While Protecting the Privacy of the Soul.” In Tavani, Ethics, Computing and Genomics, 109-119.

Tavani, Herman T., ed. Ethics, Computing, and Genomics. Boston: Jones and Bartlett Publishers, 2006.

Tavani, Herman T. “Ethics at the Intersection of Computing and Genomics.” In Tavani, Ethics, Computing and Genomics, 5-26.

Tutton, Richard and Oonagh Corrigan, eds. Genetic Databases: Socio-ethical issues in the collection and use of DNA. New York: Routledge, 2004.

Tutton, Richard and Oonagh Corrigan.  “Introduction: public participation in genetic databases.”  In Tutton and Corrigan, Genetic Databases: Socio-ethical issues in the collection and use of DNA, 1-18.