The Struggle to Build a Massive ‘Biobank’ of Patient Data

Mar 19, 2018 · 66 comments
Counter Measures (Old Borough Park, NY)
For crying out loud, today the minute you are admitted into a hospital, you are essentially a guinea pig! That is after they ask you for your insurance! Avoid them, like grim death!
Chris (SW PA)
1.4 billion is tiny money compared to what gets spent elsewhere. This is one cheap country, a nation of tight wade skinflints. Especially when it comes to something that may actually help people who aren't wealthy. Shame on us for being such passive serfs.
Peter Melzer (C'ville, VA)
If you have ever written an application for an NIH research grant, the first advice you get is that your research must be hypothesis driven. Fishing expeditions are definitely not going to be funded. What does this madman of a director do? Spend billions of dollars on a gigantic fishing expedition with no other hypothesis that with millions of people sequenced something good will come of it. Of course, everybody and their mother wants a piece of the action and a slice of the pie. According to the article the genomes of hundreds of thousands of people have already been sequenced. Where are the great ground-breaking discoveries? This seems taxpayers' dollars wasted on a pipe dream. Discoveries are made in the lab, not conjured up behind some administrator's desk. NIH has been led like a heavy-handed, top-down Soviet-style bureaucracy rewarding heroes of labor, quota five-times fulfilled. The panels of experts evaluating research grant applications have been sidelined. Institute directors and their associates handpick grant applications for funding behind closed doors. Discovery and innovation are contracted out. Leaders trade chairs between government and elite research universities revolving in a turnstile. Research universities syphon a third and more of the research awards off for indirect cost recovery. President Eisenhower was right when he predicted a university-government complex, like the military-industrial complex, where particular interest hold sway.
JY (IL)
Why one million subjects? How is that number determined? It would be bad if it is part of the "big data" craze. Can't do smart, do big. Perhaps there are sound scientific reasons for one million subjects. But I am skeptical, considering we are still mystified by economic cycles despite that there is data about every single purchase from a burger to a house to a company.
Toms Quill (Monticello)
Just wait until the genetic equivalent of Cambridge Analytica gets its hands on that data, and there will be such a company, combined with ever-larger insurance mergers, by then Amazon and Appe will be in on it, and Trump will scrap preexisting conditions. And you thought health care now was extortion?
Passion for Peaches (Left Coast)
I wonder why anyone would agree to participate. My medical group recently did a hard sell on me to recruit me for a breast cancer study. All it would have involved initially is a blood draw and a cheek scrape for a genetic panel, as well as filling out a few forms. That’s how they sold it to patients. But what alarmed me, and made me refuse, is that by joining the study I would have agreed to allow them free access to my medical records for five years. Full access. (That fact was buried toward the end of the disclosure materials, by the way.) Why would I open the door to my personal medical records? I don’t know what they would do with my information.
Richard Sorensen (Missouri)
Nothing new here, other countries have already done it. The US is lagging far behind because it's lacking a national healthcare system. It's doubtful that this will find anything that the other biobanks somehow missed. It's pathetic that the country is burning all this money on research but refuses to provide basic medical services to its population. TLDR fix the healthcare system first!
Bing Ding Ow (27514)
Rich, Vermont refuses to approve Bernie Sanders' "single payer" theories. Try to force others to follow him -- there will be political fights, never to be forgotten.
Harriet Katz (Albany Ny)
Is there assurance the participants are Identities will be protected? Is there a concern that Under the freedom of information lawyer or other mechanism, insurance companies might gain access to genetic predisposition’s and increase the rates of insurance coverage for descendants?
Nancy (NY)
This is a staggering waste of money. It almost borders on scandalous. The money would be far better spent in other ways: 1. Basic science 2. Pilot projects to really find out whether this is even feasible 3. Public health 4. Delivering good health care to people who can't even afford to buy insulin. Francis Collins should go. He has been at NIH far too long. He is controlled by people who may benefit from this project, but the public surely won't.
KK13 (Orlando)
Do you know how much the country spends through DOD budget? $580+ Billion per year, and you think $1.8B for 10 years is staggering waste of money?!
AV (Jersey City)
I would like to participate. How will candidates be chosen? I didn't see that in the article.
Dan Frazier (Santa Fe, NM)
The money would be better spent trying to get ordinary people to live healthier lives using the wealth of information that has already been gathered in other studies. This research project is assured of only one thing: lining the pockets of the researchers. The article says, "Dr. Atul Butte, director of the Institute for Computational Health Sciences at the University of California, San Francisco, hopes to find the earliest signs of disease, especially of Type 2 diabetes." He is quoted as asking, “Do you go back and forth from diabetes for a while?” he asks. “Is it preventable?” We already know the answer to this question. Dr. Michael Greger, in his best-selling book, "How Not to Die," writes, "Type 2 diabetes can be prevented, arrested, and even reversed with a healthy-enough diet. Unfortunately, doctors don't tend to educate their patients about diabetes prevention. Only about one in three prediabetic patients reports ever being told by their doctors to exercise or improve their diets." He goes on to write, "Tonah, a 65-year old Native American, had been on insulin for type 2 diabetes for the last twenty-seven years. He was told by his doctor that Native Americans were "genetically predisposed" to the disease..." After switching to a plant-based diet, Tonah turned his life around in less than two weeks. "His nerve pain diminished dramatically, to the point where it no longer kept him up at night. He lost 30 pounds in a matter of months, and no longer needed insulin."
Mercutio (Marin County, CA)
Some of the very large sum of money intended for this project would be better spent at the present time addressing widely recognized health problems already among us. These include cancer, diabetes, cardiovascular and pulmonary diseases, debilities of aging, and neurodegenerative diseases. These are increasingly important problems we must apply more resources to as we look ahead at our aging population. Factor in the thorny issues of data security and of subtle and not-so-subtle racism and sexism in the uses of the data for advancing diagnostics and therapies, and the potentials for misuse and/or misapplication of data from a vast DNA fishing expedition multiply. Let's face it. Federal funds for research are not unlimited. And though it's tough, we must seek a wise and appropriate funding balance between Manhattan Projects and existing lines of research for advancing knowledge and fighting scourges that in the years ahead will probably afflict more and more of us. We must be judicious with public resources and not be too distracted by shiny objects when the fight against known health problems affecting millions needs to be intensified.
aek (New England)
Every person needs to realize that their data will not be de-identified, regardless of assurances given otherwise. That means including, but not limited to, employers discriminating based on anticipated/real healthcare costs, insurance premiums based on risk skyrocket, government surveillance risks based on biomed demographics, and a whole host of yet unknown ways to use and abuse the data. No way should any minor's genetic material be offered up to this effort until they come of age, have full and appropriate genetic counseling and make fully informed and coercive-free decisions about whether to contribute.
vulcanalex (Tennessee)
I doubt that this will really work that well, since many groups won't be doing anything near this. Who wants such an intrusion into their personal lives? Even diabetics don't like to monitor their eating or insulin.
ZL (WI)
I took part in it because I'm too healthy to be hurt by data leaks. This kind of volunteery exams are not that harmful as the ones insurers and employers may mandate in the near future. I hope at least we can keep our democracy so that when it comes, they will not kill us and harvest our organs based on our health condition, even if privacy leak and discrimination is almost inevitable given the power that corporation giants have.
SW (Los Angeles)
Manipulating data will mean future bad medical policy....for example the current standard of care for Type 2 diabetes, supplying insulin to diabetics rather than teaching them to manage their blood sugar by controlling their diet. IF those type 2 diabetics are alive in 40 years, they are going to have serious medical issues that could have been avoided.
vulcanalex (Tennessee)
Teaching is fine, mandating is not. Fix the base issue if you can, not remove freedom.
Craig A (Florida)
It’s difficult to take the Federal Government seriously in matters like this, when they still insist marijuana is so harmful. Meanwhile, I will summarize what the NIH will find over a decade of research and save taxpayers billions : 1. Fried foods are bad for you. 2. Sugar is bad for you. 3. Exercise is good for you at any age and can counter some of the effects of poor eating. 4. Green vegetables are good for you. 5. Moderation is desirable in consumption of alcohol. 6. Most human beings are too lazy to pay attention to their diet. 7. Yoga and meditation are really good for you. 8. Americans are largely too undisciplined to regularly practice yoga, meditation or self-control. There you go America.
David Rosen (Oakland CA)
This kind of work makes a great deal of sense and will very likely yield substantial insights. However the absurdity is that our mentality is still so immature that we are simply unable to organize this kind of undertaking on a comprehensive level. Including everyone would obviously yield considerably more meaningful data. It would simply be a matter of organizing a universal process for capturing health data. I say "simply" but of course nothing is simple when so many people are focused on adversity, often in the guise of "practicality". So we end up with a much more limited and fragmented process.
Rodrick Wallace (Manhattan)
What if a good part of our health problems are social, economic, and political -- results of gross imbalance in power and of actual deprivations and insecurities? This proposed program provides a smokescreen against seeing these other determinants of health that defy biochemical reductionism.
Golfer (Chicago, IL)
Indeed they are. http://journals.sagepub.com/doi/abs/10.1177/0022146510383498
Richard Sorensen (Missouri)
In case there was any doubt: DNA data, which they are collecting, is 100% traceable to the individual. Despite any assurances to the contrary, there is no way to anonymize DNA.
NYC Nomad (NYC)
Discovering useful connections between genes, behavior, environment and health presents an enormous signal to noise challenge. Whether or not one might eventually get a sample size large enough to find meaningful correlations distracts us from the harm done by delay. The needle in a haystack approach distracts us from clear and present dangers of familial syndromes that affect enormous numbers of Americans: heart disease, lung disease, cancer, diabetes, and Alzheimer's for a start. If we give priority to efforts on kindreds and diseases that appear to have genetic linkage, we can also improve our chances of identifying genetic, behavioral or environmental modifiers that aid prevention. By selecting the Icelandic population, DeCode reduced the genetic complexity of its search. And still they face enormous hurdles. So why should we cast aside knowledge we already have? The majority of leaps forward in understanding the genetics of human disease have come from targeted efforts. There's little justification for NIH to wander in the woods.
Joy B (North Port, FL)
Since a lot of people have had their DNA already done by separate genealogy programs, maybe the government could tap into those. Medical records to accompany each would be the hard part.
Palmer (PA)
Those commercial genealogy tests are not the complete genome.
Yer Mom (everywhere)
Does anyone else worry that one of these days AI will figure out that messy, imperfect, destructive humans are just not worth the trouble?
heysus (Mount Vernon)
Germ warfare, cell mutation, bio infiltrates. All issues that could happen when the "enemy" gets ahold of all of this info. While not necessarily a waste of time, it is dangerous. A fool would choose to join.
Daniel (Brooklyn, NY)
$1.4 billion over TEN YEARS is too much to spend on health research, but in FY2017, the proposed budget for the Department of Defense (from a Democratic President, no less) was $582.7 billion. Just for ONE YEAR. The priorities on display are shocking.
Mark (Rocky River, Ohio)
NIH and CDC have long resisted anything of this nature that would provide contrary hypotheses to their own and Pharma interests. For sure, they would never stray from the "greater good" notions that they proffer, despite any evidence that some may be helped. For decades, despite considerable evidence that there is a genetic predisposition to environmental triggers found in compound like vaccines, there is no effort to fund such work. I'd participate in providing my data if I could count on the sanctity of "first do no harm" being the driving force. I have a better chance of visiting Pluto if I would expect that the oath would be kept.
Maynard Dyson (Fort Worth, Texas)
This represents a basic falsity, that if you gather enough information you must discover something. There is no specific hypothesis driving this. The proponents say that once this information is gathered researchers will come forward with significant questions that they couldn't consider before. The problem is that the new questions that this makes possible will for an individual tell him his risk for a particular disease is 1.2 times as great as if he a different gene and ate differently. What does he or the health care system do with that? Data banks are best when they are formed with some idea about what they are studying. Maynard Dyson, Physician
Karen Steinberg (Atlanta)
Maynard Dyson is right that most of the associations found will be of a magnitude (1.1-3.0) too small to predict any health outcome. One thing that can't be predicted is innovation. I hope that enough innovation will occur to allow the vast the amount of data to be analyzed and integrated to create information from data. We certainly aren't there now.
Richard Sorensen (Missouri)
In fact, gathering data blindly, without first formulating a hypothesis, is toxic, because it can lead to after-the-fact bias, aka false 'discoveries'.
Donald Johnson (Colorado)
I would volunteer for the N.I.H. project if I thought Big Government is honest, competent and cost effective. It's not. Since the private sector already is well into accomplishing what the N.I.H. is attempting, our broke government should spend its money helping people who are in need, not enriching those who are smart enough to do DNA studies and advance their careers without government funding. As Europeans and many Americans know, we can't trust government to do a better job of protecting our privacy or helping us live healthy and productive lives than the private sector. As for seeking diversity, let's not be politically correct. A million samples in several projects undoubtedly will provide the information needed about diversity. Big numbers count in research, but huge numbers are ridiculous and expensive. Already, N.I.H. is proving it's not capable of managing such a huge project. Connect the dots. Cut our losses while we can.
Oh please (minneapolis, mn)
I did volunteer for the project as I trust the government more than most private entities.
Mark (Rocky River, Ohio)
Do a bit more research into the revolving door nature of those leaving those government agencies for private sector Pharma, etc. and you may reach a different conclusion. Both sides are susceptible.
Harold412 (Massachusetts)
So did I volunteer. I hope my participation provides information that will help save lives or ameliorate suffering.
Ted (California)
I can't see why any American should participate in this project. The risks to participants are clear and frightening, while the benefits are dubious and uncertain. In the worst case, Republicans get their dream of eradicating the Affordable Care Act. As Republicans are very unlikely to enact privacy protections that interfere with important donors in the medical-industrial complex, insurers and employers will have free rein to mine the data and discriminate against or penalize people with actual or potential health problems. In the best case, the medical-industrial complex will use the data to develop highly profitable drugs and devices, for which they charge the very people who donated that data extortionate prices. Depending on their wealth and insurance, those treatments may be unavailable to them. Executives and shareholders will reap the benefits, while everyone else will bear the costs. As long as the United States has a unique health care system focused on the wealth of executives and shareholders rather than the health of patients, Americans should "just say no" if they're ever asked to participate in a program like this. At worst, their data will be used against them. At best, they'll be making an unjustified donation to the executives and investors in a medical-industrial complex that too often fails patients.
Kiki (California)
Two words: Henrietta Lacks
Bing Ding Ow (27514)
When applying to West Point, during the physical exam, it was discovered that my nephew has Type 1 diabetes. Such persons are immediately rejected. Now, we are concerned it will affect his future employability. It was Joe Biden, the Big Democrat, who talked big to get the taxes for this program. As usual, left out -- what is being done to protect the privacy of those participating? Facebook, anyone? (40% of Americans are not on FB.) I refuse to participate in another government goof (e.g., Larry Nassar, Parkland/FBI, "you can keep your doctor," ad. inf.)
Howard (Omaha)
well, I would be more thankful that they found he had type 1 diabetes and now treating it. Hard to be employed when you are in diabetic ketoacidosis or have suffered complications from untreated diabetes.
LA Lawyer (Los Angeles)
This is an outstanding project, and one that differs from the usual NIH undertaking where the government funds the research, and when it is successful, private pharma takes over and makes all of the profit despite the public investment in the research. Yes, there are privacy issues, but many of us are willing to compromise on our privacy for the benefit of future generations. And having been at many dinner parties where guests blabber on about their health issues and the medications they take, I wonder how seriously most people really are about health-related privacy.
Megan (Santa Barbara)
Problem: what about experiences, and their effects on health? We know that Adverse Childhood Experiences (ACEs) impact lifetime physical health. These effects may be epigenetic (and hence picked up on genetic tests) or they may be structural (organs or brain networks or hormone systems which fail to grow correctly). Why not poll for ACEs as well, and thereby factor in the most shaping force to health that we are currently aware of?
Mike McGuire (San Leandro, CA)
Wouldn't this enable massive genetic discrimination were the raw information to fall into the wrong hands - like insurance companies' hands? First for "only" one million people, then for the rest of us as the effort is expanded. One thing we should have learned in the Computer Age is that even when all safeguards are taken, your personal information is never totally safe once someone else gets it.
ABT (Citizen of the world)
Yes - under no circumstances would I participate in this study.
Karen Steinberg (Atlanta)
The Genetic Information Non-discrimination Act of 2008 (Pub.L. 110-233, 122 Stat 881) was enacted to prevent discrimination on the basis of genetic information. But after watching Trump flout rules and norms of civil behavior, I wouldn't count on anything anymore.
Jay in Seattle (WA)
Worth noting: HR 1313 would pretty much gut GINA, the nondiscrimination law (which, btw, *already* doesn't cover life insurance health insurance discrimination based on genetic info), and allow employers to demand genetic test results: https://www.congress.gov/bill/115th-congress/house-bill/1313
LG (NYC)
What about patient privacy? Who will have access to the results- i.e. could potential insurers, employers, etc. use information uncovered in this study to discriminate against participants down the line based upon genetic predisposition? I'd be a willing participant if doing so didn't jeopardize many things down the line, but without robust safeguards in place this is a minefield.
Sari Hoerner (Seattle)
Ambitious projects typically entail enormous complexity, and this is no exception. This is a mash-up of big data, genomics, and myriad ethical and social issues. If NIH and its partners can pull it off, the discoveries would shape human health for generations to come. However, the road blocks (financial, practical, sociocultural, and technical) are formidable and the track record of the Institute on past projects of similar magnitude (Cancer Biomedical Informatics Grid, National Children's Study) did not bear full fruit despite massive federal investments. As a researcher who has worked with some of the organizations mentioned in this article, I am watching All Of Us with a mix of optimism and skepticism. Bringing disparate collaborators together on something of this scale definitely takes time, effort, patience, and "conference calls upon conference calls" in order to gain alignment around goals, design, methods, and data governance. I'm willing to be patient a while longer, knowing the potentially transformative impact of the results.
meow (los angeles)
I wonder how many people will go sign up after reading this article? I am one of them! While I generally prefer anonymity and have privacy concerns, I realize I've already given blood and salvia samples to several other studies as well as 23andme so why not this one? The ten year scope is particularly appealing to me. Yes, I too am concerned about safety of our data, but my healthcare providers already have this info online, also in a database, so why not give it to science? After all, a 'trove of health information like nothing the world has ever seen' that 'should provide new insight into who gets sick and why, and how to prevent and treat chronic diseases' is pretty cool!
Grace (Portland)
Mister Grolsch is correct that a massive relational database is the appropriate repository this data. However, it’s difficult for mention high-level executives to understand that data models like this can take decades to develop. Most of the gigantic computer systems that we take for granted (banking, taxes, social security/Medicare, transportation, and even intrusive consumer databases) have been evolving since the 1980s and 1990s. No one can design a comprehensive data model all at once. The rocky beginning of ACA enrollment websites is an excellent example: those should have had a five-year lead time. Then you have to figure out how to integrate existing non-uniform data into the database. If you’re collecting data from end-users on apps or browsers using rapidly changing SDKs and open source tools, there’s even more unanticipated work to be done. Finally of course there are privacy and security issues. Executives looking to make money from such projects (unless they were actively involved in Y2K work or have other hands-on experience) generally dismiss long-term (five to ten-year) development schedules and force developers to work faster. A sensible strategy is to treat the overall vision as long-term and divide the project into carefully integrated phases. I see here a worthwhile undertaking with unimaginable potential, but wonder whether the current business/IT climate has the maturity and perspective needed to pull it off without a lot of pain and waste.
Mister Grolsch (Prospect, Kentucky)
Perhaps the federal effort would be best spent creating a relational database that could tap each of the existing private sector collections. Surely the programming talent needed for such a database is available and the harmonization of standards -- or the recognition of deviations from a standard -- ought to be at least conceptually not difficult to do.
BlueMountainMan (Saugerties, NY)
Could we not turn to private entities such as 23 and Me & Ancestry.com for DNA that has already been sequenced? That should bring the cost down considerably. I’d give permission to share my data.
LPK (Pittsburgh)
Unfortunately, that would not work. 23 and Me and Ancestry do not do complete genomic sequencing, such as a project like this requires.
Bobbi Bowman (France)
But they did most of what this study proposed and they asked us a lot of health questions and I think it is silly to overlook a privately funded data collection close to what is now being described as desirable. It's inexcusable that the original (before ancestry only ) project was basically shutdown by the US govt. for no good reason. They had the 1 million you want because they paid for the money gap. And then it was shut down effectively because WE shouldn't be allowed to know this information without having a doctor in the middle. It was outrageous then and it still is. The data ,for the most part, is already there and collected at a heavily subsidized cost. To ignore this existing database is criminal.
Ma (Atl)
"While supporters say the results will be well worth the money and effort, others have begun to question whether All of Us is just too ambitious, too loaded with cumbersome bureaucracy — and too duplicative of smaller programs that are moving much more quickly." This program does duplicate efforts by smaller institutions - both private and public (through grants to universities). We don't need the NIH involved in this at that price tag. More importantly, the NIH is hugely bureaucratic. Even Obama knew this when he tried to start up a new agency for innovation outside the NIH. Although, that too was destined for the same problems of bureaucracy. And the NIH won that one too - the Dept. of Innovation resides within the NIH and received 2 billion. The purpose of government is to provide security and services that the private sector would or could not provide. That's not the situation in this case.
DBA (Liberty, MO)
This is a wonderful idea. I'm currently in the third year of a five-year Harvard medical study of supplements to enhance cardiovascular health. But given what has gone on recently with Cambridge Analytics (no relation, thank God) and Facebook, it would concern me who's guarding all this data.
J (USA)
When I started visiting GW Medical Faculty Drs. I signed a permission to be included in various tests. I was included in one. Then someone allied with the project took the data on a jump stick to the cafeteria where her purse was stolen. Those included in the test were notified of the loss/theft of data. I tried then to get out of all further testing thru GW but to this day don't know whether I have successfully done so.
Larry Beacon (Amherst, MA)
Ate our biodata safe?
Miner49er (Glenview IL)
No. Nothing is safe from the MegaState. Even if confidentiality is assured, it will be breached and the data will be misused, probably to the detriment of donors.
Christine Arrington (Larchmont, NY)
My understanding is that the late Jon Huntsman, Sr., was in the process of gathering data on 15 million people from around the world--their entire DNA sequenced (which he could have done at his hospital in Salt Lake City), their ancestry records going back as far as available (which he could have done at various ancestry databases, many of them in Salt Lake City), and their health records from birth until death. I was told that he was traveling around the world gathering data from others who had pieces of those three and then, when needed, trying to supplement that by offering to sequence the DNA.
Jim (Colorado)
Seems like a good story, but you don't need to "travel around the world gathering data" these days. The mental image of this has him stopping by offices and laboratories to stuff papers in his briefcase--rather comical.
Bing Ding Ow (27514)
In case anyone cares about the facts about Mr. Huntsman -- https://healthcare.utah.edu/huntsmancancerinstitute/research/cancer-gene... He had cancer. He had a deep interest in the topic. His church keeps very good records on families. He was also very competent, which cannot be said about Big Government.
Jeffrey (St. Louis)
From a machine learning perspective, this is not exactly trivial, but isn't out of the world hard either. The looming task at hand is instead the gathering and storing of all the data. Why can't independent scientists, or a group of geographically dispersed scientists, perform research upon a smaller but effectively equivalent sample size?
LPK (Pittsburgh)
Exactly! Francis is going to end up getting beaten to the punch again, as happened with the human genome project