Microsoft Finds Cancer Clues in Search Queries

Jun 08, 2016 · 126 comments
LJ (NYC)
Let's put the other silent killer, ovarian cancer, in their sights please.
M-aroJ (Los Angeles)
People actually use Bing!?
Kim Bellard (Ohio)
Fascinating, and should open up our thinking about how we diagnose things. See: http://kimbellardblog.blogspot.com/2016/06/this-actually-is-test.html
Ann C (KY)
Wonder how many Google searches there will be tonight along the lines of "why is Google gathering my 'Google searches' medical queries and who are they selling it to?" As always, follow the money!
todd g (virginia)
resistance is futile...you will be assimilated
c smith (PA)
This story fails Journalism 101. Where are the indicators and/or "clues" mentioned in the headline? Any curious reader wants to know HOW the data mining worked! The author needed to do a lot better than: "...researchers declined to offer specific details..." Waste of time.
yuzu (ÜT: 37.858605,-122.251791)
Seems like insurance companies would be very interested in these searches.
Little Albert (Canada)
Well, okay, I suppose there may be some benefit in trying to mine whatever was the source for this research. I think a more promising approach to earlier identification is a more effective dialogue between patients and service providers with technical infrastructure to support engagement that does not require face-to-face visits, tools to extract data drivers for clinical decision support functionality from both structured and text-based information, and financial incentives for providers to 'digest' this information and perform useful activities. I think this is a more promising approach. Even if we ignore the privacy issues for a moment - I have concerns about encouraging the public (and health service organizations) to believe that quality, safety and sustainability can be promoted in any major way through application of data mining algorithms to unstructured data extracted from interactions with the web. Given scare health care $$ and a scarce supply of people who can initiate clinically relevant and appropriate actions from information supplied, I would be more inclined to figure out how to get higher-grade information into the hands or providers, rather than trying to figure out how to get information of dubious provenance and quality into the hands of providers. And re privacy - if I were to search under "diagnostic criteria for mixed personality disorder with narcissistic and antisocial features" in this brave new big data world, could that harm me???
ACW (New Jersey)
I think a lot of people don't understand how to crunch the numbers.
One thing I forgot to mention in previous comments: It helps to draw a Venn diagram to visualize the following.
Say you have a sample of 100 people who were definitely diagnosed with pancreatic cancer and who also had symptoms, before diagnoses, serious enough to get them to turn to search engines.
Of that 100, 50 searched for abdominal bloating; 30, for light-colored stools and diarrhea; 20, for fatigue and weakness, all of which may be symptoms of pancreatic cancer but also of other conditions.
Your real gold is in the subset - where those circles overlap.
Steve (California)
Thank you for illustrating and making this analysis much clearer than what was depicted. I agree with the other commenters that the journalist (a Pulitzer Prize winning reporter) could have explained this more thoroughly.
Pontifikate (san francisco)
Interesting that the NY Times picks of these comments does not include OONE of the very many here who have found the article to be a tease. Like others have said here, the obligation of a journalist is to answer the questions his/her writings will engender. This journalist did not do that and it has obviously rightly frustrated a number of readers.

Where is the new public editor? We need the Times to train its journalists better.
reckel (Washington)
The Journal of Oncology Practice article is now freely available. http://jop.ascopubs.org/content/early/2016/06/02/JOP.2015.010504.full
DILLON (BLANDING UTAH)
Let's look a little closer here - the big picture here is not about cancer - it's about a total, complete, and constant surveillance. It’s all very creepy. Your searches can be followed and determine if you have cancer. The follow on is that anyone will be able to know anything they want about you - anything. Dr. Horvitz, thank you very much, has taped into a method to lay each and every person completely transparent to the power of technology. Combine this with the ambitions of people like NSA, FBI, CIA etc, we are looking at an entirely new paradigm of society.
BluRod (Tucson)
It's the World Wide Web-you should know that when you start browsing medical chats yes?
james reed (Boston)
Isn't something other than early detection applicable here? The reason some succumb to the disease and some not relates not just to early detection, because to some degree or another, all patients seek treatment when they have symptoms. What separates the one who survive the diagnosis from those who do not has to do with the the aggressiveness of the tumor and its location. The ones who survive have a rather indolent tumor.
CLF (Minnesota)
If you haven’t specifically opted out of getting targeted ads, marketers constantly use data on your internet searches to send advertisements for products that might interest you. This is the same thing but with a more sophisticated algorithm behind it. As a public service, a future NIH could do what marketers do and send targeted ads – “see your doctor” – to individuals without personally identifying them. Creepy, yes, but it might also be a silver lining to the loss of privacy we already have.
Joy (Sacramento, CA)
Has anybody else realized what the real story is here? i.e. Microsoft is tracking every search you are doing on the internet, and tabulating that information. And for anybody who swallows their little "oh, don't worry, we won't link the searches to YOU specifically", or "Gosh, we're only trying to HELP!" -- send me a note so I can sell you some sure surefire stock tips for a nominal fee. This is one of the most chilling articles I've come across in regards to just how intrusive Microsoft has become, all in the name of "helping" us. Get a clue, people! Big Brother Mike has only your very best interests at heart, he just wants to help you. Uh huh.
GBrown (Rochester Hills, MI)
After taking my young son to multiple doctors, receiving several horribly wrong and illogical diagnoses, I would like to see software that considers health history, current medications and environmental factors providing doctors with recommended tests and a list of probable diagnosis codes.

My son had acid reflux which caused an ulcer in his esophagus and he refused to swallow his own saliva but one doctor concluded he had constipation and recommended enema's. A ridiculous diagnosis based on flawed logic and limited knowledge. A Google search did a better job of diagnosing "difficulty swallowing" than 3 doctors.
Barbyr (Northern Illinois)
This is simply an opening salvo in a "We're here to help you" campaign for the government and big business (and who knows what other, heretofore unknown malicious entities?) to be able to match data with real people.

First they are, be still my heart, increasing your chances of surviving pancreatic cancer another three months. What's not to like? People become inured to such "helpful" blatant intrusions of privacy. Next thing you know the government is going to help wean you off internet porn. This is a very real possibility. Think about it and talk to me in 10 years.

This progression of creeping civil and Constitutional rights violations seems an immutable law of the universe these days.
Bruce (Portland, Ore)
I concur. As with most technology, it can be used to do seemingly good to great things in this world. On the other side, it's about money and rights. How can it be used to make money? Connecting doctors with patients or drug companies or denying medical coverage or employment to someone whose search queries indicate they or someone in their family might have some disease. Just more rights taken away and now they're blatant about it.
Sue (California)
This must be connected to something that was puzzling me yesterday. I've been googling my son's symptoms for a couple of months. I always get the same results, but it gives me the illusion of doing something. Yesterday, Google suddenly started leading off with several results involving pancreatic cancer. The doctors have already ruled it out, but it did startle me. It must have something to do with this news story.
Dave S (New Jersey)
Big data and evidence based medicine will in time lead to earlier and more accurate diagnosis and treatment of many conditions. That these soft algorithms were useful for something as difficult to diagnose as pancreatic cancer is certainly a sign for hope.
Rishi (New York)
In abook published in 2003, BALANCE, GICORP, a relation between decline in intelligence and serious diaereses indicate how some of these diseases can be predicted much ahead of time.
Penny (New York)
How did they know that someone actually HAD a diagnosis of pancreatic cancer?
NoFun (Nashville)
Great question. I work on similar research efforts and I can tell you that is not trivial to find cancer cases among a group of people using their medical records, let alone their random internet queries. Color me skeptical of these results.
will (texas)
http://www.nytimes.com/2016/04/15/health/thyroid-tumor-cancer-reclassifi...

who says they are right ..they do !! but maybe they will start to see thing as they are
Frederick (California)
So there is slim chance Microsoft techs might be able to deduce a cancerous condition of a given network user. All that is required is access to all data on all users of the network? In this case that would be the internet. Is the purpose of this article to laud the intelligence or 'can-do' attitudes of really smart database admins? I guess I am not grasping the intent of this piece.
BobF (Park Ridge, IL)
The comment about five-year survival rates seems misleading to me. If cancer, is found earlier, of course the five-year survival rate will increase, but since there is apparently no good treatment for pancreatic cancer, it is certainly possible that the patient doesn't get a single day of extra life.
Pups224 (NYC)
This is not a major finding. Pancreatic cancer is still there and killing. The one year mortality rate may go from 97 to 93 percent. Big deal. It's also an awful and painful way to die.
What is needed is a gene that if expressed signifies the probability that a person will get pancreatic cancer. Or, an immune modulator that works on adenocarcinoma of the pancreas and cures it.
Until that time, there is no cure.
Barbara Pines (Germany)
The phrases "early detection of the disease can prolong life . . " and "the study suggests that early screening can increase the five-year survival rate of pancreatic cancer patients . . . " caught my eye.

They're not the same thing, as has already been pointed out in other articles. The patient can , well, survive longer after early diagnosis without necessarily having a longer life. Imagine that John Reader was diagnosed with a cancer (not necessarily pancreatic) in late 2011 and died last week. He would not have reached the 5-year survival rate. Had he been diagnosed in May of 2009 and died last week, he would have survived seven years after diagnosis and been counted into the happy statistics with years to spare. Reason to celebrate?

That is not to say that early detection doesn't save lives - when it prevents metastasis it often does. We just have to understand where the concept of the five- (or ten-) year survival rate is useful to us.
Gregitz (Was London, now in the American Southwest)
Bravo. Conduct a study on information submitted by the public, come to conclusions and then apparently choose not to share the details of the information gleaned from the populace. This, all whilst trumpeting the apparent success and brilliance of those conducting the study, which more or less falls along the lines of common sense. Especially egregious as the article itself states the condition is quite lethal, with early detection described as crucial.

This is woeful journalism, or click-bait masquerading as journalism. It's becoming a very sad top-down world, and NYT is becoming an active player in it.
RetiredGuy (Georgia)
Recently, two different doctors have remarked to me that the Internet can not be trusted to provide good information when it comes to anything medical. I guess stories like this tend to refute this attitude.
Sue (California)
RetiredGuy, doctors always say that, but then they use it too.
Scientist (Boston)
That may be true for many sites, but I bet they use the Mayo Clinic website just like I do.
paul (blyn)
Here will go again....we can send a man to the moon but must rely on this voo doo approach to help cure panc. cancer.

Unreal...
Lala (France)
This is crazy. Many people search the internet on behalf of friends and relatives who don't know how to, so how do these researchers know that the person searching actually got the disease? Did they match the names with the obituaries?
Also, while a great idea, many diseases share a large set of symptoms, so unless they bring forth proof that the set of symptoms they selected leads directly and only to cancer, one can question everything. Maybe the research itself is less flawed than this article suggests.
A. Davey (Portland)
Microsoft's ethicist (if there is one) needs to ponder what obligations Microsoft has to users whose queries suggest they have pancreatic cancer. Doesn't Microsoft have a duty to inform them?

Or is Microsoft already a step ahead of us, having buried an automatic opt-out of such notifications in a recent update to its user "agreement"?
W (Houston, TX)
Here's a link to a blog about the study, including some of the symptoms that were not described in this article.

https://blogs.microsoft.com/next/2016/06/07/how-web-search-data-might-he...
itsmildeyes (Philadelphia)
You know how the NRA doesn't want its members to register guns because (according to their thinking) then the government will have an easy list of who owns guns? I don’t know, in that same vein, I guess I'm kind of weirded out thinking anybody's got a list of my thoughts and interests. I'm not sure I'm buying the 'anonymized' business, either. The commercial aspects of searches are already creepy - you know, when you're looking to buy underpants, do a quick search, and next thing underpants ads are popping up while you're reading the New York Times.

Sorry, but I'd like to opt out until I can give this some more thought.

Oh, and obviously, what the heck? What were the symptoms people were searching? (My husband died of prostate cancer. His brother died of pancreatic cancer. I didn't think there was anything funny about cancer until I read about 'Cortana' helping me out. I have an iPhone, though; for me it would be, 'Siri, do you think I have cancer?' To which she would reply, 'OK, check it out.')
Concerned Reader (Boston)
So you want Google/Bing to invest many billions of dollars to create a search engine service that you can use for free, but not for any way for them to make money?

Do you understand how business works?
itsmildeyes (Philadelphia)
Well, like you I pay for the New York Times through my subscription. Then, since you kind of suggested it, I looked up (for free, according to you) at "A Brief History of the Internet & Related Networks" at internetsociety.org. If I can understand this, it looks like there are various funding sources; initially, it would seem the internet fell under the auspices of U.S. and European defense and science agencies (so, here anyway, the federal government). This article continues "The bulk of the system today is made up of private networking facilities in educational and research institutions, businesses and in government organizations across the globe."

I take your point (and you're 100% correct, I'm a poor student of capitalism), but it still seems to me it's not a strictly commercial enterprise. And, no matter who's running it, I don't think it's unreasonable of me to expect some level of privacy.

And while I'm at it, I'm not sure it necessarily follows that the commercial tech companies are necessarily the proper gatekeepers and directors of all aspects of the zeitgeist. An example of this would be Mr. Gates's promotion of the so-called Common Core curriculum. I mean, it could have been successful, I guess, but from what I understand there was not necessarily pedagogical study to back it up and it seems to have become something of a mess.

Regarding the cost of the search engines, I assume they are folded into the purchase price of my devices and provider services.
AB (Wisconsin)
Intrusive at worst and misleading at best. Anyone who wondered if privacy was being invaded by these big search companies now can fully understand that yes, it is.
Concerned Reader (Boston)
It should have been blindingly obvious to you that search engines keep track of your searches. For one thing--you can lookup your history.
Jessica (New York)
I've done several very extensive searches of medical conditions for friends and older family members. For ages after a web search on terminal kidney disease for a step-parent, ads for adult diapers trolled across the screen! Research for a friend with shingles leads to ads for itchy skin. I get the famous example of cholera deaths in London being traced to one polluted source, but that seems like a simpler triangulation..... Of course, by the act of commenting on this article, we are probably setting ourselves up for years of targeted and interested watching on the part of big Pharma.....
Bill (NJ)
"Microsoft Finds Cancer Clues in Search Queries", but keeps them secret? Sounds like the same approach to reporting Windows OS hacks, secrecy.
j (nj)
The problem is that pancreatic cancer shares symptoms with many other benign conditions. A backache or stomachache are usually indicative of nothing serious, but both have been associated with pancreatic cancer. However intriguing the findings, matching a search with those with a diagnosed disease is questionable research. My husband had a backache and stomachache for which he saw his doctor. The backache was quickly dismissed and so was the stomachache, because he drank at least 5 or 6 cups of coffee daily and was otherwise young and healthy. Unfortunately, he had pancreatic cancer and he died 31 days later.
Charles - Clifton, NJ (Clifton, NJ)
There are some interesting social ramifications to this work (the security aspect is separate, as it is on the internet in general. We want information to be available, and also want confidentiality and so forth). That people are using the internet to determine possible medical problems can imply that medical care could be expanded by offering additional diagnostic services to people.

When a patient consults his or her doctor for a physical, the doctor can listen to the patient for other signs. The patient might recall episodes of blurry vision, for example, something he or she might forget. But internet searches can reveal these symptoms earlier on, and in a more timely manner.

This work suggests a web service to which people subscribe; they can report concerns immediately rather than wait for physicals. HIPAA regulations apply, so there are privacy aspects to this service; we have them for financial transactions. The person's doctor also subscribes to this service, receiving information about his and her clients. A semantic/rules-based system can aid the doctor.

Thus the system transcends data-driven technology because it becomes a directed system that knows who the people are; it monitors the health of specific individuals.

John Markoff wrote a great book about J.C.R. Licklider and the foundation of ARPANET. We have come a long way in a short time.
Ian stuart (Frederick MD)
Those who conclude that this shows that you shouldn't do anything online because "they" will have access are making a bad mistake. What this illustrates is the need for data protection. Take a look at the Swedish law where any access to data concerning individuals requires a warrant and a compelling argument for access. In the US the major problem is that any law enforcement or security agency seems to be able to get access to individual's data. Do I want my search queries to be freely available? No. Do I want to know if it might save my life? Yes!
A. Davey (Portland)
"The study suggests that early screening can increase the five-year survival rate of pancreatic patients to 5 to 7 percent, from just 3 percent."

Folks, polio vaccine this is not.
nap (nyc)
That's the survival rate after diagnosis. So of course early screening can increase the survival rate since it will detect more cancers at an earlier stage.
Richard Frauenglass (New York)
Absolutely correct. But if you had one of those groups to chose from which would you rather be in, the 5% or the 3%. Choose wisely.
gracia (florida)
When i searched the symptoms it was for my dog who later passed away. As a data analyst I see flaws in the initial assumption that all searches would be related to people.
Shirley J. Grainger-Inselburg (Norwich, VT)
I agree. As a former medical school librarian, I am frequently asked to research medical topics for friends and family. In recent months, those online searches have included symptoms for both dogs and cats, as well as people. Also, as a certified water operator, I,routinely, conduct online searches on health hazards from various contaminants found in public water systems for others concerned about their water supply. It would not indicate that my own water system was contaminated.
A. Davey (Portland)
Why isn't Microsoft releasing the search terms or the symptoms?

It's simple. That is proprietary information Microsoft has mined from the data careless users have thrown away while using a "free" Microsoft service.

Chances are Microsoft will now turn around and monetize its findings by charging the public for important heath information.

Big Data thrives by turning user-supplied data into gold. In a just world, Microsoft would have to pay royalties to the users who supplied the search terms that are the foundation of this announcement.
Luke Ward (Washington, DC)
Concerned Reader (Boston)
You are not the customer of Google or Bing. You are the product being sold to advertisers, and this has always been true.

Most users of Google/Bing are happy with this tradeoff because the search results are "free" to them. What you advocate would require you to fork over money for search results in order for you to become the customer. The market has rejected this approach.
Spilled Ink (Baton Rouge, LA)
Monetize is the key word for all the comments with a similar question. Does Google give away its search engine? Does Monsanto give away seeds? Do altruism and exploitation mix?
also MD (Zurich)
The article in question is buried behind a paywall, and viewing costs 30$. This is a sad state of affairs for scientific publications, which are overwhelmingly funded by taxpayers' dollars yet cannot be accessed by the very same taxpayers. At the Swiss Medical Weekly (www.smw.ch of which I am the editor-in-chief), we use the Platinum Open Access model which mandates that all research papers are free for reading, and no charges are invoiced to the authors: the costs of publication are defrayed by philanthropy. See http://blog.smw.ch/scientific-publishing-in-the-times-of-open-access
Paula Robinson (Peoria, Illinois)
Excellent approach!
MedLibrarian (North Salt Lake, UT)
"What is the NIH Public Access Policy? The NIH Public Access Policy ensures that the public has access to the *published results of NIH-funded research*. It requires scientists to submit final peer-reviewed journal manuscripts that arise from NIH funds to PMC upon acceptance for publication. The Policy requires that these final peer-reviewed manuscripts be accessible to the public on PMC to help advance science and improve human health." Source- http://www.ncbi.nlm.nih.gov/pmc/about/faq/

NIH isn't the only government agency with such a mandate.

Kudos to the SMW for employing such an open access policy as well...
also MD (Zurich)
The problem is that conventional Open Access (as mandated by the NIH) provides a perverse incentive for journals to publish junk (author-pays model). This can only be eliminated by uncoupling the financial rewards from the publication. See my editorial linked above for my detailed reasoning.
Paul (Montclair, NJ)
What a terrible empty tease of an article. What were the common early queries? What are the warning signs? This article reveals nothing more than its headline.
ACW (New Jersey)
Click on the blue words 'pancreatic cancer' in the first paragraph. It won't tell you the common early queries in this study, but it will give you lots of information about pancreatic cancer, including a list of symptoms. Some symptoms may be more strongly indicative than others of pancreatic cancer (as opposed to some other condition), and that association is what the researchers are looking for.
dEs JoHnson (Forest Hills)
Paul: When did the NYT become a search engine?
chris (Belgium)
snap - how many times have I heard not to Google your health condition!
KB (New Haven, CT)
That's because, according to this article, everyone is Binging their symptoms.
Const (NY)
I’ve had friends who have been diagnosed with this horrible disease. The symptoms are vague, if any are present at all, and are more likely to be caused by some benign condition rather then pancreatic cancer. There is no way that consulting Drs. Google or Bing is going to catch pancreatic cancer at the earliest of stages. As for symptoms, abdominal pain, pain in the back, jaundice and persistent itchiness are all possible signs of pancreatic cancer. They are also signs of many other diseases. Catching pancreatic cancer early is going to require some yet to be discovered lab test; not a search engine algorithm.
ACW (New Jersey)
'There is no way that consulting Drs. Google or Bing is going to catch pancreatic cancer at the earliest of stages.'
That is not what the article says.
What the researchers are looking for is the reliability of a link between particular symptoms and pancreatic cancer. If people subsequently diagnosed with pancreatic cancer almost always have itching severe enough to prompt them to consult Dr Google, then it's a far more reliable indicator of the need to get tested than is, say, gas severe enough to prompt a Google search, if only half the persons who have pancreatic cancer searched for that symptom before diagnosis.
The researchers are, as I understand it, working backward. Draw a Venn diagramme (the kind with overlapping circles) and you'll get it.
The set is 'everyone who's definitely got the disease', and the subset is 'how many of the larger set had a particular symptom'. Not 'everyone who has this potential symptom' including those who haven't got the disease.
Nor does the research suggest an internet search equals a definite diagnosis - simply, preliminarily, that data-crunching is is a method of investigation that may show useful patterns. Some symptoms may be more closely associated with pancreatic cancer than others, e.g., perhaps half of the subsequently diagnosed cases report severe gas, but 90% report itching - then the data show that itching is a more reliable indication that pancreatic cancer should be tested for.
Gloria (nyc)
So what are the early signs of pancreatic cancer?!?!
ACW (New Jersey)
Click through the blue words 'pancreatic cancer' in the first paragraph, for risk factors and signs.
Kathleen (Anywhere)
This is interesting and has the potential to be helpful, but is also kinda scary; for example, prospective employers might be willing to pay a lot to avoid hiring someone who has a high probability of a potentially high-cost and/or disabling disease. And online searches aren't always on the behalf of the searcher, but may also be conducted by family members or friends trying to understand what another person is going through, or even just by someone who has randomly read or viewed a media report.
Lauren (NYC)
There's no way an employer can access a potential employee's search data at this point. And trust me, I've worked at huge data companies. Even the most successful companies can barely deal with their OWN data. I don't see HR departments, which usually have the most tech-averse people in a company, swinging anything like this even if it was legal (and it's not).
BAndrews (Chicago)
Perhaps not, but what if I could, for instance, access your google search history by using your Google userid? The search history of individuals at large is unnecessary but what if I had a reason to target just one person, perhaps an ex-spouse, a congressman or the CEO of a company. That would indeed be of value.
Michele (<br/>)
Would have been nice if the article included a list of the pertinent search terms. I'm pretty sure everyone who reads this article will want to know, and it seem weird to omit them.
DSK (Madison WI)
As a data scientist who has mined healthcare claims data for the last two decades, I am excited about this research. I hope that this source of Big Data is made
available to researchers for similar rich analyses.
Mountain Dragonfly (Candler NC)
Thanks Microsoft....don't share with the 40,000 people who will die from pancreatic cancer this year your findings, which MIGHT get them to a doctor when there MIGHT be a way to fight it. Instead, worry about the health of Microsoft as a financial institution that has finally discovered something unique in meta-data (yup, that same stuff the NSA looked at that caused such a fuss) that would be groundbreaking. Shore up your financial and proprietary walls before you save lives. This is corporate America at its best?
Susan (New York, NY)
To those commenting that want to know the symptoms, click where it says "pancreatic cancer" in blue in the first paragraph. It will take you to an area that shows the symptoms.
rnahouraii (charlotte)
Medical research doesn't work this way -- it's blinded placebo studies, not data mining.
noname (nowhere)
Eh? Research is research, it works in many different ways. "Blinded placebo studies", i.e. double-blind placebo-controlled studies, are only for new drugs in the final phase before FDA approval. Maybe 1% of medical research.
JK (Texas)
Kudos for the researchers thinking "outside the box". I would bet that this will turn out to be a promising adjunct to the so-called Evidence Based
Medicine that is so in vogue now.
PaulB (Cincinnati, Ohio)
This article cries out for a sidebar on the warning signs of pancreatic cancer. Please.
John (Here)
"A logical next step would be to figure out what to do with that search information."

I'm betting the insurance industry would volunteer a few possibilities.
W (Houston, TX)
I looked up this paper online--it is not available through my medical library, except for purchase. Even the free Abstract does not get close to mentioning what the warning signs were that the researchers found with their searches. Given that this is a fairly obscure journal (h-index 25) that is not available in many libraries, I'm not convinced that the data are that significant.
ReaderAbroad (a)
This is an excellent idea and should be expanded.

About 10 years ago, I predicted that a person I know would develop Alzheimer's. I am sad to say, she has.

Here is the criteria I began to notice: a substantial number of such patients were control freaks (I do not mean that in the pejorative, but I also sort of do).

The internet can be a great tool in crowd sourcing data.
H (B)
You know, I've noticed that too. If people become very rigid where each task has to be done just so, and "it's my way or the highway", I think that's an early symptom.

Not sure what to think about people who are like that their whole life!
Rob79 (NorCA)
They intentionally didn't list the symptoms because they knew the first thing people would do is get anxious and search on them, distorting the results of their work.
NS (NC)
All I had to do was google "microsoft researchers pancreatic cancer" to get details on some of the early warning signs. From the Microsoft blog:

"Pancreatic cancer — the fourth leading cause of cancer death in the United States – was in many ways the ideal subject for the study because it typically produces a series of subtle symptoms, like itchy skin, weight loss, light-colored stools, patterns of back pain and a slight yellowing of the eyes and skin that often don’t prompt a patient to seek medical attention." https://blogs.microsoft.com/next/2016/06/07/how-web-search-data-might-he...

I don't know why NYTimes articles are often so thin and as a result sometimes misleading. I usually have to go to other news outlets to get the full story.
Concerned Citizen (Anywheresville)
And an earlier post listed totally DIFFERENT early symptoms of pancreatic cancer...suggesting that really, we don't know might indicate pancreatic cancer or not.

Itchy skin, back pain, diarrhea (or light colored stools -- pretty much the OPPOSITE?), upset stomach, etc. That could be anything or nothing.
ml1357brown (<br/>)
This study appears flawed in many respects..
A 5 -15% positive rate is not very good. More crucially it is well known that retrospective data fitting of this nature is susceptible to "over-fitting," basically just finding chance associations that will not replicate on prospective dats. This why the flu study mentioned in the article flopped. I agree that there is great promise in exploiting "big data" for health management, but this is not a good example of how to do it.
Ron (Texas)
This article is empty. What are the signs they refer to?
Jaurl (US)
@Ron. Apparently many folks are missing the point of the article. This is a novel concept with interesting implications. This is not an article about the discovery of new early signs of pancreatic cancer. If you want to know what the early signs of pancreatic cancer are , do a search.
Steve Brown (Springfield, Va)
I am not sure why all this research into cancer, even as we know if cancer were eliminated, that would only increase life expectancy by about three years. It is certain that there are many other life adversities, that if addressed, would yield a more robust increase in life expectancy. But I guess researching cancer packs a certain cachet absent elsewhere.
McS (portland, me)
So, what are the findings? Why do half the story?
Candace (London, Canada)
Worked backwards from users who were diagnosed with pancreatic cancer - isn't that a big problem in using search data for early detection. How effective would this be if you started with all of the searches on symptoms (as another reader has commented not given here) made on Bing. That is how many people search the same symptoms on Bing and don't have cancer?
ACW (New Jersey)
I think you misunderstand. While there will undoubtedly be a lot of people who have the symptom (or something close to it) and turn out not to have pancreatic cancer, the question is how many people turn out to have pancreatic cancer who do not have the symptom. Say, the symptom is persistent gas unrelated to diet. Then the question is not 'how many people have gas and then turn out to have pancreatic cancer' - because there will always be a lot more people who get gas but not cancer. The question is, 'how many people with pancreatic cancer have gas before diagnosis' - if a close association can be shown, that severe gas is almost always a symptom, then it's a good indication of the need to test rather than just to write it off.
Paula Robinson (Peoria, Illinois)
Yet, how would a researcher identify any of those people-- who searched, what for, and what condition they ended up with?!

What's needed is a large-scale longitudinal study whereby ALL people of a given age were randomly sampled, with a host of symptoms checked for, and followed to see what later developed.

That way, you'd know both what % of people with disease X had symptom Y, but also what % of people with symptom Y had the disease.

I'm sure they chose not to list the symptoms because most people with those symptoms neither have not will develop pancreatic cancer. But they should have included them, along with the statistical disclaimers in bold print!
John Smith (Crozet, VA)
Big Brother dons a white lab coat? Do we really want people sitting around analyzing our every internet search? Really? Think about it!
ACW (New Jersey)
The data were anonymized. That is, they began by finding people who had in fact been diagnosed with pancreatic cancer; then searched backward to queries about symptoms, looking for patterns. However, they didn't put names to those data points. They're looking for patterns.
Trust me: no one cares about your waffles.
John Smith (Crozet, VA)
The minute somebody says "trust me," I instinctively don't! Every tool can be used for good or for ill. While your goals and intentions may be laudable, I fear that others may turn similar techniques to less noble purposes.
Madeline Conant (Midwest)
The located specific individual people who were diagnosed with cancer and then tracked their past searches? This is being anonymized? Yeah, right. Maybe they didn't put their names in the final study, but the researhers knew exactly who these people were.

Never believe it when people in authority tell you that information they get from you is "confidential."
Cathy (Michigan)
Some readers wanted to know the early signs of pancreatic cancer. My dad has it and his earliest symptom was unusual amounts of gas. He talked with his doctor, who told him it was normal for an elderly person. In retrospect, it's obvious that the gas wasn't normal for him and should have been looked into. I don't know if that early symptom is well known in the medical community. If a search engine analysis helps raise awareness, that could help doctors avoid brushing off that symptom in the future.
KMH (USA)
NYT did NOT want tp pay $30 for the article? This is not a public service. On the contrary now lot of readers will unnecessarily worry if they have the early symptoms. Bad move to provide partial information. Ironically a study that uses the consumer search models for reaching broad conclusions blocks the public from getting the bottom-line..unless they pay upfront. Of course we can not blame money hungry medical publication to cash-in on this opportunity.
Samsara (The West)
When I was studying journalism, we were taught the responsibilities of a reporter included answering the questions a reader would logically ask about the information in a particular story.

Many of the NY Times reporters apparently did not get the same training, because --all too often-- they produce articles with information gaps so large one could drive an airplane through them.

I join the other commentators in asking why in the world some of the chief early symptoms of pancreatic cancer were not included in this piece? Obviously, the researchers know what they are, perhaps better than most.

Of course this is the same newspaper that can use thousands of gallons of ink and forests of newsprint on "horse race" election coverage that makes no mention of the candidates' stands on issues critical to the electorate and the future of our country.

This is a sad time in the history of journalism which, at its best, can be a noble and essential component of a democratic nation.
Cathy (Hopewell Junction NY)
"The research focused on searches conducted on Bing, Microsoft's search engine..."

There's a flaw right there, limiting the search to one set of search engine results. It is as if the question of pancreatic cancer is ancillary to the process of determining if Microsoft has data to be mined usefully.

Information is the gold rush of our present era, with people trying to figure out ways to extract the dust from the silt. The problem is our lives are the silt that these companies are panning.

I wish I could trust that the result would be altruistic and beneficent.
hen3ry (New York)
If I can't afford the treatment all the offers in the world for care will do me no good. Furthermore, I may not want treatment or I may be looking for a friend or a family member or I may be looking out of curiosity. If we really want people to go to their doctors we need to fix our current healthcare system. Stop the narrow networks, the copays, the deductibles, the constant claim denials, the dropping of doctors which interrupts treatments, etc. In other words, simplify our healthcare system so we don't have to worry about being ill, paying, or going bankrupt. And stay out of my computer searches thank you very much!
jpduffy3 (New York, NY)
We are heading down a very slippery slope here. If we start monitoring search queries and draw inferences from them, we are starting a process of becoming thought police. Do we really want to go down that road?
billdaub (Home)
This is exactly what I thought when I read this article.
Nosovicki (NY)
Big data is not necessarily intrusive. For example, they can infer complex rules from anonymous on-line requests and distribute them as an off-line DB. That way, your questions will be absolutely private, while answers will be both targeted and as accurate as if you asked them online.
justwright (fort walton beach)
This research could provide a huge benefit however The Journal of Clinical Oncology Practice only allows access by paid subscription.Please allow this information to be released Thanks
Barbyr (Northern Illinois)
It's none of your business whether or not I have pancreatic cancer. I did not ask for your help, and greatly I resent your intrusion into my personal life. Butt out, now and forever.
Concerned Reader (Boston)
Why then did would use a service that tells you that in its Terms of Service told you that they would keep you data?
George Quinn (Pound RIdge, NY)
This seems like something that could possibly be the future of medical diagnosis, able to catch diseases, not just pancreatic cancer, before they become serious. The question is then, of course, one of privacy. An opt-in program would likely not be unethical in any way, but Microsoft's M.O. is usually opt-out, where a user has to dig through arcane options menus to be able to find a vaguely-worded field which allows them not to be tracked. And sometimes Microsoft doesn't even provide an option for that. That type of behavior preys on the technologically inept and naive, which are a large part of Microsoft's consumerbase. This might seem like innocuous data to collect, but I wouldn't be surprised to see Microsoft selling it to ad companies. I would feel uncomfortable seeing targeted medication advertisements, and I suspect other people would as well.

Taking it to the logical extreme, is it ethical to forcibly collect this data, in an attempt to lower mortality rates? I'd suspect it is, but the backlash would be immense. Before charging on blindly, we must ask ourselves these ethical questions to assure that we don't stumble into having our securities voluntarily violated. And I, for one, wouldn't trust Microsoft to do it right.
John (Here)
Thankfully, there are alternatives to Bing and Google that do not track your searches, like DuckDuckGo. Unfortunately, their search results are not up to par when compared with Google or Bing.

Nevertheless, your point is spot on.
fact or friction? (maryland)
On the one hand, really cool. On the other hand, really creepy. Tech companies track everything - absolutely everything - you do online. And, they keep that data forever. The result is they know not just everything you do online, but absolutely everything about you.
Jane (East Granby and Niantc, CT)
That's right, and it's about time people realized that fact and altered their media behaviors accordingly. Any online actions you take, either through data search, use of social media or cellular communications are "out there", perhaps forever. This Data 101 class appears to have been skipped by the majority of users.
Don't give your Big Brother anything he could rat you out to your parents for....
Richard J. Siegel (New York City)
As important a medical breakthrough as there has ever been. With plenty of heathy competition from Google, IBM, others.
W (Houston, TX)
Um, so what are the warning signs? The least the article could have done is enumerate what the researchers found as warning signs.
memosyne (Maine)
I hope this all succeeds. Symptoms have a way of being ignored by patients until they are clearly unpleasant. People say: "Oh it was just that rich dessert." It has been axiomatic that Pancreatic Cancer is asymptomatic until too late. Early symptoms could help docs find and treat Pancreatic Cancer before it becomes a death sentence.
I happen to believe that patient awareness is very very important: knowledge of early symptoms can really help.
If this works out I hope an information campaign will ensue.
Concerned Citizen (Anywheresville)
But the alternative is a nation of hypochondriacs, who run to the doctor every time they have gas or indigestion.

It would have been helpful if the article had listed serious signs of pancreatic cancer.
Lynn (Greenville, SC)
"From there, they worked backward, looking for earlier queries that could have shown that the Bing user was experiencing symptoms before the diagnosis."

So what were the early queries? What were the early symptoms? Tell us what they found out.
Lynn (Greenville, SC)
So are the researchers now busy trying to find a way to enrich themselves by selling their discovery - the one they made using our information without paying us for it?
MissHunter (NYC)
I don't think the NYTimes (or Microsoft) wants to be responsible for tens of thousands of people freaking out because they all experienced blurred vision in their left eyeball at some point in the last ten years (for example)
Margo (New york)
Where's the results? This article describes process. I was looking for some of the early warning signs of this cancer as I continue on my health education path.
Reed Erskine (Bearsville, NY)
You can google "early symptoms of Pancreatic cancer" if you are "looking for early warning signs".
John (Philadelphia)
New onset diabetes mellitus,unexplained weight loss, diarrhea, fatty food intolerance, clots in legs.
Ian stuart (Frederick MD)
The whole point of the study was to find out if there were symptoms that were better indicators of actual pancreatic cancer than the existing ones