Google makes an estimated 95% of its revenue from online advertising via AdSense.  Obviously, they’re going to do a phenomenol job connecting web searches with related advertising, which requires a lot of tracking the what, when, and where of interweb searches.  One of the byproducts they released to the public is Google Trends.  Since my thesis centers around gastrointestinal illness, when I first heard about it, I plugged in “rotavirus”.

Google Trends tracking for use of "rotavirus" in the U.S.

Seasonal Variation in Google Search Term "rotavirus".

This shows a surprisingly seasonal trend.  Fourier analysis on the trend in the search term (using data gleaned through digitization for the 2004-2006 seasons, the only information available the last time I did this) showed Cp peaks at 204 and 345 days.

Why?  I can think of two reasons.  One: media sensationalism warns of upcoming Rotavirus epidemics because it is a predictable seasonal disease, causing an increased interest in using it as a search term.  Two: Doctors diagnose people with rotavirus as a possible cause for their gastroenteritis, and people are more interested in learning about it.

If the second case is true, then you have a real time, easily accessible, proxy for number of rotavirus diagnoses in a given region (in the above case, the entire U.S.).  Instead of requiring the headache of a giant linked database for doctors/nurses/patients to enter the disease incidence, you only need people to type in the search term “rotavirus” into a Google browser, and you have “logged” an incidence of inquiry of the disease.

Google, thankfully, has now caught on the link between interest in searching for an illness and level of the disease with Google Flutrends.

This is incredibly awesome.  But, there is a lot more to be done.  One of the major advancements would be moving away from the artificial and meaningless boundaries of states, and trying to push for a more geographically relevant mapping.  For example, with today’s travel plans, it is a lot more relevant to Los Angeles if New York City is having a flu epidemic than if Arizona is.  A good use of someone’s time might be developing meaningful boundaries based on interactions between people in those locations.  I don’t know if this has been done, but something similar has been proposed.

Additionally, this demonstrates the use of Google’s existing information infrastructure for web-based technology to implement an easy and efficient method of geographically databasing existing diseases.  How easy would it be to ask a nurse to open a website (say something like diseasetracker.google.org) and literally just type in the diagnosis (e.g. breast cancer).   Or, better, ask patients to log in from their homes to get more-spatially relevant data.  The tracker immediately logs the address located to the IP and the diagnosis.  Instant Illness-related GIS.  Privacy shouldn’t be a concern.  Google could obscure the location of the IP Address to within 10 miles to protect patient identification.  Even with only 10 mile resolution we would have a significantly better system in place to track clusters of disease.

What will they come out with next?