GalleyCat FishbowlNY FishbowlDC UnBeige MediaJobsDaily SocialTimes AllFacebook AllTwitter LostRemote TVNewser TVSpy AgencySpy PRNewser

28 GB Of Raw Data Went Into California Watch’s Award-Winning “Decoding Prime” Series

Of all the winners announced this week for the 63rd annual George Polk Award, California Watch’s “Decoding Prime” series is the one that catches my eye.

California Watch, a project of the Center for Investigative Reporting, is only in its third year of existence after launching in 2009. The organization is joined by long-established names on the winners list like The New York Times, The Wall Street Journal, The Boston Globe, ABC 20/20, Bloomberg and The Associated Press.

So how does one brand new organization compete with years of legacy? To start, try  51 million patient records — about 28 gigabytes of raw data. That’s how much information was analyzed for the yearlong series of investigative stories that revealed a pattern at a California-based hospital chain of billing Medicare for numerous rare medical conditions for high-paying bonuses.

For example,  out of 468 cases of “autonomic nerve disorder” reported in California in 2010, 360 were reported by Prime Hospitals — 90 times more often than the average hospital. The reason for such ambiguous diagnosis? The Medicare Reimbursement Rate for autonomic nerve disorder can be up to $12,500, whereas simply calling it a “fainting spell” only reimburses for $7,000. This is just one of many other examples of unusual diagnoses in order to tap into Medicaid’s gold mine.

Pulitzer Prize-winning computer-assisted reporting specialist Stephen K. Doig partnered with California Watch to analyze the more than 51 million patient records of hospital and emergency room admissions from 2005 through 2010. Doig wrote up a synopsis of his involvement in the investigation:

This kind of evidence would be impossible to gather from a warehouse full of file drawers filled with millions of pieces of paper; finding the telltale patterns in a mountain of documents is beyond the human attention span. But in this age of electronic public records, seasoned reporters who know how to use powerful computer tools can see not only the trees, but the whole forest. As an investigative reporter, it’s wonderful to use such tools to uncover problems that otherwise might remain hidden. But as a taxpayer, I often wish government agencies would be doing the same kind of analysis.

A few other highlights from that writeup:

  • The power of the series came from pairing solid, fact-based data analysis with tips from sources
  • They used California’s Office of Statewide Health Planning and Development’s extensive and well-documented collection of public data.
  • During the year of the investigation, Doig wrote about 120 SAS programs to analyze the 50 million records
  • For one analysis, Doig used SAS to count how often more than 6,400 diagnosis codes were used in the cases of about 750,000 individual patients, creating a huge database of more than 14.6 gigabytes


Read Doig’s full writeupcheck out the full series (60+ articles) and see some of the data analysis.  Congrats to California Watch and all the other winners.

Mediabistro Course

Freelance Editing

Freelance EditingStarting August 6, learn how to build a thriving career as a creative professional! In this course, you'll learn the best practices for managing a freelance career, such as, how to establish your online presence, pitch to clients, manage your finances and solicit referrals and testimonials. Register now!