SocialTimes Allfacebook AllTwitter MediaJobsDaily more TVNewser TVSpy GalleyCat AppNewser UnBeige AgencySpy PRNewser FishbowlNY FishbowlLA FishbowlDC semanticweb.com

database

Tool of the Day: Google Refine

Google Refine

When it comes to working with and presenting data, Google reigns supreme. We’ve covered Google’s Chart Wizard, Google’s Public Data Explorer, and even ways to run a news website using Google Docs (with WordPress). Another of Google’s powerful data tools, Google Refine, lets users work with “messy” data sets and transform them into something amazing. Check out Part 1 of the Google Refine screencast.

Unlike Google’s general web-based data services, Google Refine is a standalone desktop application. Formerly known as Freebase Gridworks, the Google Refine tool has been used by the Chicago Tribunedata.gov.uk, and most famously by ProPublica for their “Dollars for Docs” investigation series from October 2010. Once you download and install the Google Refine tool, you interact with it through your web browser. You can create a new project from scratch, or you can import data sets from files stored on your computer. When your data is imported, that’s where the real power of the tool comes through.

Google Refine screenshot

You can use facets and filters to create subsets of data, as well as format strings of data which match your search patterns. For example, if you see the term “as soon as possible” and “ASAP” in the same data set, you can reformat both data strings to match each other. For more complicated queries, you can use the Google Refine Expression Language (GREL) to create regular expressions and isolate substrings of data to separate columns.

Once you’re done with formatting your data, Google Refine lets you export your work in a number of different formats, including as an Excel spreadsheet, an HTML table, or as JSON data, which you can change to match a wiki-style format. Google Refine also lets you hook into open web services, such as Google’s Language Detection Service or the open map service Nominatim.

Google Refine is a free download and is available for Windows, Mac, and Linux.

7 Places To Look For Database Journalism Stories

There’s a joke in reporting that one person’s an anecdote and three’s a trend. It’s not really funny, though, because too many stories rely on this metric to prove something’s happening or happened. There’s a better way, it just takes some digging, maybe a FOIA request, and some minimum database skills (which is another topic, but if you’re really serious look into IRE’s training or if you’re still in school, take a computer-assisted reporting course, which your school ought to require).

By analyzing databases on topics on your beat you can find the real trends and back it up with statistics. Your job as a journalist is to make those numbers and statistics meaningful. (But don’t force the story, sometimes the data doesn’t support your hypothesis. It hurts, but it happens.)

Here are a few places you can find data that will help you support your stories with facts instead of trends.

Data.gov —This site will probably just overwhelm you with the sheer quantity of information. The hard part will be picking through what’s there for what’s relevant. But you can find some interesting federal government data, including everything from military marriage trends to consumer spending to climate change, if you dig. You can sort by the type of data, the department that collected it, the category, location, topic, and more. At least try a few searches to see what’s what — and whether it leads to or fits in any of your stories.

Read more

7 Innovative online maps

The technology that is paired with online maps is constantly improving, which means the ways media organizations are using them have become more diverse. Check out a few online maps that are furthering what’s possible with map mashups.

 

Ratio Finder

This eye-catchingly designed map analyzes Foursquare check-ins and visualizes them by gender. Visitors can use the site to compare where male and female users check in and what type of businesses they are most likely to check in to. The site is available for San Francisco and New York.

 

IfItWereMyHome.com

This site allows the visitor to compare the standard of living in the United States to pretty much any other country around the world and see how they differ. For example, If Germany were your home instead of the U.S., you would statistically consume less oil, have fewer babies, and have lots more free time, according to the site. Each page includes a map that shows a scale image of the country overlaid on top of a map of the United States.

 

Home and Away: Iraq and Afghanistan Casualties

Behind CNN’s flashy interactive map is a sobering message: the large number of casualties in the two war-torn countries. The dual maps and accompanying charts show data like the hometowns of the deceased, where they were killed, and when.

 

Products of Slavery

This map of the locations where child labor happens around the world presents a complex issue in a way that is very simple and easy to understand. Site visitors can view the top 25 countries where products are made with child labor and also toggle between the map view and several graph views.

 

MurderMap

Much like the homicide databases produced by the Los Angeles Times and Stamen Design (here and here), MurderMap aims to visualize homicides in London. Visitors can toggle the map by murder weapon and click each marker to view more information about the victim.

 

Mapping America: Every City, Every Block

This New York Times map that displays census data on race in America is most notable for showing just how many neighborhoods are clearly divided by race. For example, Manhattan’s 95th street has mostly White residents on one side and Black and Hispanic residents on the other. Los Angeles’ Santa Monica Boulevard creates a similar divide — a large percentage of residents who live north of the avenue are White, while the majority of those who live south of the street are Hispanic, as evidenced by the colored dots.

 

What’s in a Surname?

National Geographic elevates the word cloud with this map that shows popular surnames by location. “Smith” is a popular last name in most of the country — especially in the eastern United States — while Garcia and Hernandez are popular in the West and Southwest, according to the map.

5 sites to 'follow the money' in politics

by Ethan Klapper

With the midterm elections just around the corner, here are some great resources for journalists who cover government and politics to track campaign finance, lobbying and related information.

1. Federal Election Commission

It might not be the prettiest site, but the campaign finance data you see somewhere else on the Web likely originates here. This site is useful because of the sheer amount of data dumps it offers from its disclosure data catalog. Seven sets of data are offered here, ranging from “Lobbyist/Registrant Committee Statement of Organization” to “Administrative Fines.” Of course, you’ll also find “Candidate Summary” which contains general financial information about candidates.

2. Influence Explorer

A project of the Sunlight Foundation, Influence Explorer crunches the FEC data and makes it digestible for the average user. It displays a number of attractive, colorful graphs detailing the source of a politician’s political contributions. Users can also sort by company, industry and also look at lobbying information.

3. OpenSecrets

While not as attractive as Influence Explorer, OpenSecrets offers more features. With OpenSecrets, you’re able to track where members of certain congressional committee receive their donations, by industry. The site also features a lobbying disclosure database and information about political action committees. It also tells you, by cycle, who ran the most and least expensive campaigns. OpenSecrets is a project of the Center for Responsive Politics.

4. Follow The Money

While FEC data is useful for those seeking federal office (House, Senate, presidency), it does not exist for candidates seeking state or local elective office. Follow The Money, a project of the National Institute on Money in State Politics, aggregates the campaign finance data from local jurisdictions across the country and presents it in an easy to use format. It also offers a handy API and some widgets.

5. LegiStorm

Journalists love this site, while Capitol Hill staffers notoriously hate it. Why? With LegiStorm, you can look up the salary of everyone who works on Capitol Hill, from the staff assistant to a first term congressman to the chief of staff to a powerful senator. Financial disclosure forms for senators, members of congress and staff are available. In another database, you can search foreign trips that were funded by private organizations. Even more databases have information about lobbying and foreign gifts. LegiStorm is a for profit website.

The sites here offer lots of information useful to both application developers and journalists on deadline. What’s your favorite site? Please share in the comments.

Get Schooled: 6 Education-themed news databases

After the release of the Los Angeles Times’ teacher database that presented information on the effectiveness of hundreds of area teachers, many journalists’ eyes were opened to the possibility of using databases for education reporting.

The Times’ project is one of many education-related databases produced in the last several years that have transformed publicly available documents into useful and usable resources for readers. The examples below share many things in common, including search fields for school, ZIP code, etc., simple interfaces, and eschew clunky tables that are used often for online news reporting, but are usually hard to follow or absorb.

Chicago Tribune: 2009 Illinois School Report Cards

The Tribune’s project is a straightforward examination of area schools that contains searchable information such as class size, test scores, household income, — the kind of information parents and other concerned folks are likely to search for.

The Washington Post: Fixing D.C.’s Schools

This 2007 project from The Post is still a model from which other news databases should be modeled. Its easy to use interface makes searching for the tons of available data on student and teacher proficiency, crimes, health code violations and more a cinch.

The Los Angeles Times: California Schools Guide

“Grading the Teachers” isn’t the first education-related database created by the Times. In 2008, the Times debuted Schools Guide with test scores, enrollment data, and a slew of other information on hundreds of L.A.-area schools.

USA Today: The Smokestack Effect – Toxic Air and America’s Schools

A schools-related database doesn’t have to focus exclusively on education. This 2008 project by USA Today illustrated how industrial pollution affected nearly 128,000 schools around the nation. Each school is ranked by percentile, indicating how many other schools have worse pollution problems than the selected school.

The New York Times: Diversity in the Classroom

This NY Times database takes a unique approach and examines the effect of immigration on American classrooms. The available charts show the number of students of color has changed over the years and readers can drill down information by state, county, and school district.

The New York Times: New York School Test Scores

Like the previously mentioned databases, this one has the standard search tools for county and ZIP code, but is also notable that it links to the largest schools in the region on the topmost page. This allows a large percentage of viewers to go directly to the test score information for that school. Once a school is selected, there is a plethora of information about how that school’s students fared on standardized tests.

Databases like the ones mentioned here take lots of time, effort, and resources to develop, so why should you endeavor to create one? Well besides the millions of potential page views, education-related databases provide information to schools, teachers, parents and students that is hard to find anywhere else and provides a public service.

 
Also on 10,000 Words:

Databases and polls: When numbers are the news
News databases: Turning numbers into knowledge
15 Awesome interactive maps from the New York Times

<< PREVIOUS PAGENEXT PAGE >>