This copy is for your personal, non-commercial use only. To order presentation-ready copies for distribution to colleagues, clients or customers, use the Reprints tool at the top of any article or visit: www.mbreprints.com.

Back to Previous Page

 Mail    Print   Share Share

Excerpt: Google: The Missing Manual

Every journalist knows how helpful Google can be. But a new guide teaches some even more useful tricks to using the search engine.

By Sarah Milstein and Rael Dornfest - June 4, 2004

Most of the time, when you run a Google search, it simply works. You type in The Simpsons, press Enter, and you've got all the character bios, episode guides, and Bart hood ornaments you could ever want. That's the beauty of Google.

But what about when you want to find something trickier? Say you're writing a story on deadline for tomorrow, and you need a quote from an expert in negotiations. You have two potential sources but neither of them is returning your calls. A friend, however, recently raved about a negotiations trainer his company brought in from a place called something like Watershed Consultants. So you run a search for Watershed Consultants... and Google gives you 120,000 results—all about saving ecosystems and revitalizing rivers.

Your friend is on vacation, and your story can't wait. What do you do?

As you may know, a Google search for Watershed Consultants gives you every page that mentions both those terms. If you want to find only pages that discuss the phrase, put quotes around it, like this: "Watershed Consultants." That hones your results from 150,000 to about 670—not an insane number to sift through. But if "Watershed Consultants" is not the exact name of the firm, you're out of luck. You could try the same trick with "Watershed Consulting," but again, you'd need an exact name match.

Instead of quotes, you could add terms to your query to narrow things down. Since you're looking for an expert in negotiation, and your friend works in Washington, D.C., rebuild your search like this: Watershed Consultants negotiation DC. But that still leaves you with more than 8,000 results. Time to tell Google to ignore the pages with words you don't want. Use the minus sign before extraneous terms (you can have up to ten words in a Google query), like this: Watershed Consultants negotiation DC -water -river -ecological -aquatic -environmental -conservation. Now you're netting a very manageable 86 results—and the first one is Watershed Associates, a Washington firm that consults on negotiating skills. A watershed moment, you might say.

Here're some more Google tips to help you search smarter and faster.

• Wildcards. Wildcards are special symbols—usually an asterisk (*) but sometimes a question mark (?)—that you add to a term to indicate that you want the search to include variants of the term. The wildcard stands in for the possibilities. For example, if you're not sure whether the Culture Club singer was Boy George or Boy Gorge, you might search for Boy G* to see how other people have completed the word.

But Google doesn't let you include a wildcard as part of a word like that. Which, frankly, is a drag.

Google does, however, offer full-word wildcards. While you can't insert an asterisk for part of a word, you can throw one into a phrase and have it substitute for a word. Thus, searching for "chicken with its * cut off" could find: "chicken with its head cut off," "chicken with its hair cut off," "chicken with its electricity cut off," and so on. (A single asterisk stands in for just one word. To set wildcards for more words, simply include more asterisks: "three * * mice" leads to "three blind fat mice," "three very tough mice," and so on. )

The full-word wildcard can come in handy for filling in the blanks and when your memory fails. For example, you've always wondered exactly what Debbie Harry was singing in the first line of "Heart of Glass." You think it might have been, "Once I had a lung and it was a gas," but you're not sure. Maybe it was "Once I had a lunch and it was a gas." Type in "Once I had a * and it was a gas"; Google gives you 416 links suggesting the lyric is actually "Once I had a love…." In short, the asterisk combined with quote marks can be good for finding quotations, song lyrics, poetry, and other phrases.

The full-word wildcard is also cool when you want the answer to a question. For example, if you're wondering how often Halley's comet appears, you can use the asterisk to stand in for your X factor by running this query: Halley's comet appears every * years. If you type your query as a question ("How often does Halley's comet appear?"), then Google searches for instances of the question, which is a nice way to find other people with a thin knowledge of astronomy.

• The cache. As Google tracks web pages, it keeps copies of them in a repository called a cache. In a Google results listing, the page title link takes you to the current site, but if you click the Cached link, Google takes you to the copy it made when it recorded the page. Google rerecords most pages every few weeks. This time difference is significant because if a page has changed recently, you can still see a slightly older version, which might include the nugget you're looking for or some info you remember from a previous visit. (Webmasters can set up a site so that Google won't cache it. As a result, you might not be able to reach a previous version of every page you find in a list of Google results. In such cases you simply won't see a cache link.)

Google's cache is also handy when a page you need has been deleted or its link is broken. Just click the Cached link, and Google takes you into its time machine.

Google's cache feature is notorious for bringing deleted web pages back from the grave. For example, in early 2003, Microsoft accidentally published activation codes on its site that let people use its software. Googlers are still hailing the cache feature for helping them find those codes for a couple of weeks after Microsoft pulled down the offending pages.

But the cache isn't a cure-all for web staleness. First of all, a cached page only lasts until Google rerecords the live page, usually every few weeks. Second, cached pages often include dead links. So if you're reading a hot article on a cached page, and it flows to a second page, clicking the Next Page link may get you nowhere. And third, sometimes Google updates the cache before it updates the snippet, so your result listing may include some text you want but that isn't even in the cache anymore. Consider yourself forewarned.

(The Web Archive's Wayback Machine, http://web.archive.org, is a public archive of the web. Unlike Google, however, it keeps track of web sites in perpetuity—making it kind of a permanent cache. It's a great resource when you need to find a site that's been defunct for more than a few weeks and has therefore fallen off Google's radar.)

• Searching by Town. Google Local is a hybrid of the Google index and standard phone book data. When you include in your search an address with city and state or ZIP code, Google crosschecks its index against various online Yellow Pages, generating a batch of results from your specified area only. Because it incorporates its own relevance rankings, too, Google sometimes lists a place ten miles away from you before a place only two miles away. Still, it's a super-handy search tool.

You can run a Google Local search two ways:

• From the regular search box. Just type in your search terms and an address with city and state or ZIP code, and Google includes a few local links, signaled by a compass icon, at the top of your regular results. Click the link that says "Local results for..." to get a full listing.

• From the Google Local page at http://local.google.com/lochp. When you run a search here, you get a full page of results listings.

• Patents, Tracking IDs, and Other Numeric Goodies. Hardly anyone knows this, but Google lets you search for numbers on the web. And not just any numbers, but specific tracking IDs, U.S. patent numbers, FAA airplane registration numbers, FCC equipment ID tags, universal product codes, maps by area code, and vehicle identification numbers. When it comes up with a match for your number, it shows you a special listing at the top of your results page.

The numeric service is new and includes just the quirky searches described above. Still, when you need to look up those numbers, this feature can save you a mess of clicking around the complex web sites of delivery services and government agencies. Here's how to run the specific searches:

• UPS, FedEx, and U.S. Postal Service tracking numbers. Looking up package tracking numbers and finding out whether your Lands' End long underwear is stuck in a warehouse in Kentucky has long been a major benefit of the web. The process just got easier. Simply type your tracking number in a blank search box, and Google provides a link to a web page with your item's transit history.

• Patent numbers. If you look up patent numbers regularly, or ever, you know the U.S. Patent and Trademark Office has a nice, thorough web site that makes you jump through a lot of hoops to find a patent by number. Stave off a few gray hairs by using Google to look them up instead. Just preface the number with the word patent, like this: patent 5123123.

• Universal product codes (UPCs). For some basic information on consumer products, like their manufacturer, try looking up the UPC, like this: 036000250015 (no need to include UPC first). Most of the time, you can find UPCs under an item's barcode.

• Federal Communications Commission equipment ID numbers. If you're an engineer at a wireless phone company, and you want inside info on a competitor's product, check out the FCC's database. To get there, type fcc into Google, followed by the ID number, like this: fcc G9H2-7930.

• Flight numbers. Want to find out if your cousin's flight from Ottawa is on time? Check flight status by typing in the airline and flight number, like this: usair 50.

• Federal Aviation Administration airplane registration numbers. If you're the head of a startup airline, and you're considering buying a used plane from one of the big industry players, this feature is for you! Just type in the registration number directly, like this: n233aa, and Google gives you a link to the FAA site with some details about the manufacturer and history of that plane.

• Vehicle identification numbers (VINs). If you're buying a used car, you can use the VIN to learn more about that individual auto's history (the VIN is usually on a small metal tag at the bottom edge of the windshield). Type in a number, like this: JH4NA1157MT001832, and Google provides a link to the Carfax info for that car.

• Maps by area code. Type in an area code, like 212, and the top of your Google results will include a link for a Mapquest map of that region. The maps generally cover a larger area than the area code, but they can give you a sense of whether 609 is in New Jersey or Idaho.

• Advanced Search. Google keeps track of text in the body of a page, in the URL, in the links to other pages, and in the title (which is different from the URL). On the Advanced Search form (which you can reach from Google's home page or any results listing), the Occurrences pop-up menu lets you tell Google when you're looking for results from a specific place on a Web page. Here's when you might want to use these options:

• In the title. A web page's URL is not the same thing as its title. A URL is an address that your computer can read, and sometimes you can read it, too (for example, www.npr.org). But often, URLs are super-long and contain a slew of characters and symbols that make no sense unless you're a droid. In those cases, it's useful when a page has a separate, readable title that a Webmaster has written to help you understand what's on that page. The first line of a Google result is usually a page's title, not its URL.

A word that's mentioned in the title of a page is more likely to indicate what's on that page than a word that shows up randomly in the text. For example, a page called "File-sharing for fun and profit" is more likely to explain how to go about file-sharing than a page that simply mentions it as part of another discussion. Use this feature to get a smaller, more focused list of results.

• In the text. Asking Google to ignore titles, URLs, and links is useful when you want to search for keywords or phrases that are likely to show up all over the place. For example, if you want only sites that discuss those bumpkins known as yahoos, and you don't want pages from Yahoo.com or links to that site, use this feature to filter out references to the Web site.

• In the URL. Want to find out how many sites have already used the word "sneaker" in their URLs? Here's the place to check. Happily, this feature does not limit you to simple Web addresses, like www.sneaker-nation.com; it also produces more complex results, like www.cynosure.com.au/isp/sneaker. (Searching for a term within a URL only yields results with whole words. In the example above, Google would give you back www.sneaker-fetish.com or www.sneaker.fetish.com, but not www.sneakerfetish.com.)

• In the links. This feature simply searches for the text in hyperlinks that connect pages. It's useful in two situations. First, if you want to find out what pages have links to a certain person, phrase, or site, the "in links to the page" option can give you a rough idea. (The text of a link may have nothing to do with the page it links to. Most commonly, you see sentences like, "To read about Barry Bonds, click here." If "here" is the text for the link, your search for "Barry Bonds" isn't going to bring up this page.)

Second, the links search can help you find a person's email address, because on most web pages, an email address is a link. If the person's name is part of the email address, or if the page says something like, "For more information, email Brad Pitt," you're in business.

• Domain. The domain feature lets you restrict your search to a single site or to a domain (like .edu or .com). The site restriction is useful when you want to look up specific keywords on a site that has no search function (or that has a lousy one). And sometimes a site search turns up goodies you simply can't seem to reach through a regular onsite search. To see the difference, try running a query with NYTimes.com as the site, and then try the same search terms on the New York Times site.

(The site search doesn't search related web sites. For example, if you want to search all of Google's sites, restricting your query to Google.com means you'll miss anything in www.answers.google.com, http://labs.google.com, and so on. To make sure you hit those sites, too, try searching for Google in the URL, described above.)

This features also lets you limit your search results to a particular domain, which can help, for example, sift out sites that want to sell you things. For instance, if you search for "organic coffee" and limit your results to the .org domain, you'll get lots of sites that address organic farming and fair trade issues, but few that hawk beans.

The domain option also lets you rule out a particular site or domain—handy if your results are peppered with one site or domain that you know doesn't contain the info you want.

This is excerpted from Google: The Missing Manual, by Sarah Milstein and Rael Dornfest. Copyright © 2004 by O'Reilly Media, Inc., and published by O'Reilly Media, Inc. Excerpted with the permission of the publisher. You can buy Google: The Missing Manual at Amazon.com.



> Send a letter to the editor
> Read more in our archives