blogs by Gio

You can Google it

  • Posted in 🖱 cyber

The other day I had a quick medical question (“if I don’t rinse my mouth out enough at night will I die”), so I googled the topic as I was going to bed. Google showed a couple search results, but it also showed Answers in a little dedicated capsule. This was right on the heels of the Yahoo Answers shutdown, so I poked around to see what Google’s answers were like. And those… went in an unexpected direction.

Should I rince my mouth after using mouthwash? Why is it bad to swallow blood? Can a fly live in your body? What do vampires hate? Can you become a vampire? How do you kill a vampire?

So, Google went down a little rabbit trail. Obviously these answers were scraped from the web, and included sources like which is, apparently, a Wiccan resource for information that is “astrological, metaphysical, or paranormal in nature.” So possibly not the best place to go for medical advice. (If you missed it, the context clue for that one was the guide on vampire killing.)

There are lots of funny little stories like this where some AI misunderstood a question. Like this case where a porn parody got mixed in the bio for a fictional character, or that time novelist John Boyne used Google and accidently wrote a video recipe into his book. (And yes, it was a Google snippet.) These are always good for a laugh.

Wait, what’s that? That last one wasn’t funny, you say? Did we just run face-first toward the cold brick wall of reality, where bad information means people die?

Well, sorry. Because it’s not the first time Google gave out fatal advice, nor the last. Nor is there any end in sight. Whoops!

So, quick background.

These direct query responses — a kind of semantic search — are what Google calls Featured Snippets. Compared to traditional indexing, semantic search is actually a relative newcomer to web search technology.

In 2009 Wolfram launched Wolfram|Alpha, a website that, instead of searching indexed web pages, promised to answer plain-english queries with computational knowledge: real, scientific data backed by scientific sources.

Wolfram was a pioneer in the field, but other companies were quick to see the value in what Semantic Search, or we might now call responses. Right on the heels of Wolfram|Alpha, Microsoft replaced Live Search with, which was what they called a “Decision Engine.” In the 2009 press release, CEO Steve Ballmer describes the motivation behind the focus on direct responses:

Today, search engines do a decent job of helping people navigate the Web and find information, but they don’t do a very good job of enabling people to use the information they find. … Bing is an important first step forward in our long-term effort to deliver innovations in search that enable people to find information quickly and use the information they’ve found to accomplish tasks and make smart decisions.

Today, Google has this video showing the evolution of the search engine, starting with the original logo and a plain index of websites, and ending with a modern-day search page, complete with contextual elements like photos, locations, user reviews, and definitions. You know, what we’re familiar with today.

Google describes its goal for search as being to “deliver the most relevant and reliable information available” and to “make [the world’s information] universally accessible and useful”. Google is particularly proud of featured snippets and responses, and paraded them out in Hashtag Search On 21’:

One contradiction stands out to me here: at 00:45, a Google spokesperson uses “traffic Google sent to the open web” as a metric Google is proud of. But, of course, featured snippets are a step away from that; they answer questions quickly, negating the need for someone to visit a site, and instead keeping them on a Google-controlled page. This has been a trend with new Google tech, the most obvious example being AMP (which is bad).

This seems subtle, but it really is a fundamental shift in the dynamic. As Google moves more and more content into its site, it stops being an index of sources of information and starts asserting information itself. It’s the Wikipedia-as-a-source problem; “Google says” isn’t enough, can’t be enough.

But for all Google’s talk about providing good information ond preventing spam, Google search results (and search results in general) are actually very bad at present. Most queries about specific questions result in either content scraped from other sites or just a slurry of SEO mush, the aesthetics of “information” with none of the substance. (For my non-technical friends, see the recipe blog problem.)

As I’ve said before, there needs to be a manual flag Google puts on authoritative sources. It’s shameful how effective spam websites are on technical content. Official manuals on known, trusted domains should rise to the top of results. This doesn’t solve the semantic search problem, but it’s an obvious remedy to an obvious problem, and it’s frankly shameful that Google and others haven’t been doing it for years already.

I saw Michael Seibel recently sum up the current state of affairs:

Michael Seibel (@mwseibel): A recent small medical issue has highlighted how much someone needs to disrupt Google Search. Google is no longer producing high quality search results in a significant number of important categories.

Health, product reviews, recipes are three categories I searched today where top results featured clickbait sites riddled with crappy ads. I’m sure there are many more. Feel free to reply to the thread with the categories where you no longer trust Google Search results.

I’m pretty sure the engineers responsible for Google Search aren’t happy about the quality of results either. I’m wondering if this isn’t really a tech problem but the influence of some suit responsible for quarterly ad revenue increases.

The more I think about this, the more it looks like classic short term thinking. Juice ad revenue in the short run. Open the door to complete disruption in the long run…

Trying to use algorithms to solve human problems🔗

But I’m not going to be too harsh on Google’s sourcing algorithm. It’s probably a very good information sourcing algorithm. The problem is that even a very good information sourcing algorithm can’t possibly work in anything approaching fit-for-purpose here.

The task here is “process this collection of information and determine both which sources are correct and credible and what those sources’ intended meanings are.” This isn’t an algorithmic problem. Even just half of that task — “understanding intended meaning” — is not only not something computers are equipped to do but isn’t even something humans are good at!

By its very nature Google can’t help but “take everything it reads on the internet at face value” (which, for humans, is just basic operating knowledge). And so you get the garbage-in, garbage-out problem, at the very least.

Google can’t differentiate subculture context. Even people are bad at this! And, of course, Google can “believe” wrong information, or just regurgitate terrible advice.

But the problem is deeper than that, because the whole premise that all questions have single correct answers is wrong. There exists debate on points of fact. There exists debate on points of fact! We haven’t solved information literacy yet, but we haven’t solved information yet either. The interface that takes in questions and returns single correct answers doesn’t just need a very good sourcing function, it’s an impossible task to begin with. Not only can Google not automate information literacy, the fact that they’re pretending they can is itself incredibly harmful.

But the urge to solve genuinely difficult social, human problems with an extra layer of automation pervades tech culture. It essentially is tech culture. (Steam is one of the worst offenders.) And, of course, nobody in tech ever got promoted for replacing an algorithm with skilled human labor. Or even for pointing out that they were slapping an algorithm on an unfit problem. Everything pulls the other direction.

Dangers of integrating these services into society🔗

Of course, there’s a huge incentive to maximize the number of queries you can respond to. Some of that can be done with reliable data sourcing (like Wolfram|Alpha does), but there are a lot of questions whose answers aren’t in a feasible data set. And, according to Google, 15% of daily searches are new queries that have never been made before, so if your goal is to maximize how many questions you can answer (read: $$$), human curation isn’t feasible either.

But if you’re Google, you’ve already got most of the internet indexed anyway. So… why not just pull from there?

Well, I mean, we know why. The bad information, and the harm, and the causing of preventable deaths, and all that.

And this bad information is at its worst when it’s fed into personal assistants. Your Alexas, Siris, Cortanas all want to do exactly this: answer questions with actionable data.

The problem is the human/computer interface is completely different with voice assistants than it is with traditional search. When you search and get a featured snippet, it’s on a page with hundreds of other articles and a virtually limitless number of second opinions. The human has the agency to do their own research using the massively powerful tools at their disposal.

Not so with a voice assistant. They have the opportunity to give zero-to-one answers, which you can either take or leave. You lose that ability to engage with the information or do any followup research, and so it becomes much, much more important for those answers to be good.

And they’re not.

Let’s revisit that “had a seizure, now what” question, but this time without the option to click through to the website to see context.

Oh no.

And of course these aren’t one-off problems, either. We see these stories regularly. Like just back in December, when Amazon’s Alexa told a 10 year old to electrocute herself on a wall outlet. Or Google again, but this time killing babies.

The insufficient responses🔗

Let’s pause here for a moment and look at the response to just one of these incidents: the seizure one. Google went with the only option they had (other than discontinuing the ill-conceived feature, of course): case-by-case moderation.

Now, just to immediately prove the point that case-by-case moderation can’t deal with a fundamentally flawed problem like this, they couldn’t even fix the seizure answers

See, it wasn’t fixed by ensuring future summaries of life-critical information would be written and reviewed by humans, because that’s a cost. A necessary cost for the feature, in this case, but that doesn’t stop Google from being unwilling to pay it.

And, of course, the only reason this got to a Google engineer at all is that the deadly advice wasn’t followed, the person survived, and the incident blew up in the news. Even if humans could filter through the output of an algorithm that spits out bad information (and they can’t), best-case scenario we have a system where Google only lies about unpopular topics.

COVID testing🔗

And then we get to COVID.

Now, with all the disinformation about COVID, sites like Twitter and YouTube have taken manual steps to try to specifically provide good sources of information when people ask, which is probably a good thing.

But even with those manual measures in place, when Joe Biden told people to “Google COVID test near me” in lieu of a national program, it raised eyebrows.

Now, apparently there was some effort to coordinate manually sourcing of reliable information for COVID testing, but it sounds like that might have some issues too:

So now, as Kamala Harris scolds people for even asking about Google alternatives in an unimaginably condescending interview, we’re back in the middle of it. People are going to use Google itself as the authoritative source of information, because the US Federal government literally gave that as the only option. And so there will be scams, and misinformation, and people will be hurt.

But at least engineers at Google know about COVID. At least, on this topic, somebody somewhere is going to try to filter out the lies. For the infinitude of other questions you might have? You’ll get an answer, but whether or not it kills you is still luck of the draw.

Related Reading🔗


1. Google has some bad summarization telling people that throwing batteries into the ocean is good.
2. News articles were written about this.
3. Bing's summarization interprets these articles as advice to ... throw batteries in the ocean!


(Apparently none of this is AI... yet.)


Howdy! If you found my writing worthwhile, the best thing you can do to support me is to share an article you found interesting somewhere you think people will appreciate it. Thanks as always for reading!