humans aren't even very good at simple tasks they care a lot about. When we're apathetic, we're absolutely terrible; it's not going to take a nearly-omniscient sci-fi level AI to displace people in most customer service jobs.
For example, here's a not-too-atypical customer service interaction I had last week with a human who was significantly worse than a mediocre AI. I scheduled an appointment for an MRI. The MRI is for a jaw problem which makes it painful to talk. I was hoping that the scheduling would be easy, so I wouldn't have to spend a lot of time talking on the phone. But, as is often the case when dealing with bureaucracy, it wasn't easy.
Here are the steps it took.
I present this not because it's a bad case, but because it's a representative one1. In this case, my dentist's office was happy to do whatever was necessary to resolve things, but UW Health refused to talk to them without repeated suggestions that talking to my dentist would be the easiest way to resolve things. Even then, I'm not sure it helped much. This isn't even all that bad, since I was able to convince the intransigent party to cooperate. The bad cases are when both parties refuse to talk to each other and both claim that the situation can only be resolved when the other party contacts them, resulting in a deadlock. The good cases are when both parties are willing to talk to each other and work out whatever problems are necessary. Having a non-AI phone tree or web app that exposes simple scheduling would be far superior to the human customer service experience here. An AI chatbot that's a light wrapper around the API a web app would use would be worse than being able to use a normal website, but still better than human customer service. An AI chatbot that's more than a just a light wrapper would blow away the humans who do this job for UW Health.
The case against using computers instead of humans is that computers are bad at handling error conditions, can't adapt to unusual situations, and behave according to mechanical rules, which can often generate ridiculous outcomes, but that's precisely the situation we're in right now with humans. It already feels like dealing with a computer program. Not a modern computer program, but a compiler from the 80s that tells you that there's at least one error, with no other diagnostic information.
UW Health sent a form with impossible instructions to my dentist. That's not great, but it's understandable; mistakes happen. However, when they got the form back and it wasn't correctly filled out, instead of contacting my dentist they just threw it away. Just like an 80s compiler. Error! The second time around, they told me that the form was incorrectly filled out. Error! There was a human on the other end who could have noted that the form was impossible to fill out. But like an 80s compiler, they stopped at the first error and gave it no further thought. This eventually got resolved, but the error messages I got along the way were much worse than I'd expect from a modern program. Clang (and even gcc) give me much better error messages than I got here.
Of course, as we saw with healthcare.gov, outsourcing interaction to computers doesn't guarantee good results. There are some claims that market solutions will automatically fix any problem, but those claims don't always work out.
That's an ad someone was running for a few months on Facebook in order to try to find a human at Google to help them because every conventional technique they had at their disposal failed. Google has perhaps the most advanced ML in the world, they're as market driven as any other public company, and they've mostly tried to automate away service jobs like first-level support because support doesn't scale. As a result, the most reliable methods of getting support at Google are
If you don't have direct access to one of these methods, running an ad is actually a pretty reasonable solution. (1) and (2) don't always work, but they're more effective than not being famous and hoping a blog post will hit HN, or being a paying customer. The point here isn't to rag on Google, it's just that automated customer service solutions aren't infallible, even when you've got an AI that can beat the strongest go player in the world and multiple buildings full of people applying that same technology to practical problems.
While replacing humans with computers doesn't always create a great experience, good computer based systems for things like scheduling and referrals can already be much better than the average human at a bureaucratic institution2. With the right setup, a computer-based system can be better at escalating thorny problems to someone who's capable of solving them than a human-based system. And computers will only get better at this. There will be bugs. And there will be bad systems. But there are already bugs in human systems. And there are already bad human systems.
I'm not sure if, in my lifetime, technology will advance to the point where computers can be as good as helpful humans in a well designed system. But we're already at the point where computers can be as helpful as apathetic humans in a poorly designed system, which describes a significant fraction of service jobs.
When ChatGPT was released in 2022, the debate described above in 2015 happened again, with the same arguments on both sides. People are once again saying that AI (this time, ChatGPT and LLMs) can't replace humans because a great human is better than ChatGPT. They'll often pick a couple examples of ChatGPT saying something extremely silly, "hallucinating", but if you ask a human to explain something, even a world-class expert, they often hallucinate a totally fake explanation as well
Many people on the pessimist side argued that it would be decades before LLMs can replace humans for the exact reasons we noted were false in 2015. Everyone made this argument after multiple industries had massive cuts in the number of humans they need to employ due to pre-LLM "AI" automation and many of these people even made this argument after companies had already laid people off and replaced people with LLMs. I commented on this at the time, using the same reasoning I used in this 2015 post before realizing that I'd already written down this line of reasoning in 2015. But, cut me some slack; I'm just a human, not a computer, so I have a fallible memory.
Now that it's been a year ChatGPT was released, the AI pessimists who argued that LLMs would displace human jobs for a very long time have been proven even more wrong by layoff after layoff where customer service orgs were cut to the bone and mostly replaced by AI, AI customer service seems quite poor, just like human customer service. But human customer service isn't improving, while AI customer service is. For example, here are some recent customer service interactions I had as a result of bringing my car in to get the oil changed, rotate the tires, and do a third thing (long story).
Overall, how does an LLM compare? It's probably significantly better than this dude, who acted like an archetypical stoner who doesn't want to be there and doesn't want to do anything, and the LLM will be cheaper as well. However, the LLM will be worse than a web interface that lets me book the exact work I want and write a note to the tech who's doing the work. For better or for worse, I don't think my local tire / oil change place is going to give me a nice web interface that lets me book the exact work I want any time soon, so this guy is going to be replaced by an LLM and not a simple web app.
Thanks to Leah Hanson and Josiah Irwin for comments/corrections/discussion.
I wonder if a deranged version of the law of one price applies, the law of one level of customer service. However good or bad an organization is at customer service, they will create or purchase automated solutions that are equally good or bad.
At Costco, the checkout clerks move fast and are helpful, so you don't have much reason to use the automated checkout. But then the self-checkout machines tend to be well-designed; they're physically laid out to reduce the time it takes to feed a large volume of stuff through them, and they rarely get confused and deadlock, so there's not much reason not to use them. At a number of other grocery chains, the checkout clerks are apathetic and move slowly, and will make mistakes unless you remind them of what's happening. It makes sense to use self-checkout at those places, except that the self-checkout machines aren't designed particularly well and are often configured so that they often get confused and require intervention from an overloaded checkout clerk.
The same thing seems to happen with automated phone trees, as well as both of the examples above. Local Health has an online system to automate customer service, but they went with Epic as the provider, and as a result it's even worse than dealing with their phone support. And it's possible to get a human on the line if you're a customer on some Google products, but that human is often no more helpful than the automated system you'd otherwise deal with.[return]