Yes, "AI" is coming for your job... but not like that

Erik Johnson
Oct 7, 2024

AI is on the rise and with it, fears of massive job losses. But if job losses happen, it won’t be because of sentient robots taking over tasks from humans. Instead, it will be for reasons more mundane, yet scarier.Let’s start with how these models actually work:

I. How these “AI” models work (more or less)

  1. Ingest training data
    First you need data - lots and lots of data. AI companies are very cagey about the specifics of their models, but they require *petabytes* (thousands of terabytes / millions of gigabytes) to be ingested and stored. Primarily this data is acquired by scraping whatever is freely available on the internet and augmented by private or purchased data stores. This data is then vectorized - strings are turned into numbers - and stored in complicated thousand-dimensional structures.

  2. Supervised training
    Once you have your data, you need humans to train it. An example of training is having humans mark examples and non-examples - think of CAPTCHA questions requiring you to choose squares which contain cats.

    Humans also craft responses to prompts. So you have teams of people creating prompts and responses - for example, for the prompt “What is a cat?”, a person might write “A cat is a four-legged feline that is typically kept as a pet” and that response is vectorized and stored in the LLM as closely related to the input string.

    Once there is enough labeled data and sample responses, you can have the LLM generate a prompt, generate a set of sample responses, and have the human trainer rank or rate those responses. For example, “What do cats eat?” could generate the responses: “Cats are carnivores who eat a variety of foods like rodents, fish, and others”, “Cats are domesticated pets who largely eat dry kibble or wet food”, “Cats eat lasagna”, and “Cats eat soft beds”, to which a human might rate these answers as “excellent”, “good”, “Bad - this is true for a specific fictional cat named Garfield, but not cats in general” and “Awful”.

    On this step, note that the models are reliant on a large number of human trainers and that the model is only improving on “human time”. 
  1. Unsupervised Training
    Once you have enough labeled data, sample responses, and rankings or ratings of those responses, then the fun starts. You can train a reward model that predicts how well a computer-generated response would be scored by a human trainer. Then you have your model sample a prompt, generate a set of responses, then score those responses. Now this is where the magic happens. You’re working on computer time and can do millions and millions of cycles of prompt, response, score, update model - this is the kind of computation that allowed computers to conquer competitive games like Chess and Go!

    Of course, it’s not quite that easy. Unlike those games, there is no objective measure for whether a response is “good”. So you still will want to dip in, assess the results with human evaluators, and re-train areas where the model goes astray or to improve capabilities for specific types of responses. 

But what’s key to note is that at no point is the model “learning” concepts - it’s learning to predict which outputs are likely to go with which inputs. These models are engaging in pattern-matching, not deliberate thought. (for an expanded look at this, see Two Thought Experiments on AI)

II. When are these models useful?

  1. You want natural-language processing, either parsing human-entered input or producing human-readable output. 
    These models are pretty good at parsing input, but are excellent at producing human-like output. This can actually be a danger, because the models produce output that sounds plausible, regardless of its accuracy - more on this later.

  2. The correct body of data exists in the training set.
    If there’s a large body of known knowledge about a topic, and that body of knowledge is on the internet (aka likely to have been scraped), then there’s a good chance you’ll get a useful result. 
  1. You know enough to catch and fix errors (or the task is not high-stakes)
    Of course, there’s no guarantee of a correct result, so you’ll need to have the knowledge to error-check the plausible-seeming output, or it has to be something where you don’t particularly care how accurate the result is (for example, generating lorem-ipsum style copy that will be rewritten later). Many high-profile AI fails come from skipping this step.
  2. You can gain significant efficiencies.
    Reading documentation or doing research can take more time than using an LLM “copilot”. I’ve found these models not to be useful when writing articles because the output is so generic that I delete it and start over. However, sometimes generating that initial draft or outline can save many hours of agonizing over a blank page. 

III. When do things go wrong?

Well, the big issue to keep in mind is that these LLMs do *not* have some of the qualities that we have come to expect from computer programs. Namely, they are neither accurate nor predictable.

Accurate. Remember, these models are not thinking about questions in a conceptual way. They are looking at an input vector and seeing if there’s an output vector that’s likely to go with it. So if you ask questions that superficially “look like” another type of question, you’ll often get widely-wrong output for a question that’s “similar”. (EX: sun bathe -> sun gaze, farmer / sheep)

Sun bathing, sun gazing, what's the difference?


Predictable. Because there is some randomness inherent in the process, but also because of how similar-meaning inputs can be expressed with many different words, it’s very hard to know whether a particular input will lead to a particular output. Famously, ChatGPT has difficulty telling whether the word “strawberry” contains 2 letters or 3 because of how it vectorizes the string and depending on how the question is asked, it can either be immediately correct or obstinately wrong. 

Here we get a correct response followed immediately by a terrible one.


IV. But isn’t “AI” inevitable?


Machine learning and natural language processing have been part of the landscape for years now and are growing even more important. But the idea of generalized “AI” like ChatGPT, Claude, Llama, etc. becoming an essential part of tech business is not nearly as sure of a thing, despite the massive amounts of money and hype trying to convince us all otherwise. 

To truly make an impact, these technologies have some important challenges to overcome: 

  1. Legal - It’s undeniable that these generalized models are trained on massive amounts of copyrighted material. If it turns out that this use violates copyright law (or requires the model owners to pay royalties), it’s hard to see how any of these survive. While there are many powerful interests lined up behind these models, there are many powerful interests opposing as well - Hollywood, The New York Times, Disney, etc. - who are well-versed in copyright law and have large and aggressive legal teams. Lawsuits are currently working their way through the court system and any adverse judgment could end or severely hamper development of these models.

  2. Technical - Everyone knows the next iPhone is going to be slimmer, have a better camera, and have longer battery life. The same is not true of the next generation of the AI models. Because of the intensive data needs, it’s possible that all the human-generated content on the internet has already been incorporated into these models and there is simply not enough available data to gain significant improvements for future generations.

    But can’t we use AI to generate data that can train future generations? Surprisingly, no! There is a phenomenon called “model collapse” where using output from a model to train future generations leads to the model diverging catastrophically from coherent responses. See for example this article on a handwriting AI written in the New York Times. 
from “When AI’s Output is a Threat to AI Itself”, NYTimes (Aug 26, 2024)

  1. Economic - This is the biggest and most serious challenge. Even if these models are legal and even if they continue to improve, will they ever be profitable?
    Aside from the exorbitant training costs of these models (estimated at tens to hundreds of millions of dollars), the per-unit costs are orders of magnitude larger than people expect. Every time an LLM is queried, it uses a relatively massive amount of compute power compared to standard tech platforms. So the millions of "free" users of these platforms represent a very real cost. NVIDIA launching new open-source models may represent a race to the bottom for AI companies trying to generate profit, because as a chip maker, NVIDIA doesn't need to make its models profitable - it's in the position of selling shovels during a Gold Rush!

    We've seen a long line of tech companies come in with sky-high promises of profitability, only to crash back to reality.

    Uber has only managed to achieve profitability twice - in 2018 and 2023 - but still carries a valuation that seems disconnected from the underlying mechanics of the business model (it’s a cab service!). WeWork famously was valued at $47 billion dollars before crashing into bankruptcy. And the less said about the hype and investment going into “the Metaverse”, or "blockchain", the better.  

    Because ultimately, that’s what this is all about. New technology has its costs subsidized to a great extent by venture capital or other investment in an attempt to build market share, but at the end of the day, it has to generate more revenue than it costs. The fact that OpenAI needed a massive fundraise recently and will likely need another by July of next year means profitability (if it's even possible) seems very far away.



Ominous rumblings on the horizon

V. The REAL danger to your job

So the real danger to your job from “AI” is not your role being replaced by a computer program (either directly or indirectly). Rather the risk we face is an over-investment in fundamentally unprofitable products that never “pay off”, leading to future job cuts. 

Let's see who the REAL culprit is...


The closest analog of our current situation seems to be the 2000s dot-com boom. E-commerce was the new hot trend, and massive amounts of investment poured into companies which proved to be overvalued and unable to reach profitability. The crash did not happen because selling things on the internet was a bad idea, just like the crash we’re headed for is not because AI isn’t a real technology. Amazon, eBay, and many other still-profitable companies found ways to leverage the new technology - the crash happened because there were too many companies that just slapped “the internet” onto their business model without having a real plan. 

We’re not heading towards Skynet, we’re heading towards Pets.com.

VI. But wait! It gets worse

Sometimes you hear something so wild and out-of-pocket from someone who should know better that it completely changes your view of the situation. As an example, back in 2017 I read an article on self-driving cars that included a product lead saying “The self-driving cars are already good enough to be on the roads - we just need to make people into better drivers.” Hearing that blew my mind, and immediately turned me off from the hype coming from that sector. The idea that this person thought it would be easier to *improve the driving skills of everyone in the world* then execute on their roadmap… the mind boggles.

So for “AI”, I honestly was more positive towards the technology until January of 2024, when OpenAI’s Sam Altman said this:

This is... not inspiring (US News World Report)

This was on stage at Davos, from the man at the helm of the most well-resourced AI company, who is probably best-positioned to know what’s possible. And he’s looking at what needs to happen to make the next generation of “AI” work, the challenges that they face, and it’s leading him to say: “How hard is it to get fusion energy up and running?”

Because that’s the real hidden cost of this technology - it’s not the dollar amount, it’s the massive amount of energy and water consumption needed to power the endless data centers to keep these models running. AI has knocked both Google and Microsoft off of their “net zero by 2030” promises, knocked them off so badly that their emissions are UP compared to when they initially made these commitments. 

And even more than our jobs, that’s the real tragedy of this latest hype cycle. Much like blockchain and crypto before it, the industry seems to be going all-in on a new technology that doesn’t seem to provide nearly enough value to be profitable, much less to counterbalance the environmental harms. If we as an industry can’t move away from an “extraction” model, where profits are won by getting in early and selling into the hype, or by pillaging existing resources that cannot be replaced, to a more “sustainable” model, where profits are genuinely built on value delivered to the customer, using resources that can be replenished - then we’re going to have a lot more to worry about in the future than just our jobs.

Read More

Erik Johnson
Jan 27, 2025
Why people love "bad" interfaces
UXUI Design
Erik Johnson
Dec 18, 2024
5 Accessibility Tips for Content Creators
UXUI Design
Erik Johnson
Nov 13, 2024
5 Principles for Improving your small business website
UXUI Design
Candida Hall
Aug 14, 2024
A discovery session can be the difference between success and failure. We're outlining exactly how to get the most from one of these engagements.
UXUI Design
Workshop
Candida Hall
Apr 22, 2024
We explore when and how to use the dashboard to help improve the overall experience.
Research
UXUI Design
Candida Hall
Feb 8, 2024
Being in the tech industry, it’s impossible to escape the conversation about AI. This article discusses 3 reasons why AI can't replace product designers.
UXUI Design
Erik Johnson
Nov 29, 2023
Increase participation and get better results with these 10 tips.
Workshop
UXUI Design
Candida Hall
Oct 20, 2023
Empathy is one of the buzziest buzz words in Design for good reason - it’s an essential skill for producing good usability. This article is about strengthening your empathy skills and incorporating them into design, management, and your everyday life with 5 easy tips.
Research
Workshop
Erik Johnson
Sep 7, 2023
Quick and collaborative, design teardowns are a great way to get out of creative ruts that we all experience. The technique is low fidelity, easy to do, and has a high reward - plus it’s fun!
Wireframing
Prototyping
Erik Johnson
Aug 31, 2023
Sometimes we start user research not knowing exactly what we want to learn by the end, which means it’s not always clear how to use what we learned, or worse, what we learn isn’t relevant to the product team. Applying the “Understanding by Design” framework allows your team to collaborate on what needs to be learned to move the product forward.
Research
Workshop