The three leading language AI models—ChatGPT-5, Gemini 2.5, and Claude 4.5—gave different answers to the question of which professions are most at risk of automation. Economists warn: popular 'AI susceptibility indices,' which policymakers and employers are already relying on, may turn out to be far less reliable than commonly believed, reports the Wall Street Journal.
This conclusion was reached by economists Michelle Yin and Hoa Vu from Northwestern University (NU), as well as Claudia Persico from American University (AU). In their preliminary scientific work, the researchers asked three AI models to assess which professions are most vulnerable to AI, and often received different answers.
Claude assigned a high degree of vulnerability to the accounting profession, while Gemini rated it significantly lower. The models also disagreed in their assessment of the vulnerability of advertising managers and senior executives. ChatGPT and Gemini were the most consistent with each other, but even they diverged in about a quarter of cases.
Some of the discrepancies can be explained by differences between the AI models themselves; however, the economists also discovered another factor: the assessments were influenced by which specialists are already using AI. Early adopters—such as financial analysts—actively work with neural networks and thereby generate more data on which future AI models are trained. This, in turn, affects how the models evaluate such professions.
AI susceptibility indices are constructed in three ways: manually, where experts assess how much AI accelerates the completion of certain work tasks; through surveys of employees using AI platforms; or through the large language models (LLMs) themselves. Manual assessments can be quite subjective, and surveys reflect the opinions of users from only one platform and do not necessarily represent the labor market as a whole. Nevertheless, these indices are widely used in analytical notes, consulting reports, and documents prepared to justify policy decisions.
Discrepancies between different versions of rapidly evolving technology are not surprising in themselves. Moreover, it is still unclear whether AI models assess susceptibility to automation worse or better than other methods. But the problem, according to the authors of the study, is that some policymakers and employers may take such assessments at face value.
To begin with, the economists believe that researchers should rely on the responses of several AI models rather than just one, and explicitly indicate the uncertainty of the results. Ultimately, in their opinion, more accurate answers can be obtained from surveys about how AI is actually being implemented in the economy and for what tasks it is used. "Personally, I wouldn’t rely on a single indicator to decide: 'I need to change jobs' or 'My child needs to change their major,'" Yin said.
The study shows that even the most advanced AI systems are currently unable to provide a definitive forecast about the future of the labor market, writes bb.lv. The authors of the work believe that such assessments should be taken with caution and not used as the sole reference when choosing a profession, developing personnel policies, or making economic decisions.
Leave a comment