I'd love to see this analysis done for ChatGPT, which has a much bigger 'consumer' marketshare.
I'm also very wary of their analysis method, given classifiers-gonna-classify. We already see it in their example of someone asking why their game is crashing and it buckets them into Computer & Mathematical occupation. I'm guessing the original question was not that of a game developer but rather a game player, so can you really call this an occupational task? Sure it's in that domain, I guess, but in a completely different context. If I'm asking a question about how to clean my dish washer, that's hardly in repairman or industrial occupations.
Still, it's cool they're doing this.
0xDEAFBEAD 32 days ago [-]
If you look at ChatGPT search volume, you can see massive dips during the summer when school is out:
Which suggests that the most common use is as a tutor / cheating on homework.
anshumankmr 32 days ago [-]
This ignores people who open ChatGPT.com or use the app
mbreese 32 days ago [-]
But it ignores them equally throughout the year… It’s not an exact measure, but that doesn’t mean it’s not a useful metric. So long as the measurement is unbiased and captures enough of the traffic, it can still be useful.
The troughs in that graph are all during prime US school/college vacation times: Summer, Winter, and Spring breaks. And then magnitude of the fall corresponds to how long the breaks typically are. To me, that makes a lot of sense.
og_kalu 32 days ago [-]
Yeah but it's old. There was no dip in 2024, only a steady increase.
throwaway0123_5 30 days ago [-]
It seems like there was a dip in mid-May and that it didn't go back up until mid/late-August? That corresponds pretty closely to US summer breaks. Also a huge spike in early December followed up a huge drop, final exams and then winter break?
What I see is it going from 58 to 40 (the scale is ???), and it’s only continued to rise over time. So that maybe a common use (~30%), but it’s not the only use.
Most of those kids will continue to use it as they graduate, having embedded it in their workflow (unfortunately many will probably fully outsource all thinking to it, having learned a lot less since it did it all for them).
og_kalu 32 days ago [-]
That dip didn't exist in 2024. Site visits just increased steadily throughout last year with no summer dips.
Cumpiler69 32 days ago [-]
[dead]
fragmede 32 days ago [-]
Yeah. reminds me of the ancient okcupid data analysis blogs and not the creepy one by sleep8. The group I'm surprised not to see represented in their analysis is "personal", where people I know use ChatGPT as a therapist/life coach/sms analysis&editor. and of course they crucially but understandably left off the denominator. 35% of a million requests is different than 35% of a billion. and also how many of the conversations had 1 message, indicating "just testing" vs 10 or 100 messages.
> 35% of a million requests is different than 35% of a billion.
Not statistically.
alwa 32 days ago [-]
A mentor I respect memorably explained to young me that “it doesn’t matter how big the pot of soup, you can use the same size spoon to taste it.”
beefnugs 32 days ago [-]
Sorry but that mentor has a small practical imagination, a pot can be so large that the top 3 feet that you reach with that spoon could be all oil
alwa 32 days ago [-]
True! Consistency and representativeness matter, in soup samples as in social samples!
Is the soup smooth or lumpy? Striated or uniform? For that matter a soup could (and often does) involve huge soup bones that give it important parts of its flavor, but never show up directly in a spoonful. And you might need something different from a spoon to convincingly rule out some specific rare lumpy ingredient.
The didactic value of sampling the soup pot goes well behind its basic function: correcting the beginner’s misperception that a sample’s statistical power is directly related to population size :)
skeeter2020 32 days ago [-]
to push this analogy too far, that's because you didn't stir it well, not because the spoon is too small.
prepend 32 days ago [-]
Have to sample to see if it’s stirred well enough.
brookst 32 days ago [-]
No, you can model whether stirring actions should create a representative sample
Terretta 32 days ago [-]
Not with immiscible layered stratified flow…
“You're gonna need a bigger spoon!”
ggm 32 days ago [-]
35% of a million students in the USA is very different to 35% of a billion students across the USA, Europe and Africa.
Since there aren't a billion students in the USA, 35% of them is an impossibility.
If you scale your population above some recognized boundary you aren't sampling in the same space any more. After all the local star density to 1AU tends very strongly to 1. That's not indicative of the actual star density in the milky way.
olddustytrail 32 days ago [-]
Yes statistically. What do you think "statistically" means?
layman51 32 days ago [-]
What do you mean by “statistically”? The end results would be like three orders of magnitude apart. Wouldn’t the desired sample size depend on the size of the population itself?
og_kalu 32 days ago [-]
>Wouldn’t the desired sample size depend on the size of the population itself?
No, The most important thing is the distribution of the sample size. You have to make sure it isn't obviously biased in some way (i.e You're only surveying students in a university for extrapolation on the entire population of the country). Beyond that, the desired sample size levels off quickly.
5000 (assuming the same distribution) won't be any more or less accurate for 10M than it is for 1M.
Of course, if you just ask everyone or almost everyone then you no longer need to worry about distribution but yeah
raylad 32 days ago [-]
Very wary. Not weary.
skeeter2020 32 days ago [-]
why not both?
Artemon 32 days ago [-]
lmao
frankfrank13 32 days ago [-]
This more or less confirms what I imagine most of us thought, AI is mostly used by engineers, or for engineering tasks. Makes sense, I wonder how much traffic comes from automated tasks (co-pilot, etc). Every time I read a report like this I do wonder if we'll ever see an ROI on LLMs. HUNDREDS of billions of dollars of spend, and 3 years in its still primarily the same crowd using it, and has yet to create a "killer" app beyond ChatGPT and Co-Pilot style IDE's. And its not like people aren't trying! Look at the recent YC batches, its all AI-for-industry. Idk man, I fear the economic reality on the backside of this kind of spend.
steveBK123 32 days ago [-]
I think the problem is the data.
Software engineering is a weird niche that is both a high paying job and something you can almost self-teach from widely available free online content. If not self-teach, you can rely on free online content for troubleshooting, examples, etc.
A lot of other industries/jobs are more of an apprenticeship model, with little data and even less freely available on open internet.
esperent 32 days ago [-]
> something you can almost self-teach from widely available free online content.
I think you massively underestimate just how much data is online for everything, especially once you include books which are freely available on every possible subject (illegally, perhaps, but if Meta can download them for free then so can everyone else).
There's less noise for many other subjects than for software engineering, there's often just a couple rather than 100s of competing ways to do everything. There might just be one coursebook rather than 1000s of tutorials. But the data for self teaching is absolutely there.
satellite2 32 days ago [-]
Consider two fields with vast amounts of literature: medicine and law.
Medicine faces two key challenges. First, while research follows the scientific method, much of what makes a good doctor—intuition, pattern recognition, and clinical judgment—is rarely documented. Second, medical data is highly sensitive, limiting access to real-world cases, images, and practice opportunities. Theory alone is not enough; hands-on experience is essential.
Law presents a different problem: unknown unknowns. The sheer volume of legal texts makes it nearly impossible to be sure you’ve found everything relevant. Even with search tools, gaps in knowledge remain a major risk.
Compounding this is the way law is actually practiced. Every judge and lawyer operates with a shared foundation of legal principles so basic they are almost never discussed. The real work happens at two higher levels: first, the process—how laws are applied, argued, and enforced in practice. Then, at a third, more abstract level, legal debates unfold about interpretation, precedent, and systemic implications. The first level is assumed, the second is routine, and only the third is where true discussion happens.
Self-teaching is easier in fields where knowledge is structured, accessible, and complete. Many subjects are not.
brookst 32 days ago [-]
Really fantastic comment. I would add one criteria to where self-teaching is easier: rapidly testable hypotheses.
roncesvalles 32 days ago [-]
A significant chunk of human knowledge is not publicly accessible. You cannot self-teach how to make a modern aircraft, jet engine, nuclear reactor, radar tech, advanced metallurgy etc.
Similarly, I would wager most of the useful economics and financial theory that humans have come up with is only known to hedge or prop trading firms.
For some subjects, the entire journal-published academic body of knowledge for it is probably some useless fraction of the whole and university academia is operating nowhere close to the cutting edge. People are probably doing PhDs today on theses that some defense contractor or HFT firm already discovered 20 years ago.
Even things like specialized medical knowledge, I would wager is largely passed down through mentor-mentee tradition and/or private notes as opposed to textbooks. It's unlikely that you can teach yourself how to do surgery just from textbooks. I once had a pathologist's report use a term for a skin condition that was quite literally ungoogleable. The skin condition itself was fairly ordinary, but the term used was outright esoteric and yet probably used on a daily basis by that pathologist. Where did he learn it from?
Not everything is on the Internet.
willturman 32 days ago [-]
Taylor Wilson built a nuclear reactor at his home when he was 14. People build jet engines and put them on modern model aircraft every day.
If the instructions aren’t immediately available, the internet provides connections and forums to find anything your heart desires.
Information wants to be free.
Arbitraging micro-opportunities (or far more likely, deploying insider information masked as HFT or some secret sauce arbitrage) is not economically useful.
BeetleB 32 days ago [-]
The difference is in the cost of the equipment.
Sure, you can learn all about power electronics by yourself. But have some ideas you want to implement? Hundreds to tens of thousands of dollars.
kanbankaren 32 days ago [-]
> Software engineering is a weird niche that is both a high paying job and something you can almost self-teach
If you meant programming, I agree it could be self-taught, but not SE. SE is the set of techniques, practices, tools that we have collected over decades for producing multi-versioned software that meets a certain reliability rating. Not all of these is freely available online.
elicksaur 32 days ago [-]
Unless you are talking about people who are actual licensed engineers, this is a distinction without a difference.
Thing is, I’ve never met someone in software with a professional license.
kanbankaren 32 days ago [-]
I didn't mean the professional license, rather the ensemble of practices, tools, etc. It is practiced in safety critical domains.
BoorishBears 32 days ago [-]
I'm self-taught and had a job in the autonmous vehicle industry writing software that included safety-critical functionality.
I had about 12 YoE at the time, and my manager didn't realize I didn't have a degree until after I was hired. Apparently it hadn't affected my offer, and he was more impressed than anything.
You say:
> SE is the set of techniques, practices, tools that we have collected over decades for producing multi-versioned software that meets a certain reliability rating. Not all of these is freely available online.
The same way there's no single guide on the internet on how to be the kind of engineer who builds reliable or extensible software, I don't think there's a guide hiding in the average CS curriculum.
Most of it consists of getting repetitions building software that involves the least predictable building block in all of software engineering (people), in all its various forms: from users, to other developers, to yourself (in the future), to "stakeholders", etc.
Learning how to predict and account for the unpredictability in all the people who will intersect with some facet of your software is the closest I've seen to a "universal method" for creating software that meets the criteria you defined.
And honestly I'd be concerned if someone told me you can just be taught some blessed set of tools and practices to get around it... that sounds a lot like not having actually internalized why they work in the first place, and the "why" is arguably more valuable than the tools and practices themselves.
mistrial9 31 days ago [-]
this is a challenging point of view.. on one side, a "a job in the autonmous (sic) vehicle industry writing safety critical software" sounds like one of the most slave-ish jobs in the world. This person had a 100 other people checking every tiny result, plus automated testing frameworks and hundreds of pages of "guidelines" .. in other words, the least creative and most guard-checked software possible.
On the other hand, an open and level playing field does not exist in the thirty-some odd years of open markets software development. No one since Seymour Cray has done complete systems design, really.. it is turtles all the way down. You have to get hardware to run on, and the software environment is going to have been defined for that.. CPU architectures and programming languages. People who write whole systems generally do it in teams.
The arrogant and self-satisfied tone of the corporate worker-bee says that there is no such thing as real software engineering skills?
like defining "health" or other broad topics.. the closer the topic is examined, the more holes in the arguments. I am glad I never punched a time clock for Elon Musk, however, all things considered.
BoorishBears 31 days ago [-]
You write too poorly to be this condescending.
mistrial9 31 days ago [-]
this is my real reaction to the post .. but conversation here could be more inquiring.. to find insight. My bad.. no happiness
BoorishBears 31 days ago [-]
Digesting your thoughts before vomiting out a reaction is allowed.
optimalsolver 32 days ago [-]
There are plenty of self-taught people in the open source space making highly reliable software.
UncleEntity 32 days ago [-]
...who don't get hired at "real" jobs because they can't produce a bubble sort in 15 minutes on a whiteboard.
I feel very fortunate that the core blender devs had the patience to put up with my stupid amateur mistakes while I learned the skills to become a helpful contributor back in the day.
taurknaut 32 days ago [-]
The vast majority of people learn this on the job. This is certainly not taught in schools (or is only barely scratched as a topic).
steveBK123 32 days ago [-]
Sure. And that's why SWEs will be fine in the world of AI, as the rote work is more easily automated.
The contrast is that for a lot of other jobs, the rote tasks are not routinely solvable with free online content in text form.
BeetleB 32 days ago [-]
I'll bite. Can you list specific things not freely available online?
bakuninsbart 32 days ago [-]
I would agree that the products coming out so far lack imagination, but hard disagree on the impact. LLMs have completely transformed multiple industries already. In SWE, I would estimate that junior positions shrank by 70-80%, but even that is less extreme than what is going on in other industries.
In marketing, the entire low-end to mid-tier market is gone. Instead of having teams working on projects for small to mid-sized companies, there's now a single Senior managing projects with the help of LLMs. I know multiple agencies who cut staff by 80-90% without dropping revenue.
Translation (of books, articles, subtitles) was never well paid, even for very complex and demanding work. My partner did it a bit on the side, mostly justifying the low pay with some moral bla about spreading knowledge across cultures... With LLMs you can completely cut out the grunt part. You define the hard parts (terms that don't translate well), round out the edges and edge out the fluff, and every good translator becomes two to ten times more productive. Since work is usually paid by the page, people in the industry got a very decent (at least temporary) pay jump, I would imagine around 100%.
Support is probably the biggest one though. It is important to remember that outsourcing ot India only works for English speaking countries. And even that isn't super cheap.
Here in Germany, if you don't have back-up wealth, it is your constitutional right to get some support from the state (~1400 euro), but you are obligated to find a job as soon as possible, and they will try to help you find a role. Support was always one of the biggest industries to funnel people towards. I talked to a friend working there, and according to them the complete industry basically stopped advertising new positions, the only ones that are left are financial services. The rest went all in on LLMs and just employ a fraction of the support stuff to deal with things escalating enough.
And that's not even touching on all the small things. How much energy is spent on creating pitch decks, communicating proposals, writing documentation etc? It probably goes up as far as 50% of work in large Orgs, and even if you can just save 5% of your time by using LLMs to phrase or organize, there is a decent ROI for companies to pay for them.
advael 32 days ago [-]
I think a lot of this is because the economic pressure is weak right now both on the side of labor and consumers, due to decades of severe upward wealth transfer. A lot of these companies are not improving or even maintaining their productivity or quality of service, and while there are probably some productivity gains for engineers, I suspect based on what I'm seeing that this is going to burn a lot of people out, as there is significant social pressure both from peers and employers to exaggerate this somewhat. People can have too heavy a workload for a decent amount of time before breaking.
There's just no countervailing force to make these decisions that immediately painful for them. Sectors are monopolized, people are tired and desperate, tech workers are in a basically unprecedented bout of instability.
The situation is super dark from a lot of angles, but I don't think it's really "the overwhelming usefulness of AI" that's to blame here. As far as I can tell, the biggest thing these technologies are doing is providing a cover story for private-equity-style guttings of various knowledge work verticals for short-term profit, which was kind of inevitable given that's been happening across the board in the larger economy, it's just another pretense that works for different verticals.
There are cases where LLMs seem really genuinely useful (Mostly ones that are for and by SWEs, like generating documentation or smoothing some ramp processes in learning new libraries or languages) and those don't seem to be "transformative" at scale yet, unless we count "transforming" many products into buggier products that are more brittle and frustrating to interact with
dorgo 32 days ago [-]
>I know multiple agencies who cut staff by 80-90% without dropping revenue.
I'm finding it hard to reconcile this with my own experiences. My whole team ( 5 people ) left last year ( for better pay I guess ) and the marketing agency in germany Im working for had to substitute them with freelancers. To offset the cost they fired the one guy who was hired to push the whole LLM AI topic.
We managed to fill one junior position by offering 10k+ more then in their last job. The firm would love to hire people to replace the freelancers.
We had to cut stuff lately. But mostly they closed the kitchen which wasn't used due to work from home policy.
Definitely don't see any stuff reduction due to automation / LLM use. They still pay (external) people 60€ per written text/article. Because clients don't like LLM written stuff.
torginus 32 days ago [-]
Actually I have interacted with multiple translators in multiple industries and I haven't seen any disruption (although I agree with your statement that it was never well paid)
- Synchronous translation at political/economic events still needs a personm as it ever did
- LLMs are nowhere near the level to be able to translate fine literature at a high enough quality to be publishable
- Translating software is still very hard, as the translator usually needs a ton of context/reference for commonly used terminology - we partnered with a machine translation company, and what they produced sucked balls.
I have friends who work as translators, and we make use of translation services as a company, and I haven't seen the work going away.
achierius 32 days ago [-]
> I would estimate that junior positions shrank by 70-80%
This just isn't true, it's nowhere close.
tonyedgecombe 31 days ago [-]
>LLMs have completely transformed multiple industries already.
If this was true we would see the results in productivity and unemployment stats. We don't though, so far the effect hasn't registered.
physicsguy 32 days ago [-]
We’re trying to use it for industrial apps. Been over a year of R&D. Some good but often mixed results. Adherence to prompts is a big issue for us. It’s most useful not as a chatbot but to give explained descriptions of what the user is seeing, so they don’t need to dig down into 20 graphs and past history. That necessitates being able to refer to things with URIs which works 95% of the time but the 5% is killer since it’s difficult to detect issues and leads to dead links.
frankfrank13 32 days ago [-]
I tried to build a BIG E2E automation pipeline, along the lines of like, replace a team of 5 with this one simple tasks. And as I was doing it, all I could think was, just use chatgpt. Sure it can't actually automate what you're doing, but it'll get you there 80% as fast as fully-automated, with 90% less risk of error/nonsense at the end. Ironically, this company blocks all LLM websites, they even block GH on their employees' computers.
tamersalama 32 days ago [-]
I'm curious about your approach and the nature of those industrial apps. Is it more of recommender agents accessing available sources (through URIs) - or more like explainers. Would be great to connect https://shorturl.at/xdOee
groby_b 32 days ago [-]
Claude is mostly used by software engineers. That's an important distinction to make.
I love Claude, but let's not ignore that in the LLM race, they're not exactly the leading player.
throwaway2037 32 days ago [-]
Can I ask a dumb question as an LLM newbie? What is it about Claude that makes it so good at basic software engineering tasks? Do you think it was finely tuned to be good at these tasks? No joke/trolling: A bunch of people have posted on HN in the last 6 months about creating MVPs (Minimum Viable Products) -- usually web apps -- using Claude. As a non-web-app programmer, I think this is amazing progress!
NervousRing 32 days ago [-]
I think it understands the context better and it was possibly fine tuned better. I have been using GPT since 3 and while the replies have obviously gotten more accurate, it still makes weird assumptions at times, whereas Claude seems to "get it" more often. In tasks other than coding, I've found gpt to be more detailed by default and yet Claude seems to hit the mark better.
psytrancefan 32 days ago [-]
IMO it was just the strongest model for awhile for programming. It got the answers right more often than not.
It is faster than the reasoning/chain of thought models. With current o1 and DeepSeek though I haven't logged into Claude in a few weeks.
I have no inside knowledge but I am kind of expecting Sonnet chain of thought any day now and I am sure that will be incredible.
FergusArgyll 32 days ago [-]
This is gonna sound strange but:
Anthropic's llms always (always? at least since 2) have a distinctive "personality". I obv don't know how to quantify it or what "it" really is, but if you've used it you might know what I mean. Maybe that "personality" is conducive to swe?
frankfrank13 32 days ago [-]
Fair, but do you think OpenAI and Gemini are going to be like directionally similar? How much of OpenAI's traffic is from Co-Pilot and other related tools, for example. My local IDE probably generates more queries a day than (pick a profession, idk, nurse? insurance sales? construction worker?) does in a month!
psytrancefan 32 days ago [-]
I would be pretty shocked if the Anthropic reasoning model is not mind blowing and doesn't take the lead back.
flessner 32 days ago [-]
But "AI" tools have more or less seeped into every mainstream product... this is a strong "defensive move" for companies in anticipation of more to come.
We aren't leaving MS Office or Adobe because they already pushed out some minimal innovation. But what about the products you don't even know about? For lawyers, doctors, logistics, sales, marketing, wood workers, handymen? In Europe or Asia?
New product by bringing true innovation could easily push out legacy business by "shiny new thing"(AI) and better UX alone. A lot of software in these areas simply hasn't improved for 10 years - with a great idea and a dedicated team it's a landslide waiting to happen.
tzury 32 days ago [-]
Claude is indeed far more familiar amongst software engineers.
Google Gemini integration into their docs/sheets/slides and Gmail perhaps will show different demographics in a few months, and that is yet before we heard from OpenAI.
frankfrank13 32 days ago [-]
You may be right, but I doubt it. I suspect similar usage metrics for Gemini and OpenAI
Ancalagon 32 days ago [-]
Maybe spend and better models will help this (I’ve not used the deep research models so maybe we are there already). But even day to day coding, the LLMs are great helpers but giving them anything more than a slightly complicated prompts and it seems like these models become completely helpless. You just constantly need a human in the loop because these models are too dumb or lack the context to understand the big picture.
Maybe these models will get better as they’re given more context and can understand the full stack but for now they cannot.
And this is just with code where it already has billions of examples. Nevermind any less data-rich fields. The models still need to get smarter.
ravmachre 32 days ago [-]
I don't think it's necessarily because of lack of generalizability. We (SWEs) built it, so we naturally have the most intimate knowledge of how to dogfood/use it. And so the cycle intensifies (use, provide feedback, improve). There's many positive examples of LLMs being useful in document based workflows in other domains as well!
frankfrank13 32 days ago [-]
Maybe! But you could say the inverse of lots of things that SWE's built. SWE's built the bloomberg terminal! SWE's built CRMs! I think its at least possible that LLMs are VERY useful for SWEs and a small number of other professions, but is unlikely to massively scale beyond that
dleink 32 days ago [-]
If you were on the early internet talking to someone about music or woodworking or whatever, you could reasonably assume they were a tech person because it was not simple to get online. It took a minute for it to spread.
salynchnew 32 days ago [-]
Daniel Rock has done some interesting work on the ROI of AI in general (also, I believe two of his papers are referenced in this study). Note that this doesn't explicitly restrict itself to covering LLMs, but... still a very interesting body of work.
My term for this is “Whitey’s goin’ to the data center”. We are looking at an arms race, where there really is genuine new technology and it will make a difference - but at the 1-2% per annum of an economy level - compounded over fifty years that is geo political dominance yes, but it’s not “machines of loving grace” level growth.
We already have thousands of geniuses working across our economies and teaching our youth. The best of our minds have every year or so been given a global stage in Nobel speeches. We still ignore their arses and will ignore it when AI tells us to stop fighting or whatever.
The real issue here is that wafer scale chips give 900,000 cores, and nothing but embarrassingly parallel code can use it - and frankly no coder I know writes code like that - we have to rethink our whole approach now Moores law is over. Only AI has anything like ability to use the processing ability being built today - the rest of us can stick to cores from 2016 and nothing would change.
Throwing hundreds of billions at having a bad way to program 1 million cores because we have not rethought software and businesses to cope seems wrong - both because “Whitey” can spend it on better things but also because it is an opportunity - imagine being 900,000 times faster than your competitors - what’s does that even mean?
Edit: Trying to put it another way - there are two ways AI can help us - it can improve cancer treatments at every stage of medical care, through careful design and creation of medical AI models that can slowly ratchet up diagnosis, treatment and even research and analysis. This is human organisations harnessing and adapting around a new technology
Or AI can become so smart it just invents a cure for cancer.
I absolutely think the first is going to happen and will benefit the denizens of the first world first. The second one requires two paradigm shifting leaps in the same sentence. Ten years ago I would have laughed in Anthropics face. Today I just give it a low probability multipled by another low probability- and that is an incredible shift.
jszymborski 32 days ago [-]
I mean, are any of us shocked that folks who work with computers or are computer enthusiasts are early adopters of LLMs?
I feel like this has less to do with what LLMs are best at and more to do with which folks are mostly likely to spend time using a chat bot.
kanbankaren 32 days ago [-]
> I fear the economic reality on the backside of this kind of spend.
Minor nitpick. Use of the word 'spend' as a noun is not widespread and not well known.
frankfrank13 32 days ago [-]
Yeah fair, I forget HN is very international, this may read as just straight up weird
throwaway2037 32 days ago [-]
As someone who works in finance, I would disagree. I asked ChatGPT:
Is the noun spend rare?
ChatGPT said:
The noun "spend" is relatively rare compared to its more common form as a verb. While "spend" is widely used as a verb (meaning to give money or time for something), as a noun, it refers to an expenditure or the act of spending, and it’s not as commonly encountered.
In most contexts, people would use alternatives like "expenditure," "spending," or "outlay" instead of "spend" as a noun. That said, it is still used occasionally in certain contexts, especially in financial or informal language.
kanbankaren 32 days ago [-]
Well, ChatGPT is making the same point. Not well known outside the financial industry.
The majority of audience and posters of ycombinator are not in that industry group, right?
c0redump 32 days ago [-]
“Spend” is a common term in advertising, which is arguably the single largest employer of software engineers
brap 33 days ago [-]
Seems like Anthropic has too much money on their hands and are looking for ways to spend it. It’s surprising to see lean AI startups accumulate fat so quickly. Usually this sort of wheel spinning is reserved for large corporations.
And it’s not just them. To me this trend screams “valuations are too high”, and maybe hints at “progress might start to stagnate soon”.
raldi 33 days ago [-]
Anthropic is a Public Benefit Corporation whose governance is very different from a typical company in that it doesn’t put shareholder ROI above all else. A majority of its board seats are reserved for people who hold no equity whatsoever and whose explicit mandate is to look out for humanity.
This is why I cancelled my chatgpt subscription and moved to claude. Its kinda silly, but I feel like the products are about equivalent for my use case so I'd rather do business with a company that is acting in good (better?) faith.
mupuff1234 33 days ago [-]
Don't think that's silly at all.
the_sleaze_ 32 days ago [-]
Hope not - I haven't purchased a Nestle brand in years for this exact reason.
rvnx 32 days ago [-]
In the case they don't get high salaries from this activity, there is also a solution. The next step in ~10 years could be to offer their services to governments to offer "automated court decisions".
Then the people who funded / trained this "justice" out of their good heart, would actually have leverage, in terms of concrete power.
It's a much more subtle way to capture power, if you can replace the judges with your software.
saagarjha 32 days ago [-]
Anthropic pays their engineers pretty well. They're doing just fine, at least for as long as people are pouring money into their company. But that's everyone in this space, isn't it?
UncleEntity 32 days ago [-]
I guess they can get them to rewrite the US Constitution to remove that pesky "fair trial" bit and, since they would control the narrative, delete 1000+ years of common law.
Brave new world, indeed...
rafram 32 days ago [-]
Thanks but no thanks.
mostlysimilar 32 days ago [-]
That isn't silly, that's one of the only ways to exercise agency under hypercapitalism. I recently cancelled my Amazon Prime membership and got a Costco membership for the same reason. I don't get every product I want, but I'm also okay with that.
asdasdsddd 32 days ago [-]
This has to be a meme. Costco is peak hypercapitalism lol.
mostlysimilar 32 days ago [-]
Could you say more?
asdasdsddd 32 days ago [-]
It's a 500B company that undercuts everyone else with incredible efficiency, just like Amazon. It's an example of how capitalism can be great. If you really want to get of out of capitalism, you can just buy directly from farmers or grow your own food.
The whole thing about no ethical consumption under capitalism is a just a way to enjoy the conveniences of capitalism on a moral high ground. It's totally doable, you just might not enjoy it haha.
mostlysimilar 32 days ago [-]
I guess the angle I was coming at it from is that they pay their employees a living wage. I need to buy toilet paper from somewhere, and between Amazon and Costco I would much rather give my money to Costco.
asdasdsddd 32 days ago [-]
The secret is buying a bidet so you dont need to buy from either ever again!
UncleEntity 32 days ago [-]
Hell, just buy from Wallyworld where you get low, low prices and pseudo-socialism with their employees on the food stamps.
The camel's gotta get its nose in the tent somehow.
mppm 33 days ago [-]
I'm not sure if you are being sarcastic or not, but the practical upshot of this new "Public Benefit Corporation" thing, with or without a trust or non-profit attached, is that you can tell both the public and your investors to fuck off. The reason why all the big AI startups suddenly want to use it is because they can. Normally no sane investor would actually invest in such a structure, but right now the fear that you might be left out of the race for humanity's "last invention" is so acute that they do it anyway. But if Dario Amodei actually cared about humanity any more than Sam Altman, that would be the surprise of the year to me.
raldi 33 days ago [-]
Can you imagine a hypothetical AI company that did care about humanity, and if so, how would it look different from Anthropic?
Being available for use by militaries is incredibly irresponsible, regardless of what scope is specifically claimed, because of the inherent gravity of the situation when a military is wrong. The US military maintains a good deal of infrastructure in the US; putting into their hands an unreliable, incompetent calculator puts lives at risk.
It would be structured as a non-profit (there are no teeth to a PBC; the structure is entirely to avoid liability, and if you have no trust in the executive body of an organization, it has zero meaningful signal).
It would have a different leadership team.
It would have a leader who could steelman his own position competently. Machines of Loving Grace was less redeeming than Lenat's old stump speeches for his position, despite Amodei starting up in an industry significantly more geared for what he had to say, and Lenat having an incredibly flexible sense of morality. Its leader would not have a history working for Chinese companies and jingoistically begin advocating for export controls.
It would have different employees than the people I know who are working there, who have a history of picking the most unethical employers they can find, in a fashion not dissimilar to how Illumination Entertainment's "Minions" select employers.
erikerikson 32 days ago [-]
You seem to misunderstand benefit corporations. They remain committed to profit and are just as subject to their board and officers as any other corporation.
There are sane investors that prefer investing in companies that adopt these corporate structures. Based on data, those investors see public benefit corporations as more profitable and resilient. They are able to attract employees and customers that would otherwise not be interested or might be less interested.
manquer 32 days ago [-]
The attempt is commendable, but the agency problem is well understood and none of these alternative structures have really solve for it.
Stock based compensation mix evolved from this thesis, and quite common in the valley and why almost all OpenAI staff wanted Sam Altman back even though the non profit board did not.
Aligning key talent's compensation to enterprise value is only viable in unrestricted for profit entities any other structure with limits (capped profit, public benefit corporation, non profit, trust, 501c's etc) does not work as well.
Talent will then leave to a for-profit entity who can offer better compensation than a restricted entity can because they share a % of their enterprise value which restricted ones either cannot or not have same liquidity/value [1] etc.
---
[1]This is why public companies are more valuable for RSU/options than private companies, and why cash flow positive companies like Stripe still raise private money to just give liquidity to employees .
idiotsecant 33 days ago [-]
Put this and 'dont be evil' and 5 dollars in my hand and I'll give you a cup of coffee.
nonchalantsui 33 days ago [-]
Coffee for $5? That's a steal in this economy!
flurie 32 days ago [-]
The coffee is made with the assistance of AI, which means some nonzero portion will be something other than coffee, but at least it means every sip is an adventure.
amarcheschi 32 days ago [-]
This is one of the funniest takes on ai I've read, it could've been out of a videogame like the outer worlds with its absurd takes on crapitalism
It's not the best choice, it's spacer's choice!
idiotsecant 30 days ago [-]
Isn't there an SCP where occasionally it spits out liquid magma or strange matter or something?
33 days ago [-]
beepbopboopp 33 days ago [-]
The exact opposite. Relative to ChatGPT Anthropic has an enormous "brand problem." What they should be doing is exclusive deals like this, but with deals with large publishers on a recurring basis and figure out how to teach consumers who they are and how to use them best. For like 99% of the use cases all these products are parody and the real business gains are finding a way into consumers lives.
Semi-relevant sidenote: ChatGPT, spent $8m on a super bowl commercial yesterday just to show cool visualizations instead of any emotional product use case to an ultra majority audience has never had a direct experience with the product.
These companies would be best served building a marketing arm away from the main campus in a place like LA or NY to separate the gen pop story from that of the technology.
peterlk 33 days ago [-]
I disagree. I think Anthropic, like the other big players, is trying to get some of that government money. Releasing policy-adjacent papers seems like a way to alert government officials that Anthropic ought to be in the room when the money starts changing hands.
noah_buddy 33 days ago [-]
I am inclined to agree. If you’re at the precipice of automating or transforming knowledge work and the value for being the first is nearly infinite (due to “flywheel effects”), why would you dedicate any energy to studying the impact of AI on jobs directly? The thesis is everything changes.
I think AI in its current iteration is going to settle into being like a slightly worse version of Wikipedia morphed with a slightly better version of stackoverflow.
lblume 33 days ago [-]
I think that strongly underestimates the impact LLMs, especially reasoning models, have on how code is written today.
noah_buddy 33 days ago [-]
Educate me. I find them useful but they are less so when you try to do something novel. To me, it seems like fancy regurgitation with some novel pattern matching but not quite intuition/reasoning per se.
At the base of LLM reasoning and knowledge is a whole corpus of reasoning and knowledge. I am not quite convinced that LLMs will breach the confines of that corpus and the logical implications of the data there. No “eureka” discovery, just applying what we already have laying around.
lblume 32 days ago [-]
Let's say I can't fully disclose the details because it is an area I am actively working on, but I had an algorithmical problem that was already solved in an ancient paper, but after a few hours of research I could find no open implementation of it anywhere. I thus spent quite some time re-implementing this algorithm from scratch, but it kept failing in quite a few edge cases that should have been covered by the original design.
Just to try it out, I uploaded the paper to DeepSeek-R1 and wrote a paragraph on the desired algorithm, that it should code it in Python and that the code should be as simple as possible while still working in exactly the way as described in the paper. About ten minutes later (quite a long reasoning time, but inspecting the chain of thought, it did almost no overthinking, but only reasoned about ideas I had or should have considered) it generated a perfect implementation that worked for every single test case. I uploaded my own attempt, and it correctly found two errors in my code that were actually attributable to naming inconsistencies in the original paper that the model was able to spot and fix on the fly. (The model did not output this, this I had to figure out myself.) I would have never expected AI to do that in my lifetime just two years ago.
I don't know whether that counts as "novel" to you, but before DeepSeek, I also thought that Copilot-like AI would not be able to really disrupt programming. But this one experience completely changed my view. It might be the case the model was trained on similar examples, but I find it unlikely just because the concrete algorithm cannot be found online except for the paper.
james_marks 32 days ago [-]
This fits my experience. When the information is encoded somehow already, LLM’s excel at translating to another medium.
Combined with the old “nothing new under the Sun” maxim, in that most ideas are re-hashes or new combinations of existing ideas, and you’ve got a changed landscape.
asadotzler 32 days ago [-]
clearly NOT novel as you so clearly explained, "an algorithmical problem that was already solved in an ancient paper"
lblume 32 days ago [-]
Well, of course. Realistically, I would not expect AI systems like this to be very useful for novel cutting-edge scientific results, proving mathematical theorems etc. in the next few years.
But this is not the majority of what software developers are doing and working on today. Most have a set of features or goals to implement using code satisfying certain constraints, which is what current reasoning AI models seem to be able to do very well. Of course, this test was not rigorous in any meaningful way, but it really changed my mind on the pace of this technology.
hansonkd 32 days ago [-]
I think the trap people fall in is that LLMs don't need to be novel or reason as well as a human to revolutionize society.
Plenty of value is already added just by converting unstructured data to structured data. If that is all LLMs did they would be still be a revolution in programming and human development. So much manual entry and development work has essentially evaporated overnight.
If there was never a chat based LLM "agent" LLMs just converting arbitrary text to structured JSON schema would be the biggest advancement in comp sci since the internet. There is nothing equivalent that existed before except for manual extraction or rule based hard coding.
Judging LLMs based on some criteria of creativity or intuition from a chat is missing the forest for the trees.
BeetleB 32 days ago [-]
> find them useful but they are less so when you try to do something novel.
Well over 90% of work out there is not novel. It just needs someone to do it.
Because that research helps you understand your market and where the value generation is. This can expose where to better invest.
aqueueaqueue 32 days ago [-]
A lot of assumptions there. Why isn't Ford the only motor company?
And if the flywheel is that AI begets AI exponentially in an infinite loop then those share certificates you own probably won't be worth much. The AI won.
Coincidentally, Anthropic's mission is AI safety.
tinyhouse 33 days ago [-]
Understating who is using your product is wheel spinning?
SpicyLemonZest 33 days ago [-]
I don't see it. This is just an analysis of how Anthropic customers are using the product and what investment areas seem most promising in the future - why wouldn't they want that?
nerdponx 33 days ago [-]
It's clearly more than an interesting tech blog post written by one of the data guys in their spare time. It's an "initiative".
That said, this doesn't seem like completely superfluous "fat" like what Mozilla does. It seems very much targeted at generating interesting bits of content marketing and headlines, which should contribute to increasing Anthropic's household brand name recognition vs. other players OpenAI, as well as making them seem like a serious, trustworthy institution, rather than a rapacious startup that has no interest in playing nice with the rest of society. That is: it's a good marketing tool.
My guess is that they developed it internally for market research, and realized that the results would make them look good if published. Expect it to be "sunset" if another AI winter approaches.
sofixa 33 days ago [-]
Even on the contrary, this is very important information to have, in order to understand your customer base and how sticky you are with them, what features you need to focus on, etc etc
castigatio 32 days ago [-]
We live in a world where there's a lot of talk about how AI might impact societies and economies - but little actual data. To me it seems very worthwhile to try to add 'any' data to that discussion and track how things change over time. Are reports of economic or labour trends pointless? Should companies not track how people use their products? I don't think it costs Anthropic much to do this - it's work for a couple of people to analyze their database.
amazingamazing 33 days ago [-]
idk, the models themselves are quickly becoming a commodity. it makes sense to spend money figuring out go to market rather than just improve the models themselves.
laidoffamazon 32 days ago [-]
I would argue this is within their overall objective. It’s not like Stripe creating a publisher (??)
blackeyeblitzar 32 days ago [-]
They only have like 500 employees. And you could argue this is part of their stated mission.
Artemon 32 days ago [-]
only?
throwaway954399 32 days ago [-]
And yet they don't have the resources to let job applicants know when their application was unsuccessful. You just get an email after you applied saying: "We may not reach out unless we think you are a strong fit for the role you applied to. In the meantime, we truly appreciate your patience throughout our hiring process." They also tell you not to use AI in the application.
AnEro 33 days ago [-]
Everyone in the comments seem to not like the article or see it as a waste of time. I just don't think we are the audience they wanted for this, I think they want to show the average business owner the realistic potential and public (journalists that will distill this later) they are aware of the impacts and what to expect.
I don't read it as fear AI, I read change is happening because of AI.
ajmurmann 33 days ago [-]
They also call out how it's more used for augmentation rather than full automation which will address some concerns the public has.
s_dev 32 days ago [-]
Tools empower those with knowledge further than those without knowledge. The fact that people were concerned layman were simply going to be able to take on experienced programmers at their day jobs was farcical.
neom 32 days ago [-]
I'm not a dev but I thought the issue from a senior dev perspective was AI on AI, not layman on AI?
ajmurmann 32 days ago [-]
It seems like you get downvoted, but I think you touch on something important. IMO, right now there are two limitations with AI replacing experienced developers:
a) It's not good enough at programming. It sometimes goes down rabbit holes and cannot get out. In other cases it comes up with ridiculously complicated solutions that could be solved much simpler.
b) Making assumptions instead of gathering requirements.
I suspect that a) will get better over time. I also suspect that b) can be addressed by a pre-programmed prompt-flow that uses a LLM to gather requirements from a PM and ask probing questions to get a well-defined scope and agree on how edge cases should be handled. It doesn't seem far-fetched that a AI also would be able to call out small requirement changes that might allow for much simpler/faster solutions.
neom 32 days ago [-]
That is also what I think will happen, I mean these tools right now are just capturing market share with funding rounds and hacks surly? wrapping any foundational modal and trying to scope it down is always going to suck, but once people start to truly nail just training in only the information required to do that job (I described 03-high-mini or whatever it's called as a dumb finance bro with no depth to my wife) and then couch the task LLMs with orchestration LLMs, surly things improve?
SilasX 32 days ago [-]
I honestly don’t know who the audience could be, other than “people who like to tell others they’re in the know because they read AI companies’ press releases.”
At no point do I see an actual elevator pitch/tl;dr/summary of what the frak this index actually is, except that it’s part of some effort to track AI adoption. It just rains down figures about which industries are using how much AI without first grounding the new concept they’re introducing.
When you say you have a new economic index, you need to give me a number, how I should interpret that number, and where it comes from. I don’t see that.
GDP: measure if a country’s total economic output by adding up end product purchases.
CPI: general price level by taking a weighted average of prices throughout the economy
Big Mac index: how expensive goods are in a country relative to the US by reference to the local cost of a Big Mac, converted through the exchange rate.
Here I expect something like “the economic output-weighted fraction of production taken over by AI”, but instead it’s just a list of AI adoption by industry.
Why introduce an index and not headline with a definition of an index? Which audience prefers that?
ajmurmann 33 days ago [-]
Awesome that there are releasing this paper and the associated data. I hope they'll do this regularly, so that changes can be tracked.
One thing I hope they'll correct going forward is inclusion of API usage. Anecdotally, I only use Anthropic models via Cursor. So none of that usage shows up in here. I'd expect that specialized tools/interfaces like Cursor will grow and thus more usage will shift to API. It would be a shame to miss out on that in the data set.
cruffle_duffle 32 days ago [-]
Tools like cursor have to be huge chunk of their traffic. I mean Cursor and Sonnet are like two peas in a pod.
Even if they don’t train on the data they could break it down by user agent / API client ID and infer something about cursor traffic.
bwhiting2356 32 days ago [-]
I expect more 'automation' to happen through the API than 'augmentation'.
rsanek 33 days ago [-]
haven't they committed to not training on / using data submitted through API?
rafaelmn 33 days ago [-]
Feels like a page of graphs where Anthropic team discovers that Claude is the best coding model and is used mostly by devs. And they have no penetration in general population compared to OpenAI
CL_ergo 32 days ago [-]
I laughed out loud at their chart showing that Sonnet had a higher share of coding questions, whereas Opus had more writing.
If they would just look at their product, they'd see that it literally says it in the model description that Opus is better for writing. If you advertise one of your models as geared for task X, the insight that people use it more for task X isn't really an insight.
mlinhares 33 days ago [-]
Read it as that as well, I think this mixes the "there is no penetration in that market" with "we have not been able to get these people to use our tools".
hall0ween 33 days ago [-]
Does OpenAI have similar data available to compare to?
iagooar 32 days ago [-]
I used to use Claude.ai as my go-to LLM for everything. But then my conversations around taxes and finance got very frequently patronized by the LLM and even flagged. All legal stuff! It is just that my personal tax situation is a bit more complex than other people's because of businesses I run and geographic complications (living in more than one country, etc).
It got to the point where I was forced to go to ChatGPT if I wanted to just be left alone and get my answers. Then o1, o1 pro, o3-mini and Deep Research dropped and I have almost no reason to go back to Claude anymore. These days my main use case is using it as part of Cursor for code generation / co-piloting. But that's it.
If Anthropic wants to get me back, they should treat me as an adult again.
Havoc 32 days ago [-]
That tracks well with my subjective experience.
At day job - finance/office stuff - essentially zero traction despite everyone having enterprise AI subs & brainstorming sessions about use cases etc.
Then go home & do some hobby coding and suddenly it's next level useful.
It's not that the one is harder than the other, but rather that many jobs don't have an equivalent to a code base. The AI could I think grok parts of the job but typing up relevant content & what is required would take longer much than doing the task. There is nothing there to copy & paste for a quick win in the same way as code.
slama 33 days ago [-]
> Overall, we saw a slight lean towards augmentation, with 57% of tasks being augmented and 43% of tasks being automated.
I'd like to see a comparison to the data 6 months ago, before Sonnet 3.5. I suspect the automation rate will track up over time, but that may mostly be captured by API usage which isn't in the dataset.
xnx 33 days ago [-]
I wouldn't expect Anthropic to have any special knowledge/insight in this area aside from the data they have on usage of their own tools. As such, I'd be much more interested to see some "Google Trends"-like data about Anthropic usage. Unfortunately, since AI is dynamic and competitive industry, I don't think anyone will be sharing information like that for a long time.
827a 33 days ago [-]
Honestly, simply releasing a graph showing the trendline of their own metrics on e.g. inference would speak volumes. They have this data already, 100%, its on some grafana dashboard the on-calls are watching every day. My suspicion is that some AI providers (OAI/Google) are watching these metrics go up-and-to-the-right quite consistently, but Anthropic's isn't doing that.
> I would say it is trust worthy because if it were found to be gamed then Anthropic’s reputation would crater.
But on the other hand, how would we found out that they've gamed the numbers, if they were gamed? Unless you work at Anthropic and have abnormally high ethics/morals, or otherwise private insight into their business, sounds like we wouldn't be able to find out regardless.
_nalply 33 days ago [-]
I do wonder how much of the population is using AI.
On page 7 of the paper there's the diagram "Minimum fraction of tasks in use". On the left side about 75% of occupations use at least one tasks and on the right side the maximum is some occupation that uses slightly more than 95% of the tasks.
Cool.
Here I start to wonder how they got that graph.
At the start of section 3. Methods and analysis on page 4 it's said:
> To understand how AI systems are being used for different economic tasks, we leverage Clio [Tamkin et al., 2024], an analysis tool that uses Claude [Anthropic, 2024] to provide aggregated insights from millions of human-model conversations. We use Clio to classify conversations across occupational tasks, skills, and interaction patterns, revealing breakdowns across these different categories. All analyses draw from conversation data collected during December 2024 and January 2025.
So this means they use real people's chats to make these estimations. I don't know Clio, but perhaps they did this? They sample chats from individuals, and some individuals never chatted and some individuals delegated all their work to Claude. But I wonder how they estimated the total numbers of tasks of an individual.
I am sure these answers are found by really going deep and reading the cited sources and running some experiments yourself, but I can't be bothered, sorry.
Again, I really wonder how much of total population use AI? How much? How do parts of population differ? Can this be found out at all?
andai 33 days ago [-]
I don't know about older people, but I'd wager about 100% of young people based on the hushed whispers echoing around the public library.
ramon156 32 days ago [-]
> While Claude.ai data contains some non-work conversations, we used a language model to filter this data to only contain conversations relevant to an occupational task, which helps to mitigate this concern.
That's nice. My main prompt has a hint suggesting when I refer to work. I do this so Claude can assume my tech-stack. Of course I exclude or mask confidential data, but it's still nice this stuff gets filtered.
corry 33 days ago [-]
IMO it's a mistake to get too caught up in the (admittedly self-described) goal of modelling AI's economic impacts.
Instead, this is a super rare and valuable look into who/what/how folks are doing with Claude across millions of conversations, nicely categorized by function and task.
The economic impact data (i.e. wages) that they might overlay onto that usage data is a separate thing that -- of course -- is more subjective and likely to be part of some PR machinery about the public value of AI etc.
But as to sharing the raw usage data itself - we should applaud it! What a useful window into how this stuff is being used in the real-world.
Will OpenAI release similar data? Why or why not? I hope they will. It elevates the discussion for everyone, and frankly would be 'good business' if it gets people thinking about who/how AI could be used at their organization with more granularity.
33 days ago [-]
ryao 32 days ago [-]
Their promise not to train on the conversations seemed to imply that conversations were private. It is disconcerting that they were not. The extent to which you need to look for loopholes in privacy guarantees these days is amazing.
neom 32 days ago [-]
Their privacy policy is EXTREMELY easy to interpret:
They use personal data “to improve the Services and conduct research.” Your chat interactions (that is, your "Inputs and Outputs") are included in the data they collect. and: "If you include personal data in your Inputs, we will collect that information and this information may be reproduced in your Outputs."
You don't need to look for loopholes, it's spelled out plainly.
falafels 32 days ago [-]
I listened to an interview a while back with the researchers did this analysis. They developed privacy-preserving techniques, avoiding having to read user conversations directly.
This is Anthropic we're talking about, they're rightly recognized as the 'ethical' AI company.
ryao 32 days ago [-]
There would be even better privacy if they published the weights so that they could be run by third parties that have no log policies or even better, locally.
Cheer2171 32 days ago [-]
Words I typed into a service I paid for are now in a huggingface repo. Fuck that.
I ve stopped believing any of this PR from AI companies that are trying to justify their huge valuations with some nebulous societal impacts that we have yet to see any trace of. Stop talking about the damn thing and show me the thing
gmaster1440 33 days ago [-]
the entire premise of this economic index is that they're showing you actual usage and insights from millions of anonymized claude conversations.
ActionHank 33 days ago [-]
As PR
acomms 33 days ago [-]
Is there a way to do it as not PR?
ActionHank 32 days ago [-]
Yes, publish a paper or the stats.
Putting it on your company blog is marketing, always.
The only exception is PornHub insights, because they don't need to advertise and the people reading are there for the insights.
It's pretty great that the AI bros have gotten quieter now. It was frankly exhausting.
Instead of getting robots that do the laundry and clean the kitchen we got robots that do token work in a showroom at a BMW factory.
All the knowledge surfaced through LLMs was already mostly available online, they just make it more cohesive. It is better search.
Devs have figured out that creating a login page over and over is not a job, and that is now somewhat automatable.
Also everyone hates the name Devin now.
picafrost 33 days ago [-]
I think Anthropic does a good job highlighting the efforts they take to anonymize data. It’s a marketing risk even to be open about the use of this data.
There are things we say or write openly without caring who hears or reads it. Things we share with friends and family. Things we share with our closest friend, partner, or therapist. Finally there is our private heart which holds the things we’re not comfortable sharing with anyone.
I worry that LLMs are sufficiently anthropomorphic but not "real" enough to be privy to these latter thoughts. In the wrong hands this data is catastrophic at the individual level.
hn_acc1 32 days ago [-]
(not an experienced AI person, just a software dev whose used it a bit. Don't know all the players, every term for every model, etc)
We have github copilot and augment available for making suggestions inside vscode. I don't think either are anthropic - but I'm sure they offer a similar feature. I wonder if they count EVERY suggestion offered as a "use"? Sometimes it really helps, but it makes plenty of suggestions I ignore. Does it essentially treat every keystroke as a use then, since it updates / re-suggests sometimes with every keystroke?
advael 32 days ago [-]
Anthropic pivoting to doing economics? Seems about as legit as most economics
Probably an overall smart move, since claiming to be doing economics sometimes leads to being positioned to make policy favorable to oneself
trash_cat 32 days ago [-]
I think it's not economics per se, but a responsible outlook on society as a whole. I think it is what sets Anthropic apart, as they do focus more on how AI affects society. That is why they have an emphasis on AI safety.
33 days ago [-]
rm_-rf_slash 33 days ago [-]
I’m not surprised most of their usage is for coding tasks, but I wonder if that reflects a kind of selection bias.
HN and programming subreddits rave about Claude for coding, so it’s possible that a lot of developers use Claude for coding, but the average AI use case may weight differently on ChatGPT or Grok.
In my experience, if ChatGPT can’t solve a coding problem, I try again on Claude. Although this happens less frequently since upgrading to o1-pro and o3-mini-high. And I haven’t used Claude for anything else.
bbor 32 days ago [-]
All these comments, and not a single person wondering what exactly a "shampooer" is... IMO that's an easter egg to see if we're paying attention!
logicchains 32 days ago [-]
"As we predicted, there wasn’t evidence in this dataset of jobs being entirely automated"
- but then later mention they didn't include API queries in the data, only Free and Pro queries on the website. Most "full automation" type queries would use the API, not the web interface (and nowadays probably wouldn't use Claude anyway due to how expensive its API is compared to Deepseek R1 or O3 Mini).
danvoell 33 days ago [-]
I agree this is the biggest issue. Too many folks are passing off the potential behind this stuff...this might replace the person beneath me but... As a one person manufacturing business, I've augmented 5 positions that I otherwise would have needed to hire. The end game of this is 10% of the population will be able to produce the goods for 100% of the population. That's the problem statement.
rmah 32 days ago [-]
This is already almost true. Only 8% of the American workforce is in manufacturing and about 29% of China's (and about 1/7th is exports to the US). I'd guess that that somewhere around 12% to 15% are needed to manufacture all goods for an advanced economy. Another 2% or so for agriculture. Not really much more to go.
33 days ago [-]
Cheer2171 32 days ago [-]
Words I typed into a paid service are now public on a huggingface repo. I don't care if it is anonymized. Fuck that. I am deleting my account.
cosmojg 32 days ago [-]
Uh, this isn't true at all. Did you actually look at the repo[1]? Only the metadata (i.e., LLM-generated task names and interaction classifications) have been made available.
People don't react to data privacy stuff with rationality, they don't care to understand anonymization. For 95% of people it's only gut instinct, and that's all it will ever be.
albert_e 33 days ago [-]
Interesting
are there any other good reads on the Economic impact of AI that is not just hype or marketing but more considered analysis of data / indicators?
33 days ago [-]
Dahmonium1 32 days ago [-]
I wonder whether a higher degree of use by a particular professional group is detrimental/threatening to that occupational group or rather to the professional groups that use it to a lesser extent.
I also wonder if a number for success rate of the tasks makes for a more complete picture.
33 days ago [-]
jtrn 32 days ago [-]
It would be interesting to know how much of this effect is also explained by factors other than income and profession, and even types of work.
I work both as a software developer and a psychologist, and I love tinkering in the shop with welding and mechanics. It is extremely obvious that using AI is more available and appropriate when coding, as you're often in front of a very capable computer with a good interface to interact with. When I am a psychologist, it's not as fitting to bring out a computer and input prompts. And when I'm working in the shop, it's more of a hassle to grab the phone and ask a question.
Types of work and knowledge work, obviously, are ripe for integration with AI tools, but I think the pure ease of use/availability is a major factor. Sometimes two seconds of extra work to do something is the difference between not doing it and doing it.
I'm a heavy user of dictation and voice-assisted features on mobile phones, but it just doesn't cut it when you have to fight with the phone to select text and copy-paste. (The clicking of selected text to copy is so temperamental, and why the hell is the contextual menu so inconsistent after you've selected text still! I selected the text and waited for the tooltip to appear, but it only does so if it feels like it still.)
Anyways, "ease of use for a given profession" vs "Actual usage" is also important, is my point...
[Edit for spelling]
Philpax 32 days ago [-]
Have you tried ChatGPT's advanced voice mode, and if so, what did you think of it?
jtrn 32 days ago [-]
I have used it a lot, and I love it, but it's very limited with regards to which situations it's useful in. It's way too sensitive to sound, so it stops way to often when answering; if there is noise in the room (as there often is in a shop).
It's also often not useful because it's more work to spell out every other thing that dictation is not good for.
For instance, If i want to ask
"What does the ICD-10 code for F320 stand for?", it might transcribe it as
"What does IceDen code for F3. 120. stand for?"
When I have to start messing around with the keyboard anyways, it's double slow compared to just typing on a physical keyboard.
Many times when I need input, the thing in question is a technical term. This is as true in psychology as in coding. So it must have a way to correctly understand the uncommon terms, for instance, a predictable way to spell out or ask for clarification. Same with regards to coding terms. What is the chance that it correctly understands?"Explain #include <stdio.h> syntax"?
That said, it's awesome as long as the question uses common and predictable words. It's just surprising how often it uses uncommon terms. Thus, it's awesome, but limited. The best use case is when I think of a topic while walking the dog that I want more information on. Then I can have a cool conversation with it while walking.
On another note: It went completely off the rails for me a month ago and stopped giving useful information after it created a memory that I "want short, concise, factual, and to-the-point responses," which is true, but it went from informative to almost giving me the silent treatment and answering show short that it was useless. I feel it never got completely back to normal after removing that memory.
srcreigh 32 days ago [-]
This is based on # of conversations started. That's it. It's just bad data to compare AI use across professions that spend vastly different amounts of time sitting at a computer not in any meetings.
itkovian_ 33 days ago [-]
Incredible how low usage is among lawyers. Does anyone have any intuition on why?
aithrowawaycomm 33 days ago [-]
Part of it is selection bias, Claude is much less general-audiences than ChatGPT. But any lawyers using LLMs in 2025 deserve to be disbarred:
The list goes on and on. Maybe there's a bespoke RAG solution that works...maybe.
ghxst 32 days ago [-]
> But any lawyers using LLMs in 2025 deserve to be disbarred
In what year would you think it will be acceptable and why?
LLMs are tools, I don't see anything wrong with using them in any occupation as long as the user is aware of the limitations.
mistrial9 32 days ago [-]
no - some Judge wrote to his family member recently.. " I am seeing all these great briefs now " followed by a novice discussion of AI use. This is anecdotal (recent), but it says to me that non-lawyers, with care, are writing their own legal papers across the USA and doing it well. This fits with other anecdotes here in coastal California for ordinary law uses.
93po 32 days ago [-]
i think they're especially likely to hallucinate when asked to cite sources, as in they're mostly prone to making up sources, and a lot of the work my lawyer friend have asked of chatgpt or claude requires it to cite stuff, and my friend has said it has just made up case law that isn't real. so while it's useful as a launching point and can in fact be helpful and find real case law, you still have to double check every single thing it says with a fine tooth comb, so its productivity impact is much lower than code where you can clearly see whether the output works immediately
drewbeck 33 days ago [-]
My guess is bc hallucinations in a legal context can be fatal to a case, possibly even career endu g — there’s been some high profile cases where judges have ripped into lawyers pretty destructively.
cbg0 33 days ago [-]
Because LLMs make things up and the lawyer is liable for using that made up information.
jeffbee 33 days ago [-]
Lawyers are selected for critical thinking skills and they aren't vulnerable to AI hype the way relatively poorly educated computer guys are.
"Claude is fully capable of acting as a Supreme Court Justice right now."
startupsfail 32 days ago [-]
Examples given at the Figure 1 are very strange, they seem to put questions like « how do I make my game run » or « make my blog post better » into occupations/productive work?
keybored 33 days ago [-]
Independent of whatever this org does, it makes perfect sense for economists to start to track when wage labor can be gotten rid of (in line with the AI Hype).
linux_devil 33 days ago [-]
Assuming that AI adoption is more prevalent in the technical domain, could this be one of the reasons why it is leaning towards computer and technical usage?
Not really any surprises here, if you’ve been following this stuff. I’d be much more interested in understanding what is holding up penetration of Claude and the rest into medicine, finance, and law.
This may just be my ignorance, but it seems that distributed version control is a highly valuable technology which hasn’t penetrated that well into law. If this is true—my evidence is only anecdotal, talking with lawyers—then it should provide partial insight that translates into the problem of LLM adoption.
dzonga 32 days ago [-]
AI, just like crypto - exposes narrow minded tech bro's who live in some tech utopia where tech rules the world.
before it was smart contracts will replace lawyers & contracts. DeFI will replace traditonal finance.
now it's AI will replace jobs - because it can autocomplete Javascript and guess the next sequence of english / {{whatever}} lang words.
hell AI won't even replace CRUD software engineers who make software based on some business rules.
vonneumannstan 33 days ago [-]
Its a public benefit research company. Expecting it to behave like a normal Corporation is missing the point...
cma 33 days ago [-]
It's nice that their public benefit charter terms are actually public unlike Blue Sky's, where bsky can have anything in there even just minor benefits stuff and still be pretty much a normal corporation.
asadotzler 32 days ago [-]
B corps are normal corporations.
33 days ago [-]
80-beats 33 days ago [-]
[dead]
saranshsharma 33 days ago [-]
[dead]
rvz 33 days ago [-]
AI companies: But don't worry, AI chatbots will create a UBI utopia where we will not do any work and we will have a future that will give us more time back and would never have to pay for work or food again! /s
Here's the reality: You are getting displaced.
Companies like Anthropic and OpenAI screaming about AGI are repeatedly lying to you as they raise more money while Meta (who are laying off staff today), Salesforce (announced layoffs as well) [0], Klarna (not hiring), etc are admitting this in front of us (and laughing at all of us).
Do you get it now? I'm giving you a 5 year head start of their plan before it becomes a complete catastrophe for the market. [1]
I don't think I can eat the output of some AI chatbot...
cma 33 days ago [-]
Pretty sure things are gonna go far beyond chatbots into robots
mola 33 days ago [-]
Probably robotic avatars like those Tesla bots
You can be a servant for the billionaire class without leaving your home! Actually your makeshift hobble, you can't afford housing.
keybored 33 days ago [-]
It’s time to take up socialism, folks.
bbor 32 days ago [-]
Yup. If you disagree that's fine, I'm 99% sure you just disagree about the words -- it's time to take up whatever "real" democracy means to you! A democracy where increased productivity doesn't make people worry that they'll paradoxically have less. A democracy where power is neither hereditary nor selfish, but rather a civic sacrifice. A democracy where human flourishing is the goal, not a side hustle.
throwpoaster 33 days ago [-]
I wish Anthropic luck.
The company seems to be operating in a classic failure mode: being more concerned with its industry than its competitors and customers.
Where I could be wrong is the CEO is technical, however most of what I hear from them is about industry and social impact instead of product.
33 days ago [-]
phillipcarter 33 days ago [-]
> however most of what I hear from them is about industry and social impact instead of product
Have you considered that, since they are a public benefit corporation staffed with people who left OpenAI in part due to more capitalistic pursuits, this is by design?
throwpoaster 33 days ago [-]
I am saying they are choosing to operate inside a failure mode, not that they are doing so accidentally.
I'm also very wary of their analysis method, given classifiers-gonna-classify. We already see it in their example of someone asking why their game is crashing and it buckets them into Computer & Mathematical occupation. I'm guessing the original question was not that of a game developer but rather a game player, so can you really call this an occupational task? Sure it's in that domain, I guess, but in a completely different context. If I'm asking a question about how to clean my dish washer, that's hardly in repairman or industrial occupations.
Still, it's cool they're doing this.
https://trends.google.com/trends/explore?date=today%205-y&ge...
Which suggests that the most common use is as a tutor / cheating on homework.
The troughs in that graph are all during prime US school/college vacation times: Summer, Winter, and Spring breaks. And then magnitude of the fall corresponds to how long the breaks typically are. To me, that makes a lot of sense.
Most of those kids will continue to use it as they graduate, having embedded it in their workflow (unfortunately many will probably fully outsource all thinking to it, having learned a lot less since it did it all for them).
What are you referring to?
https://www.404media.co/ceo-reminds-everyone-eightsleep-pod-...
Not statistically.
Is the soup smooth or lumpy? Striated or uniform? For that matter a soup could (and often does) involve huge soup bones that give it important parts of its flavor, but never show up directly in a spoonful. And you might need something different from a spoon to convincingly rule out some specific rare lumpy ingredient.
The didactic value of sampling the soup pot goes well behind its basic function: correcting the beginner’s misperception that a sample’s statistical power is directly related to population size :)
“You're gonna need a bigger spoon!”
Since there aren't a billion students in the USA, 35% of them is an impossibility.
If you scale your population above some recognized boundary you aren't sampling in the same space any more. After all the local star density to 1AU tends very strongly to 1. That's not indicative of the actual star density in the milky way.
No, The most important thing is the distribution of the sample size. You have to make sure it isn't obviously biased in some way (i.e You're only surveying students in a university for extrapolation on the entire population of the country). Beyond that, the desired sample size levels off quickly.
5000 (assuming the same distribution) won't be any more or less accurate for 10M than it is for 1M.
Of course, if you just ask everyone or almost everyone then you no longer need to worry about distribution but yeah
Software engineering is a weird niche that is both a high paying job and something you can almost self-teach from widely available free online content. If not self-teach, you can rely on free online content for troubleshooting, examples, etc.
A lot of other industries/jobs are more of an apprenticeship model, with little data and even less freely available on open internet.
I think you massively underestimate just how much data is online for everything, especially once you include books which are freely available on every possible subject (illegally, perhaps, but if Meta can download them for free then so can everyone else).
There's less noise for many other subjects than for software engineering, there's often just a couple rather than 100s of competing ways to do everything. There might just be one coursebook rather than 1000s of tutorials. But the data for self teaching is absolutely there.
Medicine faces two key challenges. First, while research follows the scientific method, much of what makes a good doctor—intuition, pattern recognition, and clinical judgment—is rarely documented. Second, medical data is highly sensitive, limiting access to real-world cases, images, and practice opportunities. Theory alone is not enough; hands-on experience is essential.
Law presents a different problem: unknown unknowns. The sheer volume of legal texts makes it nearly impossible to be sure you’ve found everything relevant. Even with search tools, gaps in knowledge remain a major risk.
Compounding this is the way law is actually practiced. Every judge and lawyer operates with a shared foundation of legal principles so basic they are almost never discussed. The real work happens at two higher levels: first, the process—how laws are applied, argued, and enforced in practice. Then, at a third, more abstract level, legal debates unfold about interpretation, precedent, and systemic implications. The first level is assumed, the second is routine, and only the third is where true discussion happens.
Self-teaching is easier in fields where knowledge is structured, accessible, and complete. Many subjects are not.
Similarly, I would wager most of the useful economics and financial theory that humans have come up with is only known to hedge or prop trading firms.
For some subjects, the entire journal-published academic body of knowledge for it is probably some useless fraction of the whole and university academia is operating nowhere close to the cutting edge. People are probably doing PhDs today on theses that some defense contractor or HFT firm already discovered 20 years ago.
Even things like specialized medical knowledge, I would wager is largely passed down through mentor-mentee tradition and/or private notes as opposed to textbooks. It's unlikely that you can teach yourself how to do surgery just from textbooks. I once had a pathologist's report use a term for a skin condition that was quite literally ungoogleable. The skin condition itself was fairly ordinary, but the term used was outright esoteric and yet probably used on a daily basis by that pathologist. Where did he learn it from?
Not everything is on the Internet.
If the instructions aren’t immediately available, the internet provides connections and forums to find anything your heart desires.
Information wants to be free.
Arbitraging micro-opportunities (or far more likely, deploying insider information masked as HFT or some secret sauce arbitrage) is not economically useful.
Sure, you can learn all about power electronics by yourself. But have some ideas you want to implement? Hundreds to tens of thousands of dollars.
If you meant programming, I agree it could be self-taught, but not SE. SE is the set of techniques, practices, tools that we have collected over decades for producing multi-versioned software that meets a certain reliability rating. Not all of these is freely available online.
Thing is, I’ve never met someone in software with a professional license.
I had about 12 YoE at the time, and my manager didn't realize I didn't have a degree until after I was hired. Apparently it hadn't affected my offer, and he was more impressed than anything.
You say:
> SE is the set of techniques, practices, tools that we have collected over decades for producing multi-versioned software that meets a certain reliability rating. Not all of these is freely available online.
The same way there's no single guide on the internet on how to be the kind of engineer who builds reliable or extensible software, I don't think there's a guide hiding in the average CS curriculum.
Most of it consists of getting repetitions building software that involves the least predictable building block in all of software engineering (people), in all its various forms: from users, to other developers, to yourself (in the future), to "stakeholders", etc.
Learning how to predict and account for the unpredictability in all the people who will intersect with some facet of your software is the closest I've seen to a "universal method" for creating software that meets the criteria you defined.
And honestly I'd be concerned if someone told me you can just be taught some blessed set of tools and practices to get around it... that sounds a lot like not having actually internalized why they work in the first place, and the "why" is arguably more valuable than the tools and practices themselves.
On the other hand, an open and level playing field does not exist in the thirty-some odd years of open markets software development. No one since Seymour Cray has done complete systems design, really.. it is turtles all the way down. You have to get hardware to run on, and the software environment is going to have been defined for that.. CPU architectures and programming languages. People who write whole systems generally do it in teams.
The arrogant and self-satisfied tone of the corporate worker-bee says that there is no such thing as real software engineering skills?
like defining "health" or other broad topics.. the closer the topic is examined, the more holes in the arguments. I am glad I never punched a time clock for Elon Musk, however, all things considered.
I feel very fortunate that the core blender devs had the patience to put up with my stupid amateur mistakes while I learned the skills to become a helpful contributor back in the day.
The contrast is that for a lot of other jobs, the rote tasks are not routinely solvable with free online content in text form.
In marketing, the entire low-end to mid-tier market is gone. Instead of having teams working on projects for small to mid-sized companies, there's now a single Senior managing projects with the help of LLMs. I know multiple agencies who cut staff by 80-90% without dropping revenue.
Translation (of books, articles, subtitles) was never well paid, even for very complex and demanding work. My partner did it a bit on the side, mostly justifying the low pay with some moral bla about spreading knowledge across cultures... With LLMs you can completely cut out the grunt part. You define the hard parts (terms that don't translate well), round out the edges and edge out the fluff, and every good translator becomes two to ten times more productive. Since work is usually paid by the page, people in the industry got a very decent (at least temporary) pay jump, I would imagine around 100%.
Support is probably the biggest one though. It is important to remember that outsourcing ot India only works for English speaking countries. And even that isn't super cheap. Here in Germany, if you don't have back-up wealth, it is your constitutional right to get some support from the state (~1400 euro), but you are obligated to find a job as soon as possible, and they will try to help you find a role. Support was always one of the biggest industries to funnel people towards. I talked to a friend working there, and according to them the complete industry basically stopped advertising new positions, the only ones that are left are financial services. The rest went all in on LLMs and just employ a fraction of the support stuff to deal with things escalating enough.
And that's not even touching on all the small things. How much energy is spent on creating pitch decks, communicating proposals, writing documentation etc? It probably goes up as far as 50% of work in large Orgs, and even if you can just save 5% of your time by using LLMs to phrase or organize, there is a decent ROI for companies to pay for them.
There's just no countervailing force to make these decisions that immediately painful for them. Sectors are monopolized, people are tired and desperate, tech workers are in a basically unprecedented bout of instability.
The situation is super dark from a lot of angles, but I don't think it's really "the overwhelming usefulness of AI" that's to blame here. As far as I can tell, the biggest thing these technologies are doing is providing a cover story for private-equity-style guttings of various knowledge work verticals for short-term profit, which was kind of inevitable given that's been happening across the board in the larger economy, it's just another pretense that works for different verticals.
There are cases where LLMs seem really genuinely useful (Mostly ones that are for and by SWEs, like generating documentation or smoothing some ramp processes in learning new libraries or languages) and those don't seem to be "transformative" at scale yet, unless we count "transforming" many products into buggier products that are more brittle and frustrating to interact with
I'm finding it hard to reconcile this with my own experiences. My whole team ( 5 people ) left last year ( for better pay I guess ) and the marketing agency in germany Im working for had to substitute them with freelancers. To offset the cost they fired the one guy who was hired to push the whole LLM AI topic. We managed to fill one junior position by offering 10k+ more then in their last job. The firm would love to hire people to replace the freelancers. We had to cut stuff lately. But mostly they closed the kitchen which wasn't used due to work from home policy. Definitely don't see any stuff reduction due to automation / LLM use. They still pay (external) people 60€ per written text/article. Because clients don't like LLM written stuff.
- Synchronous translation at political/economic events still needs a personm as it ever did - LLMs are nowhere near the level to be able to translate fine literature at a high enough quality to be publishable - Translating software is still very hard, as the translator usually needs a ton of context/reference for commonly used terminology - we partnered with a machine translation company, and what they produced sucked balls.
I have friends who work as translators, and we make use of translation services as a company, and I haven't seen the work going away.
This just isn't true, it's nowhere close.
If this was true we would see the results in productivity and unemployment stats. We don't though, so far the effect hasn't registered.
I love Claude, but let's not ignore that in the LLM race, they're not exactly the leading player.
It is faster than the reasoning/chain of thought models. With current o1 and DeepSeek though I haven't logged into Claude in a few weeks.
I have no inside knowledge but I am kind of expecting Sonnet chain of thought any day now and I am sure that will be incredible.
Anthropic's llms always (always? at least since 2) have a distinctive "personality". I obv don't know how to quantify it or what "it" really is, but if you've used it you might know what I mean. Maybe that "personality" is conducive to swe?
We aren't leaving MS Office or Adobe because they already pushed out some minimal innovation. But what about the products you don't even know about? For lawyers, doctors, logistics, sales, marketing, wood workers, handymen? In Europe or Asia?
New product by bringing true innovation could easily push out legacy business by "shiny new thing"(AI) and better UX alone. A lot of software in these areas simply hasn't improved for 10 years - with a great idea and a dedicated team it's a landslide waiting to happen.
Google Gemini integration into their docs/sheets/slides and Gmail perhaps will show different demographics in a few months, and that is yet before we heard from OpenAI.
Maybe these models will get better as they’re given more context and can understand the full stack but for now they cannot.
And this is just with code where it already has billions of examples. Nevermind any less data-rich fields. The models still need to get smarter.
https://www.danielianrock.com/research
We already have thousands of geniuses working across our economies and teaching our youth. The best of our minds have every year or so been given a global stage in Nobel speeches. We still ignore their arses and will ignore it when AI tells us to stop fighting or whatever.
The real issue here is that wafer scale chips give 900,000 cores, and nothing but embarrassingly parallel code can use it - and frankly no coder I know writes code like that - we have to rethink our whole approach now Moores law is over. Only AI has anything like ability to use the processing ability being built today - the rest of us can stick to cores from 2016 and nothing would change.
Throwing hundreds of billions at having a bad way to program 1 million cores because we have not rethought software and businesses to cope seems wrong - both because “Whitey” can spend it on better things but also because it is an opportunity - imagine being 900,000 times faster than your competitors - what’s does that even mean?
Edit: Trying to put it another way - there are two ways AI can help us - it can improve cancer treatments at every stage of medical care, through careful design and creation of medical AI models that can slowly ratchet up diagnosis, treatment and even research and analysis. This is human organisations harnessing and adapting around a new technology
Or AI can become so smart it just invents a cure for cancer.
I absolutely think the first is going to happen and will benefit the denizens of the first world first. The second one requires two paradigm shifting leaps in the same sentence. Ten years ago I would have laughed in Anthropics face. Today I just give it a low probability multipled by another low probability- and that is an incredible shift.
I feel like this has less to do with what LLMs are best at and more to do with which folks are mostly likely to spend time using a chat bot.
Minor nitpick. Use of the word 'spend' as a noun is not widespread and not well known.
The majority of audience and posters of ycombinator are not in that industry group, right?
And it’s not just them. To me this trend screams “valuations are too high”, and maybe hints at “progress might start to stagnate soon”.
https://www.anthropic.com/news/the-long-term-benefit-trust
https://time.com/6983420/anthropic-structure-openai-incentiv...
Then the people who funded / trained this "justice" out of their good heart, would actually have leverage, in terms of concrete power.
It's a much more subtle way to capture power, if you can replace the judges with your software.
Brave new world, indeed...
The whole thing about no ethical consumption under capitalism is a just a way to enjoy the conveniences of capitalism on a moral high ground. It's totally doable, you just might not enjoy it haha.
The camel's gotta get its nose in the tent somehow.
It wouldn't specifically brag about doing it, while leaving out that they were specifically dealing with Palantir, because they know what they're doing is unethical: https://www.anthropic.com/news/expanding-access-to-claude-fo...
Being available for use by militaries is incredibly irresponsible, regardless of what scope is specifically claimed, because of the inherent gravity of the situation when a military is wrong. The US military maintains a good deal of infrastructure in the US; putting into their hands an unreliable, incompetent calculator puts lives at risk.
It would be structured as a non-profit (there are no teeth to a PBC; the structure is entirely to avoid liability, and if you have no trust in the executive body of an organization, it has zero meaningful signal).
It would have a different leadership team.
It would have a leader who could steelman his own position competently. Machines of Loving Grace was less redeeming than Lenat's old stump speeches for his position, despite Amodei starting up in an industry significantly more geared for what he had to say, and Lenat having an incredibly flexible sense of morality. Its leader would not have a history working for Chinese companies and jingoistically begin advocating for export controls.
It would have different employees than the people I know who are working there, who have a history of picking the most unethical employers they can find, in a fashion not dissimilar to how Illumination Entertainment's "Minions" select employers.
There are sane investors that prefer investing in companies that adopt these corporate structures. Based on data, those investors see public benefit corporations as more profitable and resilient. They are able to attract employees and customers that would otherwise not be interested or might be less interested.
What is "the agency problem"?
In modern management compensation theory (https://saylordotorg.github.io/text_introduction-to-economic... ) this is key to why executive compensation has increased much faster than workers in the last 50 years.
Stock based compensation mix evolved from this thesis, and quite common in the valley and why almost all OpenAI staff wanted Sam Altman back even though the non profit board did not.
Aligning key talent's compensation to enterprise value is only viable in unrestricted for profit entities any other structure with limits (capped profit, public benefit corporation, non profit, trust, 501c's etc) does not work as well.
Talent will then leave to a for-profit entity who can offer better compensation than a restricted entity can because they share a % of their enterprise value which restricted ones either cannot or not have same liquidity/value [1] etc.
---
[1]This is why public companies are more valuable for RSU/options than private companies, and why cash flow positive companies like Stripe still raise private money to just give liquidity to employees .
It's not the best choice, it's spacer's choice!
Semi-relevant sidenote: ChatGPT, spent $8m on a super bowl commercial yesterday just to show cool visualizations instead of any emotional product use case to an ultra majority audience has never had a direct experience with the product.
These companies would be best served building a marketing arm away from the main campus in a place like LA or NY to separate the gen pop story from that of the technology.
I think AI in its current iteration is going to settle into being like a slightly worse version of Wikipedia morphed with a slightly better version of stackoverflow.
At the base of LLM reasoning and knowledge is a whole corpus of reasoning and knowledge. I am not quite convinced that LLMs will breach the confines of that corpus and the logical implications of the data there. No “eureka” discovery, just applying what we already have laying around.
Just to try it out, I uploaded the paper to DeepSeek-R1 and wrote a paragraph on the desired algorithm, that it should code it in Python and that the code should be as simple as possible while still working in exactly the way as described in the paper. About ten minutes later (quite a long reasoning time, but inspecting the chain of thought, it did almost no overthinking, but only reasoned about ideas I had or should have considered) it generated a perfect implementation that worked for every single test case. I uploaded my own attempt, and it correctly found two errors in my code that were actually attributable to naming inconsistencies in the original paper that the model was able to spot and fix on the fly. (The model did not output this, this I had to figure out myself.) I would have never expected AI to do that in my lifetime just two years ago.
I don't know whether that counts as "novel" to you, but before DeepSeek, I also thought that Copilot-like AI would not be able to really disrupt programming. But this one experience completely changed my view. It might be the case the model was trained on similar examples, but I find it unlikely just because the concrete algorithm cannot be found online except for the paper.
Combined with the old “nothing new under the Sun” maxim, in that most ideas are re-hashes or new combinations of existing ideas, and you’ve got a changed landscape.
But this is not the majority of what software developers are doing and working on today. Most have a set of features or goals to implement using code satisfying certain constraints, which is what current reasoning AI models seem to be able to do very well. Of course, this test was not rigorous in any meaningful way, but it really changed my mind on the pace of this technology.
Plenty of value is already added just by converting unstructured data to structured data. If that is all LLMs did they would be still be a revolution in programming and human development. So much manual entry and development work has essentially evaporated overnight.
If there was never a chat based LLM "agent" LLMs just converting arbitrary text to structured JSON schema would be the biggest advancement in comp sci since the internet. There is nothing equivalent that existed before except for manual extraction or rule based hard coding.
Judging LLMs based on some criteria of creativity or intuition from a chat is missing the forest for the trees.
Well over 90% of work out there is not novel. It just needs someone to do it.
And if the flywheel is that AI begets AI exponentially in an infinite loop then those share certificates you own probably won't be worth much. The AI won.
Coincidentally, Anthropic's mission is AI safety.
That said, this doesn't seem like completely superfluous "fat" like what Mozilla does. It seems very much targeted at generating interesting bits of content marketing and headlines, which should contribute to increasing Anthropic's household brand name recognition vs. other players OpenAI, as well as making them seem like a serious, trustworthy institution, rather than a rapacious startup that has no interest in playing nice with the rest of society. That is: it's a good marketing tool.
My guess is that they developed it internally for market research, and realized that the results would make them look good if published. Expect it to be "sunset" if another AI winter approaches.
I don't read it as fear AI, I read change is happening because of AI.
I suspect that a) will get better over time. I also suspect that b) can be addressed by a pre-programmed prompt-flow that uses a LLM to gather requirements from a PM and ask probing questions to get a well-defined scope and agree on how edge cases should be handled. It doesn't seem far-fetched that a AI also would be able to call out small requirement changes that might allow for much simpler/faster solutions.
At no point do I see an actual elevator pitch/tl;dr/summary of what the frak this index actually is, except that it’s part of some effort to track AI adoption. It just rains down figures about which industries are using how much AI without first grounding the new concept they’re introducing.
When you say you have a new economic index, you need to give me a number, how I should interpret that number, and where it comes from. I don’t see that.
GDP: measure if a country’s total economic output by adding up end product purchases.
CPI: general price level by taking a weighted average of prices throughout the economy
Big Mac index: how expensive goods are in a country relative to the US by reference to the local cost of a Big Mac, converted through the exchange rate.
Here I expect something like “the economic output-weighted fraction of production taken over by AI”, but instead it’s just a list of AI adoption by industry.
Why introduce an index and not headline with a definition of an index? Which audience prefers that?
One thing I hope they'll correct going forward is inclusion of API usage. Anecdotally, I only use Anthropic models via Cursor. So none of that usage shows up in here. I'd expect that specialized tools/interfaces like Cursor will grow and thus more usage will shift to API. It would be a shame to miss out on that in the data set.
Even if they don’t train on the data they could break it down by user agent / API client ID and infer something about cursor traffic.
If they would just look at their product, they'd see that it literally says it in the model description that Opus is better for writing. If you advertise one of your models as geared for task X, the insight that people use it more for task X isn't really an insight.
It got to the point where I was forced to go to ChatGPT if I wanted to just be left alone and get my answers. Then o1, o1 pro, o3-mini and Deep Research dropped and I have almost no reason to go back to Claude anymore. These days my main use case is using it as part of Cursor for code generation / co-piloting. But that's it.
If Anthropic wants to get me back, they should treat me as an adult again.
At day job - finance/office stuff - essentially zero traction despite everyone having enterprise AI subs & brainstorming sessions about use cases etc.
Then go home & do some hobby coding and suddenly it's next level useful.
It's not that the one is harder than the other, but rather that many jobs don't have an equivalent to a code base. The AI could I think grok parts of the job but typing up relevant content & what is required would take longer much than doing the task. There is nothing there to copy & paste for a quick win in the same way as code.
I'd like to see a comparison to the data 6 months ago, before Sonnet 3.5. I suspect the automation rate will track up over time, but that may mostly be captured by API usage which isn't in the dataset.
https://openrouter.ai/rankings
It seems like the most popular choice for API access.
Unless they are trying to mislead competitors (who don't look at their own numbers...), they have no reason at all to game those numbers there.
But, we found out that OpenAI is/was gaming benchmarks (https://news.ycombinator.com/item?id=42761648) and that seems to be forgotten history now - so I don’t know.
But on the other hand, how would we found out that they've gamed the numbers, if they were gamed? Unless you work at Anthropic and have abnormally high ethics/morals, or otherwise private insight into their business, sounds like we wouldn't be able to find out regardless.
On page 7 of the paper there's the diagram "Minimum fraction of tasks in use". On the left side about 75% of occupations use at least one tasks and on the right side the maximum is some occupation that uses slightly more than 95% of the tasks.
Cool.
Here I start to wonder how they got that graph.
At the start of section 3. Methods and analysis on page 4 it's said:
> To understand how AI systems are being used for different economic tasks, we leverage Clio [Tamkin et al., 2024], an analysis tool that uses Claude [Anthropic, 2024] to provide aggregated insights from millions of human-model conversations. We use Clio to classify conversations across occupational tasks, skills, and interaction patterns, revealing breakdowns across these different categories. All analyses draw from conversation data collected during December 2024 and January 2025.
So this means they use real people's chats to make these estimations. I don't know Clio, but perhaps they did this? They sample chats from individuals, and some individuals never chatted and some individuals delegated all their work to Claude. But I wonder how they estimated the total numbers of tasks of an individual.
I am sure these answers are found by really going deep and reading the cited sources and running some experiments yourself, but I can't be bothered, sorry.
Again, I really wonder how much of total population use AI? How much? How do parts of population differ? Can this be found out at all?
That's nice. My main prompt has a hint suggesting when I refer to work. I do this so Claude can assume my tech-stack. Of course I exclude or mask confidential data, but it's still nice this stuff gets filtered.
Instead, this is a super rare and valuable look into who/what/how folks are doing with Claude across millions of conversations, nicely categorized by function and task.
The economic impact data (i.e. wages) that they might overlay onto that usage data is a separate thing that -- of course -- is more subjective and likely to be part of some PR machinery about the public value of AI etc.
But as to sharing the raw usage data itself - we should applaud it! What a useful window into how this stuff is being used in the real-world.
Will OpenAI release similar data? Why or why not? I hope they will. It elevates the discussion for everyone, and frankly would be 'good business' if it gets people thinking about who/how AI could be used at their organization with more granularity.
https://www.anthropic.com/legal/privacy
They use personal data “to improve the Services and conduct research.” Your chat interactions (that is, your "Inputs and Outputs") are included in the data they collect. and: "If you include personal data in your Inputs, we will collect that information and this information may be reproduced in your Outputs."
You don't need to look for loopholes, it's spelled out plainly.
This is Anthropic we're talking about, they're rightly recognized as the 'ethical' AI company.
Putting it on your company blog is marketing, always.
The only exception is PornHub insights, because they don't need to advertise and the people reading are there for the insights.
Instead of getting robots that do the laundry and clean the kitchen we got robots that do token work in a showroom at a BMW factory.
All the knowledge surfaced through LLMs was already mostly available online, they just make it more cohesive. It is better search.
Devs have figured out that creating a login page over and over is not a job, and that is now somewhat automatable.
Also everyone hates the name Devin now.
There are things we say or write openly without caring who hears or reads it. Things we share with friends and family. Things we share with our closest friend, partner, or therapist. Finally there is our private heart which holds the things we’re not comfortable sharing with anyone.
I worry that LLMs are sufficiently anthropomorphic but not "real" enough to be privy to these latter thoughts. In the wrong hands this data is catastrophic at the individual level.
We have github copilot and augment available for making suggestions inside vscode. I don't think either are anthropic - but I'm sure they offer a similar feature. I wonder if they count EVERY suggestion offered as a "use"? Sometimes it really helps, but it makes plenty of suggestions I ignore. Does it essentially treat every keystroke as a use then, since it updates / re-suggests sometimes with every keystroke?
Probably an overall smart move, since claiming to be doing economics sometimes leads to being positioned to make policy favorable to oneself
HN and programming subreddits rave about Claude for coding, so it’s possible that a lot of developers use Claude for coding, but the average AI use case may weight differently on ChatGPT or Grok.
In my experience, if ChatGPT can’t solve a coding problem, I try again on Claude. Although this happens less frequently since upgrading to o1-pro and o3-mini-high. And I haven’t used Claude for anything else.
- but then later mention they didn't include API queries in the data, only Free and Pro queries on the website. Most "full automation" type queries would use the API, not the web interface (and nowadays probably wouldn't use Claude anyway due to how expensive its API is compared to Deepseek R1 or O3 Mini).
[1] https://huggingface.co/datasets/Anthropic/EconomicIndex
are there any other good reads on the Economic impact of AI that is not just hype or marketing but more considered analysis of data / indicators?
I work both as a software developer and a psychologist, and I love tinkering in the shop with welding and mechanics. It is extremely obvious that using AI is more available and appropriate when coding, as you're often in front of a very capable computer with a good interface to interact with. When I am a psychologist, it's not as fitting to bring out a computer and input prompts. And when I'm working in the shop, it's more of a hassle to grab the phone and ask a question.
Types of work and knowledge work, obviously, are ripe for integration with AI tools, but I think the pure ease of use/availability is a major factor. Sometimes two seconds of extra work to do something is the difference between not doing it and doing it.
I'm a heavy user of dictation and voice-assisted features on mobile phones, but it just doesn't cut it when you have to fight with the phone to select text and copy-paste. (The clicking of selected text to copy is so temperamental, and why the hell is the contextual menu so inconsistent after you've selected text still! I selected the text and waited for the tooltip to appear, but it only does so if it feels like it still.)
Anyways, "ease of use for a given profession" vs "Actual usage" is also important, is my point... [Edit for spelling]
It's also often not useful because it's more work to spell out every other thing that dictation is not good for. For instance, If i want to ask "What does the ICD-10 code for F320 stand for?", it might transcribe it as "What does IceDen code for F3. 120. stand for?" When I have to start messing around with the keyboard anyways, it's double slow compared to just typing on a physical keyboard.
Many times when I need input, the thing in question is a technical term. This is as true in psychology as in coding. So it must have a way to correctly understand the uncommon terms, for instance, a predictable way to spell out or ask for clarification. Same with regards to coding terms. What is the chance that it correctly understands?"Explain #include <stdio.h> syntax"?
That said, it's awesome as long as the question uses common and predictable words. It's just surprising how often it uses uncommon terms. Thus, it's awesome, but limited. The best use case is when I think of a topic while walking the dog that I want more information on. Then I can have a cool conversation with it while walking.
On another note: It went completely off the rails for me a month ago and stopped giving useful information after it created a memory that I "want short, concise, factual, and to-the-point responses," which is true, but it went from informative to almost giving me the silent treatment and answering show short that it was useless. I feel it never got completely back to normal after removing that memory.
"A Major Law Firm's ChatGPT Fail" https://davidlat.substack.com/p/morgan-and-morgan-order-to-s...
"Lawyer cites six cases made up by ChatGPT" https://arstechnica.com/tech-policy/2023/05/lawyer-cited-6-f...
"AI 'hallucinations' by ChatGPT end up costing B.C. lawyer" https://www.msn.com/en-ca/news/world/ai-hallucinations-creat...
The list goes on and on. Maybe there's a bespoke RAG solution that works...maybe.
In what year would you think it will be acceptable and why?
LLMs are tools, I don't see anything wrong with using them in any occupation as long as the user is aware of the limitations.
"Claude is fully capable of acting as a Supreme Court Justice right now."
I see only few images available to download
This may just be my ignorance, but it seems that distributed version control is a highly valuable technology which hasn’t penetrated that well into law. If this is true—my evidence is only anecdotal, talking with lawyers—then it should provide partial insight that translates into the problem of LLM adoption.
before it was smart contracts will replace lawyers & contracts. DeFI will replace traditonal finance.
now it's AI will replace jobs - because it can autocomplete Javascript and guess the next sequence of english / {{whatever}} lang words.
hell AI won't even replace CRUD software engineers who make software based on some business rules.
Here's the reality: You are getting displaced.
Companies like Anthropic and OpenAI screaming about AGI are repeatedly lying to you as they raise more money while Meta (who are laying off staff today), Salesforce (announced layoffs as well) [0], Klarna (not hiring), etc are admitting this in front of us (and laughing at all of us).
Do you get it now? I'm giving you a 5 year head start of their plan before it becomes a complete catastrophe for the market. [1]
[0] https://news.ycombinator.com/item?id=42975813
[1] https://www.weforum.org/publications/the-future-of-jobs-repo...
The company seems to be operating in a classic failure mode: being more concerned with its industry than its competitors and customers.
See the first few points here: https://brief.bismarckanalysis.com/p/27-insights-from-three-...
Where I could be wrong is the CEO is technical, however most of what I hear from them is about industry and social impact instead of product.
Have you considered that, since they are a public benefit corporation staffed with people who left OpenAI in part due to more capitalistic pursuits, this is by design?