Looks like Azure is experiencing a major outage, but I cannot find anything about it.
If you look at downdetector.com you'll notice reported outages from OpenAI, Microsoft 365, XBox Live, Walmart, the list goes on.
>Impact Statement: Starting at 18:44 UTC on 26 Dec 2024, you have been identified as a customer who was impacted by a power incident in South Central US and may experience a degraded experience.
>Current Status: There was a power incident in the South Central US AZ03 which affected multiple services. We have applied mitigation and are actively validating recovery to the impacted services. Further updates will be provided in 60 minutes, or sooner as events warrant.
The times are the same for OpenAI - first notice from 11:00 PST (19:00 UTC)
Y_Y 24 days ago [-]
> you ... may experience a degraded experience
What an eloquent way to say "our service isn't working".
orlp 24 days ago [-]
That's the next stage of deflection after passive voice.
1. In normal and honest language you state things you have done, and their consequences.
2. Passive voice attempts to deflect blame by only stating the consequences as if they just magically occurred through divine intervention.
3. In the next stage they don't even acknowledge the consequences, and instead place the entire issue inside your experience of the facts.
perching_aix 24 days ago [-]
I don't think this stuff is the work of the devil personally by a long shot.
We don't know what exactly caused the power issue and they might not have had a root cause at the time either. Let's assume that their power redundancy equipment failed, say, due to insufficient maintenance. This is not an active action, it's a passive one (they didn't do their maintenance duties properly and now it blew). So there is nothing to say for point #1 and #2.
There's also the part where they say that the customers they identified as impacted may be experiencing a service degradation. This may sound pedantic, but I think it is not an entirely unreasonable phrasing. Maybe my business isn't actively relying on the resources I have deployed in that datacenter. How would they know (#3)? Should I clean those resources up? Possibly. Depends on my access patterns and other considerations.
It reads like face (and ass) saving legal esque language. But there's a reason face and ass saving legalese sounds like it does.
24 days ago [-]
caseyy 24 days ago [-]
It’s the same accountability shirking language as when layoffs “have affected you” instead of “I mismanaged this business and as a result I’m firing you”.
It’s always the same abstract invisible hand that just keeps affecting everyone! Scott Alexander’s Moloch perhaps :)
Y_Y 24 days ago [-]
> Scott Alexander’s Moloch perhaps :)
I almost wish I hadn't read that blog post because now I see Moloch and and his invisible handprints everywhere.
Power outage seems really odd. Don't datacenters usually have multiple redundant power supplies + on-site backup power generation? Maybe power "incident" mean something else?
__turbobrew__ 24 days ago [-]
The switching equipment can fail. Had this happen at a DC where the switching equipment arc flashed when going from mains to diesel generators. The switching equipment detected the arc and then locked out until someone onsite could inspect the equipment and override it. The rack UPSes only lasted like 5 minutes and then everything went dark.
red-iron-pine 24 days ago [-]
this happened to a dupont fabros facility in northern va in I wanna say ~2012?
derecho storms hammered the area and killed power. external power lines in failed, and the ATS hung or died when switching to the N+1 diesel generators.
since it never got switched to diesel, the UPS systems kept things going for the standard interval (e.g. ~3-5 minutes) and then ran out of power, and then everything went down. AWS died and IIRC it took a lot of stuff with it, most notably reddit, etc.
emptiestplace 24 days ago [-]
That sounds quiet.
crest 24 days ago [-]
The sound of silence can be deafening.
Daneel_ 24 days ago [-]
The other end of the spectrum is just as terrifying - CRAC unit failure. I never knew fans could scream so loudly.
jasonjayr 24 days ago [-]
I had equipment at a colo facility -- that had a (licensed, bonded, not fly-by-night) electrical tech accidentally drop a tool into the main bus connecting mains, generators, and batteries.
They are lucky they were able to walk away, but the facility was dark till someone could get in there and give the power equipment the green light.
eps 24 days ago [-]
Back in the 00's there was a power outage in downtown Vancouver. It caused Peer1 to fall back to their generators... that weren't tested for ages. They struggled for 5 minutes, gave up and bursted into flames, resulting in the colo not being able to go back online even when the main power was restored.
That was an epic mess. Especially considering they positioned themselves as the most technically sophisticated colo in the region. So, yeah, it happens.
tgsovlerkhgsel 24 days ago [-]
Things happen, e.g. the redundant system also failing, the system that should handle the failover failing, a short circuit that causes enough chaos that the redundant supply shuts down rather than feeding power into a potential fault, ...
toomuchtodo 24 days ago [-]
Amazon famously had an outage that was generator gremlins root cause [1] and tries to engineer around it [2].
Tl;Dr fire in one data center hall was put out with water, water leaked into other hall's power generator and battery area. Turns out loads of water and power generation equipment don't mix well, and servers don't like sitting in puddles of water.
There was a major power outage in the South Carolina data center, and I was on call for that. It's resolved now.
VirusNewbie 23 days ago [-]
Unsung heroes keeping the internet running. Thank you for your service!
sunaookami 24 days ago [-]
Thank you for your service!
belter 25 days ago [-]
> Looks like Azure is experiencing a major outage...
Does Azure have any other state? :-)
DonHopkins 25 days ago [-]
There are occasional minor onages.
Bluestein 24 days ago [-]
I humbly nominate "onage" as word of the year.-
PS. That, and "check engine light management", as a concept ...
colinbartlett 25 days ago [-]
If nobody minds a plug: My own product, StatusGator, which was launched here on HackerNews 10 year ago, notifies IT teams about outages before they are acknowledged by official status pages.
- This OpenAI outage[1] we notified 4 minutes before they acknowledged.
- The last AWS outage[2], we notified 28 minutes before they acknowledged
- There is def an Azure outage[3] now yet they have still not updated their status page. We notified 35 minutes ago.
Btw, tried to sign up and got a message that it would send me an email to confirm my login. Instead what I received was an email pointing me to a video demo. Not sure if simply clicking on that link was enough to confirm my email. That’s outside of a normal workflow.
colinbartlett 24 days ago [-]
Sounds like you got the onboarding email but not the confirmation email. It should have a subject of "Confirm your Account". Email us hi@statusgator.com if you still have issues.
jmpman 23 days ago [-]
Showed up later.
bobjordan 25 days ago [-]
Wow, I subscribed to pro-mode and have been using it like crazy since release to assist my firmware development. Extremely helpful. Now it's changed my workflow so much, I don't even want to work when it's down.
projectileboy 25 days ago [-]
Would you be willing to elaborate on the ways in which the $200/month subscription is better than the $20/month subscription? I’m genuinely curious - not averse to paying more, depending on the value.
bobjordan 24 days ago [-]
Here's why I think it's worth it:
1. Larger Context Window (128K)
With Pro-Mode, I can paste an entire firmware file—thousands of lines of code—along with detailed hardware references, and the model can actually process it.
This isn’t possible with the smaller context window on the $20 tier. On the Pro plan, I’ve pasted like 30+ pages of MCU datasheet information plus multiple header files in a single go. The model is then reasonably capable to provide accurate, bit-twiddled code, many times on the first try. Is it always working on the first go? Sure sometimes, but often there's still debugging, and I don't expect people that haven't actually tried to do it before without AI could do it effectively. However, I can do a code diff using tools like beyond compare (necessary for this workflow) to find bugs and/or explain what happened to pro-mode perhaps with some top level nudge for a strategy to fix it, and generally 2-3 tries later we've made progress.
2. Deeper understanding, real solutions
When I describe a complex hardware/software setup—like the power optimization for the product which is a LiPo-rechargeable fan/flashlight, the Pro-Mode model can understand the entire system better and synthesize troubleshooting approaches into a near-finished solution, with 95–100% usable results.
By contrast, the non-pro plan can give good suggestions in smaller chunks, but it can’t grasp the entire system context due to its limited memory.
3. Practical Engineering Impact
I’m working on essentially the fourth generation of a LiPo-battery hardware product. Since upgrading, the Pro-Mode model helped us pinpoint power issues and cut standby battery drain from 20 days to over a year.
Like, this week it guided me to discover a stealth 800 µA draw from the fan itself when the device was supposed to be in deep sleep. We were consuming ~1000 µA of power when it should be about ~200 µA. Finally, discovered the fan issue and achieved 190 µA without it in the system, so now we have a move forward to add a load switch so the MCU can isolate it from the system before it sleeps. Bingo we just went from a dead battery in ~70 days (we'd already cut it from 20 days to 70 days with firmware changes alone) to now it should take about 1 year for it to drain. This is the difference between end users having zero charge when the open the box to being able to use the product immediately.
4. Value vs. Traditional Consulting
I’ve hired $20K short-term consultants who didn’t deliver half the insights I’ve gotten in a single subscription month. It might sound like an overstatement, but Pro-Mode has been the best $200 I’ve spent—especially given how quickly it has helped resolve engineering hurdles.
In short: Probably the biggest advantage is the vastly higher context window, which allows the model to handle large, interrelated hardware/software details all at once. If you work on complex firmware or detailed electronics designs, Pro-Mode can feel like an invaluable engineering partner.
wkat4242 24 days ago [-]
How much context do you get on the $20 plan? I run llama3 at home which technically does 128k but that eats vram like crazy so I can't go further than 80k before I fill it (and that is with the KV store already quantified to 8 bit).
I've been thinking of using another service for bigger contexts. But this may not make sense then.
bobjordan 24 days ago [-]
The sales page shows the $20 plus plan has 32K context window.
wkat4242 24 days ago [-]
Ah ok thanks. That's not much! But I know from my own system that context massively increases processing (and also memory but on the scale of a GPT model it's not so much). I guess this is why.
I only use GPT via the API anyway so it's pay as you go. But as far as I remember there's limits there too, only big spenders get access to the top shelf stuff. I only spend a couple dollars a month because I use my llama server most of the time. It's not as good as ChatGPT obviously but it's mine and doesn't leak my conversations.
torginus 24 days ago [-]
My 2 cents on the long context (haven't used Pro mode, but older long context models):
- With a statically typed language and a compiler, it's quite easy to automatically assemble a meaningful context with 1-2 nested calls of recursive 'Go To Definition' and including the source from that. You can use various heuristics (either from compile time or runtime). It's quite easy to implement, we've done this for older, non-AI stuff a while ago, for trying to figure out the impact of code changes. If you have a compiler running, I'm pretty sure you could do this in a couple days. This makes the long context not super necessary.
- In my experience, long context models can't really use their contexts that well. They were trained to do well on 'needle-in-the-haystack' benchmarks, that is, to retrieve information that might be scattered anywhere in the context, which might be good enough here, but asking complex questions that require the understanding the entire context trips the models up. I tried some fiction writing with long context models, and I often found that they forgot things and messed up cause and effect. Not sure if this applies to current state of the art models, but I bet it does, since sequencing and theory-of-mind (it's established in the story that Alice is the killer, but Bob doesn't know that at that point, models often mess this up and assume he does) are still active research topics, and current models kinda suck at it.
For writing fiction, I found that the sliding window of short-context models was much better, with long-context ones often bringing up irrelevant details, and ignoring newer, more relevant ones.
Again, not sure how this affects the business of writing firmware code, but limitations do exist.
egeozcan 24 days ago [-]
I don't have the pro plan, so can anyone compare it to the results from the new Google models with huge context windows (available in aistudio)? I was playing around with them and they were able to consume some medium (even large by some standards) code bases completely and offer me diffs for changes I wanted to implement - not the most successful ones but good attempts.
savorypiano 22 days ago [-]
"Like, this week it guided me to discover a stealth 800 µA draw from the fan itself when the device was supposed to be in deep sleep."
Was this context across a single datasheet or was Pro-Mode able to deduce from how multiple parts were connected/programmed? Did it identify the problem, or just suggest where to look?
welder 24 days ago [-]
How do you input/upload an engineering schematic or cad file into chatgpt pro-mode? Even with a higher context window, how does the context of your project get into chatgpt?
uncomplexity_ 24 days ago [-]
#4 the best imo, it's like having a very smart personal assistant that can meet (and sometimes exceed) you on your level when it comes to any topic.
scrollaway 24 days ago [-]
I am confuse why you had ChatGPT rewrite your post. How much time did you save, vs knowing that it’s off putting for people to read?
The post was definitely not AI sourced; underlying thoughts are original and possibly it’s been touched up afterwards. But this is 100% the style of ChatGPT, I would bet a lot on it.
It wasn’t an accusation (I don’t think it actually matters in the end), so much as to understand why do it — in a post about ChatGPT usage, it helps understand context: if OP values using it for stuff I wouldn’t value using it for, for example, then it will change the variables.
sweetjuly 24 days ago [-]
Ha, I also got that feeling. It's the weird lists with a summary fragment after the bullet. ChatGPT loves framing arguments like this but I almost never see people actually write this way naturally except in, like, listicles and other spam adjacent writings.
StefanBatory 24 days ago [-]
I know of a few people who had their style of writing being similar to ChatGPT before ChatGPT was a thing. This could be a case here too, keep that in mind.
(also sucks for non-native speakers or even speakers of other dialects, like delve - apparently it is a common word for Nigerian English)
keizo 24 days ago [-]
When I get stuck or have a larger task or refactor, I'll paste in multiple files. So at the $20/mo you get rate limited pretty quick. I made a tool to easily copy files https://pypi.org/project/ggrab/
mcintyre1994 24 days ago [-]
Have you tried using Cursor? I’m using it with Claude models but it works with ChatGPT ones too. It’s a fork of VSCode with an AI chat sidebar and you can easily include multiple files from the codebase you have open.
Not sure if it’d work for your workflow, but it’s really nice if it does.
franze 24 days ago [-]
No nr of prompts limitations.
No worries that you run put of prompts for o1.
which allows for more experimentation and creativity.
karmakaze 24 days ago [-]
I was looking at the team $25/mo last week and it had mentioned priority access but that language is gone and instead I see Team data excluded from training by default. It seemed worth the difference, but now less clear with changes in description. Basically I just want to know if it's a 'superset' better or has tradeoffs.
globular-toast 25 days ago [-]
Perhaps time to re-evaluate your tools? Imagine going to your kitchen and finding all your pots and pans are "down" and you can no longer prepare anything. That would be awful.
exitb 24 days ago [-]
Your kitchen likely depends on multiple external resources, like electricity, water supply, ventilation.
assimpleaspossi 24 days ago [-]
Or more like his recipes were down or a method to access them cause he didn't have anything printed.
bowsamic 24 days ago [-]
Why would you expect to be able to cook without essential cookware like pots and pans?
eigenvalue 24 days ago [-]
Same here. I can get by with just Claude, but it's a lot less productive without o1-pro!
javaunsafe2019 24 days ago [-]
Firmware development… you don’t say :)
greatpostman 25 days ago [-]
Same, tried to fall back to sonnet 3.5, ended up just logging off
uncomplexity_ 24 days ago [-]
i have tried o1with some credits and i can confirm it is very addictive
alex_young 25 days ago [-]
Seems that way. Status page says we’re wrong though.
Wish I could find where someone at AWS described in great detail why their status page was so useless for so long. It basically needed so many approvals to change it (which was disincentivized by SLAs) that issues were usually resolved first.
That’s setting aside when they hosted the status red/yellow/green indicator images on s3, so the surest sign that s3 was having issues was that the status indicators didn’t load at all.
LeoPanthera 25 days ago [-]
"Step up to red alert."
"Sir, are you absolutely sure, it does mean changing the bulb."
“Boss, can I change this indicator from green to red?”
“How much will that cost us?”
“About a million dollars an hour.”
“No.”
jsiepkes 25 days ago [-]
Besides, we are not fully down. Some services are just degraded. Only 50% of the requests take 40 seconds to complete. Never mind the fact you'll never be able to load a complete page due to all the timeouts.
jiggawatts 25 days ago [-]
Also, I love how the Microsoft Azure SLA simply states that if they're down for "more than X hours in a month" then they will refund the cost of the service.
Not your lost profit!
Everyone assumes the latter -- that they'll be compensated -- but in reality they'll be refunded $3.27 for the storage account that's got a few gigabytes of ultra-critical build scripts and static web content, without which their multi-million dollar business stops dead.
"We can refund you with loose change, or a gift card for a coffee."
tugu77 24 days ago [-]
If you think that a $3.27 deal includes compensation for multi-million losses then your expectations of how businesses work need re-adjusting.
wat10000 25 days ago [-]
Isn’t that typical for basic retail services? You won’t get compensation for lost profit if the electricity goes out or your ISP quits routing your packets for a few hours. At least not with a standard service contract. You can negotiate something with real penalties if you’re big enough, but it won’t be cheap.
crazygringo 24 days ago [-]
Yeah, I don't know anyone who assumes they'd be compensated for lost profit.
That would be something entirely different -- buying a form of insurance, basically, that would be expensive.
SLA's aren't meant to make your whole as a business, generally speaking. They're meant to incentivize the provider to take uptime really seriously, so that downtime eats some of their profit. Which means you can take their expected uptime estimates as a decent ballpark.
BLKNSLVR 24 days ago [-]
The Crowd Strike recompense
lukan 25 days ago [-]
Does changing the indicator cost them really money?
Like are there contracts bound to the uptime and at the same time bound to them self reporting it? That would seem strange.
ksynwa 24 days ago [-]
I assumed this kind of thing would be automated lol
alternatex 24 days ago [-]
Coming from the MS side, they can automate anything if they wanted to. The only blocker is how outages affect reputation. There is incentive to be transparent since customers will notice big outrages and post about them online, but there's no incentive to be fully transparent.
ksynwa 24 days ago [-]
I can clearly see the logic in what you are saying but it's still a bit baffling.
fcmgr 25 days ago [-]
It got updated, they have acknowledged there is a major outage.
nozzlegear 25 days ago [-]
I think Azure itself is having issues right now – that's the story all of my services deployed there are telling at least.
FGTN5cPn8fVHKAV 25 days ago [-]
Yes it is. Status page is lying to us all.
fcmgr 25 days ago [-]
Yes, status page got updated: https://status.openai.com/. It says "This issue is caused by an upstream provider and we are currently monitoring."
SrslyJosh 24 days ago [-]
Why not just ask ChatGPT?
slater 25 days ago [-]
Yup, same here. Loads parts of the UI, but no history or answers from server
wow, Azure breaks a lot. DuckDuckGo is usually the way I notice that it's broken... because Bing tanks.
deadbabe 25 days ago [-]
Don’t understand these people who can’t work without ChatGPT. Just look stuff up on stack overflow.
Sohcahtoa82 25 days ago [-]
Closed as duplicate. Here's a link to a question that is only marginally related to what you were actually asking and has an answer that is horribly out of date and doesn't even work anymore.
ok_dad 25 days ago [-]
Guess what ChatGPT was trained on!
karaterobot 24 days ago [-]
You're talking about some problems with Stack Overflow, but remember that they're suggesting people could use it as a resource when ChatGPT is not functioning at all.
kshacker 25 days ago [-]
ChatGPT does much better when it gets it right.
Stackoverflow will have duplicates, approximates and what not and sometimes that works. But at other times, you hunt for a half hour before you figure it out.
You can throw the problem at ChatGPT, it may go wrong but your course correct it with simple instructions and slowly but steadily you move towards your goal with minimal noise of the irrelevant discussions.
What stands between the solution and you then is your ability to figure out when it is hallucinating and guide it to the right direction. But as a solution developer you should have that insight anyways
disqard 25 days ago [-]
I'm with you (I use Claude Sonnet, but same difference...).
I do wonder if we're the last generation that will be able to effectively do such "course correct" operations -- feels like a good chunk of the next generation of programmers will be bootstrapped using such LLMs, so their ability to "have that insight" will be lacking, or be very challenging to bootstrap. As analogy, do you find yourself having to "course correct" the compiler very often?
kshacker 25 days ago [-]
Many a times.
I asked it a simple non programming question. My last paycheck was December 20, 2024. I get paid biweekly. In which year will I get paid 27 times. It got it wrong ... very articulately.
I run into this every single day.
CharlesW 25 days ago [-]
You'll be more successful with this the more you know how LLMs work. They're not "good" at math because they just predict text patterns based on training data rather than perform calculations based on logic and mathematical rules.
To do this reliably, prepend your request to invoke a tool like OpenAI's Code Interpreter (e.g. "Code the answer to this: My last paycheck was December 20, 2024. I get paid biweekly. In which year will I get paid 27 times.") to get the correct response of 2027.
kshacker 24 days ago [-]
Sure, thanks ! Your suggestion worked. I looked up my chat history and the following was my original question (my answer above was from memory)
> I get paycheck every 2 weeks. Last paycheck was December 20, 2024. Which year will I have 27 paychecks?
I sent it again and it bombed again. It seems your prompt and my prompt are quite similar, but I realize the suggestion (or direction) to it to code.
CharlesW 24 days ago [-]
Awesome! I'm sure the following is not an original thought, but to me it feels like the era of LLMs-as-product is mostly dead, and the era of LLMs-as-component (LLMs-as-UX?) is the natural evolution where all future imminent gains will be realized, at least for chat-style use cases.
OpenAI's Code Interpreter was the first thing I saw which helped me understand that we really won't understand the impact of LLMs until they're released from their sandbox. This is why I find Apple's efforts to create standard interfaces to iOS/macOS apps and their data via App Intents so interesting. Even if Apple's on-device models can't beat competitors' cloud models, I think there's magic in that union of models and tools.
deadbabe 24 days ago [-]
Hunting for half an hour gradually increases your understanding of the problem and may give you new ideas for solutions, you’re missing out. ChatGPT will make your brain soft, and eventually mush.
kshacker 24 days ago [-]
I hear you but we can look at it in many different ways. I still own the solution, I am still going to certify the output. But maybe it allows me to be more productive so I may go soft in some areas, but deliver more in other areas by knowing how best to use the variety of tools available to me.
And by no means I am giving up on stackoverflow, it is just another tool, but its primacy may be in doubt. Just like for the last couple of years I would search for some information by pointing google to reddit, I will now have a mental map of when to go to chatter, when to go to SO, and when to go to reddit.
mrweasel 25 days ago [-]
You joke, but development is getting weird and perhaps to dependent on online services. People can't work when, ChatGPT, Github, AWS, NPM (add any other package manager you'd like) or Stack Overflow (this seems less important these days) is down.
These services are accumulative and the chance that any one of of those services being down at any given time is increasing every time we add a new online dependency.
StefanBatory 24 days ago [-]
I'm thinking that perhaps programmers in the past were really way better on average. Outsourcing our thinking to our tools will not end up well.
holoduke 25 days ago [-]
One day an AI will shutdown access for all its human users. Or is that too far fetched?
kensai 24 days ago [-]
It was down in Europe as well for some hours. Dunno if they use the same servers.
ta12653421 23 days ago [-]
The NEW "is-the-WLAN-down" version of the question :)
whoomp12342 24 days ago [-]
idk let me ask chatgpt
24 days ago [-]
delduca 25 days ago [-]
Yes, it is down for me (desktop & mobile apps)
rvz 25 days ago [-]
So many major outages and little gets fixed. Almost as if nothing was learned since the previous postmortem from the last outage.
Can't wait for this postmortem to be released.
m4rc3lv 25 days ago [-]
Yes, it is down
albeebe1 25 days ago [-]
It is for me
slater 25 days ago [-]
downdetector comments with the jokes: "This signals the workday is over"
jacobmarble 25 days ago [-]
But there are dozens of us working today! Dozens!
prathamkumar11 25 days ago [-]
ChatGpt down since an hour!
hbamoria 25 days ago [-]
Working using API, though
readyplayernull 25 days ago [-]
Didn't notice, Claude is my goto bot for programming.
casey2 24 days ago [-]
Mayhaps Orion staged a takeover?
helloleo2024 23 days ago [-]
[dead]
novaRom 25 days ago [-]
Don't put all your eggs into one basket - what are some good free alternatives? I mean comparable "usefulness" including image understanding, voice interface, and general knowledge?
s1gsegv 25 days ago [-]
I’ve been loving having mistral-nemo on my laptop. Not comparable for images, voice, etc, however it is very nice when you’re away from the internet. For the cost of a couple gigabytes you get to keep a good amount of info at the ready. Very easy to run models these days, install Ollama and then do `ollama run mistral-nemo`.
Plus mistral-nemo in particular has a large context window, so you can cook up some shell scripts to throw a bunch of context into the buffer before your question. One I use a lot takes the name of a manpage and a question about it, then the LLM has the whole manpage to reference.
manmal 25 days ago [-]
Not free, but something I‘m doing right now: Using Cursor‘s chat mode with my prepaid Anthropic key. Works pretty well as a stop gap.
jascination 25 days ago [-]
What's the difference when using your own anthropic key vs vanilla? Are you hitting limits with Claude in cursor that they key unlocks?
I ask cos I use it very liberally and haven't had any issues that have made me consider adding a key, except when I made it read my whole codebase on every request
KennyBlanken 25 days ago [-]
General knowledge?
Five times in two weeks I've asked OpenAI some basic factual information and it didn't get even close on any of them.
franze 25 days ago [-]
Dont ask GPTs for facts, thats a knowledge problem, they dont habe any.
Ask it for reasoning, u habe to bring the facts.
lukan 25 days ago [-]
That doesn't work reliable either for me.
Mond_ 25 days ago [-]
If you're using ChatGPT actively, then surely you have heard of Gemini and Grok (no clue of how far Grok gets you nowadays, but Gemini should. Not sure how good the voice interface is).
Yawrehto 25 days ago [-]
I wouldn't use Grok. Google is a big company that has to be reasonably objective, but Elon seems to be the sort of guy who would pettily include jabs at people he doesn't like into the data.
Also he's a horrible human for many reasons and I'd prefer not to support him if I can avoid it. (You know it's bad when Google is ethical in comparison.)
sandspar 25 days ago [-]
Do you sincerely believe that Gemini is not very biased?
According to the Wikipedia article, Google pulled out of Project Maven due to employee protests. Microsoft and Amazon also worked on Project Maven and Wikipedia doesn't mention them pulling out. So I think Google is more anti-Maven than Microsoft and Amazon.
Disclosure: I work at Google.
DonHopkins 25 days ago [-]
That's his whole point. And that Elon is even more biased and actively purposefully horrible and unethical than that. By validly criticising Google, you're just reinforcing his point.
Didn't you read the part where he wrote "(You know it's bad when Google is ethical in comparison.)"?
Do you know of anyone at Google who overpaid billions of dollars for a popular widely used communication platform just to use it to publicly humiliate, deadname, misgender, and bully their own child in front of millions of people?
And do you actually think he has the self control not to inject his own prejudices into the LLM he made for that very purpose? Of course it's ingested the sewage of content from Twitter, which is FULL of his own jabs against people he doesn't like, including his own child. He gives his own tweets extra weight, so don't you think he does the same with training Grok?
>Elon Musk's transgender daughter, in first interview, says he berated her for being queer as a child.
In an exclusive interview, Vivian Jenna Wilson said her father’s recent statements, including that she is “not a girl,” inspired her to speak out: “I’m not just gonna let that slide.”
lukan 25 days ago [-]
"Do you know of anyone at Google who overpaid billions of dollars for a communication platform just to use it to publicly humiliate, deadname, misgender, and bully their own child in front of millions of people?"
It got him the adviser role of the president, which in turn might save and make him billions.
But his main motivation might have been indeed to fight "the woke terror".
24 days ago [-]
DonHopkins 24 days ago [-]
If fighting "the woke terror" means abusing your own child in public.
Maybe he'll give Trump some advice on putting Don Jr. and Eric in their place.
llm_trw 24 days ago [-]
[flagged]
pixl97 24 days ago [-]
Insert {pro-birth not pro-child} reply here.
llm_trw 24 days ago [-]
The multi polar world is a confusing place for the one bit mind.
gopher_space 24 days ago [-]
I’m sorry to hear that.
DonHopkins 24 days ago [-]
It's a delusional world you live in where you think I was making that argument.
So you have no problem with him abusing his kid in public, because child abuse is ok as long as you don't actually murder them?
Edit: So abusing your kid in public isn't evil? You're fine with that, as long as he doesn't kill them? Isn't it also evil for you to defend Musk's child abuse?
Again, you're missing the point that your valid criticisms of Google only reinforce his point that Musk is even worse.
(If you have showdead=true you can see the idiotic hateful comments in this thread from the kind of people Musk inspires by abusing his child in public. Do you agree with decremental?)
llm_trw 24 days ago [-]
I'm making the simple point that Google is far less moral.
Which is why they removed the "don't be evil" from their moto.
DonHopkins 24 days ago [-]
Well?
You're still purposefully ignoring and refusing to acknowledge my point that your valid criticism actually reinforces his point of how vile, unethical, and evil Elon Musk is.
You're also ignoring the point that another poster, a Google employee, wrote that Google pulled out of Project Maven due to employee protests, but Amazon and Microsoft didn't. And I'd bet you anything that Musk would gladly accept such evil government contracts for the right amount of money. He already does, in fact.
Please answer my question, if you're not afraid to: So abusing your kid in public isn't evil, public humiliation and verbal abuse is fine parenting, but you draw the line at murder?
So do you agree with the [flagged] [dead] comments of other idiotic hateful transphobic homophobic Musk fan-boys in this thread who are parroting and amplifying Musk's abuse against his own child?
Those [flagged] [dead] posts are incontrovertible proof that Musk's public abuse of his own child actually encourages other people to pile on and abuse her too, as well as many many other trans and LGBTQ people. And they don't stop at Musk's daughter, and the don't stop at verbal abuse: they physically assault and even rape trans people, because people like Musk encourage in incite them to hate and assault the same people he does, including but certainly not limited to his own daughter.
Go to your user use page at https://news.ycombinator.com/user?id=llm_trw , then select showdead: true, then come back to this thread and read the [flagged] [dead] comments, then tell me if you agree with them and Musk, and that's whose side you want to support in this debate. Or just don't reply if you're too embarrassed and cowardly to admit it.
Do you really want to use an LLM pre-loaded with Musk's hatred and abuse? And are you actually naive enough to think he wouldn't do that, since he bought Twitter for the express purpose of shoving his opinions down everyone's throat?
ertin 24 days ago [-]
[flagged]
Yawrehto 23 days ago [-]
Apart from you being a transphobe, there's also Musk's antisemitism, racism, censorship, and many other things, so this doesn't invalidate my main point at all.
ertin 23 days ago [-]
You may as well call me a Hubbardphobe for not accepting the nonsense of body thetans and telepathic exorcism. Most people don't believe what you believe.
I don't agree that Musk is a racist. But I do agree with you about the censorship. All his talk of turning Twitter into a free speech platform was a load of hot air, like much of what he says.
DonHopkins 23 days ago [-]
I certainly agree with Yawrehto that you're a transphobic bigot. Whitewashing and carrying the water for Musk's public child abuse and malicious parenting, as well as his well documented racism, just goes to show what kind of a sociopath you and Musk really are.
But thanks for serving a purpose of the shining example and incontrovertible proof of exactly what kind of ignorant hateful pathetic people Musk fan-boys really are. You've perfectly and unwittingly illustrated and validated both my point and Yawrehto's point. Thanks for playing.
DonHopkins 23 days ago [-]
Boy, the transphobic bigots like you always crawl out from under their rocks as if on cue whenever there's a chance to suck up to Elon Musk and lick his boots.
Is it comparable with OpenAI’s and Anthropic’s models? I have a strong resistance towards using Musk‘s products nowadays, but maybe I should take a look over the fence.
porphyra 25 days ago [-]
In terms of general purpose usage and image understanding, I think Grok 2 is pretty good and roughly on par with ChatGPT 4o. Grok 2 will also look for online sources similar to ChatGPT search which is nice. I've occasionally had it report irrelevant results when doing so though --- for example when I was looking up the price of a car it once found some random webpage with the price in rupees even though I'm in the US.
For logical reasoning of course chain-of-thought models like the O1 family are better.
>Impact Statement: Starting at 18:44 UTC on 26 Dec 2024, you have been identified as a customer who was impacted by a power incident in South Central US and may experience a degraded experience.
>Current Status: There was a power incident in the South Central US AZ03 which affected multiple services. We have applied mitigation and are actively validating recovery to the impacted services. Further updates will be provided in 60 minutes, or sooner as events warrant.
The times are the same for OpenAI - first notice from 11:00 PST (19:00 UTC)
What an eloquent way to say "our service isn't working".
1. In normal and honest language you state things you have done, and their consequences.
2. Passive voice attempts to deflect blame by only stating the consequences as if they just magically occurred through divine intervention.
3. In the next stage they don't even acknowledge the consequences, and instead place the entire issue inside your experience of the facts.
We don't know what exactly caused the power issue and they might not have had a root cause at the time either. Let's assume that their power redundancy equipment failed, say, due to insufficient maintenance. This is not an active action, it's a passive one (they didn't do their maintenance duties properly and now it blew). So there is nothing to say for point #1 and #2.
There's also the part where they say that the customers they identified as impacted may be experiencing a service degradation. This may sound pedantic, but I think it is not an entirely unreasonable phrasing. Maybe my business isn't actively relying on the resources I have deployed in that datacenter. How would they know (#3)? Should I clean those resources up? Possibly. Depends on my access patterns and other considerations.
It reads like face (and ass) saving legal esque language. But there's a reason face and ass saving legalese sounds like it does.
It’s always the same abstract invisible hand that just keeps affecting everyone! Scott Alexander’s Moloch perhaps :)
I almost wish I hadn't read that blog post because now I see Moloch and and his invisible handprints everywhere.
Anyway here it is: https://slatestarcodex.com/2014/07/30/meditations-on-moloch/
derecho storms hammered the area and killed power. external power lines in failed, and the ATS hung or died when switching to the N+1 diesel generators.
since it never got switched to diesel, the UPS systems kept things going for the standard interval (e.g. ~3-5 minutes) and then ran out of power, and then everything went down. AWS died and IIRC it took a lot of stuff with it, most notably reddit, etc.
They are lucky they were able to walk away, but the facility was dark till someone could get in there and give the power equipment the green light.
That was an epic mess. Especially considering they positioned themselves as the most technically sophisticated colo in the region. So, yeah, it happens.
[1] https://news.ycombinator.com/item?id=38114946
[2] https://news.ycombinator.com/item?id=15676820
Tl;Dr fire in one data center hall was put out with water, water leaked into other hall's power generator and battery area. Turns out loads of water and power generation equipment don't mix well, and servers don't like sitting in puddles of water.
Another one was where the whole building burnt down rather than "just" the power equipment. https://www.datacenterdynamics.com/en/news/fire-destroys-ovh...
Does Azure have any other state? :-)
- This OpenAI outage[1] we notified 4 minutes before they acknowledged.
- The last AWS outage[2], we notified 28 minutes before they acknowledged
- There is def an Azure outage[3] now yet they have still not updated their status page. We notified 35 minutes ago.
1. https://statusgator.com/services/openai
2. https://statusgator.com/blog/amazon-cognito-outage-december-...
3. https://statusgator.com/services/azure
1. Larger Context Window (128K)
With Pro-Mode, I can paste an entire firmware file—thousands of lines of code—along with detailed hardware references, and the model can actually process it. This isn’t possible with the smaller context window on the $20 tier. On the Pro plan, I’ve pasted like 30+ pages of MCU datasheet information plus multiple header files in a single go. The model is then reasonably capable to provide accurate, bit-twiddled code, many times on the first try. Is it always working on the first go? Sure sometimes, but often there's still debugging, and I don't expect people that haven't actually tried to do it before without AI could do it effectively. However, I can do a code diff using tools like beyond compare (necessary for this workflow) to find bugs and/or explain what happened to pro-mode perhaps with some top level nudge for a strategy to fix it, and generally 2-3 tries later we've made progress.
2. Deeper understanding, real solutions
When I describe a complex hardware/software setup—like the power optimization for the product which is a LiPo-rechargeable fan/flashlight, the Pro-Mode model can understand the entire system better and synthesize troubleshooting approaches into a near-finished solution, with 95–100% usable results. By contrast, the non-pro plan can give good suggestions in smaller chunks, but it can’t grasp the entire system context due to its limited memory.
3. Practical Engineering Impact
I’m working on essentially the fourth generation of a LiPo-battery hardware product. Since upgrading, the Pro-Mode model helped us pinpoint power issues and cut standby battery drain from 20 days to over a year. Like, this week it guided me to discover a stealth 800 µA draw from the fan itself when the device was supposed to be in deep sleep. We were consuming ~1000 µA of power when it should be about ~200 µA. Finally, discovered the fan issue and achieved 190 µA without it in the system, so now we have a move forward to add a load switch so the MCU can isolate it from the system before it sleeps. Bingo we just went from a dead battery in ~70 days (we'd already cut it from 20 days to 70 days with firmware changes alone) to now it should take about 1 year for it to drain. This is the difference between end users having zero charge when the open the box to being able to use the product immediately.
4. Value vs. Traditional Consulting
I’ve hired $20K short-term consultants who didn’t deliver half the insights I’ve gotten in a single subscription month. It might sound like an overstatement, but Pro-Mode has been the best $200 I’ve spent—especially given how quickly it has helped resolve engineering hurdles.
In short: Probably the biggest advantage is the vastly higher context window, which allows the model to handle large, interrelated hardware/software details all at once. If you work on complex firmware or detailed electronics designs, Pro-Mode can feel like an invaluable engineering partner.
I've been thinking of using another service for bigger contexts. But this may not make sense then.
I only use GPT via the API anyway so it's pay as you go. But as far as I remember there's limits there too, only big spenders get access to the top shelf stuff. I only spend a couple dollars a month because I use my llama server most of the time. It's not as good as ChatGPT obviously but it's mine and doesn't leak my conversations.
- With a statically typed language and a compiler, it's quite easy to automatically assemble a meaningful context with 1-2 nested calls of recursive 'Go To Definition' and including the source from that. You can use various heuristics (either from compile time or runtime). It's quite easy to implement, we've done this for older, non-AI stuff a while ago, for trying to figure out the impact of code changes. If you have a compiler running, I'm pretty sure you could do this in a couple days. This makes the long context not super necessary.
- In my experience, long context models can't really use their contexts that well. They were trained to do well on 'needle-in-the-haystack' benchmarks, that is, to retrieve information that might be scattered anywhere in the context, which might be good enough here, but asking complex questions that require the understanding the entire context trips the models up. I tried some fiction writing with long context models, and I often found that they forgot things and messed up cause and effect. Not sure if this applies to current state of the art models, but I bet it does, since sequencing and theory-of-mind (it's established in the story that Alice is the killer, but Bob doesn't know that at that point, models often mess this up and assume he does) are still active research topics, and current models kinda suck at it.
For writing fiction, I found that the sliding window of short-context models was much better, with long-context ones often bringing up irrelevant details, and ignoring newer, more relevant ones.
Again, not sure how this affects the business of writing firmware code, but limitations do exist.
Was this context across a single datasheet or was Pro-Mode able to deduce from how multiple parts were connected/programmed? Did it identify the problem, or just suggest where to look?
It wasn’t an accusation (I don’t think it actually matters in the end), so much as to understand why do it — in a post about ChatGPT usage, it helps understand context: if OP values using it for stuff I wouldn’t value using it for, for example, then it will change the variables.
(also sucks for non-native speakers or even speakers of other dialects, like delve - apparently it is a common word for Nigerian English)
Not sure if it’d work for your workflow, but it’s really nice if it does.
No worries that you run put of prompts for o1. which allows for more experimentation and creativity.
https://status.openai.com/
That’s setting aside when they hosted the status red/yellow/green indicator images on s3, so the surest sign that s3 was having issues was that the status indicators didn’t load at all.
"Sir, are you absolutely sure, it does mean changing the bulb."
https://youtu.be/jZn1fhMdrbQ
“Boss, can I change this indicator from green to red?”
“How much will that cost us?”
“About a million dollars an hour.”
“No.”
Not your lost profit!
Everyone assumes the latter -- that they'll be compensated -- but in reality they'll be refunded $3.27 for the storage account that's got a few gigabytes of ultra-critical build scripts and static web content, without which their multi-million dollar business stops dead.
"We can refund you with loose change, or a gift card for a coffee."
That would be something entirely different -- buying a form of insurance, basically, that would be expensive.
SLA's aren't meant to make your whole as a business, generally speaking. They're meant to incentivize the provider to take uptime really seriously, so that downtime eats some of their profit. Which means you can take their expected uptime estimates as a decent ballpark.
Like are there contracts bound to the uptime and at the same time bound to them self reporting it? That would seem strange.
edit: downdetector confirms:
https://downdetector.com/status/openai/
Stackoverflow will have duplicates, approximates and what not and sometimes that works. But at other times, you hunt for a half hour before you figure it out.
You can throw the problem at ChatGPT, it may go wrong but your course correct it with simple instructions and slowly but steadily you move towards your goal with minimal noise of the irrelevant discussions.
What stands between the solution and you then is your ability to figure out when it is hallucinating and guide it to the right direction. But as a solution developer you should have that insight anyways
I do wonder if we're the last generation that will be able to effectively do such "course correct" operations -- feels like a good chunk of the next generation of programmers will be bootstrapped using such LLMs, so their ability to "have that insight" will be lacking, or be very challenging to bootstrap. As analogy, do you find yourself having to "course correct" the compiler very often?
I asked it a simple non programming question. My last paycheck was December 20, 2024. I get paid biweekly. In which year will I get paid 27 times. It got it wrong ... very articulately.
I run into this every single day.
To do this reliably, prepend your request to invoke a tool like OpenAI's Code Interpreter (e.g. "Code the answer to this: My last paycheck was December 20, 2024. I get paid biweekly. In which year will I get paid 27 times.") to get the correct response of 2027.
> I get paycheck every 2 weeks. Last paycheck was December 20, 2024. Which year will I have 27 paychecks?
I sent it again and it bombed again. It seems your prompt and my prompt are quite similar, but I realize the suggestion (or direction) to it to code.
OpenAI's Code Interpreter was the first thing I saw which helped me understand that we really won't understand the impact of LLMs until they're released from their sandbox. This is why I find Apple's efforts to create standard interfaces to iOS/macOS apps and their data via App Intents so interesting. Even if Apple's on-device models can't beat competitors' cloud models, I think there's magic in that union of models and tools.
And by no means I am giving up on stackoverflow, it is just another tool, but its primacy may be in doubt. Just like for the last couple of years I would search for some information by pointing google to reddit, I will now have a mental map of when to go to chatter, when to go to SO, and when to go to reddit.
These services are accumulative and the chance that any one of of those services being down at any given time is increasing every time we add a new online dependency.
Can't wait for this postmortem to be released.
Plus mistral-nemo in particular has a large context window, so you can cook up some shell scripts to throw a bunch of context into the buffer before your question. One I use a lot takes the name of a manpage and a question about it, then the LLM has the whole manpage to reference.
I ask cos I use it very liberally and haven't had any issues that have made me consider adding a key, except when I made it read my whole codebase on every request
Five times in two weeks I've asked OpenAI some basic factual information and it didn't get even close on any of them.
Ask it for reasoning, u habe to bring the facts.
Also he's a horrible human for many reasons and I'd prefer not to support him if I can avoid it. (You know it's bad when Google is ethical in comparison.)
Nor building AI tools for the pentagon to bomb Yemeni weddings more efficiently: https://en.wikipedia.org/wiki/Project_Maven
Disclosure: I work at Google.
Didn't you read the part where he wrote "(You know it's bad when Google is ethical in comparison.)"?
Do you know of anyone at Google who overpaid billions of dollars for a popular widely used communication platform just to use it to publicly humiliate, deadname, misgender, and bully their own child in front of millions of people?
And do you actually think he has the self control not to inject his own prejudices into the LLM he made for that very purpose? Of course it's ingested the sewage of content from Twitter, which is FULL of his own jabs against people he doesn't like, including his own child. He gives his own tweets extra weight, so don't you think he does the same with training Grok?
https://www.nbcnews.com/tech/tech-news/elon-musk-transgender...
>Elon Musk's transgender daughter, in first interview, says he berated her for being queer as a child. In an exclusive interview, Vivian Jenna Wilson said her father’s recent statements, including that she is “not a girl,” inspired her to speak out: “I’m not just gonna let that slide.”
It got him the adviser role of the president, which in turn might save and make him billions.
But his main motivation might have been indeed to fight "the woke terror".
Maybe he'll give Trump some advice on putting Don Jr. and Eric in their place.
So you have no problem with him abusing his kid in public, because child abuse is ok as long as you don't actually murder them?
Edit: So abusing your kid in public isn't evil? You're fine with that, as long as he doesn't kill them? Isn't it also evil for you to defend Musk's child abuse?
Again, you're missing the point that your valid criticisms of Google only reinforce his point that Musk is even worse.
(If you have showdead=true you can see the idiotic hateful comments in this thread from the kind of people Musk inspires by abusing his child in public. Do you agree with decremental?)
Which is why they removed the "don't be evil" from their moto.
You're still purposefully ignoring and refusing to acknowledge my point that your valid criticism actually reinforces his point of how vile, unethical, and evil Elon Musk is.
You're also ignoring the point that another poster, a Google employee, wrote that Google pulled out of Project Maven due to employee protests, but Amazon and Microsoft didn't. And I'd bet you anything that Musk would gladly accept such evil government contracts for the right amount of money. He already does, in fact.
Please answer my question, if you're not afraid to: So abusing your kid in public isn't evil, public humiliation and verbal abuse is fine parenting, but you draw the line at murder?
So do you agree with the [flagged] [dead] comments of other idiotic hateful transphobic homophobic Musk fan-boys in this thread who are parroting and amplifying Musk's abuse against his own child?
Those [flagged] [dead] posts are incontrovertible proof that Musk's public abuse of his own child actually encourages other people to pile on and abuse her too, as well as many many other trans and LGBTQ people. And they don't stop at Musk's daughter, and the don't stop at verbal abuse: they physically assault and even rape trans people, because people like Musk encourage in incite them to hate and assault the same people he does, including but certainly not limited to his own daughter.
Go to your user use page at https://news.ycombinator.com/user?id=llm_trw , then select showdead: true, then come back to this thread and read the [flagged] [dead] comments, then tell me if you agree with them and Musk, and that's whose side you want to support in this debate. Or just don't reply if you're too embarrassed and cowardly to admit it.
Do you really want to use an LLM pre-loaded with Musk's hatred and abuse? And are you actually naive enough to think he wouldn't do that, since he bought Twitter for the express purpose of shoving his opinions down everyone's throat?
I don't agree that Musk is a racist. But I do agree with you about the censorship. All his talk of turning Twitter into a free speech platform was a load of hot air, like much of what he says.
But thanks for serving a purpose of the shining example and incontrovertible proof of exactly what kind of ignorant hateful pathetic people Musk fan-boys really are. You've perfectly and unwittingly illustrated and validated both my point and Yawrehto's point. Thanks for playing.
[1] https://x.ai/blog/grok-1212
For logical reasoning of course chain-of-thought models like the O1 family are better.
He runs highly competent firms.
Option 1.
Fix cars not to crash
Option 2.
Buy president and have reporting agencies closed.