So much respect for this guy. He is like Neo of the matrix, bridging the gap between humans and machines. I have so far learned the following for free from his repos/videos:
1. minGPT, nanoGPT (transformers)
2. NLP (make more series)
3. tokenizers (his youtube)
4. RNN (from his blog)
There are many domains which don't have a karpathy and we don't hear about them. So glad we have this guy to spread his intuitions on ML.
DoingIsLearning 38 days ago [-]
There are also different styles of teaching and learning. Karpathy always like to start from first principles and increment the building blocks.
Whereas for example Jeremy Howard's style resonates a lot more with how I enjoy learning, very much a "let's build it" and then tinker around to gain intuition on how things inside the box are working.
I see the benefit in both approaches and perhaps Karpathy is more methodical and robust. But I just find Howard's top-down style a lot easier to stay motivated with when I am learning on my own time.
janalsncm 38 days ago [-]
Definitely agree. I think a lot of people get hung up on the math in ML but honestly there are so many other things you could spend time on, and there are opportunity costs for everything.
So I say, build the thing, figure out where the shortcomings in your knowledge are, and continue refining. One of those things will inevitably be math. Maybe it will be signals processing the next week or fundamentals of Spark the next. And there are always interesting papers coming out.
attentionmech 38 days ago [-]
I definitely avoided ML for years just due to math. But having a chatbot who can explain math with examples in any style you want defintely changed my opinion about math and ML in general. A big barrier to math is how it's written imo and not explained in a fun way with lot of examples. I certainly don't have a mathy brain, but I do get things when explained with examples (and certainly find it hard to come up with my own examples while fighting with the symbols).
attentionmech 38 days ago [-]
Will checkout jeremy's lectures. I actually use his fastbook notebooks a lot to self-study.
Karpathy's style, for me is more like at the right abstraction to bring out curiosity in me towards the subject. After watching his lectures, i go on to more materials generally, and never really stop there.
bamboozled 38 days ago [-]
"Neo of the matrix", what great analogy! Made my day and gave me a good laugh, thanks.
He is for sure a cool guy.
attentionmech 38 days ago [-]
he has earned it haha.
Yajirobe 38 days ago [-]
5. How to solve a Rubick’s cube
attentionmech 36 days ago [-]
saw that video just now, thanks for this.
levocardia 38 days ago [-]
I tell all my friends that Andrej was the best instructor I had in grad school, even though I didn't even go to Stanford--I just watched his CS321n videos on YouTube. Really thrilled that he's still making videos.
ipsum2 39 days ago [-]
He's made more than 5 videos covering basically the same topic, of transformer architecture and training. Wonder whats different about this one?
karpathy 39 days ago [-]
My YouTube videos fall into two tracks:
1. technical track (all the GPT repro series)
2. general audience track
For (2), I had a 1hr video from 1 year ago, but I didn't actually expect that video to be some kind of authoritative introduction to LLMs. The history is that I was invited to give an LLM talk (to general audience), prepared some random slides for a day, gave the talk, and then re-recorded the talk in my hotel room later in a single take, and that become the video. It was quite random and haphazard. So I wanted to loop back around more formally and do a more comprehensive intro to LLMs for general audience; Something I could for example give to my parents, or a friend who uses ChatGPT all the time and is interested in it, but doesn't have the technical background to go through my videos in (1). That's this video.
gnunez 38 days ago [-]
Great work! I love your videos; they've taught me so much. Any plans for a Mixture of Experts (MoE) video? My understanding is that starting from GPT4 most advance models use MoE to some extent. For example, can I take the model from your GPT2 video and just change the feed forward layer to an MoE layer like the one found here (1)? I guess I can just try it myself but I enjoy the expert guidance you provide in your videos. Please don't stop! great content!
This would be very welcome as it brings us closer to understanding the secret sauce behind training a real, practical LLM.
38 days ago [-]
pj_mukh 38 days ago [-]
"Something I could for example give to my parents,"
Not for nothing, Andrej's parents maybe different from mine, but I definitely can't send this to my parents.
yingliu4203 38 days ago [-]
Both are the best videos in NN and LLM. Your kindness is highly appreciated.
gardenhedge 38 days ago [-]
I watched that one after finding it randomly! Will give the new one a watch too
carbine 38 days ago [-]
thank you SO much for all your hard work, we sincerely appreciate it
38 days ago [-]
ks2048 39 days ago [-]
From the description:
I have one "Intro to LLMs" video already from ~year ago, but that is just a re-recording of a random talk, so I wanted to loop around and do a lot more comprehensive version.
I think he has videos on building GPT2 from scratch, but this seems more high-level.
hustwindmaple1 38 days ago [-]
When he drops a vid, you don't ask questions. You watch first and then ask questions :)
lolinder 38 days ago [-]
I appreciated the question, because the video is 3hr30min long, and karpathy's answer here was helpful to know that it's not directed at me.
thomassmith65 38 days ago [-]
Among other things, this video is recent enough to discuss DeepSeek R1.
sota_pop 39 days ago [-]
Really love his “let’s build” series - I end up picking up cool Python tricks along the process, even in addition to the higher level content.
bicepjai 38 days ago [-]
I still remember how to back propagate using python lists which was part of CS231n project by @karpathy. Amazing thing is I did not goto Stanford.
Dinux 39 days ago [-]
Thanks Andrej. I have a pretty good understanding of how LLMs work and how they are trained, but alot of my friends don't. These videos/talks give them 'some' idea.
arvinsim 38 days ago [-]
It frustrates me that I can't focus on these long-form videos when they are likely much better than the soundbite sized video counterparts.
fransje26 38 days ago [-]
Focus is cultivated.
If you want to start getting your focus back under control, give meditation a try. It's a gentle tool to will help you get you understand how your attention works, and, with training, will give you back the control you need.
browningstreet 37 days ago [-]
Serious suggestion: stand while you watch the video (Youtube on a TV helps here), use a pomodoro timer to do it in bits.
I also find I can watch a good bit of a complicated/technical video while walking on a treadmill/stairmaster at the gym. Just enough noise of other people, and the determination to do 45-60 minutes on a fitness machine replaces the determination required to get through a video. Doing it 3 days in a row feeds the motivation cycle in me. After day 1 I'm itching to get back to the gym and get back to the video to do another "episode".
apetrov 38 days ago [-]
you don't have to do it in one go, i usually do pen and paper, 45m-1h of video time (~2h wall time)
and often redo the same exercise a couple of months later
brianjking 37 days ago [-]
Karpathy is too good to us. These videos are so incredibly valuable, thank you for sharing.
stuckkeys 38 days ago [-]
I sat through the entire thing...my cheeks fell asleep but well Worth it. Thak you Andrej!
behnamoh 38 days ago [-]
I'm a simple man, I see a Karpathy video, I click, I watch, I enjoy. :)
demarq 39 days ago [-]
I wish there were another way to distribute video. Content disappears from youtube eventually, for silly reasons.
I think this is important content, the more people know how ai works under the hood the more empowered society will be.
IncreasePosts 39 days ago [-]
You can just make a torrent of the video. It will then survive as long as you/others are willing to seed it.
m_ppp 39 days ago [-]
Do you think videos disappearing is the biggest problem with YouTube from a distribution perspective?
demarq 38 days ago [-]
Durability is not a Distribution problem.
m_ppp 38 days ago [-]
I guess more in the problem of whether or not you consider it a good distribution platform, if you can't ascertain its durability.
layer8 38 days ago [-]
If accessible knowledge about how LLMs work ever disappears, it won’t be due to YouTube.
Dinux 39 days ago [-]
YouTube is known for not deleting videos, and so far the never have (with some obvious exceptions)
tmp111111 39 days ago [-]
Andrej, I like you much more now than when you were at Tesla. You have been adding real value to my life and many others. Thank you.
orand 38 days ago [-]
What is the relevance of Tesla to liking someone?
cma 38 days ago [-]
It may be that Karpathy was used in the AI day presentations to sell the idea that the cars would drive themselvesin a year and it was really important to pay $10,000 for it now (~2019?) before they raised the price.
We know from some of the leaked Musk/OpenAI emails they didn't really believe it internally and thought they would need to merge with OpenAI to do it.
brizii 39 days ago [-]
[flagged]
zachanderson 38 days ago [-]
[flagged]
lewiscarson 38 days ago [-]
I wouldn’t be so harsh on him, Karpathy really knows what he’s at when it comes to this stuff. I was sold when I read his recipe for training neural nets.
1. minGPT, nanoGPT (transformers)
2. NLP (make more series)
3. tokenizers (his youtube)
4. RNN (from his blog)
There are many domains which don't have a karpathy and we don't hear about them. So glad we have this guy to spread his intuitions on ML.
Whereas for example Jeremy Howard's style resonates a lot more with how I enjoy learning, very much a "let's build it" and then tinker around to gain intuition on how things inside the box are working.
I see the benefit in both approaches and perhaps Karpathy is more methodical and robust. But I just find Howard's top-down style a lot easier to stay motivated with when I am learning on my own time.
So I say, build the thing, figure out where the shortcomings in your knowledge are, and continue refining. One of those things will inevitably be math. Maybe it will be signals processing the next week or fundamentals of Spark the next. And there are always interesting papers coming out.
Karpathy's style, for me is more like at the right abstraction to bring out curiosity in me towards the subject. After watching his lectures, i go on to more materials generally, and never really stop there.
He is for sure a cool guy.
1. technical track (all the GPT repro series)
2. general audience track
For (2), I had a 1hr video from 1 year ago, but I didn't actually expect that video to be some kind of authoritative introduction to LLMs. The history is that I was invited to give an LLM talk (to general audience), prepared some random slides for a day, gave the talk, and then re-recorded the talk in my hotel room later in a single take, and that become the video. It was quite random and haphazard. So I wanted to loop back around more formally and do a more comprehensive intro to LLMs for general audience; Something I could for example give to my parents, or a friend who uses ChatGPT all the time and is interested in it, but doesn't have the technical background to go through my videos in (1). That's this video.
1. https://github.com/mistralai/mistral-inference/blob/main/src...
https://karpathy.github.io/2016/05/31/rl/
Not for nothing, Andrej's parents maybe different from mine, but I definitely can't send this to my parents.
If you want to start getting your focus back under control, give meditation a try. It's a gentle tool to will help you get you understand how your attention works, and, with training, will give you back the control you need.
I also find I can watch a good bit of a complicated/technical video while walking on a treadmill/stairmaster at the gym. Just enough noise of other people, and the determination to do 45-60 minutes on a fitness machine replaces the determination required to get through a video. Doing it 3 days in a row feeds the motivation cycle in me. After day 1 I'm itching to get back to the gym and get back to the video to do another "episode".
I think this is important content, the more people know how ai works under the hood the more empowered society will be.
We know from some of the leaked Musk/OpenAI emails they didn't really believe it internally and thought they would need to merge with OpenAI to do it.