I know this is just a fun little easter egg, but features like this seems a little risky to me. It calls attention to the fact that Youtube isn't just serving me videos neutrally, it knows what is being said in those videos at the exact moment it is said. It highlights how much Google analyses the content of its videos in a way that more neutral features like automatic transcripts don't and in turn how much that can teach Google about me based on the videos I watch. It isn't that I didn't know Google had this ability, but foregrounding it underlines Google's Big Brother nature in a way that makes me want to use Youtube less and not more.
gruez 27 days ago [-]
> It calls attention to the fact that Youtube isn't just serving me videos neutrally, it knows what is being said in those videos at the exact moment it is said. It highlights how much Google analyses the content of its videos in a way that more neutral features like automatic transcripts don't and in turn how much that can teach Google about me based on the videos I watch.
How else do you think the "up next" and "recommended videos for you" feature works? Did you think they were serving up relevant videos purely out of chance? If anything a dumb rule like "highlight the subscribe button if the transcript mentions it" is much more benign then a recommendation engine that can somehow always serve you a relevant video.
slg 27 days ago [-]
>How else do you think the "up next" and "recommended videos for you" feature works? Did you think they were serving up relevant videos purely out of chance?
It is absolutely possible to serve recommendations in a content agnostic way based on other user behavior. That is how the Youtube's recommendations worked at least initially. It should be obvious that they didn't have this ability to analyze content when Youtube started almost 20 years ago.
And even in a content aware system, there is a difference between matching up keywords and understanding what those words mean. That is why an automatic transcript feels more neutral. That just transcribes what is being said. There is no implication of any analysis on the meaning of what is being said.
>If anything a dumb rule like "highlight the subscribe button if the transcript mentions it" is much more benign then a recommendation engine that can somehow always serve you a relevant video.
I'm not saying this feature is scarier than the recommendation engine. I'm saying Google revealing how much they know about the content of the video makes everything they do, including their recommendation engine, a little scarier.
fragmede 27 days ago [-]
Google owned YouTube'a ability to analyze content at scale for copyrighted music is the only reason YouTube is still around today. That the engine that does that analysis might be doing more seems like not a stretch at all. How much easier is it to detect "smash that subscribe button" that "the > 30 seconds of music is possibly a copyrighted song of all the known songs"?
8note 27 days ago [-]
around 20 years ago mark all kinds of things would happen at different times of a video, with the tool tips, and end cards and what have you.
CSS animations werent there I guess, but given a manual transcription, either by the submitter, or by a person watching, youtube definitely had the capability to do exactly this, maybe with the button being some flash animation.
slg 27 days ago [-]
>...but given a manual transcription... youtube definitely had the capability to do exactly this...
I mean the "given a manual transcription" condition means the very much didn't have "the capability to do exactly this". The animation itself isn't the interesting part here.
xcv123 27 days ago [-]
> It is absolutely possible to serve recommendations in a content agnostic way based on other user behavior.
Yes but the quality and accuracy of that system would be inferior, with a long delay until it gathers enough collaborative filtering information to be useful.
ben_w 26 days ago [-]
> It is absolutely possible to serve recommendations in a content agnostic way based on other user behavior
Indeed, and I assume that's why the home page is full of sports and music when I'm not logged in. (Or was recently, it now shows a blank page).
I'm a very strange person: I don't have any interest in sports, and will listen to perhaps 2 or 3 bits of music every week — if I've heard a piece before, I generally find it predictable and by extension dull, unless some prior association of positive feelings can override that.
everforward 27 days ago [-]
Up Next and recommended videos don't require understanding the content of the videos. They could, as an example, find other users that watch similar videos and pick a video you haven't seen out of the things the other users have watched.
It wouldn't surprise me if they use transcripts, but neither feature strictly requires it.
I think the real risk here is users extrapolating from "YouTube can automatically respond to the content of videos". The Mr Beast thing is still blowing up, could they automatically identify illegal lotteries? Could they identify things that aren't suitable for YouTube Kids but end up there anyways? So on and so forth, YouTube's moderation is basically always under fire.
This seems like a risky mixed message to send. Being able to light up the Subscribe button doesn't imply the technical ability to do better moderation, but I also wouldn't want to be the guy that has to explain that and how it works to the Senate.
xattt 27 days ago [-]
It speaks to the fact that things that are most beneficial to Google have been finetuned to exploit eyeballs and pocketbooks. If they wanted to, or if their hand was forced, they’d be able to do the things that benefit the greater good.
Don’t get started with the “it’s a slippery slope” kind of shit. Short of living in some society where it’s every man for himself, kids shouldn’t be playing lotteries
everforward 26 days ago [-]
There are significant differences in difficulty, but I do agree with the sentiment. Grepping for “like button” in a transcript is much easier and cheaper than training an LLM to check whether a lottery is legal or not.
It also helps that there’s no counterparty risk with the like button. I don’t think anyone is trying to “trick” that feature, and it wouldn’t matter if they did.
chgs 27 days ago [-]
Many people in tech are big proponents of the “every man for himself” randian distopyas
aeternum 27 days ago [-]
They need it anyway to enforce copyright. Simple content hashing tricks don't work because people upload slightly cropped or mirrored copyrighted content.
Understanding the content is ultimately the only way to properly enforce and discriminate IP theft vs. fair use at scale. And the Senate wants that feature as the IP lobby is strong.
everforward 26 days ago [-]
Aren’t there fancy hashing algos that are resistant to that? I feel like I remember PhotoDNA still being able to match even after a surprising array of modifications. I want to say cropping and reversing could still match, but I don’t really recall and would strongly prefer to not have “how to beat PhotoDNA” in my search history.
aeternum 26 days ago [-]
Yes there are quite a few algos and YT already uses many of them but it only takes one person to find a workaround then publish it and it again opens the floodgates.
All the different mechanisms that youtube has used is an interesting rabbithole but its ultimately an unwinnable cat and mouse game unless you have an algo that understands the actual content.
Even the human reporting has become weaponized. Don't like the message of a video? have a bunch of youtube accounts flag it and it gets demonetized.
anticensor 26 days ago [-]
YouTube actually can not tolerate fair use, due to the fact they work in countries with little to no fair use, such as Germany and Japan. They can only tolerate statutory exceptions to copyright that apply to all countries at once.
3np 26 days ago [-]
> The Mr Beast thing is still blowing up, could they automatically identify illegal lotteries?
OotL; Mind sharing a link or two?
Familiar with MrB but no idea about anything blowing up. Quick search doesn't make the referenc obvious.
That’s the original video that kicked it off. It’s pretty mundane as far as internet controversy goes. The lotteries were run poorly and as such were probably illegal, not exactly uncommon on YouTube. The show largely targets kids, so there’s some accusations of it effectively being gambling for kids.
After that people dug into his whole life and found out one of his employees was convicted of statutory rape and he was aware, which is a bad look for a channel mostly watched by children.
It’s like a 3/10 on internet controversy scales. Doing something common and dumb but not obviously malicious, and then just stuff that looks bad without concrete evidence anything bad actually happened.
It’s not terribly interesting, I only saw because of a comedian making fun of it.
mcphage 27 days ago [-]
> Could they identify things that aren't suitable for YouTube Kids but end up there anyways?
I sure wish they did.
HappMacDonald 26 days ago [-]
They would detect that Avenue Q looks so much like Sesame Street and put that on Youtube Kids.
mcphage 25 days ago [-]
I’d like to think that they’d be able to pick up on the song about porn…
cheeseomlit 27 days ago [-]
You don't necessarily need to analyze every word in every video for recommendations- it could be as simple as 'users who liked this video also liked these other videos'
conductr 27 days ago [-]
You can also turn on subtitles and even have them translated, it's pretty obvious Google is doing a lot with the audio part of videos
queuebert 27 days ago [-]
The simplest recommender would use videos that were commonly watched afterwards by other people.
Minor49er 27 days ago [-]
That would be fun to exploit. Just get a few hundred accounts, have them watch a Mr. Beast video followed by a video you want to boost, and watch the numbers grow
xboxnolifes 27 days ago [-]
Not sure even a few thousand accounts would make a dent in a Mr beast video count. Recommendation would go toward the 1 million accounts who watch another Mr beast video after.
Minor49er 27 days ago [-]
True, though the recommendation engine on YouTube will suggest other alternatives that are not on the same channel. There would still be a positive effect
OJFord 27 days ago [-]
I'm not saying I thought this was the case, but I can think of:
1) similarity to other users (people who watched this also watched)
2) similarity in words in title or description (or the generated transcript)
mathverse 27 days ago [-]
If u are from a small country and you travel a lot it absolutely throws google off.
I never get any relevant ads or videos for me :(.
conductr 27 days ago [-]
I mostly use Youtube to watch home improvement / DIY / construction related stuff. They show me ads in Spanish. I'm probably the most stereotypical American white dude in Finance that speaks only English. What's weird is I've been using Google Search/Gmail basically since they became available and have bought their products and services for other stuff over the years and they have me pretty dialed in on their marketing elsewhere. YT specifically thinks otherwise.
johnfn 27 days ago [-]
> it knows what is being said in those videos at the exact moment it is said.
How did you think they did automated closed captioning..?
slg 27 days ago [-]
I don't know why you quoted this snippet to ask this question when I addressed it literally in the next sentence of my original comment, but I guess I need to reiterate it. Automatic transcripts/ closed captioning suggest Google heard what was said. This feature suggests Google understands what was said.
Think of it like reading a foreign language that uses the same alphabet. You can give me some French text and I can read it well enough for a French speaker to understand it. That doesn't mean that I myself understand what I read. Transcripts are the former and this feature is the latter.
johnfn 27 days ago [-]
The reason I didn't respond to the rest of your post is because I can't understand its relationship to your argument. There is no need for deeper understanding here. It's just a substring match.
slg 27 days ago [-]
Ironically you seem to be focusing too much on the exact words and phrases I used rather than the deeper meaning. So let's just get completely away from words like "knows" and "understanding" which seem to be tripping multiple people up.
> It's just a substring match.
Let's just say this is true. That is a super simply process, but what would it look like?
Step 1: Transcribe the audio into text
Step 2: Run substring match on text
The transcribing/close captioning feature only does step 1. This shows that a step 2 is possible. I think you would have to be naive to think the capability to do this type of analysis on the transcribed text was designed for only this feature and would never be used for anything else. This feature is announcing that Youtube isn't merely creating transcripts of the audio in videos, it is running some unknown amount of analysis on that data.
As I said in my original comment, "it isn't that I didn't know Google had this ability", but this is literally a glowing sign pointing to this fact. I think the danger of Google reminding people of this has potential to outweigh the benefit of the "that's cute" reaction that this is designed to elicit.
johnfn 27 days ago [-]
> I think you would have to be naive to think the capability to do this type of analysis on the transcribed text was designed for only this feature and would never be used for anything else.
Why would you have to be naive to believe this? The subtitles, with timings, are available on the client-side already. You seem to allude that doing this work would require some sort of deep analysis work. I think it's really more like 5 lines of JS, and 4 of them are producing the fun animated gradient :P
slg 27 days ago [-]
It isn’t about this requiring “deep analysis work”. It is the difference between some analysis and no analysis. That increase from 0 to 1 is always the biggest hurdle to clear when it comes to any corporate behavior like this.
This feature is like walking into your kitchen one day to find a dead cockroach. I’m saying that is an indication of a cockroach problem while you’re effectively responding with “it’s fine, the one cockroach is dead and there is no reason to believe there are any others.”
johnfn 27 days ago [-]
In a previous post, you said this:
> This feature suggests Google understands what was said.
Running .includes() on the client does not imply that Google has any "understanding" of what was said. It only implies that they ran .includes() on the client. includes() does not "understand" anything.
The thing I really don't understand is this: the fact that Google has closed captions at all implies they do an enormous more "analysis" than this minor feature could possibly require. If you understand how Google does CCs and what that means, this shouldn't have bothered you at all.
In your analogy, it's like you see a mound of a hundred thousand cockroaches, but you're are worried about a dust speck in another room.
slg 27 days ago [-]
>Running .includes() on the client does not imply that Google has any "understanding" of what was said. It only implies that they ran .includes() on the client. includes() does not "understand" anything.
Do you want to have a good faith conversation about this? Because going back to debating the meaning of "understanding" after I already said this was misleading is not a good way to have a conversation.
>The thing I really don't understand is this: the fact that Google has closed captions at all implies they do an enormous more "analysis" than this minor feature could possibly require. If you understand how Google does CCs and what that means, this shouldn't have bothered you at all.
Can we set up a baseline that there is a difference between content agnostic analysis and content aware analysis? Transcripts are content agnostic in that they can be produced without any comprehension of the words said. This feature is content aware in that it is looking for specific meaning in the words said. Do you not see any difference between these two?
johnfn 26 days ago [-]
> Do you want to have a good faith conversation about this? Because going back to debating the meaning of "understanding" after I already said this was misleading is not a good way to have a conversation.
Call it "understanding", call it "content aware analysis". I guarantee that their closed captioning service has much more of that quality than this new feature does.
> Can we set up a baseline that there is a difference between content agnostic analysis and content aware analysis? Transcripts are content agnostic in that they can be produced without any comprehension of the words said. This feature is content aware in that it is looking for specific meaning in the words said. Do you not see any difference between these two?
Again, I don't see it. CCs are not content agnostic: they have to have semantic understanding of the words said in order to produce accurate results. How do you think CCs differentiate between the words "to", "too" and "two" without looking at the surrounding words and having some idea of contextual usage? How do you think CCs can tell between "there" and "they're" without understanding if the speaker is referring to a person or a location? This is only the tip of the iceberg as to how CCs actually work, and more "content aware analysis" will always lead to more accurate CCs.
slg 26 days ago [-]
>Again, I don't see it. CCs are not content agnostic: they have to have semantic understanding of the words said in order to produce accurate results. How do you think CCs differentiate between the words "to", "too" and "two" without looking at the surrounding words and having some idea of contextual usage? How do you think CCs can tell between "there" and "they're" without understanding if the speaker is referring to a person or a location? This is only the tip of the iceberg as to how CCs actually work, and more "content aware analysis" will always lead to more accurate CCs.
Still can't get away from that "understanding" debate. You're also now equating an understanding of context with an understanding of meaning. An understanding of meaning isn't needed to differentiate between "to", "two", and "too" because they're all used differently in sentences. When the system encounters those, I don't think it goes to the definitions and tries to find which word makes the most meaningful sentence. Most times the specific homophone can be inferred based on things like part of speech and the part of speech can often be inferred from a sentence without knowing any meaning.
For example, would the system be able to properly handle homophones that are grammatically similar? Could it consistently transcribe sentences like "I have Celiac disease and enjoy the taste of rose water, so I prefer flower to flour in my deserts." That is an easy sentence to understand for anyone who knows the meaning of those words, but there are no grammatical or structural indications on which flower/flour to use.
But either way, that is getting way too deep in the weeds compared to where my point started. This feature calls attention to an analysis of meaning because the user sees the software reacting to the meaning of the content of the video. A transcript does not call attention to an analysis of meaning because the behavior of the software does not change based on the content of the video.
johnfn 26 days ago [-]
> But either way, that is getting way too deep in the weeds compared to where my point started.
Your first comment - the one that started all this - was, as far as I can understand, arguing that this feature indicated that Google had the capabilities to do more advanced - understanding? processing? meaning analysis? - than it had done in the past. If I keep coming back to that, well, it's because it appears to be your main point. If it's not, please correct me.
> Most times the specific homophone can be inferred based on things like part of speech and the part of speech can often be inferred from a sentence without knowing any meaning.
This is not true. I don't think I have enough more responses on HN to fully explain why homophones can not be inferred without understanding meaning, but I encourage you to go and read about how transcription works!
> For example, would the system be able to properly handle homophones that are grammatically similar?
I mean, this is easy enough for you to check. Here's some videos about flour / flower - notice how the CCs correctly determine if the word is flour or flower with almost 100% accuracy.
> This feature calls attention to an analysis of meaning because the user sees the software reacting to the meaning of the content of the video.
Are you saying you specifically think that YT is analyzing meaning from this feature, or just some generic user? I think you are smart enough to know that it's not true, but perhaps my mom might not understand that CCs require infinitely more processing power and this feature is just a drop in the bucket. (If you really still don't think it's true, definitely go read more about how CCs are made!)
slg 25 days ago [-]
>Your first comment - the one that started all this - was, as far as I can understand, arguing that this feature indicated that Google had the capabilities to do more advanced - understanding? processing? meaning analysis? - than it had done in the past. If I keep coming back to that, well, it's because it appears to be your main point. If it's not, please correct me.
Here is what I said. "It highlights how much Google analyses the content of its videos... It isn't that I didn't know Google had this ability...". My point was not that I learned about Google's capability from this feature or that this capability was new, it is that this calls attention to Google looking for meaning in the content of the video. A transcript does not call attention to Google looking for meaning regardless of how the transcripts are prepared.
>I mean, this is easy enough for you to check. Here's some videos about flour / flower - notice how the CCs correctly determine if the word is flour or flower with almost 100% accuracy.
Both those videos include the correct homophone in the title and description of the video. Choosing the correct one is not an indication of the system using the meaning of those words, it is pattern recognition. Every use of "flower" means the next usage is less likely to be "flour". The specificity of the example I used was important because it used both "flower" and "flour" in a way that can only be distinguished by the meaning of the words.
>Are you saying you specifically think that YT is analyzing meaning from this feature, or just some generic user? I think you are smart enough to know that it's not true, but perhaps my mom might not understand that CCs require infinitely more processing power and this feature is just a drop in the bucket. (If you really still don't think it's true, definitely go read more about how CCs are made!)
This feature is a glowing sign that Youtube as a company analyses the content of the videos for the meaning of what is said in those videos. You are too deep into the technical details trying to assign credit for what aspect of Youtube does the "understanding" or which "require[s] infinitely more processing power".
Think of this feature like receiving mail and you see one of the letters has already been opened. That could make you feel like your privacy was invaded in a way you wouldn't feel after receiving a postcard. And now we have spent several comments debating whether a torn envelope indicates whether anyone read the letter and whether a postcard is private.
FrequentLurker 27 days ago [-]
Google also censors all swear words in the auto transcriptions so is it content agnostic anymore?
slg 27 days ago [-]
That depends on how the transcription software is written. Are swear words filtered out or are they just never in the system's vocabulary in the first place? I assumed the latter, but fair enough point. It is possible my categorization needs more thought.
Regardless, there is in my opinion a clear distinction in sophistication between a filter and something that triggers a timed action. And that was really what my original comment was about, this feature's elevated sophistication is a conscious reminder of Google's capabilities. Normally that is out of sight and out of mind which is probably better for Google.
jncfhnb 27 days ago [-]
0 to 1? Pfft.
Try 1 to 2.
jrvieira 27 days ago [-]
so you're not so worried that they do this analysis (this is very tame comparing to what they really do), but rather that they are transparent about it?
zamadatix 27 days ago [-]
You've overlooked that automatic transcription also includes the filtering of certain words based on match lists as well.
8note 27 days ago [-]
Does this feature suggest they understand what was said?
I'd expect the button to light up at random times in a video about "smash the like button" causing the button to light up, rather than in the end call to action for the video where the video author actually wants the watcher to click the button.
Similarly, I doubt it can properly understand when the video presenter implies the term "smash the like button" while instead say, leaving an empty space where they would usually say such a thing.
If I pointed you at some french text and asked you to point at the term "le Chateau" you'd be able to do that without understanding french.
slg 27 days ago [-]
Now this is just a semantic debate about whether a computer can even "understand" anything. The system is conditionally responding based upon the content of what was said. That indicates a level of sophistication in analyzing the content of the video that a transcript does not. I don't really care if you define that as "understanding" or not.
pb7 27 days ago [-]
Does it understand it or does it have a regex for "smash that subscribe button"?
hi-v-rocknroll 27 days ago [-]
It doesn't work, at least not on my channel or on an unlisted video. I tried "Press the 'Subscribe' button", "Smash the 'Subscribe' button", and "Smash the 'Like' button."
skeaker 26 days ago [-]
It isn't rolled out to every channel yet. Back when the chapters feature was added to Youtube not every video would have it even if they had the correct syntax in their description. I happened to upload a video around that time and happened to put that syntax in my description and got it to work despite having a channel with almost 0 subscribers, making me think feature rollouts like these are essentially just random. Anecdotally, I suspect videos with these new features are also given an algorithm boost: My video where I had this feature randomly trigger early got over a million views despite that not being my intent and despite my channel being otherwise near non-existent in terms of activity.
dylan604 27 days ago [-]
do you really think that Googs has implemented a regex for this little easter egg without parsing the rest of the content in a much more meaningful manner as the original comment is suggesting? i'm in agreement that the levels of what these platforms can infer about the content you consume is beyond creepy.
root_axis 27 days ago [-]
Yes, because a regex match is cheap, easy and predictable. I don't see a reason why it needs to be any more complicated than that.
dylan604 27 days ago [-]
but that's not how Googs makes money
root_axis 27 days ago [-]
What does that have to do with anything? The text transcript for the video already exists on the client, implementing this behavior is trivial with that in mind. Whatever creepy analysis YouTube is doing (which I don't doubt they are), this feature isn't evidence of it.
dylan604 26 days ago [-]
you honestly think that Googs spend the time and money to develop automatic transcription/translation for anything other than figuring out ad sales to videos? anything else they've done with it is just bonus features from the work on how to direct ads. how is that difficult to grok?
SirMaster 26 days ago [-]
So you consider an if statement checking whether "like button" was present at that time in the CC to be the computer "understanding"?
munk-a 27 days ago [-]
Was this not already highlighted by ContentID, automatic topic tagging/recommendations, and auto-caption?
I honestly suspect this feature is just implemented as a side effect as the auto-captioning... and now I'm a bit curious if carefully crafted manual captions can trigger it to go off continuously.
root_axis 27 days ago [-]
I'm not seeing the risk. It's mostly common knowledge that YouTube analyzes video content for dozens of reasons like Content ID, flagging for sensitive content, etc (e.g. applying covid19 warning labels).
There's a whole bunch of lore about the types of words and phrases (such as slurs and profanity) that can affect algorithm rankings, monetization, or child-restriction.
Doing a regex search for "smash that like button" is trivial by comparison.
dakial1 27 days ago [-]
That's not an easter egg, that feature probably increased the number of likes in videos since it made the button easier to spot when users were looking for it (which is when it is mentioned on videos).
So that sweet engagement KPI is positively impacted and Youtube's CRO/experiment/Ab testing team gets their cookies.
There are probably various small tweaks like this in the UI.
anticensor 26 days ago [-]
Interaction reminders were actually against YouTube guidelines before this change has been implemented.
hammock 27 days ago [-]
I've got bad news about your Gmail...
seanthemon 27 days ago [-]
This is why I keep a buffer if 20k emails of junk. Goodluck finding out anything, I can't.
zamadatix 27 days ago [-]
Apart from whether or not it's really something new to users, it'd be interesting how many are actually concerned with the concept Google/YouTube has analyzed the content of videos it serves vs interested in what things that means are done (automatic video sections, better find in video, automatic summaries, whatever it is someone might want).
The biggest issues tend to be when "Big Brother" style companies perform that kind of analysis on your content like the Microsoft AI screenshotting backlash.
pndy 27 days ago [-]
> the Microsoft AI screenshotting backlash
I think I've missed this - what happen?
xcv123 27 days ago [-]
I don't like Google but this is not a Big Brother scenario. The videos are hosted on their own servers and are publicly available. They are not snooping on your private videos on your personal hard drive. How could they provide this service without analyzing the content of the videos? The cost of hiring humans to review and classify all of the content uploaded every day is prohibitive.
vngzs 27 days ago [-]
It's not a complete mitigation, but you can pause your video history which may decrease the amount of information Google retains over time from this sort of analytics. I haven't dug into the privacy policy and I'm not sure it mentions if this feature affects backend data collection, however.
crawshaw 27 days ago [-]
There isn't a feature a company as scrutinized as Google could launch that can't be described as "a little risky".
xbmcuser 26 days ago [-]
comments like these are the same thing as wolf is coming and one major reason why more and more people ignore about encroaching on privacy by corporations
1vuio0pswjnm7 27 days ago [-]
"It highlights how Google analyses the content of its videos..."
Google analyses the content of the uploaders' videos yet "YouTube search", from the end user's perspective, is seemingly searching strings in titles, maybe tags. These titles generally suck for this purpose, even containing unicode garbage. Throw in all the "recommended" crap in the "results", videos that do not contain the strings being searched, and it barely feels like search at all. The end user is sent "results" she never asked for.
stuaxo 27 days ago [-]
I mean, they have auto generated subtitles so of course they have a text to reference.
seydor 27 days ago [-]
> at the exact moment it is said.
I mean, it has autogenerated captions, translated (which i use to learn languages, fun)
zisguyislow 27 days ago [-]
[flagged]
sircastor 27 days ago [-]
I noticed a few weeks ago that when people said "subscribe" the subscribe button had a little light show that was kind of cool.
From a technical perspective I like the attention mechanism and that it's automated.
fennecfoxy 24 days ago [-]
I wonder if it triggers when someone says "un...subscribe".
dishsoap 27 days ago [-]
This has been happening for at least a year, probably more. I always assumed it was something the creator had to manually mark, never considered that it might be automatic.
HenryBemis 27 days ago [-]
My TV's YT app is useless and buggy. It doesn't have automatically generated & automatically translated subtitles. The YT app on my mi-box does. So it wouldn't be impossible to have the VTT prescan the audio and "if VTT has the phrase so-and-so then display animation lalala.gif"
(I use the VTT on my Firefox, I use IDM to download the auto-generated English language scripts for lengthy videos).
I remember listening to Chomsky saying that he prefers to read the transcripts than listening/watching speeches.
dylan604 27 days ago [-]
why would these auto generated subs be the apps purview? wouldn't it be YT platform doing all of this work and providing a CC button if the data is available. i can't imagine this being done in real time and not just some post process that is run against new content at time up upload processing.
HenryBemis 26 days ago [-]
It could as well be done the few secs after one uploads to the time it becomes available, and the same applies to all translations (to all languages).
It is one of those things I don't care (at all) to know how it is technically set up :)
vijucat 27 days ago [-]
What about if I say, "Please don't smash that like button" or "Smash that like button if you're an idiot"? Just wondering if it is based on transcript-grepping or does it even consider sentence semantics? (I know, overkill. Please don't implement that, YT engineers. You've heard of "boiling the ocean to make a cup of tea", right?)
Axsuul 27 days ago [-]
It should still highlight it to clarify which button they're referring to
27 days ago [-]
arder 26 days ago [-]
If I were youtube I'd be concerned about the opposite. The whole point of youtube is you have all these different creators doing different things in different ways and youtube has stepped and said "No, don't do it your way, say our exact phrase and we'll help you". The downstream impact of that is Youtube's thumb on the scale is going to drive everyone to one particularly unconvincing phrase.
superjan 27 days ago [-]
I have recently been wondering how it is possible that they can make features like that, and provide closed captions, but that they continue to insert ads halfway a speaker’s sentence.
sebastiennight 27 days ago [-]
As someone who runs a startup in the video market (and most of our users use our software for marketing purposes), I think it's pretty likely they might actually use the transcript... to insert ads halfway the speaker's sentence on purpose.
The best calls to action inside a video take advantage of the Zeigarnik effect : once they're invested in the content, people really want to find out the
BLKNSLVR 27 days ago [-]
Any company that takes financial advantage of that should just fu
netsharc 27 days ago [-]
When I caught a movie after not watching them on German TV for a few years, I noticed a terrible new thing: ads in the middle of scenes. In one very memorable one, 2 guys are waiting for a girl at the airport. One of them says "There she is!" Cut to: not the girl going through the arrivals door, but a beer ad!
I turned the TV off and walked away...
stnmtn 27 days ago [-]
One of the metrics I'm sure they want to avoid is a user clicking away. If they inserted ads only when a speaker has finished a thought, it would probably result in more click aways than ads where a user wants to hear how a sentence/thought ends.
seydor 27 days ago [-]
Maybe it's meant to be annoying so you get Premium
withinboredom 26 days ago [-]
It's literally more expensive than literally every other streaming service. Get a pihole/adblocker, it's less expensive.
27 days ago [-]
waveBidder 27 days ago [-]
Well if that doesn't demonstrate their priorities I don't know what would.
xyst 27 days ago [-]
Just a YT developer Easter egg. It’s become such a cliche phrase on that platform in an effort to pump creator stats.
I think there’s a similar effect for the subscribe button. But I don’t remember which phrase triggers it
yellow_lead 27 days ago [-]
Why does it only work for some videos?
sholladay 27 days ago [-]
It is probably AI with suboptimal training, despite the massive dataset available to them.
It might even just be conventional speech-to-text, which can struggle with accents and poor acoustics, etc.
sltkr 27 days ago [-]
YouTube already generates transcripts of uploaded videos (though not super accurate ones). A simple and straightforward implementation would just pattern match in the transcript text.
No need for fancy AI, on top of whatever technology powers the transcription feature.
hagbard_c 27 days ago [-]
It will be the STT they use for their automatic captioning so it should be as accurate as whatever that produces. It would be the height of inefficiency to do a separate STT run over the entire audio just for this gimmick.
aimazon 27 days ago [-]
Not all videos have transcripts.
amiga386 27 days ago [-]
> Button doesn't glow when when you say "SLAP like NOW"
> Davie504 in pieces
andxor 26 days ago [-]
Underappreciated comment.
cjs_ac 27 days ago [-]
Given that YouTube uses the like and subscribe buttons as inputs for its recommendation algorithm, doesn't this feature just aid and abet content creators in gaming YouTube's system?
mrguyorama 27 days ago [-]
Google WANTS creators to embrace the algorithm. The algorithm being the only useful way to get views means google gets to shape how people make stuff for youtube. "Gaming the system" is the INTENDED result.
The best example of this is clickbait thumbnails. Nearly every creator I am subscribed to has expressed hatred for them, and yet they mostly still make them. Why?
Because, when youtube shows your thumbnail to someone, if they do anything other than click on that thumbnail, your channel/video is downranked in the algorithm (because it has "low conversion"). Every creator is in a zero sum, fully antagonistic war with every other creator to get you to click on their video instead of someone else's. If a channel has low enough "conversion" for long enough, youtube will stop showing that video to potential viewers, including your actual subscribers. For a channel to survive and make enough money to pay rent and things, you have to play the dumb algorithm games. Patreon is literally the only alternative.
Youtube prefers it this way because then they get to crowdsource more addictive and "engagement driving" content through millions of creators, rather than just a department in google (which honestly they probably still have). They didn't have to tell anyone to make "more stupid clickbait thumbnails", they did it themselves because it was encouraged by the incentives given to them. A Youtube where everyone plays the algorithms and thumbnail games literally earns more money and views than a youtube where every creator is only driven by their own desires to make good art. Google has been doing this since they bought Youtube. Early on, only videos longer than ten minutes could get ads, so every video ballooned to ten minutes and one second in length.
skybrian 27 days ago [-]
Many entertainers on YouTube do that routinely. It seems way too late to start worrying about it.
I’m wondering what an unbiased survey would even look like.
rft 27 days ago [-]
I have noticed multiple YT content creators just casually mentioning things like "like and subscribe for the algorithm" and even "leave a comment for the algorithm", which people do, sometimes just saying "comment for the algorithm". One creator I remember did a "if you watched until the end comment 'boat'" or similar, which of course is again just baiting more comments. It also nudges recurring viewers who know that spiel to watch until the end, because they get to enjoy the dopamine from being one of the people who can comment the word. This increases watch time, which apparently is again one of the things the algorithm uses.
Maybe I have just become more attentive to these things or they have become more prevalent, but I notice this blatant optimization for the Algorithm(tm) more and more. I guess this is the meta you need to play in order to stand out/make it in the torrent of content that is YT or social media in general. I also see this as an example of Goodhart's law as creators try to optimize for reach and thus likes etc. while the original intention of the YT Algorithm(tm) was (or should have been) to serve you content you will enjoy.
SllX 27 days ago [-]
Just yesterday I discovered a channel that asks for people who watched all the way to the end to replay the video at 0.25x speed to support the channel. I have zero clue if that would work and I’m not going to do that, but he makes good videos so I mean, if others want to do, it’s whatever I guess. I no longer care about crap like that, YouTube is essentially its own medium at this point.
rcxdude 27 days ago [-]
Youtube definitely switched at some point from rewarding number of views (which incentivized short content: 10 minutes was a bit of a meme for a bit because that was the minimum for monetizing) to rewarding watch time (which incentivizes longer content: suddenly people noticed VODs of 6 hour livestreams doing numbers), as far as their recommendations and ad rewards worked. I wouldn't be surprised if they take the video speed adjustment into consideration such that the suggested action would contribute.
TheAceOfHearts 27 days ago [-]
If you type "awesome" the progress bar strobes and changes color.
ziml77 27 days ago [-]
I've only noticed this a couple of times, I guess because it needs to specifically be that phrase about "smashing" the button.
The thing that I do see on every video that reminds me to hit the like button is seeing the counter move. I'm way more likely to see that and think "yeah I do like this video, I should hit this button" when I see the counter change than I am to have a similar reaction when explicitly told by the person in the video to do it.
Ylpertnodi 26 days ago [-]
Notice the 'like' button numbers are not level.
Reminds me of vhs and cassette counters.
boneitis 27 days ago [-]
I can't be the only one who finds this and the subscribe animations distracting, to put it politely.
I'm sure many here would love to make use of some uBlock Origin rulesets to disable them?
I wouldn't bother dealing with it. But you can use something like StyleBot or Tampermonkey to override the css:
Before Animation:
<yt-smartimation class="smartimation smartimation--experiment-enabled smartimation--enable-masking">
While Animation:
<yt-smartimation class="smartimation smartimation--experiment-enabled smartimation--active-border smartimation--active-background smartimation--enable-masking">
satvikpendem 27 days ago [-]
Well, yes, this has been going on for quite a while, perhaps even a year before this HN post. It's cool that this article references it, though, but it's definitely not the end of times, as some might proclaim, especially as it legitimately increases traction for those who want to subscribe (and those who might not, but might perhaps want to, which is all YouTube cares about).
maxglute 27 days ago [-]
Meanwhile useful features like sleep timer only availalbe until sep if you're premium. Also had slightly better UI experiment last month.
seydor 27 days ago [-]
I wish they used something more than that stupid minimalist icon theme. I have again and again smashed the button only to unlike a video
ziml77 27 days ago [-]
I don't understand how... You always have two thumbs there which means you can contrast the state of being just an outline with the state of being filled in.
lxgr 27 days ago [-]
It happens to me all the time. I can’t explain it either, but my hypothesis is me interpreting the highlighted “like” button as the encouraged action out of the two?
ziml77 26 days ago [-]
That makes sense now that the other reply pointed out the Join button being filled in. The like and dislike buttons are internally consistent but the entirety of the buttons down there are not consistent.
seydor 27 days ago [-]
And yet the "join" button is filled in without being joined. Let alone, the icons on the video player which have their own varied little philosophies
ziml77 26 days ago [-]
Huh... somehow I noticed that but it never really registered with me that it's inconsistent because I've never wanted to join a channel.
iwishiknewlisp 27 days ago [-]
I never noticed this, is it just when they say "smash this like button" or just the term "like button"?
vizzier 27 days ago [-]
Unsure on the specific phrases, but it has been happening for a while. Subscribe highlights with certain phrases too.
sprobertson 27 days ago [-]
It might just be the word "like" said with a certain affect, I've seen it glow in the wrong context a few times.
ydnaclementine 27 days ago [-]
I have never once hit that bell
sph 26 days ago [-]
I have only once, for KRAZAM.
I need to balance all the time spent on this site reading about using Kubernetes at scale, AI changing the world and how to bring more value to my employer with their videos.
You've never found a single content creator that put out stuff you liked so much you wanted to see everything they did? Not a single one? Do you prefer TV instead?
pino82 27 days ago [-]
If it's just about keeping a link to a handful of content creators, and you aren't a multiple-hours-per-day yt addict, there is already a nice solution for that, which universally works for all web resources and is well supported by all browsers under the sun.
I'm sure thay's not ideal (e.g. not for Google trying to understand you better) for the addicts, but for the other ones, this can be the perfect solution.
nickthegreek 27 days ago [-]
I never use the bell. I just subscribe and go through my subscribed list to watch videos, and for some I add to my rss reader.
skeaker 26 days ago [-]
At that point you may as well still hit the bell, even if only to further boost your preferred creators.
rcxdude 27 days ago [-]
RSS works much better, and youtube still supports it despite hiding it well
Cordiali 27 days ago [-]
I've never used a YouTube account, I've always just used the site as-is. For a (very few) accounts, that I want to know when they release a new video, I just add them to my RSS feed reader.
seydor 27 days ago [-]
Has anyone tried the other buttons?
karaterobot 27 days ago [-]
Invidious and FreeTube do not do this, so I hadn't noticed. Kind of a neat touch, although I wonder if they do any kind of analysis to figure out whether someone is using the phrase as a call to action, or if they're, you know, making fun of the speech patterns of annoying influencers.
kerkeslager 27 days ago [-]
Ugh. If they can detect when people say that in videos, I'd prefer that they used it to penalize the video in the algorithm, rather than leaning into this irritating gaming of the system.
I don't blame creators for doing this: creators have to say this to compete (this is an example of Moloch[1]). The point isn't to hurt creators who do this, it's to remove the incentive so that creators no longer have to do this. So I think the proper way to do this is to announce that videos that ask for likes or subscribes will be penalized, and only penalize videos that come out after the announcement.
But, YouTube serves advertisers, not viewers, so they'll never do anything that benefits viewers but hurts advertisers.
There is no "If they can". Both YouTube and TikTok are actively penalizing (effectively shadowbanning) videos after detecting words like rape/murder/suicide and content related to war crimes[1]. In some cases the exploiter side are governments, not advertisers. Content creators censor these words and it helps to some extent.
My computer does not have a speaker (none built in, and none attached). I watched the clip a few times. Each time, as the button glowed in his video, the button also glowed below the video in the youtube UI.
So, at first I thought that Google is parsing the text of the video being played, but is not listening to your mic.
However, several comments both here and on YT imply that Google is ALSO listening to what you say on your mic. Which, frankly, is much more scary - though I cannot currently test.
lxgr 27 days ago [-]
Are you saying they are recording what viewers are saying? That would be an extraordinary claim requiring extraordinary proof.
Other than that, why wouldn’t they know what’s being said in a video they host?
reify 27 days ago [-]
[flagged]
KennyBlanken 27 days ago [-]
They pour money into engineering and hardware resources into stuff like this but still claim that DMCA abuse (by media companies, blackmailers, griefers, and reputation management activity) is an unsolvable problem?
Yet...they can't hire more staff, implement databases to track malicious/unreliable DMCA submitters, and falsely-targeted creators for whom complaints are likely BS, and implement AI-powered tools to assist with all of this?
theshrike79 27 days ago [-]
There is no incentive for them.
Example: Over here there is a huge problem with rented e-scooters being left wherever.
Every scooter company asks you to take a photo of the parking. NOBODY ever checks those.
I did a quick test with GPT-4 and it could tell me if a scooter was parked in a considerate way about 90% of the time (10 photos, some with multiple scooters).
Could they implement this in an afternoon? Yes. Will they? No. Because it'd be used to penalise their own customers.
zimpenfish 27 days ago [-]
> Every scooter company asks you to take a photo of the parking. NOBODY ever checks those.
Only a semi-scooter company but Lime, at least in London, do sometimes check the photos because I once got dinged for "incorrect parking" (wrongly, mind, it was parked identically to the Santander bikes a few metres away and thus incapable of inconveniencing anyone.)
kklisura 27 days ago [-]
All that AI CapEx: for a button to glow! WOW. ROI will be off the charts!
stnmtn 27 days ago [-]
I think this is more like a developer was bored one afternoon and implemented this for fun, and then everyone else thought it was fun enough to put into the main codebase.
How else do you think the "up next" and "recommended videos for you" feature works? Did you think they were serving up relevant videos purely out of chance? If anything a dumb rule like "highlight the subscribe button if the transcript mentions it" is much more benign then a recommendation engine that can somehow always serve you a relevant video.
It is absolutely possible to serve recommendations in a content agnostic way based on other user behavior. That is how the Youtube's recommendations worked at least initially. It should be obvious that they didn't have this ability to analyze content when Youtube started almost 20 years ago.
And even in a content aware system, there is a difference between matching up keywords and understanding what those words mean. That is why an automatic transcript feels more neutral. That just transcribes what is being said. There is no implication of any analysis on the meaning of what is being said.
>If anything a dumb rule like "highlight the subscribe button if the transcript mentions it" is much more benign then a recommendation engine that can somehow always serve you a relevant video.
I'm not saying this feature is scarier than the recommendation engine. I'm saying Google revealing how much they know about the content of the video makes everything they do, including their recommendation engine, a little scarier.
CSS animations werent there I guess, but given a manual transcription, either by the submitter, or by a person watching, youtube definitely had the capability to do exactly this, maybe with the button being some flash animation.
I mean the "given a manual transcription" condition means the very much didn't have "the capability to do exactly this". The animation itself isn't the interesting part here.
Yes but the quality and accuracy of that system would be inferior, with a long delay until it gathers enough collaborative filtering information to be useful.
Indeed, and I assume that's why the home page is full of sports and music when I'm not logged in. (Or was recently, it now shows a blank page).
I'm a very strange person: I don't have any interest in sports, and will listen to perhaps 2 or 3 bits of music every week — if I've heard a piece before, I generally find it predictable and by extension dull, unless some prior association of positive feelings can override that.
It wouldn't surprise me if they use transcripts, but neither feature strictly requires it.
I think the real risk here is users extrapolating from "YouTube can automatically respond to the content of videos". The Mr Beast thing is still blowing up, could they automatically identify illegal lotteries? Could they identify things that aren't suitable for YouTube Kids but end up there anyways? So on and so forth, YouTube's moderation is basically always under fire.
This seems like a risky mixed message to send. Being able to light up the Subscribe button doesn't imply the technical ability to do better moderation, but I also wouldn't want to be the guy that has to explain that and how it works to the Senate.
Don’t get started with the “it’s a slippery slope” kind of shit. Short of living in some society where it’s every man for himself, kids shouldn’t be playing lotteries
It also helps that there’s no counterparty risk with the like button. I don’t think anyone is trying to “trick” that feature, and it wouldn’t matter if they did.
Understanding the content is ultimately the only way to properly enforce and discriminate IP theft vs. fair use at scale. And the Senate wants that feature as the IP lobby is strong.
All the different mechanisms that youtube has used is an interesting rabbithole but its ultimately an unwinnable cat and mouse game unless you have an algo that understands the actual content.
Even the human reporting has become weaponized. Don't like the message of a video? have a bunch of youtube accounts flag it and it gets demonetized.
OotL; Mind sharing a link or two?
Familiar with MrB but no idea about anything blowing up. Quick search doesn't make the referenc obvious.
That’s the original video that kicked it off. It’s pretty mundane as far as internet controversy goes. The lotteries were run poorly and as such were probably illegal, not exactly uncommon on YouTube. The show largely targets kids, so there’s some accusations of it effectively being gambling for kids.
After that people dug into his whole life and found out one of his employees was convicted of statutory rape and he was aware, which is a bad look for a channel mostly watched by children.
It’s like a 3/10 on internet controversy scales. Doing something common and dumb but not obviously malicious, and then just stuff that looks bad without concrete evidence anything bad actually happened.
It’s not terribly interesting, I only saw because of a comedian making fun of it.
I sure wish they did.
1) similarity to other users (people who watched this also watched)
2) similarity in words in title or description (or the generated transcript)
I never get any relevant ads or videos for me :(.
How did you think they did automated closed captioning..?
Think of it like reading a foreign language that uses the same alphabet. You can give me some French text and I can read it well enough for a French speaker to understand it. That doesn't mean that I myself understand what I read. Transcripts are the former and this feature is the latter.
> It's just a substring match.
Let's just say this is true. That is a super simply process, but what would it look like?
Step 1: Transcribe the audio into text
Step 2: Run substring match on text
The transcribing/close captioning feature only does step 1. This shows that a step 2 is possible. I think you would have to be naive to think the capability to do this type of analysis on the transcribed text was designed for only this feature and would never be used for anything else. This feature is announcing that Youtube isn't merely creating transcripts of the audio in videos, it is running some unknown amount of analysis on that data.
As I said in my original comment, "it isn't that I didn't know Google had this ability", but this is literally a glowing sign pointing to this fact. I think the danger of Google reminding people of this has potential to outweigh the benefit of the "that's cute" reaction that this is designed to elicit.
Why would you have to be naive to believe this? The subtitles, with timings, are available on the client-side already. You seem to allude that doing this work would require some sort of deep analysis work. I think it's really more like 5 lines of JS, and 4 of them are producing the fun animated gradient :P
This feature is like walking into your kitchen one day to find a dead cockroach. I’m saying that is an indication of a cockroach problem while you’re effectively responding with “it’s fine, the one cockroach is dead and there is no reason to believe there are any others.”
> This feature suggests Google understands what was said.
Running .includes() on the client does not imply that Google has any "understanding" of what was said. It only implies that they ran .includes() on the client. includes() does not "understand" anything.
The thing I really don't understand is this: the fact that Google has closed captions at all implies they do an enormous more "analysis" than this minor feature could possibly require. If you understand how Google does CCs and what that means, this shouldn't have bothered you at all.
In your analogy, it's like you see a mound of a hundred thousand cockroaches, but you're are worried about a dust speck in another room.
Do you want to have a good faith conversation about this? Because going back to debating the meaning of "understanding" after I already said this was misleading is not a good way to have a conversation.
>The thing I really don't understand is this: the fact that Google has closed captions at all implies they do an enormous more "analysis" than this minor feature could possibly require. If you understand how Google does CCs and what that means, this shouldn't have bothered you at all.
Can we set up a baseline that there is a difference between content agnostic analysis and content aware analysis? Transcripts are content agnostic in that they can be produced without any comprehension of the words said. This feature is content aware in that it is looking for specific meaning in the words said. Do you not see any difference between these two?
Call it "understanding", call it "content aware analysis". I guarantee that their closed captioning service has much more of that quality than this new feature does.
> Can we set up a baseline that there is a difference between content agnostic analysis and content aware analysis? Transcripts are content agnostic in that they can be produced without any comprehension of the words said. This feature is content aware in that it is looking for specific meaning in the words said. Do you not see any difference between these two?
Again, I don't see it. CCs are not content agnostic: they have to have semantic understanding of the words said in order to produce accurate results. How do you think CCs differentiate between the words "to", "too" and "two" without looking at the surrounding words and having some idea of contextual usage? How do you think CCs can tell between "there" and "they're" without understanding if the speaker is referring to a person or a location? This is only the tip of the iceberg as to how CCs actually work, and more "content aware analysis" will always lead to more accurate CCs.
Still can't get away from that "understanding" debate. You're also now equating an understanding of context with an understanding of meaning. An understanding of meaning isn't needed to differentiate between "to", "two", and "too" because they're all used differently in sentences. When the system encounters those, I don't think it goes to the definitions and tries to find which word makes the most meaningful sentence. Most times the specific homophone can be inferred based on things like part of speech and the part of speech can often be inferred from a sentence without knowing any meaning.
For example, would the system be able to properly handle homophones that are grammatically similar? Could it consistently transcribe sentences like "I have Celiac disease and enjoy the taste of rose water, so I prefer flower to flour in my deserts." That is an easy sentence to understand for anyone who knows the meaning of those words, but there are no grammatical or structural indications on which flower/flour to use.
But either way, that is getting way too deep in the weeds compared to where my point started. This feature calls attention to an analysis of meaning because the user sees the software reacting to the meaning of the content of the video. A transcript does not call attention to an analysis of meaning because the behavior of the software does not change based on the content of the video.
Your first comment - the one that started all this - was, as far as I can understand, arguing that this feature indicated that Google had the capabilities to do more advanced - understanding? processing? meaning analysis? - than it had done in the past. If I keep coming back to that, well, it's because it appears to be your main point. If it's not, please correct me.
> Most times the specific homophone can be inferred based on things like part of speech and the part of speech can often be inferred from a sentence without knowing any meaning.
This is not true. I don't think I have enough more responses on HN to fully explain why homophones can not be inferred without understanding meaning, but I encourage you to go and read about how transcription works!
> For example, would the system be able to properly handle homophones that are grammatically similar?
I mean, this is easy enough for you to check. Here's some videos about flour / flower - notice how the CCs correctly determine if the word is flour or flower with almost 100% accuracy.
https://www.youtube.com/watch?v=y8vLjPctrcU https://www.youtube.com/watch?v=xdaRvErv2Kc
> This feature calls attention to an analysis of meaning because the user sees the software reacting to the meaning of the content of the video.
Are you saying you specifically think that YT is analyzing meaning from this feature, or just some generic user? I think you are smart enough to know that it's not true, but perhaps my mom might not understand that CCs require infinitely more processing power and this feature is just a drop in the bucket. (If you really still don't think it's true, definitely go read more about how CCs are made!)
Here is what I said. "It highlights how much Google analyses the content of its videos... It isn't that I didn't know Google had this ability...". My point was not that I learned about Google's capability from this feature or that this capability was new, it is that this calls attention to Google looking for meaning in the content of the video. A transcript does not call attention to Google looking for meaning regardless of how the transcripts are prepared.
>I mean, this is easy enough for you to check. Here's some videos about flour / flower - notice how the CCs correctly determine if the word is flour or flower with almost 100% accuracy.
Both those videos include the correct homophone in the title and description of the video. Choosing the correct one is not an indication of the system using the meaning of those words, it is pattern recognition. Every use of "flower" means the next usage is less likely to be "flour". The specificity of the example I used was important because it used both "flower" and "flour" in a way that can only be distinguished by the meaning of the words.
>Are you saying you specifically think that YT is analyzing meaning from this feature, or just some generic user? I think you are smart enough to know that it's not true, but perhaps my mom might not understand that CCs require infinitely more processing power and this feature is just a drop in the bucket. (If you really still don't think it's true, definitely go read more about how CCs are made!)
This feature is a glowing sign that Youtube as a company analyses the content of the videos for the meaning of what is said in those videos. You are too deep into the technical details trying to assign credit for what aspect of Youtube does the "understanding" or which "require[s] infinitely more processing power".
Think of this feature like receiving mail and you see one of the letters has already been opened. That could make you feel like your privacy was invaded in a way you wouldn't feel after receiving a postcard. And now we have spent several comments debating whether a torn envelope indicates whether anyone read the letter and whether a postcard is private.
Regardless, there is in my opinion a clear distinction in sophistication between a filter and something that triggers a timed action. And that was really what my original comment was about, this feature's elevated sophistication is a conscious reminder of Google's capabilities. Normally that is out of sight and out of mind which is probably better for Google.
Try 1 to 2.
I'd expect the button to light up at random times in a video about "smash the like button" causing the button to light up, rather than in the end call to action for the video where the video author actually wants the watcher to click the button.
Similarly, I doubt it can properly understand when the video presenter implies the term "smash the like button" while instead say, leaving an empty space where they would usually say such a thing.
If I pointed you at some french text and asked you to point at the term "le Chateau" you'd be able to do that without understanding french.
I honestly suspect this feature is just implemented as a side effect as the auto-captioning... and now I'm a bit curious if carefully crafted manual captions can trigger it to go off continuously.
There's a whole bunch of lore about the types of words and phrases (such as slurs and profanity) that can affect algorithm rankings, monetization, or child-restriction.
Doing a regex search for "smash that like button" is trivial by comparison.
So that sweet engagement KPI is positively impacted and Youtube's CRO/experiment/Ab testing team gets their cookies.
There are probably various small tweaks like this in the UI.
The biggest issues tend to be when "Big Brother" style companies perform that kind of analysis on your content like the Microsoft AI screenshotting backlash.
I think I've missed this - what happen?
Google analyses the content of the uploaders' videos yet "YouTube search", from the end user's perspective, is seemingly searching strings in titles, maybe tags. These titles generally suck for this purpose, even containing unicode garbage. Throw in all the "recommended" crap in the "results", videos that do not contain the strings being searched, and it barely feels like search at all. The end user is sent "results" she never asked for.
I mean, it has autogenerated captions, translated (which i use to learn languages, fun)
From a technical perspective I like the attention mechanism and that it's automated.
(I use the VTT on my Firefox, I use IDM to download the auto-generated English language scripts for lengthy videos).
I remember listening to Chomsky saying that he prefers to read the transcripts than listening/watching speeches.
It is one of those things I don't care (at all) to know how it is technically set up :)
The best calls to action inside a video take advantage of the Zeigarnik effect : once they're invested in the content, people really want to find out the
I turned the TV off and walked away...
I think there’s a similar effect for the subscribe button. But I don’t remember which phrase triggers it
It might even just be conventional speech-to-text, which can struggle with accents and poor acoustics, etc.
No need for fancy AI, on top of whatever technology powers the transcription feature.
> Davie504 in pieces
The best example of this is clickbait thumbnails. Nearly every creator I am subscribed to has expressed hatred for them, and yet they mostly still make them. Why?
Because, when youtube shows your thumbnail to someone, if they do anything other than click on that thumbnail, your channel/video is downranked in the algorithm (because it has "low conversion"). Every creator is in a zero sum, fully antagonistic war with every other creator to get you to click on their video instead of someone else's. If a channel has low enough "conversion" for long enough, youtube will stop showing that video to potential viewers, including your actual subscribers. For a channel to survive and make enough money to pay rent and things, you have to play the dumb algorithm games. Patreon is literally the only alternative.
Youtube prefers it this way because then they get to crowdsource more addictive and "engagement driving" content through millions of creators, rather than just a department in google (which honestly they probably still have). They didn't have to tell anyone to make "more stupid clickbait thumbnails", they did it themselves because it was encouraged by the incentives given to them. A Youtube where everyone plays the algorithms and thumbnail games literally earns more money and views than a youtube where every creator is only driven by their own desires to make good art. Google has been doing this since they bought Youtube. Early on, only videos longer than ten minutes could get ads, so every video ballooned to ten minutes and one second in length.
I’m wondering what an unbiased survey would even look like.
Maybe I have just become more attentive to these things or they have become more prevalent, but I notice this blatant optimization for the Algorithm(tm) more and more. I guess this is the meta you need to play in order to stand out/make it in the torrent of content that is YT or social media in general. I also see this as an example of Goodhart's law as creators try to optimize for reach and thus likes etc. while the original intention of the YT Algorithm(tm) was (or should have been) to serve you content you will enjoy.
The thing that I do see on every video that reminds me to hit the like button is seeing the counter move. I'm way more likely to see that and think "yeah I do like this video, I should hit this button" when I see the counter change than I am to have a similar reaction when explicitly told by the person in the video to do it.
I'm sure many here would love to make use of some uBlock Origin rulesets to disable them?
Before Animation: <yt-smartimation class="smartimation smartimation--experiment-enabled smartimation--enable-masking">
While Animation: <yt-smartimation class="smartimation smartimation--experiment-enabled smartimation--active-border smartimation--active-background smartimation--enable-masking">
I need to balance all the time spent on this site reading about using Kubernetes at scale, AI changing the world and how to bring more value to my employer with their videos.
https://youtu.be/ia8Q51ouA_s?si=U3WbLvRlV04fjo4J
I'm sure thay's not ideal (e.g. not for Google trying to understand you better) for the addicts, but for the other ones, this can be the perfect solution.
I don't blame creators for doing this: creators have to say this to compete (this is an example of Moloch[1]). The point isn't to hurt creators who do this, it's to remove the incentive so that creators no longer have to do this. So I think the proper way to do this is to announce that videos that ask for likes or subscribes will be penalized, and only penalize videos that come out after the announcement.
But, YouTube serves advertisers, not viewers, so they'll never do anything that benefits viewers but hurts advertisers.
[1] https://slatestarcodex.com/2014/07/30/meditations-on-moloch/
[1] https://www.nrk.no/ostfold/xl/tiktok-doesn_t-show-the-war-in...
So, at first I thought that Google is parsing the text of the video being played, but is not listening to your mic.
However, several comments both here and on YT imply that Google is ALSO listening to what you say on your mic. Which, frankly, is much more scary - though I cannot currently test.
Other than that, why wouldn’t they know what’s being said in a video they host?
Yet...they can't hire more staff, implement databases to track malicious/unreliable DMCA submitters, and falsely-targeted creators for whom complaints are likely BS, and implement AI-powered tools to assist with all of this?
Example: Over here there is a huge problem with rented e-scooters being left wherever.
Every scooter company asks you to take a photo of the parking. NOBODY ever checks those.
I did a quick test with GPT-4 and it could tell me if a scooter was parked in a considerate way about 90% of the time (10 photos, some with multiple scooters).
Could they implement this in an afternoon? Yes. Will they? No. Because it'd be used to penalise their own customers.
Only a semi-scooter company but Lime, at least in London, do sometimes check the photos because I once got dinged for "incorrect parking" (wrongly, mind, it was parked identically to the Santander bikes a few metres away and thus incapable of inconveniencing anyone.)