Hackers' Pub

NotebookLMに（音楽や自然音でなく）会話のオーディオファイルを食わせて、
タイミングについての質問をしたら、

---
※ソースには秒単位のタイムスタンプが記載されていないため、内容の区切り（ソース番号）と、編集の目印となる「開始・終了フレーズ」で指定します。
---

って言われたので、内部では一旦文字起こししてからLLMで処理しているんだね。

画像でいうCLIPみたいに、発話と文字情報を同列に扱うようにしているのかと勝手に思っていたけど、
このアーキテクチャーだとすると、「間」とか「声の調子」とかは分からないんだな。
LLMという資産を生かそうと思うとこうなるのか。

Syntax	Description	Examples
`"` keyword `"`	Finds the string within quotes, including spaces. Case-insensitive. (Escape quotes inside with `\"`)	`"Hackers' Pub"`
`from:` handle	Finds content written by the specified user.	`from:hongminhee` `from:hongminhee@hollo.social`
`lang:` ISO 639-1	Finds content written in the specified language.	`lang:en`
`#` tag	Finds content with the specified tag. Case-insensitive.	`#HackersPub`
condition condition	Finds content that satisfies both conditions on either side of the space (logical AND).	`"Hackers' Pub" lang:en`
condition `OR` condition	Finds content that satisfies at least one of the conditions on either side of the OR operator (logical OR).	`#HackersPub OR "Hackers' Pub" lang:en`
`(` condition `)`	Combines the operators within the parentheses first.	`(#HackersPub OR "Hackers' Pub" OR "Hackers Pub") lang:en`