I think the theoretical existence of "reply guys" on #Mastodon has very simple technical explanation.
The existing #activitypub implementation in #Mastodon has critical flaw in displaying only small subset of replies to any given toot (status). This is because only subset of replies is shown: only those existing in already cached federated timeline are shown. This is typical example of state-of-the art primitive algorithm (actually: there is currently no algorithm, where there should be one: not shady and manipulating algorithm, but rather deterministic and transparent one).
And if we don't see all replies (which is true especially on small, self-hosted instances, which cache very limited subset of federated content), than we tend to answer. And because we don't see other like us, we don't know there are many like us explaining the same thing. And poor author of original toot sees it all - and don't understand, why we all repeat more or less the same thing. Why we don't just fav the other replies, like we did on Xitter? Well... because we don't see them! We would need to "open original page" each time we want to reply.
To make #Mastodon really useful tool of public discourse, we would need a chance to see all replies. Of course, we would like to apply our own sort criteria, not be manipulated by algorithm choosing "personalized best replies for us" like on late Xitter - but really, we need chance to see them all. Well, limitations may apply, like "only replies from first hour, only replies with most favs and boosts". But we need a chance to seem them all . Reply spam would become an issue, of course.
I wonder, if upcoming new releases of Mastodon are going to fix this issue... and to be true, it is not an easy engineering problem, as distributed information systems are really a hard problem. We are now merely confederated, we are not really federated.