I would expect Lemmy to show up equally in the search results if there is enough relevant content. My tiny tiny instance is already showing up in search results, crawlers can definitely find stuff on here. It would be great if at some point we can append “lemmy” to search queries to get the good stuff like we could with Reddit.
I’ve been toying with this idea at well. I don’t think it’s a good idea to scrape all content. This could drown out the lemmy-original content, especially when large subreddits are concerned. Maybe an upvote threshold (Only scrape if more than X upvotes) would be a good idea.
I would also scrape only the post itself, not the comments. Best to have our own organic discussions here.
Finally it should be very clear that a bot is posting these things. Ideally the bot would also ensure it is not re-posting something that was already posted by a Lemmy users just a bit earlier.