

Hrrmm. Webrings it is. But also, the search engine problem seems like one calling out for a creative solution. I’ll try to look into it some more I guess. Maybe there’s a way that you could distribute which peer indexes which sites. I would even be fine sharing some local processing power when I browse to run a local page ranking that then gets shared with peers…maybe it could be done in a way where attributes of the page are measured by prevalence and then the relative positive or negative weighting of those attributes could be adjusted per-user.
Hope it’s not annoying for me to spitball ideas in random Lemmy comments.
Took me awhile to get back to this, but yeah I agree that it seems at least conceptually solid. The big barrier is that, like jarfil mentioned, you’d need at least 200 million sites indexed, so you’d need a good amount of users for it to work. And the users would need to consent to running some software that basically logs all the pages they visit. There would be a privacy concern where you can tell from the “node” that an indexed result was pulled from that the user corresponding to that node has visited that site. This could maybe be fixed by each user also downloading indexed site data from others aside from what they personally use, thus mixing in their own activity with others indistinguishably? Probably clever vulnerabilities in that too though.
Structurally it seems a lot like DNS. If only DNS servers were fine storing embeddings of site content and making those queryable, it would seemingly accomplish the same idea, aside from it being in the hands of DNS operators. Of course, that massively multiplies the amount of data these servers need to an impossible degree.
I still need to read up on what primitive indexing really looks like and how much space it takes to store per site.