This seems like an invalid test.
One of them collected posts from Hacker News and LinkedIn profiles and then linked them by using cross-platform references that appeared in user profiles. They then stripped all identifying references from the posts and ran a large language model on them.
If I post something on LinkedIn, and then post the same thing on Hacker News, of course an LLM could match my accounts up.
Am I missing something?
From a Facebook post I made on February 17th:
There are giant AI data firms that promise they can go through massive troves of data and pull out general and specific information from them. Information that is actionable and accurate. Give it 6 million data points and it’ll find all the links and organize them for you and unmask hidden details that aren’t visible to the naked eye.
Not one of those companies is stepping up to go through the publicly released Epstein files.
Today I asked AI to tell me which phone providers were available short by price and offers and it lied all the time, when I pointed it the AI corrected most of it but also removed some that were accurate for some reason.
It would have been quicker if I did that myself instead of ask AI, oh also didn’t provide all companies.
Maybe those companies have better AI that can make no mistakes but I doubt it, I think the LLMs will lie and no one has time to check if they are correct.
There were reports of people trying to unredact the files almost immediately.
But that’s not the same, is it?
I don’t think you can do literally the same thing on the Epstein files. Maybe I’m misunderstanding what you have in mind.
In theory, using the information and the released files and the information the public sources, it should be possible to figure out who those redacted names are based on writing style and other factors. We should be able to deanonymize.
Hmm. Maybe but it is not the same problem as those discussed in OP. I also have some doubts about the paper, but that’s another story. You could try it out?
I’m not qualified to design the prompts and home users can’t really pile in 3 million+ documents.
Prompts are in the appendix: https://arxiv.org/abs/2602.16800
I don’t know how far you get on the free tier but it should be at least enough for a proof of principle; to get other people to chip in. You didn’t have qualms demanding other people should do this for free.
Mind that this is a serious GDPR violation in Europe. So there will be serious pressure on AI companies to prevent this kind of use.
And it will falsely identify people at even greater scale, because it is an imprecise and buggy tool.
Yeah, but if it falsely identifies the right people, is it really buggy?
How dare you claim that the hallucination engine hallucinates. The Billionaires have declared this heresy.
Great, we’re at a point where “researchers” are helping tech bros hurt the public interest. Could they just NOT publish this shit? Stop giving helpful tips to tyrannical oligarchs!
Academics can be stupid idiots sometimes.
Tbh I read the research article and it’s not rocket science that they were doing. Any 2nd rate FBI analyst would have come up with these ideas sooner or later to try and match anonymous profiles with veryfied ones using LLMs.
Researchers’ work has always been abused by others. The advancement and free distribution of knowledge should not be curtailed for fear of malicious parties.
Average people download gamed and apps and their phone is loaded to the tilt with bloatware. You think they care?
The average person puts their entire lives on Facebook or linkedin with their real names…they don’t give a shit.
“WeLl I hAvE nOtHiNg To HiDe”
The number of times I’ve heard this from people in the secops field is frighteningly high.
I call BS we can’t even get AI models to determine if an AI write text. This as go to me some magic statistics
Is this the first step towards using local LLMs for anonymity? 🫠 Always rephrasing each sentence somewhat. Truly dystopian stuff
The results, especially the high numbers stated in the news article (68% recall, 90% accuracy) are overestimated as their verification method (i.e., whether the LLM detected really the right account) come from matching veryfied accounts with a test set of anonymous accounts of which they knew the real name. They knew the real name bcs the persons had a public link to their LinkedIn in their “anonymous” profile (which was removed for the sake of testing wheter the LLm can match the two acfounts. That being said: a user who uses a pseudonym but links his/her account publically to a, say, LinkedIn account doesn’t really care about anonymity and might hand out many more ‘breadcrumbs’ to follow than a truly anonymous account.
But I still think that also in the case of a fully anonymous account, people can be fingerprinted and matched with non-anonymous identities due to language, style etc. by a LLM.
Reminds me of an AI tool that could identify authorship of articles with surprisingly high accuracy, and then they peeked under the hood and realized it was just looking for the author byline at the top of the article that says “By John Doe,” where it completely failed if the article didn’t explicitly say who the author was.
I can’t believe this product, modeled after humans, would lie and cheat like humans
For those who don’t know, we’ve been living in a dystopia since the 2000s.
Yeah. I got a hunch of that a while ago, while trying some “old” scenarios of de-anonymization we used to do by hand. Just asking questions and posting pictures got surprisingly accurate results. A single picture with (to me) no significant landmark could lead to localizing a specific part of a city, and that was using a local LLM with a relatively small model, running on a 16GB VRAM 4060Ti.
It is now time to remember fondly the time where the younger people were warned by older people to not post all their stuff online, not over-share, be cautious about strangers, etc. I’m not sure when we lost that, but oh boy, it’s a festival.
Hmmm interesting. I’ve never used AI to try and find out stuff about myself. Maybe I’ll try. Just curious.
That’s how they get you

Do y’all not write differently when you’re trying to be discreet on Blind?
Brazil has 200 million ppl, how they would find someone in Rio like me?
- Filter for Brazilian who are Pink Floyd fans
- Filter for Brazilians who can speak English
- Filter for Brazilians who are left/socialist and who are on alternative social media sites.
And so on
Fuuuu





