
AI chatbots provide a unique experience to the user by directly generating answers from other sources, and it has left many wondering where exactly it gets its information from.
For decades Google has been seen as a magic answer box of sorts where you can enter any question and you'll be given an answer — but those answers were never actually produced by the search engine itself and instead written by a whole host of websites that you can choose from.
Contrastingly, the rise of artificial intelligence chatbots cuts out the middle man by removing the need for users to seek out the information themselves, and this has had a potentially devastating impact on countless websites across the internet.
AI has to get that information from somewhere though, and many have been left stunned by a shocking study that reveals all of the top locations that all of the biggest models on the market pull from right now.
Where does AI get its information from?
One fascinating new study conducted by SEO and keyword research company Semrush has revealed all of the top domains that are most frequently cited by ChatGPT, Perplexity, and Google's AI Mode and AI Overviews.
Advert
There are currently 20 entries on the list, with 19 of them ranging between 4.22% and 26.33% citation frequency, but the most cited website stands out significantly and perhaps unexpectedly at the top.

According to the data, which was collected from over 150,000 citations, forum-like social media platform Reddit is by far the most cited website by the aforementioned AI models, with 40.11% – nearly half of all citations – coming from the site.
It is currently cited 13.78% more than the next most popular destination, Wikipedia (26.33%), and this isn't too surprising considering how many people added 'reddit' to their Google searches before AI boomed in popularity.
Advert
Other key sites that make up the top of the list include YouTube (23.52%), Google (23.28%), Yelp (21.01%), and Facebook (18.72%).
Why is this worrying?
The prevalence of Reddit in AI citations is worrying on two separate fronts, as not only is the frequency likely driven by deals made between the social media platform and AI companies, but it also most importantly isn't always a reliable source.
As reported by Reuters in February 2024, Google made a deal with Reddit to make all of the platform's content available for training AI models used across Gemini and Search, with the value working out at around $60 million each year.
Advert
OpenAI also announced a new partnership with Reddit just a few months after the Google deal, where "OpenAI will bring Reddit content to ChatGPT and new products, helping users discover and engage with Reddit communities."

While plenty of people will often head to Reddit to find answers – specifically for questions that other websites might not cover – it's not completely reliable source considering all information produced on the website is done by users themselves.
Where humans might be able to parse when a comment isn't genuine, AI will almost definitely struggle, to the point where it has been known to take clearly comedic comments – which there are plenty of on Reddit – at face value.
Advert
OpenAI CEO Sam Altman has previously expressed his surprise that people actually trust ChatGPT, but it's certainly worrying how much content is sourced from Reddit with how blindly many people would take the advice and information provided to them by an AI.