Google has developed a system called “topic authority” to improve the relevance and expertise of content in Google Search and News. The system helps in better serving local news queries by surfacing local and regional content in both Top Stories and other news features.
The “topic authority” system uses a range of signals to assess a publication’s expertise in specific areas, including source notability, influence, original reporting, and source reputation.
Key Signals of Topic Authority: Notability, Influence, and Reputation
Notability of a source for a topic or location: This signal looks at how relevant a publication is to certain topics or geographical areas. The system understands and learns from user behavior, recognizing which publications users frequently turn to for specific kinds of news or information. For instance, a local newspaper might be the go-to source for news about local high school football games. Over time, the system recognizes this pattern and assigns a higher level of authority on this topic or location to the publication.
Influence and original reporting: This signal assesses a publication’s authority based on how often its original reporting is cited by other publishers. If a news outlet frequently breaks stories that are then picked up and referenced by other media, it demonstrates that the outlet is influential and authoritative in its field. Google started adding a “Highly Cited” label to such stories in 2022 to help users identify such authoritative and influential sources.
Source reputation: This signal considers a source’s history of high-quality reporting and recommendations from expert sources, such as professional societies. For example, a publication with a history of producing original reporting, or one that has received numerous journalistic awards, would be considered to have a strong reputation. This reputation further solidifies the outlet’s position as an authoritative source.
Topic Weights and Authorship Percentage in Determining Authority
According to patent “System and method for determining topic authority” (US8458196B1) the concept of “topics” and “weights” in the context of a document’s content plays a crucial role in this system, representing the subject matter of a document and how strongly it’s associated with the document, respectively.
Sure, let’s put it in simpler terms:
So, there’s this cool method for figuring out which articles or documents know their stuff about certain topics. It starts by getting some info about the topics covered in a document. For each topic, there’s a “weight” that shows how much the document is really about that topic.
Now, topics can be a bunch of things – categories, big ideas, subjects, concepts – basically, whatever the document’s content is all about. For instance, let’s say you have a document that talks about cars, planes, and boats. Those are the specific topics. But in a broader sense, the document might be about ‘vehicles’ and ‘modes of transportation’ – those are the big-picture topics.
Now, onto weights. You can think of a weight like a score, showing how strongly a document is tied to a topic. It can be a percentage or a number from 0 to 1, or even something more complex like a vector or a scalar. So, if a document is super focused on ‘vehicles,’ that topic might get a high weight like 0.8 or 80%. All the weights for the topics in a document usually add up to 1 or 100%.
Also, the system collects information about who wrote the document. For every topic in there, it tracks who the authors are and what percentage of the content on that topic each author wrote. This helps the system understand which authors really know their stuff about certain topics.
In the context of the topic authority system, this data can be valuable in several ways. It can help to further refine the expertise of individual authors on certain topics, which can contribute to the overall evaluation of a publication’s authority on these topics. It can also aid in identifying and promoting subject matter experts within a field, thereby enhancing the richness and diversity of information surfaced in response to user queries.
For example, if an author frequently writes on a specific topic and their articles are widely read and cited, the system might recognize this author as a particularly reliable source on that subject, even if they contribute to various publications. This author-centric view can complement the overall source reputation, providing a more nuanced understanding of expertise and authority in the landscape of news and information.
In summary, the most prominent signals suggest that content creators should strive for a focused, original, high-quality approach to reporting, particularly in their areas of specialization or local relevance, to maximize their visibility and authority in Google search results.
Publications that provide great coverage about the areas and topics they know well are more likely to be recognized by the topic authority system and surface higher in search results.
How can Google detect topics?
Using our site as an example, let’s try to simulate how Google defines topics. To do this, I downloaded URLs from Google Search Console for the last 30 days.
Using artificial intelligence, I obtained a summary of the text of each page. Additionally, using Netpeak Spider, I retrieved the meta titles of each URL.
Based on the analysis of my dataset using the Claude chatbot, I obtained the following results.
Here are the most frequent bigrams from the given text columns:
the_google – 83 occurrences
search_engine – 34 occurrences
machine_learning – 31 occurrences
ai_technology – 22 occurrences
digital_age – 21 occurrences
artificial_intelligence – 20 occurrences
google_search – 19 occurrences
natural_language – 17 occurrences
big_data – 17 occurrences
language_models – 16 occurrences.
Here are some additional insights from the analysis:
- The vast majority of the articles discuss topics related to Google, search engines, and AI technologies. This indicates the focus of VProexpert is covering trends in these fields.
- The frequent mentions of “machine learning”, “artificial intelligence”, “big data” and related terms show that these technologies are important subjects covered by VProexpert.
- The bigrams “Google Search” and “Search Engine” appearing frequently is clear evidence that search optimization and ranking is a major topic for the website.
- Terms like “digital age” and “multimodality” indicate that VProexpert discusses emerging technologies and their impact in the contemporary digital era.
- Several articles discuss “language models” and “natural language”, showing VProexpert covers topics at the intersection of AI and language technologies.
- The text summary and title of many articles reference Google’s systems, tools and updates, showing Google and its products form a central focus of VProexpert’s coverage.
In summary, based on the frequent bigrams and related terms, it is clear that VProexpert covers topics mainly in the areas of:
- Search engine optimization
- Google search and Google products
- Artificial intelligence
- Machine learning
- Big data and data analytics
- Emerging technologies and their impacts.