AI & TechArtificial IntelligenceBigTech CompaniesDigital MarketingDigital PublishingNewswireTechnology

Google Labels AI & Bot Content in Forums

▼ Summary

– Google updated its structured data documentation to add new properties for Discussion Forum and Q&A Page markups.
– A key new property, `digitalSourceType`, allows sites to label if content was created by a trained AI model or a simpler automated system.
– Google introduced a `commentCount` property to declare the total number of comments, even if not all are shown in the markup.
– The `sharedContent` property for Discussion Forums now explicitly supports specific subtypes like WebPage, ImageObject, VideoObject, and reposted forum content.
– These new properties are recommended but optional, and existing structured data implementations will not break.

Google has enhanced its structured data guidelines for discussion forums and Q&A pages, introducing new optional properties that allow webmasters to provide more detailed context about their content. A key addition is the digitalSourceType property, which enables sites to specify whether a post or comment was generated by an automated system. This update gives publishers a formal method to label content originating from large language models (LLMs) or simpler algorithmic bots directly within their markup.

The property utilizes established IPTC digital source enumeration values to classify content origin. Supported values include TrainedAlgorithmicMediaDigitalSource, for content created by a trained AI model like an LLM, and AlgorithmicMediaDigitalSource, for content from a basic automated process, such as a reply bot. This approach mirrors Google’s existing use of IPTC metadata for images, now extending transparency to text-based community content. The property is recommended for several content types, including DiscussionForumPosting, Comment, Question, and Answer.

Alongside source labeling, Google introduced a commentCount property. This recommended feature lets sites declare the total number of comments on a post or answer, which is particularly useful when comments are paginated or not all are included in the markup. For Q&A pages, a new formula clarifies that answerCount + commentCount should equal the total number of replies, giving search engines a more accurate measure of thread engagement.

The guidelines also expand support for shared content within forums. The sharedContent property now explicitly accepts four specific subtypes instead of a generic CreativeWork. These are WebPage for shared links, ImageObject for image-centric posts, VideoObject for video posts, and newly added support for DiscussionForumPosting or Comment types to properly markup quoted or reposted content from other threads. Accompanying code examples demonstrate how to annotate a referenced comment with its URL, author, date, and text.

Further refinement appears in the handling of images. Google’s updated documentation advises that link preview images should be placed within the sharedContent field’s attached WebPage object, rather than in a post’s primary image field, promoting more accurate content representation.

These new properties are entirely optional, and existing structured data implementations remain valid without any required changes. For platforms hosting a blend of human and machine-generated content, the digitalSourceType property offers a standardized way to disclose content origin directly to Google. The search engine has not specified how, or if, this data will influence ranking or display features, stating only that it serves to indicate provenance. Sites interested in adopting these new properties can integrate them gradually at their own pace.

(Source: Search Engine Journal)

Topics

structured data update 100% ai content labeling 95% digital source type 93% forum markup 90% q&a page markup 90% comment count property 88% shared content expansion 85% iptc standards 83% content origin transparency 80% link preview images 78%