How to Use the Moderation AI Agent

Last Updated On: September 18, 2025

This article helps moderators understand how to use Gainsight’s AI Moderation.

Overview

Community moderators are responsible for ensuring that user-generated content aligns with established guidelines and maintains a respectful, trustworthy environment. However, manually reviewing high volumes of content can be time-consuming, delay content publishing, and introduce inconsistencies in moderation.

Using the AI Moderation from the Pre-Moderation Rules in the community settings, moderators can streamline this process. The Moderation AI Agent uses powerful AI models to evaluate content in real time, helping moderation teams screen posts faster and more consistently.

Why AI Moderation?

Moderation AI Agent helps enforce your community’s code of conduct by:

Automatically detecting potentially inappropriate or harmful content.
Flagging submissions for manual review or blocking them automatically based on risk level.
Complementing existing tools such as Keyword Blocker and Spam Prevention.

Moderation AI Agent can reduce the manual workload of your moderators, improve consistency, and accelerate content verification, while continuing to provide a safe and high-quality experience for all community members.

How Does AI Moderation Work?

Moderation AI Agent classifies posts and replies into Approved, Pending, or Trash, applying informative reasoning and descriptive moderator tags for human intervention, filtering, and review.

It provides a fully automated moderation by immediately processing all User-Generated Content (UGC) through OpenAI's Moderation API, ensuring initial compliance. Post-clearance, the AI Moderation further validates content against detailed checks for adherence to community guidelines, absence of NSFW content, PII protection, and spam detection.

Community Code of Conduct

The code of conduct provides guidelines for the Moderation AI Agent and sets clear guardrails for moderation. This ensures that AI-powered moderation aligns with the unique needs of your community.

You can reuse your existing public-facing community rules, code of conduct, or equivalent guidelines. In addition, you may include internal training documents that community managers use during onboarding. The Moderation AI Agent evaluates content against your code of conduct to make the initial decision about what is acceptable within your community.

Note: The Community Code of Conduct can be up to 5,000 characters in length.

Moderation Status

Moderation AI Agent scores content on a scale of 0.0 - 1.0 and currently has three possible outcomes:

Status	Confidence Score	Description
Approved	0.0 - 0.4	Content is considered safe and appropriate. Content is approved, published, and visible in the community
Pending	0.4 - 0.7	Requires manual review; borderline or uncertain content. Content is held in the Pending status, is not published, and is not visible in the community.
Trash	0.7 - 1.0	Content violates community or Gainsight moderation guidelines. Content that is Trashed or Trashed and Reported is not published and is not visible in the community.

Moderator Tags

Moderation AI Agent adds tags to each topic or reply that it reviews, to share insights on sorting, filters, and other analytics based on all content moderated.

Category	Tags
Positive Case	Meets code of conduct SFW No PII No Spam Approved
Pending Case	Pending code of conduct Pending NSFW Pending PII Pending Spam Pending
Negative Case	Does not meet the code of conduct NSFW Contains PII Contains Spam Trash
Other Cases	Flagged by OpenAI AI Moderator To be reviewed by the community manager

Configure AI Moderation

Moderators can configure Moderation AI Agent in addition to Keyword Blockers and Moderator Approval to ensure that the AI evaluates the content before it is published in the community.

To configure AI Moderation:

Log in to Control.
Navigate to Settings > Pre-Moderation Rules. The Pre-Moderation Rules page appears.
Expand AI Moderation.
Turn on the Enable AI Moderation toggle.
Note: When AI Moderation is enabled, an AI moderator user is created. Gainsight recommends not deleting this user.
(Optional) In addition to Administrator, Community Manager, Moderator, and Superuser, you can add additional roles whose content can be excluded from AI Moderation. To add the roles:
1. From the Additional Excluded Roles dropdown, select the custom roles.
2. Click Apply.
In the Community Code of Conduct, enter your community guidelines.
Click Save changes.

Once Moderation AI Agent is configured, any Topics or Replies that do not meet your community’s guidelines are automatically tagged with Moderator Tags. These posts are then moved to Trash and Reported.

You can review this content in the Content Moderation widget on the Control Home page.

For more information on how to add this widget, refer to the Overview of Control Home article.

Page 1 / 2

Hey Community!!
We have recently updated the article with information on Content Moderation Widget!

Thanks a lot for this new feature! I have to start using it ASAP 😀

At this point, I have a few questions:

When we write the Code of Conduct for the AI, what kind of tone should we use, and who should we address in the text? Is it the AI, the community member, or does it not really matter?
About the tags — what does “SFW” mean?
The Moderation AI works with topics (opening posts) and replies. If the AI thinks that a member’s reply to a topic is suspicious and marks it as “Pending” (for example) because there are no moderation tags in the reply, how can we know why that decision was made?

Hi @revote,

Thank you for reaching out to us with your query. Tagging the PM @Graeme Rycyk for more inputs on these, Thank you.

Thanks a lot for this new feature! I have to start using it ASAP 😀

At this point, I have a few questions:

When we write the Code of Conduct for the AI, what kind of tone should we use, and who should we address in the text? Is it the AI, the community member, or does it not really matter?
About the tags — what does “SFW” mean?
The Moderation AI works with topics (opening posts) and replies. If the AI thinks that a member’s reply to a topic is suspicious and marks it as “Pending” (for example) because there are no moderation tags in the reply, how can we know why that decision was made?

Hey @revote,

My name is Graeme and I am the Product Manager for the Moderation AI Agent. First, thanks for your questions here.

1. Your Code of Conduct can be written like you were posting your community rules to your community but it can also be written like training or onboarding documentation for a new member of the community team. The ideas is it operates the same way as your human community moderators would moderate on your community.

2. SFW stands for "safe for work," which means that the content is appropriate and is often used to indicate that something is suitable to view in a professional or public setting.

3. The Moderation Agent uses Moderation Tags to give your community team a fast and easy way to understand what moderation decisions have been made and why. Further to this when a post is Trashed and Reported a more detailed explanation of the reported reason is also included.

If you have any other questions, please feel free to reach out.

All the best,

Graeme

Where these tags can be seen? Is there a new field in the Control? Currently, moderation tags appear in the opening post.

Hey @revote,

The Tags that the Moderation Agent adds are visible in the “Moderator Tags” section of the Moderate Topic page in Control.

Home > Content > Overview > [Topic]

See this example here:

If you need anything please let me know.

Cheers,

Graeme

Maybe I dont how to ask, but your example looks like tags from to opening post.

Will there be same kind of field next to each and every comment?

Hey @revote,

Yes they do look like the Public Tags, but Moderator Tags are only visible inside Control, if you go to the moderation view of a topic you will see a input field labeled Moderator Tags, like in the close up screenshot in my earlier post or see a full screen annotated image below. I hope that helps clear this up.

If you need anything please don’t hesitate to ask.

Cheers,

Graeme

Thanks, but you are still not answering my question. I dont know how to ask differently 😃

Hi @revote,

Sorry about that. I think I understand the crossed wires here, by comments, I think you are referring to Replies, I took “comments” more broadly and thought you were referring to all posts.

So Moderator Tags are not applied to Replies, only Topics. However, we will be extending and improving functionality here to give more context on the status of Replies, I cannot give a firm date on this yet.

Note: When a Reply is Trashed there will be added context in the “Report” feature to elaborate on the reason the reply was Trashed.

I hope this clears up things.

Cheers,

Graeme

Okay, so how this actually works?

(Sorry, I was not able to delete those tables here)

How Does AI Moderation Work?

Moderation AI Agent classifies posts and replies into Approved, Pending, or Trash, applying informative reasoning and descriptive moderator tags for human intervention, filtering, and review.

Moderator Tags

Moderation AI Agent adds tags to each topic or reply that it reviews, to share insights on sorting, filters, and other analytics based on all content moderated.

Hi @Graeme Rycyk ,

I am testing this tool in our sandbox - seems to work well even in Finnish language 💪🏻

However I have found some cases where synonyms like nasty, mean, vicious, unkind were not always recognised by the tool. How can we teach the AI?

And another thing came up:

The upper comment/reply contains a phone number > it was trashed.
The lower one contained an email address > it was automatically moderated.

Why the logic is different? Is it because email is more easily recognized as an email?

Hey @Suvi Lehtovaara,

An easy way to solve this would be to add words you want to ensure are always caught into your Code of Conduct.

Do let me know if you have any other questions or if this doesn’t work.

Kind regards,

Graeme

Hi @Suvi Lehtovaara,
Mohammed Fahd here, I’m the POC from documentation for this article. We have recently received a request from you to access the Overview of Control Home article. Apologies for the inconvenience as the link was wrongly tagged.
We have now updated the article link - Overview of Control Home. Please let me know if you have any trouble accessing it. Thank you

Thanks, this is helpful post.

So there is no Moderation tags in replies, so we dont know what and why actions are made.

Thanks, this is helpful post.

So there is no Moderation tags in replies, so we dont know what and why actions are made.

Yes, that’s true @revote. And if there’s no context in the post, the reason may stay unclear.

Thanks, this is helpful post.

So there is no Moderation tags in replies, so we dont know what and why actions are made.

At this moment yes, this is correct, but this is an improvement we intend to make in the near future.

Kind regards,

Graeme

Yes, that’s true @revote. And if there’s no context in the post, the reason may stay unclear.

Also, I didn’t know that AI deletes content automatically. I thought it just hides content that goes against the code of conduct.

Why is this a problem?

If you think about the DSA, the Digital Services Act: if we limit user communication by deleting content or even just part of it, the DSA states that the user has the right to make a note about the decision (the decision to delete content). And we are obligated to handle that note.

In theory, the decision to delete might be wrong. That means we would have to restore the content.

But since there is no version history in replies, restoring the content is not possible.

We need to consider whether we should start using Moderation AI at this point.

Yes, that’s true @revote. And if there’s no context in the post, the reason may stay unclear.

Also, I didn’t know that AI deletes content automatically. I thought it just hides content that goes against the code of conduct.

Why is this a problem?

In theory, the decision to delete might be wrong. That means we would have to restore the content.

But since there is no version history in replies, restoring the content is not possible.

We need to consider whether we should start using Moderation AI at this point.

Hey again @revote,

So AI Moderation doesn’t delete any content, it only Trashes and adds a Report reason for further context for content that was trashed. This is how content is removed from public viewing in our communities. If mistakes are made by the AI Moderator, say by mistakenly Trashing content, this can always be reversed by the human-moderator. Our “Notify Author” flow can be applied by your community team to adhere to the DSA even if using AI Moderator. Custom views can be set up to enable your community team to easily create a workflow to track the moderation action of the AI Moderator.

I would be happy to jump on a call to discuss things and take more questions, if you would prefer.

Kind regards,

Graeme

So AI Moderation doesn’t delete any content, it only Trashes and adds a Report reason for further context for content that was trashed.

Okay, good. Thanks.

I just looked this screenshot (lower example) and @Suvi Lehtovaara´s comment:

The upper comment/reply contains a phone number > it was trashed.
The lower one contained an email address > it was automatically moderated.

It is good to know that it doesnt delete content automatically. It just hides content.

So AI Moderation doesn’t delete any content, it only Trashes and adds a Report reason for further context for content that was trashed.

It is good to know that it doesnt delete content automatically. It just hides content.

To be precise, the email address was deleted. I mean, at least in our case we could not find the original content from Control.

I can confirm, AI moderation deletes content automatically. Here´s example, it deleted phone number and then it published my test topic:

But same time I have to say that this works perfectly 👌 I have tested with several different ways and AI works nicely. Nice job!

Btw, I didnt find phrase for this text, to translate in Finnish:

To be precise, the email address was deleted. I mean, at least in our case we could not find the original content from Control.

Based on my test, the AI didn´t delete the email address. It just hid the post.

But it deleted phone number when I tested it with separate post 😀

Overview

Why AI Moderation?

How Does AI Moderation Work?

Community Code of Conduct

Moderation Status

Moderator Tags

Configure AI Moderation

Content Moderation Widget

How Does AI Moderation Work?

Moderator Tags

Didn't find what you were looking for?

Sign up

Welcome to the Gainsight Community

Scanning file for viruses.

This file cannot be downloaded