I made a robot moderator. It models trust flow through a network that’s made of voting patterns, and detects people and posts/comments that are accumulating a large amount of “negative trust,” so to speak.
In its current form, it is supposed to run autonomously. In practice, I have to step in and fix some of its boo-boos when it makes them, which happens sometimes but not very often.
I think it’s working well enough at this point that I’d like to experiment with a mode where it can form an assistant to an existing moderation team, instead of taking its own actions. I’m thinking about making it auto-report suspect comments, instead of autonomously deleting them. There are other modes that might be useful, but that might be a good place to start out. Is anyone interested in trying the experiment in one of your communities? I’m pretty confident that at this point it can ease moderation load without causing many problems.
What kind of sample size of comments or time length would the bot use to make a decision? Is it safe to assume that those new Troll accounts would fly under this bots radar?
Edit:Also is there a functionality where we can look up a user using this tool? I would be interested in seeing some users I’ve interacted with in some political and news related communities.
It’s the last 30 days of comments. That’s long enough to be robust, but short enough that someone can realistically rehabilitate their image with the bot by not being a jerk for 30 days, and restore their posting ability.
I was hoping that it would be a good tool for self-reflection and fairness in moderation. In practice, the people who get banned for being jerks are totally uninterested in revising their commenting strategy, and choose instead just to yell at me that I’m awful and my bot is unfair and it should be their right to come in and be a jerk if they want to, and banning them means I am breaking Lemmy. Then they restart one of the arguments that got them banned in the first place. I don’t know what I was thinking, expecting anything different, but that’s what happened. You can see some of it happening in these comments.
New accounts, or accounts that have been recently inactive, are a hard problem. I think I’ve got it mostly worked out now. If the bot has limited information, it won’t ban you, but it will be super-strict if you have a generally negative reception, and if its unclear impression of you is negative and you also make a comment that gets downvoted, it’ll delete the comment. I think it should work fairly well, but it’s still in development. It’s hard to test, because that situation only comes up a few times a month, so I basically just have to wait a while every time I do it.
You can check a user by searching the modlog for their user, and [email protected] as the moderator, and see what comes up. If you see that they’ve been banned at any point, then they are probably a reprobate of one sort or another.
deleted by creator
You can do that now, and evade human moderation in the same way.
I don’t want you to give it a try in the Santa communities, even though it would be a badly-needed test of the system. The code that’s supposed to detect and react to that doesn’t get much action. Mostly it’s been misfiring on the innocent case, and attacking innocent people because they’re new and they said one wrong thing one day. I think I fixed that, but it would be nice to test it in the other case, with some participation that I know is badly intended, and make sure it’s still capable of reacting and nuking the comments.
But no, please don’t. The remedy for that kind of thing is for admins to have to do work to find and ban you at the source, or look at banning VPNs or something which is sad for other reasons, so I don’t want that. Just leave it until real bad people do it for real, and then me and the admins will have to work out how to get rid of them when it happens.
Thanks for the information. I took a look at the bots community and for what it’s worth I appreciate the amount of effort you put into fine tuning it as well as being as transparent as possible.