Fighting spam in MediaWiki

Cleaning up spam on and on have resulted in a little tutorial and a presentation I want to show.

Ok, so here is some advice that can alone improve your spam situation.

  1. Healthy community. If your wiki has enough users who care about the content, they will clean up the spam manually.
  2. Heuristics and little tricks. For example you can set the waiting period for all the newly-registered users. That will help because the majority of bots register and immediately write something
  3. Questy captcha instead of any other captcha
  4. If nothing helps use behavior analysis tool called AbuseFilter


The importance of participating the OSS community

When you’re making the software that is based on some open source code it’s crucially important to participate in a life of the community. Why?

First of all let’s remember why have we decided to be dependent on some OSS at the first place. Well, it’s better than making all from scratch because the other great programmers have already done all the work for us. Since this software are being used by some amount of people, the developers have probably fixed many bugs. In many cases the code that is created in opensource has some kind of code standard and is covered with some amount of unit tests. If the community is big and the projects this cored is used in are important – than someone have already thought about the security as wel. The reuse of that code is saving us a lot of money.

This code is evolving – the developers add a lot of new features, make the software faster and more beautiful. So when we’re using the open source code as a platform for our projects we’re trying hard not to patch it too much because we want to use all these updates. This is also very understandable.

So what about the original thesis? We’re making money here, why do we have to answer the newbie questions on the forums, figuring out the mistakes in the mailing lists, improving documentation and sponsoring the events that are related to that software? I’ll try to offer my opinion.

Fist of all: the more people are interested in the software – the more people eventually participate in its creation. And as I have demonstrated earlier, we consider the open source developers as our employees that create the great code, but we don’t have to pay them anything. Of course we need more of them to come.

Secondly – the more people are interested in software and intensively use it – the more bugs they report. Those are our free testers, and because of that we also want to open our own code.

Thirdly – when the software is popular it’s easier to find the hackers and wizards who can work as our employees. The more those guys on a market – the more opportunities for us to chose.

How the software become popular? Well, it has to be good and usable, have good documentation, have the community that can answer any question. You can also promote it with the advertisement and the events. As you can see it’s now the full circle.

MediaWiki research: should we show the users when the page was last modified?

Wikimedia Foundation and community put very much effort in self-research. That’s obvious: Wikipedia is unique and there is nothing to compare it with.Today I’m going to talk about one of such projects, the small one. The project is related to the small timestamp that is now located in the very bottom of the website. The timestamp shows when the page was last modified.

Wikipedians got interested how the location of this timestamp can affect the user’s behavior on the wiki. For wiki projects it vitally important that users understand that they can edit pages. Why is that? Because most of the time the users produce the value of the wiki. So probably the intuition in this research was the following: if I can show the users that the page they are viewing not is an ever-changing object, this users will be able to feel that it can be edited by him.

So they tried to move the timestamp from the very bottom of the page to its right upper corner and make it clickable.

The result was that users started to click more on article revision history tab. The revision history is not a particularly nice place to see for non-technical-savvy user in my opinion but still I believe that some users were able to understand what it is. It was very disappointing for me to see that the researches haven’t measured the correlation between the page edits and the new position of the timestamp, but they said that it was probably not very big anyway.

This experiment is one of the Editor engagement experiments that are all related to the users’ motivation to partitipate Wikimedia projects.

Hello world!

My name is Yury Katkov and I want to try blogging. I work in WikiVote! company and I’m a big Semantic MediaWiki and Linked Data fan. My plan is to periodically post here something technical about Semantic MediaWiki: screencasts, manuals, observations and just thoughts.