Security Fundamentals: Prepare Your Wiki
This discussion will be both the most important and also the least technical. It is very important that planners and technical operators not skip over it in order to jump into the discussion of the technologies used in securing the wiki. This discussion provides the foundation on which all security discussions will depend.
As is the case with most questions of IT security, the most effective dose of protection is to properly plan how the application will be used in advance. Operating a safe wiki becomes much easier if both the administrators and users realize its benefits and limitations before launch.
A wiki is a system that is by definition open. Most if not all of its operational benefits come from the ease at which content can be added and subtracted, and the wide audiences that can participate in the process. The wiki is not a Content Management System or secure shared work space – although it can effectively and easily distribute files, images, and documents. The wiki is not a business messaging system – although it quickly and thoroughly distributes messages.
The Dry Erase Board Model
A particularly effective analogy is to compare a wiki that is used in a professional setting to a dry erase board that might hang in an office. The items have much in common: an easy way to post, edit and remove content, a permanent and easily accessed location, and a space that provides equal opportunity to different types of messages.
When teaching users how to limit the possible security risks of operating their wiki many direct analogies to the office’s dry erase board that hangs in their break room become evident:
- The dry erase board is not somewhere you would hang up sensitive business information that is not appropriate for anyone that floats in and out the break room to see.
- If there is a message that vital for all office employees to receive in a timely manner, you would not simply scribble it on the dry erase board.
- Messages on the dry erase board might at times be informal. However, it is in a work setting so even though it is possible to jot down inappropriate or profane material, it is not acceptable to do so. Your boss might also be awfully interested in having a talk with who did it.
- Users all have the ability to pick up both the marker and the eraser. Any message that is written down can be added to, changed, or replaced outright.
It is the development of these common sense ideas that most fundamentally secure the use of a wiki. All of the security risks that we talk about from here on out are mitigated by proper planning. A wiki is a tool that allows streamlined information distribution and collaboration. It is not a replacement for common sense and properly planned ways of handling and distributing information within an organization. It is also not effective as a sole source tool, but rather as an overview or easily accessed table of contents that points to more permanent, secure repositories of information.
Instances where outside forces are able to corrupt data within a wiki, overambitious users alter the party line, or a technical failure puts the wiki offline are all mitigated if the use of the wiki was properly planned. The fax machine did not replace the Postal Service, the spellchecker did not replace the Content Editor, and likewise the wiki does not replace established best practice methods for distributing data. It exists as a tool to expedite information sharing and enable the exchange of information over great distances.
Spam: No I don’t want to buy Viagra, thanks though.
The greatest single threat to a wiki is spam. We will limit this discussion to problems that will be faced by public facing systems. While intranet and internet applications alike might face the problem of an abundance of meaningless or inappropriate entries, that is a problem of content editing and is outside the scope of this discussion.
The introduction of a large amount of robot generated spam entries can cripple a wiki in two distinct ways:
- Decrease the perceived value of the information in the wiki to the audience. No one is going to take a wiki seriously when they have to navigate through 20 Viagra ads to get to the information.
- Large amounts of automated entries produce a performance drain on the hardware
A Mediawiki instillation offers several tried and true methods of standing up to this challenge.
Bad Behavior is an open source set of PHP scripts which automatically blocks harvesters, spam bots, and malicious visitors before they can even see the content of the website. The set of scripts is widely used in most public facing Mediawiki installations. It functions by requiring and then analyzing browser identification information before forms can be submitted.
Setting up and installing BadBehavior is simple:
- Upload / unpack the archive to the /extensions directory
- Edit LocalSettings.php and append:
include( 'extensions/Bad-Behavior/bad-behavior-mediawiki.php' );
From the Bad Behavior website: (http://www.bad-behavior.ioerror.us/)
Bad Behavior complements other link spam solutions by acting as a gatekeeper, preventing spammers from ever delivering their junk, and in many cases, from ever reading your site in the first place. This keeps your site’s load down, makes your site logs cleaner, and can help prevent denial of service conditions caused by spammers.
Bad Behavior also transcends other link spam solutions by working in a completely different, unique way. Instead of merely looking at the content of potential spam, Bad Behavior analyzes the delivery method as well as the software the spammer is using. In this way, Bad Behavior can stop spam attacks even when nobody has ever seen the particular spam before.
Bad Behavior works on, or can be adapted to, virtually any PHP-based Web software package. Bad Behavior is available natively for WordPress, MediaWiki, Drupal, ExpressionEngine, and LifeType, and people have successfully made it work with Movable Type, phpBB, and many other packages.
The core of Bad Behavior is free software released under the GNU General Public License.
Regular Expression Filtering
Another tactic widely used to prevent spam entries is using a set of regular expression filters to automatically block content. This is analogous to the way a mailer daemon keeps messages that contain any derivation of the word “Viagra” out of your inbox. One specific method of doing this will be provided to demonstrate the concept.
$SpamRegex (http://www.mediawiki.org/wiki/Manual:$wgSpamRegex ) is a plug-in for Mediawiki that examines entries into the wiki and checks them against a defined regular expression that contains disallowed terms. The plug-in sends checks content submissions against a regular expression that is defined in the LocalSettings.php file.
For example setting the variable to:
$wgSpamRegex = "/buy-viagra|adultporn|online-casino|dirare.com|sexcluborgy.net/";
filters out a common set of possibly malicious entries. By default a user that has content blocked by the plug-in would see:
The page you wanted to save was blocked by the spam filter. This is probably caused by a link to an external site.
The following text is what triggered our spam filter: [word/domain name which was blocked]
These methods also serve to fight against embedded HTML and hidden CSS which can serve to hide spam in content. Content entry such as:
can be filtered with expressions like:
$wgSpamRegex = "/".
This description serves as a primer to wiki security. There are many other best practice methods that will be important to consider.
- .htaccess manipulation to limit robot entries that don’t respect robot.txt
- Captchas (http://recaptcha.net/plugins/mediawiki/) that use images to require human entry before posting or user registration. (“Type the word you see here”)
- The MediaWiki SpamBlacklist plug-in (http://www.mediawiki.org/wiki/Extension:SpamBlacklist)
- Responsible content editing: A large wiki needs to have a pre-planned method of keeping a human eye on the content that is posted within it.
Cross Server Scripting: XSS
One of the threats to users of a wiki are links placed into an entry under the guise of a legitimate contribution that contain malicious code hosted at the destination. A blackhat user could post a conversation seeming to be legitimate and list a link as a reference to their discussion.
When considering these attacks two things are important to remember:
- There is limited harm in stealing the wiki session. A properly implemented wiki is isolated from the rest of the business infrastructure and getting someone’s wiki session only allows the grand total of posting in their name. There will be no further damage to be done, and no ability of that session to reach further into the business IT infrastructure. The worst case scenario is the theft of an admin session which would allow the changing of user permissions. A Mediawiki installation contains full archives that can not be erased, and any changes made can be unmade.
- This is a completely non-unique threat. Any user that is able to access the wiki has an active internet connection and faces this danger any time they access web pages.
A properly implemented wiki will contain exit disclaimers and attempt to educate its users about this threat. In addition, routine maintenance of the content by editors and information gardeners limits this threat.
SQL Injections: Hands off my database.
In order to prevent those more interested in the administration than the technology of the wiki from falling asleep we will put the cart before the horse in this section and give the conclusions of the threat before discussing the specific nature of it. Feel free to skip to the next section whenever your eyes start to glaze over.
The conclusion is that the topic is worth discussing and being cognizant of. An administrator of a wiki has the comfort of knowing that the response to the threat relies on the language of PHP itself and a set of core functions built in to filter out the attacks that attempt to access the database in unintended ways. Millions of users constantly test the filters and Mediawiki takes advantage of all of them and represents no unique threat.
The Department of Homeland Security’s National Vulnerability Database (http://nvd.nist.gov/nvd.cfm) lists no known SQL injection threats for Mediawiki version 1.11. The attacks are handled by the proper implementation of the PHP filtering code within Mediawiki.
Now for the basic tech behind the threat:
In its most basic sense Mediawiki is a set of server side scripts that interact with a database to store and retrieve information. At some point in the scripts, the PHP has a set of standard calls to the database in the language of SQL to get or write information. SQL injections seek to take advantage of the syntax of those calls in order to do unintended things.
A completely arbitrary example to illustrate the point:
A web page might need to access a database in order to log someone in. The specific syntax that the web page uses to access the database might not be known, but with enough playing around a way of taking advantage of the way SQL works could be found.
If someone supplied the login name of “admin” and the password of “kitty” the SQL query to access the database would be:
SELECT userId FROM users WHERE username = 'admin' AND password = 'kitty'
By injecting improper inputs the SQL statement can be changed to do something detrimental. For example if a user knew that “#” was an escape character in the SQL language and supplied user = ”bob’ #” and password = “whatever” the web page would construct a SQL statement that looks like:
SELECT userId FROM users WHERE username = ' bob ' #' AND password = ' whatever '
Because that “#” begins a comment the database would evaluate the statement to actually read:
SELECT userId FROM users WHERE username = ' bob '
Therefore the database validates the user as bob without ever looking up a password. That in its most basic nature is a SQL injection.
The defense to the attack relies on pre-processing entries supplied to SQL statements. This something that PHP has constructed a solid core of pre-made functions to filter out malformed requests whenever the PHP is going to form a SQL statement. This works in much the same way as the regular expression spam fighters we already discussed. PHP that is built without proper attention to these safeguards forms websites vulnerable to attacks.
The Mediawiki software follows all best practices when it comes to safeguarding its database calls.
Plug-ins: Time to fight the battle all over again
We will not cover specific set of plug-ins here but rather talk about a general attitude of adapting plug-ins.
A Web 2.0 application is a living, breathing web site. One of the biggest advantages of a Web 2.0 application is its extensibility through plug-ins. These plug-ins are not simply small tweaks that provide flavor, but rather fundamental methods of improving the application and allowing better user interaction.
The security threat that exists is not unique to a plug-in. That is to say that each plug-in could face the same challenges that we have just discussed. Each time a plug-in is adopted it needs to be examined for the same security holes that the wiki itself has.
Most plug-ins are small and will be of little risk, however it is beneficial to ask the same questions that we asked above each time we examine one:
- Wiki Setup: Does the plug-in change the way the wiki works and make people too dependent on it? Does this violate best practice methods of storing data?
- SQL Injections: Does this plug-in touch the database and if so is it coded to apply the proper PHP filters?
- Spam: Can users leave content in this plug-in and if so does it apply our spam filters?
And so on and so forth.
Plug-ins are great, and any wiki that is hesitant to try and discard lots of them before finding the winners is handicapped. The administrators of a wiki simply need to be knowledgeable enough to make these determinations and responsible enough to ask the questions each time a plug-in is used.