Attacker Target Selection: Cms Plugins
A popular solution for launching a website among both the tech-savvy and the less sophisticated is to utilize a Content Management System (CMS). CMSes, such as WordPress or Joomla, allow webpages to be launched at the click of a few buttons, making them a very powerful tool. Due to the simplicity and automation of webpage administration, webmasters of CMS-managed pages may be less aware of the dangers they put themselves and their visitors at when installing third-party plugins. Doctoral student Marie Vasek collected data to test whether the presence of CMSes in uenced the likelihood of compromise. Building on this, I worked to shed light on how the presence of CMS plugins increased or decreased risk of compromise. Hence, this chapter describes a method for automatically detecting plugins and their versions, when possible, in WordPress and Joomla installations.
Data Collection
Collecting Case and Control Data
In order to draw meaningful conclusions regarding the relationship between CMS/plugin data and compromise, a set of compromised websites was needed as well as a control set of websites. Using the case-control model, websites which are \infected” are compared against websites which are as similar to the infected as possible without themselves being infected.
Figure 1: We join the webserver and compromise datasets to compare risk factorswith outcomes.
A control set was generated by taking a random sample from the .com zonele, obtained from Verisign. The .com zonele contains all domains registered under the .com top-level-domain, making it a suitable representative population of websites from which to sample. In all, 210 496 domains were sampled to generate the control set of data. The set of infected, or compromised, websites was generated from two different sets of websites.
The rst set of compromised websites were seen to be issuing phishing attacks. Phishing refers to the practice of one website pretending to be another website in order to dupe visitors into handing over information. A popular example of phishing is mimicking a bank’s website, tricking users into entering their login information. The URLs seen propagating phishing attacks were gathered from several sources: two rms which take down phishing pages for banks, a large brand owner, PhishTank, and the Anti-Phishing Working Group. 97 788 distinct URLs from 29 682 domains impersonating 1 098 different brands were observed, all of which were reported as phishing between November 20, 2012 and January 7, 2013.
The second source of compromised websites were seen to be involved in search redirection attacks, and came from the authors of. These websites are those set up by non-criminals, but hacked to redirect traffic to illicit pharmacies. These websites
came from web search results of 218 pharmaceutical-related search terms collected between October 20, 2011 and December 27, 2012, and span 58 516 URLs.
Extracting CMS Data
In order to collect CMS data for the webpages in the control and compromise data sets, the HTML was requested for the top-level webpage of every domain seen. To determine CMS used, if any, Vasek rst scanned the HTML for a generator tag. The generator tag is an HTML element, which species information about how the document was generated such as the text editor used, the CMS used, and sometimes the CMS version used. For example, a website running WordPress version 3.2.1 might contain the tag <meta name=“generator” content=“WordPress 3.2.1”>. Regular expressions were used to pull out CMS information from both the generator tag as well as from common-paths used in the webpage’s body.
Extracting Plugin Data
I built upon Vasek’s data collection by identifying the presence of WordPress plugins and Joomla extensions. We scanned each website’s stored HTML les for paths beginning with /wp-content/plugins/. The following directory indicates the corresponding plugin, e.g., a website using the WP eCommerce plugin has the /wp-content/plugins/wp-e-commerce/ path. We detected Joomla extensions in a similar manner. Extensions are comprised of components, modules, plugins, templates, and languages. We used regular expressions to identify each plugin, such as /components/com nw*/ for nding components. We also tried to nd versioning information for WordPress plugins. We focused on nding versions for the 50 most popular plugins from the control dataset. As there is no standard way to convey version information in plugins, from manual inspection we successfully identied plugin information for 19 of the top 50. Some WordPress plugins broadcast their version in a parameter handed to their scripts. For example, a website running version 6.1 of Google Analyticator would contain wp-content/plugins/-
google-analyticator/external-tracking.min.js?ver=6.1. The plugin version here is specied by a \ver” parameter handed to a JavaScript le, but often a plugin will have several references such as the above, calling both JavaScript and CSS les. In the event of disagreeing JavaScript and CSS les’ versions, the JavaScript’s version has priority as CSS les seem to more often be versioned independently of the plugin itself.
Due to how unreliable version information pulled from script parameters can be, a list of legitimate versions is needed to check against for any given plugin. Obtaining a list of potential versions for a given WordPress plugin is relatively easy, as plugins will typically have an information page at https://wordpress.org/(PLUGINNAME)/. Figure 2 shows an example of this. Unfortunately this list may not be exhaustive, so weeding out incorrect versions is still a manual process.
Identifying Risk Factors For Compromise
Vasek and Moore found that several CMSes did increase the odds of a website being compromised. Websites generated by WordPress were found to be 4.44 times more likely to be in the phishing dataset than websites created without a CMS. Similarly, WordPress websites were 17 times more likely to be in the search-redirection dataset than websites created without a CMS. Websites created with Joomla also
had statistically signicant higher odds of being compromised than websites created without a CMS. When analyzing the presence of CMS plugins, we focused on the top 50 most popular WordPress plugins within the control set’s WordPress population, and similarly for Joomla the top 50 most popular extensions. It was found that WordPress servers running a top-50 plugin are at 21.9% greater odds of compromise, and Joomla servers running a top-50 extension are at 54.3% greater odds of compromise. Running a popular add-on software, regardless of what it is, is a positive risk factor for compromise.
Table 1 shows the statistically signicant odds ratios comparing websites running the given CMS and having the plugin against websites running the given CMS without that plugin. Of the 50 most popular WordPress plugins, the presence of 15 were seen to be positive risk factors for compromise. MM Form Community was the worst offender seen. Websites generated by WordPress which had the plugin MM Form Community were 26 times more likely to be compromised than WordPress websites without it. Four of the WordPress plugins were seen to be negative risk factors for compromise. WordPress pages with TimThumb, an image resizing script which caused widespread compromise in August 2011, were seen to be less likely to be compromised than WordPress pages without TimThumb. Of the 50 most popular Joomla extensions, the presence of 17 were seen to be positive risk factors for compromise, and two were seen to be negative risk factors. Figure 3 plots the odds ratios for compromise based on the number of top-50 plugins that were present on given pages, with statistically signicant odds in red.
WordPress and Joomla pages were both seen to have an increase in the odds of compromise as the number of plugins increased. WordPress websites running 2 of the top 50 most popular WordPress plugins were 1.6 times more likely to be compromised than the WordPress websites running no plugins. The odds grew with the number of plugins, and those running 10 or more of the top 50 plugins were twice as likely to be compromised thanWordPress websites with none of them. Similarly Joomla webpages with three of the top 50 extensions were 1.86 times more likely to be compromised than Joomla pages with none of the top 50 extensions. There is a drastic increase in the odds of compromise for Joomla pages with every additional extension.
The rates of compromise for the top 50 WordPress plugins whose versions could be reliably collected are presented in Table 2, comparing compromise for the up-to-date and out-of-date. For 14 out of the 19 versioned plugins, rates of compromise were higher in the up-to-date than in out-of-date. This trend appears to be the result of more than chance, because the statistically signicant odds ratios all favored compromise in the up-to-date plugins.
Conclusions
WordPress and Joomla plugins were collected from a set of compromised websites as well as a set of control websites. It was found that the presence of plugins does in fact increase the odds of compromise, as expected. Additionally, the more plugins were present on a page the more likely it was to be compromised. Finally, the more up-to-date plugins were the more likely they were compromised. This seems counter-intuitive, as plugin updates are often performed for the purpose of patching vulnerabilities, but it likely re ects the fact that the most up-to-date versions of
plugins have a wider user-base making them wider targets for attackers. This is consistent with the ndings of Vasek and Moore that more updated WordPress software is hacked more than outdated installations.