KDnuggets News 04:21, item 3, Features

KDnuggets : News : 2004 : n21 : item3

Features

From: Aleks Jakulin
Date: 27 Oct 2004
Subject: Data Mining of Political Data

We have taken the US Senate roll call voting data, which disclose how each of the senators voted in a particular issue. There is a lot of this: there are the 100 senators and there are almost 500 issues per year. Several organizations examine how "friendly" individual senators are to them, but for an ordinary voter, there is just too much hassle. Political scientists, however, regularly observe these datasets with special-purpose models. Our objective was to check if the "usual" algorithms

It turns out that the data mining tools are quite complementary to those already used in political science. We can do lots of things:

do the clustering of senators, and identify "outliers"
identify "social" networks based on how similarly senators vote
evaluate how influential a senator is, or a whole state, or a particular group of senators; here we use information-theoretic ideas
represent senators in a 2D "ideology" space
use VRML for 3D visualizations
perform what-if analysis: who could have overturned a particular bill, or when was there disagreement inside the party

If you're interested, check it out: http://www.ailab.si/aleks/Politics/

We have used the general-purpose Orange data mining toolkit (http://www.ailab.si/orange/) which is Python and GPL. Furthermore, we have used the MPCA discrete probabilistic principal components modeling kit (http://cosco.hiit.fi/search/MPCA), also under GPL, to identify the blocs in the senate. Our scripts and data are all freely available. We also have two working papers there that discuss everything in more detail.

mag. Aleks Jakulin
http://www.ailab.si/aleks/
Artificial Intelligence Laboratory,
Faculty of Computer and Information Science,
University of Ljubljana, Slovenia.

KDnuggets : News : 2004 : n21 : item3

PREVIOUS | NEXT