Bitcoin tools and datasets

Bitcoin, a secure and anonymous internet currency, has recently experienced a bubble in value and attention. Here is a very useful (and free) set of data extraction scripts and datasets for analysts interested in Bitcoin.



From: Ivan Brugere
Date: Apr 8, 2013

Following the recent financial events in Cyprus, interest has again grown in the BitcoinBitcoin digital currency (bitcoin.org/).

Bitcoin allows users to securely and anonymously retain and transfer value in a decentralized P2P network, and is one of the largest open microeconomic transaction network datasets available. Recently, Bitcoin has experienced a bubble, causing its total value to quadruple (from 500M USD to 2B USD) over the past monthBitcoin bubble (blockchain.info/charts/market-cap). New data may provide insights into the dynamics of these events.

I have developed a set of data extraction scripts (to my knowledge, the only freely available for this task), datasets (in human-readable flat file format), and a relational data model which could be of great use to those in the data mining/network science/machine learning communities interested in Bitcoin. I wanted to direct attention to these so that others' curiosity might be sparked while the buzz is high.

There have been scarce study of Bitcoin thus far. One recently publication looks at deanonymization, and presents some standard network measures (ieeexplore.ieee.org/xpl/articleDetails.jsp?arnumber=6113303). However, the possibilities are largely untapped.

I have extracted a dataset from inception to April 7, 2012, containing 6.2M nodes and 37M directed, weighted, time-stamped edges. This new dataset, and extraction code is available at:

compbio.cs.uic.edu/data/bitcoin/

If you have any questions, do not hesitate to contact me.

Ivan Brugere
ibruge2@uic.edu
University of Illinois at Chicago