KDnuggets : News : 2009 : n18 : item10 < PREVIOUS | NEXT >

Features


Subject: Web Mining on P2TagTeam Data

About P2RIC: Pollution Prevention Regional Information Center (P2RIC) strives to improve resource sharing between the programs, businesses, and agencies of EPA Region 7 (Iowa, Kansas, Missouri, Nebraska) that provide waste reduction services and expertise to business and industry. The Nebraska Business Development Center (NBDC) at the University of Nebraska at Omaha (UNO) operates the Pollution Prevention Regional Information Center. P2RIC works with partner programs to collect, share, and update pollution prevention information. P2RIC collaborates with the other eight regional information centers of the National Pollution Prevention Resource Exchange (P2Rx).

Project Background: In May 2009, P2RIC launched a social bookmarking campaign using the website www.delicious.com. We encouraged P2 professionals to sign up on www.delicious.com and use the tag P2TagTeam for relevant P2 urls. The result was a virtual community of P2 professionals tagging urls with P2TagTeam so that others both inside and outside the community could search for this information by searching for the P2TagTeam tag. It should be noted that along with the P2TagTeam tag, users can also use other tags. For example, an article about green energy might be tagged with "P2TagTeam", "Green", "Energy", "Sustainability", and "Economics".

Project Description: The next step in the project is to offer a functional specification that would allow P2RIC and an IT-web development specialist to develop a scope of work and expected project timeline. The following items are functions P2RIC would want to be able to perform as outcomes of a web-mining project:

A: Database Capture: Identify all URLs associated on the delicious.com site that are associated with the common tag, P2TagTeam. This will represent the complete body of references tagged by the community of practice (CoP).

B: Mining on Individual URLs:

  • Identify all tags associated with any one URL and the common tag, P2TagTeam. This will allow us to search tags to help identify synonyms and build a thesaurus for the CoP taxonomy. This will also allow us to provide context, deeper meaning, relationships, and semantic web opportunities as the range of tags used broadens for any one URL.
  • Identify users associated with any one URL and the common tag, P2TagTeam. This will allow us to define subgroups of interest within the CoP.
  • Identify users associated with any one URL who have not used the common tag, P2TagTeam. This will allow us to find users of shared interest who are not currently identified as part of the CoP.
  • Identify tags associated with any one URL that do not use the common tag, P2TagTeam. This will help identify synonyms and context that may not be used within the CoP. C: Mining on Data Relationships:
  • To be able to sort and present any set of P2TagTeam-tagged URLs so that one can see most referenced URLs for any given tag, or set of tags, chosen by a user. (tag A + tag B + ... + tag n).
  • To be able to see all URLs associated with specific users and tags. (tag A + tag B and user X + user Y)
For more information, visit http://p2ric.org/ and http://www.p2rx.org/p2tagteam/

KDnuggets : News : 2009 : n18 : item10 < PREVIOUS | NEXT >

Copyright © 2009 KDnuggets.   Subscribe to KDnuggets News!