Tuesday, 19 March 2013

Measuring the Impact of Wikipedia for organisations (Part 3)

Previous posts in this series:
As mentioned in a previous post in this series I have downloaded all of the Wikipedia pages that make a direct link to the Natural History Museum website. While this is useful in attempting to measure the impact of the NHM and Wikipedia on each other this post is a little bit more for fun at this stage (although the data was collected for an upcoming project).

An obvious thing to do with these downloaded pages is scan for them links - then build a graph of the interconnections between them. The script I set about this task is taking a while - so I decided to see what I could summarise about a topic (Wikipedia page) based on the articles that page links to. In all of these examples the numbers are the number of links from the 'subject' page to the other page.

First up is the iconic Dippy (Diplodocus):

4 | Othniel_Charles_Marsh
3 | Carnegie_Museum_of_Natural_History
3 | Sauropod
3 | Walking_with_Dinosaurs
2 | Jurassic
2 | Diplodocidae
2 | Type_species
2 | John_Bell_Hatcher
2 | William_Jacob_Holland
2 | Diplodocid
2 | Fossil

These as a set seem to be a reasonable, high-level, summary of the Diplodocus.  There is a mixture of information that is technical (type species, Diplodocid), cultural (Walking with Dinosaurs) and about the discovery, description and display of the fossil (Marsh, Hatcher, etc).

Let's go for another species, the Holly Blue
3 | Lycaenidae
2 | Eurasia
2 | North_America
2 | India
2 | http://www.itis.gov/servlet/SingleRpt/SingleRpt?search_topic=TSN&search_value=188523
2 | Holly_Blue
2 | Main_Page
2 | Wikipedia:About
1 | Biological_classification
1 | Animal
1 | Arthropod
This time the information is more about the biogeography and higher taxonomy, but nevertheless can be seen as a reasonable, if subjectively limited, summary of the species.

Time for something different: first up a member of NHM staff, Chris Stringer

2 | Archaeology
2 | Biological_anthropology
2 | Social_anthropology
2 | Cultural_anthropology
2 | Feminist_anthropology
2 | Fellow_of_the_Royal_Society
2 | http://www.ahobproject.org/
2 | http://books.google.com.au/books?id=wTnWJGnBwgUC&printsec=frontcover&dq=Giacobini+Hominidae&hl=en&ei=jRvcS6rVJZLg7AO9_sC_Bg&sa=X&oi=book_result&ct=result&resnum=1&ved=0CDMQ6AEwAA#v=onepage&q&f=false
2 | http://books.google.com.au/books?id=Ke7_cl6tQ1EC&printsec=frontcover&dq=%22Chris+Stringer%22&hl=en&ei=JhDcS4WCF43u7APBsoiuBg&sa=X&oi=book_result&ct=result&resnum=5&ved=0CEUQ6AEwBA#v=onepage&q&f=false
2 | http://www.nhm.ac.uk/business-centre/publishing/det_humevol.html
2 | http://www.nhm.ac.uk/about-us/news/2008/march/stringer-wins-kistler-book-award.html

In short, a Fellow of the Royal Society who is an anthropologist and has written a number of books. In a purely professional sense: pretty much spot on.

So what does this kind of summary allow us to do? In a limited sense it allows us to make brief summaries of people, species and institutions that have a Wikipedia presence. But the real use comes when a large number of these analyses can be aggregated, queried and visualised. More of this another time, however here is a quick visualisation made from hacking the demos that come with arbor.js.

Full Screen Version

  Creative Commons Licence
Measuring the Impact of Wikipedia for organisations (Part 3) by Edward Baker is licensed under a Creative Commons Attribution-ShareAlike 3.0 Unported License.
Based on a work at http://pblog.ebaker.me.uk/2013/03/measuring-impact-of-wikipedia-for_19.html.