Get Ready for Webometrics Repository Ranking Data Collection
Submitted by on Thu, 2010-12-09 15:24
By Bram Luyten, @mire
Leuven, Belgium Early in 2011, the CSIC Cybermetrics Lab (Spain) will harvest, analyze and publish data about online visibility of IRs in its Top 800 Institutional Repository Ranking. The January ranking will reflect the visibility of repositories in different online search engines. A limited time window is being used to get a snapshot of data. Here is the ranking production schedule:
Data collection: January 1-10
Analysis: January 10-24
Publication: week of the 25th of January
Why would you or your management care ?
The aim of the Ranking is to support Open Access initiatives and therefore the free access to scientific publications in an electronic form and to other academic material. The web indicators are used to measure the global visibility and impact of the scientific repositories.
When increasing exposure for your digital research output is one of the objectives of your repository project, it's a logical consequence that you should attempt to measure how well you are doing in attaining this objective. Although internal metrics are the primary tools to track the progress in this area, the ranking offers great opportunities to identify and learn from other successful repositories.
Zooming in on your own repository metrics, comparing them over time, allows you to demonstrate progress in attaining certain targets. If online exposure is the target you want to measure, it can be very useful to take a look at how many pages of your repository are indexed in the most popular search engines. This is also a very important metric in the repository ranking methodology.
To do this, you can either dig into the search engines yourself. For example, entering the query "site:>" in Google will show you the number of repository pages indexed. By doing this every month or week, you can keep track on how the exposure of your repository grows. Handy tools, such as the SEO Quake browser add-on help you to automate this process.
The repository ranking totally disregards repository usage (pageviews, downloads), simply because those data are not easily accessible for the people at the Cybermetrics Lab. However, tools like Google Analytics or the internal DSpace SOLR statistics enable you to keep track of your repository usage.
Learning from the repository ranking
When internal metrics are the primary tools, already offering a wide range of options to track your repository's progress on certain metrics, why bother with the ranking ? Although there are a few pitfalls, there is definitely an opportunity to learn from successful repositories.When comparing the ranking for your repository, to its previous ranking, you could get strange results (e.g. very big jumps) because of changes in the ranking algorithm. This was the case when comparing the rankings from January vs July 2010. However, for this edition, the Cybermetrics lab has assured that the ranking algorithm wouldn't be tweaked compared to the July 2010 edition.
Although it's clear from the ranking methodology that repositories with a thousands of items generally score higher than very poorly populated ones, having the highest number of pages is no automatic ticket to the top. For example, although the University of Sao Paulo has 26.166 items indexed, it's placed higher than the Kyoto University's KURENAI repository with almost 100.000 items. The scores on four different indicators (size, visibility, rich files, scholar) show on which of the areas a repository can improve in order to improve its overall ranking.
Including your repository in the Webometrics Repository Ranking
If your repository hasn't been included in any of the previous rankings, it won't necessarily mean that it didn't perform well enough to make the top 800, but could also indicate that the Cybermetrics lab is not aware of your repository. Send an email with your repository URL to email@example.com, well before the 1st of January, to ensure that your URL gets included in the data collection phase. Only repositories with an autonomous web domain or subdomain are included:
Although it will take some technical work to change your URL while ensuring proper redirects for your older URL's, it's definitely worth to go through this trouble in getting an autonomous web domain or subdomain.
Those repositories consisting only of one or several electronic journals (journal' portals), or devoted to non scientific papers or focusing in archival material are excluded.
Apart from these basics, more best practices can be consulted here.