With the growth of social bookmarking tools a new approach for metadata creation called tagging has emerged. Tagging based systems enable users to add tags, insert their own explanations to web resources to categorize these resources. Examples of sites such sites include the social bookmarking site del.icio.us, the video sharing service YouTube, and the photograph sharing website flickr.
The increased use of tags and tagging has led to the emergence of a number of methods for presenting these tags to users. A number of websites use lists to present tags to users of their services. However, a growing number of websites are using tag clouds to present tags to users. Tag clouds are visual presentations of a set of words, typically a set of tags, in which attributes of the text such as size, weight or colour can be used to represent features (e.g., frequency) of the associated terms.
There are also a number of variations on tag clouds emerging (see e.g., Tag.alicio.us, Extisp.icio.us and Facetious). Shaw  proposes a tag cloud mapped like a graph, where tags are represented as visually distributed nodes, and similarity relationships as edges between nodes. Bielenberg and Zacher  present their tag cloud in a circular form, where font size and distance to the center represent the importance of a tag, but where distance between the tags does not represent their similarity. Dubinko et al  present a further approach that visualizes the evolution of tags over time.
Despite the increasing popularity of tag clouds in particular, there have been very few studies evaluating the effectiveness of tag clouds. Hasan-Montero and Herrero-Solana  discuss an algorithm for semantically clustering tag clouds and provide an example of its use, however they do not formally evaluate whether users find any improvement. Rivadeneira et al  provide an evaluation of tag clouds for impression formation. They noted that font size in tag clouds, and to a lesser extent layout can have an effect on impression formation amongst users of tag clouds. The following section provides details of the setup of the evaluation we was carried out on such properties.
The main goal of our experiment was to investigate the effect of some of the different properties that can be utilized in presenting tags (e.g. alphabetization, using larger fonts). Our experiment used a goal-orientated task as opposed to a free browsing task. Free browsing tasks can also be supported by the presentation styles that we tested and are also a more common usage for tags, but their simulation in controlled experiments is to difficult.
Participants had to complete a series of selection tasks. For each task users were presented on screen with the name of a country and were then asked to identify the country from a list of country names presented in one of 6 different formats. Each list contained 10 items randomly selected from a list of 60 country names. Immediately after the user selected the correct country name, a new task started and the experimental system requested the user to select a different country name. The participants each had 4 practice runs on the selection task, and were then presented with 24 selections to complete as part of the experiment. This task is similar to a task that was used to assess adaptation techniques .
As was stated earlier, the list of countries were presented to each user in 6 varying formats, these formats being a horizontal list, a horizontal list ordered alphabetically, a vertical list, a vertical list ordered alphabetically, a tag cloud and a tag cloud ordered alphabetically. Each user saw each presentation type. In all cases the order in which the country names are presented was random. In the case of the tag clouds 3 different font sizes were used, with the font size for each country being selected randomly. The tag clouds typically spanned 3 lines in the evaluation. The time to complete each selection task was recorded.
Sixty-two participants, mainly students of Computer Science, took part in the experiment: 11 females and 51 males. The average age of participants was 27 with a maximum of 44 and a minimum of 23.
All of the selection tasks were completed successfully. Due to an experimental error the links to countries selected earlier in the evaluation, were highlighted as previously visited. These selections involving these links were removed from the dataset for analyses. However, analyses with and without these items were not found to be appreciably different. 1231 “clean” selections remained.
Surprisingly, the tag cloud presentations took the longest to complete of all presentation types. Selections involving lists that were alphabetized were the quickest to be completed. Also when the list is alphabetized the difference in selection time between the two list presentation types and the cloud presentation type is reduced. Participant comments included “When the countries were ordered alphabetically it was much easier” and “usually, I move my mouse on the list depending on the State First position in alphabet, even before starting to read tag lists”.
|Presentation Type||Number of Occurrences||Average Time|
|Alphabetical Horizontal List||155||2.887|
|Alphabetical Vertical List||157||2.892|
In an attempt to discover why the time to complete the task using a tag cloud was greater than using a list we looked at the font sizes that were used in the tag clouds (see Table 2). Unsurprisingly, it was found that the larger the font size the more quickly the users completed the task. However, a number of participants had some problems locating the required country with the larger font, one user commented “I really noticed that a big tag was VERY hard to find and took me a long time”. This indicates that a more careful selection of font sizes could perhaps increase the speed of completing selections in tag clouds. The variance in the number of occurrences, was due to the excluded data, otherwise these would all have been equal.
|Number of Occurences||167||153||158|
Finally, the position of the tag in the list or cloud was investigated as a factor in task completion time. In the case of all layouts, if the country that the participants were required to find was in the upper left corner of the cloud or list then it was found most quickly. This effect is usually expected on stimuli that require westernized reading (left-to-right and top-to-bottom). However, in the case of tag clouds, countries that were located on bottom right of the bottom line of the cloud, or in the middle of middle lines were located most quickly on their respective lines. This indicates that participants appear to be scanning the lists rather than reading them. However, it will take further analysis to confirm this. A similar effect was found in alphabetized lists, if a country was at the start of the list it was found most quickly, countries at the end or in the middle of the list were found the next most quickly.
For this analysis an evaluation was carried out that required participants to find information in varying formats that can be used to present tags. Information about how the tags were presented and how easily/quickly participants found the information was logged. A number of interesting phenomena were found, some expected and some unexpected.
On the final point, this is an initial finding and more analysis is required, perhaps using an eye-tracking paradigm. A number of these findings have also been found for tag clouds for the task of impression formation . Indicating that these issues are relevant for a number of tasks that tag clouds can be used for, including searching, browsing, impression formation and recognition/matching.
Bielenberg, K. and Zacher, M. Groups in Social Software: Utilizing Tagging to Integrate Individual Contexts for Social Navigation. Masters Thesis submitted to the Program of Digital Media, Unisersitat Bremen. (2006)
 Dubinko, M., Kumar, R., Magnani, J., Novak, J., Raghavan, P., and Tomkins, A. Visualizing Tags over Time. In Proceedings of the WWW 2006 Conference. (May 2006)
 Hasan-Montero, Y. and Herrero-Solana, V. Improving Tag-Clouds as a Visual Information Retrieval Interfaces. In Proceedings of International Conference on Multidisciplinary Information Sciences and Technologies. (Oct 2006)
 Rivadeneira, A.W., Gruen, D.M., Muller, M.J., and Millen, D.R. Getting Our Head in the Clouds : Toward Evaluation Studies of Tagclouds. In Proceedings of CHI 2007. (To appear: April 2007)
 Shaw, B. Utilizing Folksonomy : Similarity Metadata from the Del.icio.us System. Project Proposal. (Dec 2005) Retrieved from http://www.metablake.com/webfolk/web-project.pdf
 Tsandilas, T. and Schraefel M.C. An Empirical Assessment of Adaption Techniques. In Late Breaking Results CHI 2005. (Apr 2005)