• Ingen resultater fundet

Test discussion

In document Zeeker: A topic-based search engine (Sider 114-118)

94 Testing retrieval

12.1.1 Selected test methods

Two methods were mainly used to test the engine. First of all, the retrieval part was tested using a trial-and-error approach, where the primary goal was to find errors in the retrieval logic and programming code. Trial-and-error was also used to see how the engine handled various potentially problematic queries1. These tests revealed some errors which were fixed before the search engine was put on-line for others to try out.

The second method used was manual user feedback. A questionnaire was constructed which was sent out to numerous people asking them to participate.

Before creating the questionnaire, the information and answers valuable to the search engine’s performance had to be defined. Based on general search behav-ior using search engines, e.g. Google, it was concluded that there were mainly two ways users use search engines, either forquestion answering (who is, what is etc.) or forresearch (what has been written about some topic). Therefore, the questionnaire should include questions that would give indications as to how well the search engine can be used for question answering and research respec-tively. To test the question answering part, users were asked to find answers to questions known to exist in the index, given minimal clues to go on. Researching was tested by asking the users to submit queries on their own and evaluate the relevance of the results returned by the search engine.

It was also considered asking users to evaluate results from predefined queries.

This idea presented a couple of problems. First of all, users might not know anything about the chosen topic of the queries and would therefore be in no position to evaluate the relevance of the retrieved information. Furthermore, predefined queries known to give good results could also be selected, thus giving biased results making the questionnaire unreliable. Finally, this approach does not model the general search behavior mentioned above and therefore the use of predefined queries was entirely dismissed.

The devised questionnaire can be found in chapter B.2 in the appendix. The test results are presented and discussed in the next section.

12.2 Test discussion 95

diate3 or better. This yields a test group of users mainly in the age group of 21 to 30 years old, where most are well familiar with how search engines work and equally distributed among the two sexes.

With that in mind, the results of the questionnaire and the conclusions drawn from it will be presented.

12.2.1 Searching

Based on the described test strategy, the purpose was to test two different scenarios namelyquestion answering and research. Participants were asked to find information known to exist in the index and to submit their own queries.

In the following tables the key words in the table headers refer to the questions in the questionnaire where participants were asked to find information on these key words. The information retrieval tasks were:

1. Find a single from an album calledTen.

2. Find the real name of the artist which uses the stage nameThe Edge.

3. Find the name of the band behind the songLord of the Boards.

The goal of these information retrieval tasks was to find out whether or not users were able to find useful information - this being the main functionality of a search engine. Table 12.1 shows how the 24 survey participants answered that question.

Answer Ten The Edge Lord of the Boards %

Yes 21 16 20 79.2

No 1 7 4 16.7

Don’t know 2 1 0 4.2

Table 12.1: Did you find what we asked for?

Clearly the table shows that Zeeker Search Engine is capable of retrieving information when users are asked to find something known to exist in the index.

Finding the real name of the artist behind the stage name The Edge4 caused problems for several of the participants as seen in the table. The problem caus-ing this was quickly located and lies in the handlcaus-ing of upper- and lower-case letters in the index. If searching forthe edgethe results are much more relevant than with the query The Edge. The answers to this question in the question-naire resulted in a small but very serious bug fix.

Even though the participants seemed to be able to find the information required, knowing how difficult it was to find seemed important. People tend to try harder when participating in a survey than when trying out a new product at leisure. Table 12.2 shows how difficult the participants found the information retrieval tasks. Again it seems that the participants did not have problems

3Where Intermediate was defined as people well familiar with Google

4Guitarist David Howell Evans from the band U2

96 Testing retrieval

finding some of the information asked for, which is also confirmed by comment 1 in table 12.3. A number of users did however find it very difficult. Especially finding the band behind the songLord of the Boards5, where users had trouble finding the right answer. Again, the handling of uppercase and lowercase letters might be the source of this problem. If searching forlord of the boardsorLord of the Boards (as written in the questionnaire), Guano Apes (the correct answer) is number four in the list of results. However, if any of the stop-words in the query, i.e. of or the are written with capital letters, Guano Apes is not in the list of results. When users were faced with these problems, several of them requested more query operators (see comment 3, 6, 7 and 8 in table 12.3) as they believed the problem was a fundamental searching problem. Besides correcting the uppercase/lowercase problem, a future version ofZeeker Search Engine will also introduce more query operators to help users get the information they need.

Future versions and extensions are discussed in chapter 14.

Answer Ten The Edge Lord of the Boards %

Very easy 7 8 2 27.0

Easy 5 3 7 23.8

Normal 8 3 6 27.0

Hard 1 2 5 12.7

Very Hard 0 3 3 9.5

Table 12.2: How difficult was it to find?

Zeeker Search Engine differs from other search engines on account of its categories. Obviously testing whether users found the categories practical and effective was necessary. Many found the categories a good additional tool when searching whereas a large percentage (37.5%) of the participants did not find them useful as shown in table 12.4 and supported by comments 6 and 7 in table 12.3. The main reason may be that the category filtering is not strict enough, meaning that too many web pages are clustered under categories they do not (strictly) belong to. This was also expressed by a participant (see comment 9 in table 12.3). This issue is not easy to rectify and is discussed further in chapter 14. The categories did however help some users as expressed in comments 4 and 5 in table 12.3.

In order to test the research capabilities of Zeeker Search Engine, partic-ipants were asked to submit queries of their own and then asked to rate the relevance of the results. Table 12.5 shows how the participants rated the rele-vance of the results to their own queries. More than 80% found the retrieved information relevant or better whereas only 4.2% found the retrieved informa-tion not relevant at all.

The tests on Zeeker Search Engine’s search capabilities have revealed that participants rate the ease of use, relevance of retrieved information and the ease of finding information very highly. At the same time, the survey also revealed a problem in the handling of upper- and lower-case letters which resulted in some poor ratings. Zeeker Search Engine already has future features planned that will hopefully make these numbers even better (see chapter 14).

5Song by Guano Apes

12.2 Test discussion 97

Id Selected comments

1 Couldn’t understand it at first, but then it was very easy.

2 Please include a spelling wiz to help the user.

3 I had some difficulties when searching for The Edge, did not get any results when writing with small caps

and no relevant results when writing as shown in the survey...

4 Good work. Shouldn’t be case sensitive though? Bugs aside, good for finding stuff within categories, i.e. ’I like rock, show me some bands.’

For a specific search I’ll rather google

5 I like the categories, they are very useful to guide the search in the right direction

6 The engine definitely needs the use of quoted expressions: I always use queries like ”the beatles” ”last single” op:AND

if I want to find the last single issued by the Beatles.

Furthermore I just couldn’t find any use for the clusters

- apparently they kept suggesting a partitioning of the results that I simply had no use for.

7 The operators arent as useable compared to google’s, nor as usefull.

I didn’t get to use the categories a single time...

8 Had problems finding Lord of the Boards. The categories were just not useful there. An operator like SONG: could be very helpful in this case.

9 ... I feel like the ”filtering” possibilities are too broad. It would be great if you could somehow come up with more specific filtering for the users ...

Table 12.3: Selected comments

Answer Ten The Edge Lord of the Boards %

Very Useful 2 8 6 22.2

Useful 5 3 6 19.4

Somewhat Useful 7 4 4 20.1

Not Useful 10 9 8 37.5

Table 12.4: Did you find the categories useful?

Answer Count Percent

Very Relevant 5 20.8%

Quite Relevant 10 41.7%

Relevant 5 20.8%

Somewhat Relevant 3 12.5%

Not Relevant at all 1 4.2%

Total 24 100%

Table 12.5: How relevant were the results to your queries?

12.2.2 Overall evaluation

The participants’ overall evaluation of Zeeker Search Engine is presented in tables 12.6 and 12.7. The majority of users found the performance between good and average whereas a future version will probably be able to increase performance and decrease the search engine’s response time, i.e. the time it

98 Testing retrieval

takes the engine to respond to queries. The participants found the performance acceptable and adequate.

Answer Count Percent

Very Good 1 4%

Good 12 50%

Average 10 42%

Bad 1 4%

Very Bad 0 0.00%

Total 24 100%

Table 12.6: How do you rate our search engine’s overall performance?

The overall performance statistics also reflect how likely a user is to use a topic-based search engine (likeZeeker Search Engine) in the future. In general, the participants are positive toward this kind of search engine (see Table 12.7) yet some (25%) find it unlikely or very unlikely to use such a search engine in the future. This is of course disheartening but the future plans and features forZeeker Search Engine are believed to greatly improve the search engine thus hopefully lower the number of unsatisfied users.

Answer Count Percent

Very Likely 5 20.8%

Likely 10 41.7%

Unlikely 5 20.8%

Very Unlikely 1 4.2%

Don’t Know 3 12.5%

Total 24 100%

Table 12.7: How likely are you to use this kind of search engine again?

In document Zeeker: A topic-based search engine (Sider 114-118)