Examples of pooling

The same search query was run on two search engines. Please see the search result list: In this example, we consider the top 10 links; relevant ones are yellow-marked. After you have decided the relevance of each link, put the relevant ones in a common pool. Some links appear on one search engine, some on both. In the pool, each link appears only once. We have 7 relevant links in the pool:

Another query:

We have 8 relevant links in the pool:

 

Example of precision and recall

Based on the above queries and pooling.

GoogleLycos
q1q2 q1q2
PRPRPRPR
Top 1 - - 1/1=1 1/8=0.13 1/1=1 1/7=0.14 1/1=1 1/8=0.13
Top 2 1/2=0.5 1/7=0.14 2/2=1 2/8=0.25 1/2=0.5 1/7=0.14 2/2=1 2/8=0.25
Top 3 1/3=0.33 1/7=0.14 2/3=0.67 2/8=0.25 2/3=0.67 2/7=0.29 3/3=1 3/8=0.38
Top 4 2/4=0.5 2/7=0.29 3/4=0.75 3/8=0.38 3/4=0.75 3/7=0.43 3/4=0.75 3/8=0.38
Top 5 3/5=0.6 3/7=0.43 4/5=0.8 4/8=0.5 3/5=0.6 3/7=0.43 3/5=0.6 3/8=0.38
Top 6 4/6=0.67 4/7=0.57 5/6=0.83 5/8=0.63 3/6=0.5 3/7=0.43 3/6=0.5 3/8=0.38
Top 7 5/7=0.71 5/7=0.71 6/7=0.86 6/8=0.75 3/7=0.43 3/7=0.43 3/7=0.43 3/8=0.38
Top 8 6/8=0.75 6/7=0.86 6/8=0.75 6/8=0.75 3/8=0.38 3/7=0.43 4/8=0.5 4/8=0.5
Top 9 6/9=0.67 6/7=0.86 6/9=0.67 6/8=0.75 3/9=0.33 3/7=0.43 4/9=0.44 4/8=0.5
Top 10 6/10=0.6 6/7=0.86 7/10=0.7 7/8=0.88 3/10=0.3 3/7=0.43 4/10=0.4 4/8=0.5

Please observe that recall values are always rising as we consider top 1, top 2, etc. documents. Precision values can go up and down. Also observe that the top 1 precision-recall values for q1 on Google are not defined because the top 1 link was not relevant.

Google precision-recall curves Lycos precision-recall curves
Please observe that in Assignment 1 you measure precision-recall for the top 5, top 10, top 15 ... top 30 links.

 

Example of interpolated precision

Based on the above precision-recall values:

Google, interpolated precision Lycos, interpolated precision
Standard recall q1 q2 Average q1 q2 Average
0.1 0.75 1 0.88 1 1 1
0.2 0.75 1 0.88 0.75 1 0.88
0.3 0.75 0.86 0.81 0.75 1 0.88
0.4 0.75 0.86 0.81 0.75 0.50 0.63
0.5 0.75 0.86 0.81 0 0.5 0.25
0.6 0.75 0.86 0.81 - - -
0.7 0.75 0.86 0.81 - - -
0.8 0.75 0.7 0.73 - - -

In order to enable average, missing interpolated precision values are replaced with 0.

Please observe that an interpolated precision value is the highest, not closest, measured precision value "to the right" (which includes your exact measured precision if your measured recall value happens to be equal to the standard recall value being considered). A sequence of interpolated precision values is either flat or falling, never rising.

Google vs. Lycos
Each curve represents the average interpolated precision-recall curve, one curve for Google, one curve for Lycos.


Assignment 1