<?xml version="1.0" encoding="UTF-8"?><rss
version="2.0"
xmlns:content="http://purl.org/rss/1.0/modules/content/"
xmlns:dc="http://purl.org/dc/elements/1.1/"
xmlns:atom="http://www.w3.org/2005/Atom"
xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
> <channel><title>Comments on: Lets Roll Our Own Boolean Query Search Engine</title> <atom:link href="http://www.skorks.com/2010/02/lets-roll-our-own-boolean-query-search-engine/feed/" rel="self" type="application/rss+xml" /><link>http://www.skorks.com/2010/02/lets-roll-our-own-boolean-query-search-engine/</link> <description>For the betterment of the software craft...</description> <lastBuildDate>Mon, 21 Nov 2011 13:57:06 +0000</lastBuildDate> <sy:updatePeriod>hourly</sy:updatePeriod> <sy:updateFrequency>1</sy:updateFrequency> <generator>http://wordpress.org/?v=3.1.2</generator> <item><title>By: Alan Skorkin</title><link>http://www.skorks.com/2010/02/lets-roll-our-own-boolean-query-search-engine/comment-page-1/#comment-3731</link> <dc:creator>Alan Skorkin</dc:creator> <pubDate>Sun, 28 Feb 2010 02:28:57 +0000</pubDate> <guid
isPermaLink="false">http://www.skorks.com/?p=1289#comment-3731</guid> <description>The way I understand it, a boolean retrieval system is called that not because you do AND, OR, NOT queries, it is called that due to the fact that the results are either found or not found (i.e. either 1 or 0). There is no fuzziness like there would be in a ranked system such as a web search engine.
All ranked retrieval systems are also boolean retrieval systems (kinda) in that you have to find the results before you can rank them by relevance.</description> <content:encoded><![CDATA[<p>The way I understand it, a boolean retrieval system is called that not because you do AND, OR, NOT queries, it is called that due to the fact that the results are either found or not found (i.e. either 1 or 0). There is no fuzziness like there would be in a ranked system such as a web search engine.</p><p>All ranked retrieval systems are also boolean retrieval systems (kinda) in that you have to find the results before you can rank them by relevance.</p> ]]></content:encoded> </item> <item><title>By: CarolN</title><link>http://www.skorks.com/2010/02/lets-roll-our-own-boolean-query-search-engine/comment-page-1/#comment-3712</link> <dc:creator>CarolN</dc:creator> <pubDate>Sat, 27 Feb 2010 18:31:48 +0000</pubDate> <guid
isPermaLink="false">http://www.skorks.com/?p=1289#comment-3712</guid> <description>I&#039;m not sure why you make the distinction &quot;the results are not ranked by relevance (being a boolean system)&quot;-- many search engines are BOTH a boolean retrieval system and a ranked retrieval system. There&#039;s nothing incompatible about doing both boolean queries and ranking, and having ranking doesn&#039;t make something NOT a boolean retrieval system...</description> <content:encoded><![CDATA[<p>I&#8217;m not sure why you make the distinction &#8220;the results are not ranked by relevance (being a boolean system)&#8221;&#8211; many search engines are BOTH a boolean retrieval system and a ranked retrieval system. There&#8217;s nothing incompatible about doing both boolean queries and ranking, and having ranking doesn&#8217;t make something NOT a boolean retrieval system&#8230;</p> ]]></content:encoded> </item> <item><title>By: Alan Skorkin</title><link>http://www.skorks.com/2010/02/lets-roll-our-own-boolean-query-search-engine/comment-page-1/#comment-3490</link> <dc:creator>Alan Skorkin</dc:creator> <pubDate>Sun, 07 Feb 2010 01:30:15 +0000</pubDate> <guid
isPermaLink="false">http://www.skorks.com/?p=1289#comment-3490</guid> <description>Ah yes, now I see it, you&#039;re right the intersecting part of the two circles is in play for the OR NOT case but not in play for the NOT case (as per your truth table). Unfortunately, this doesn&#039;t remove any of the complexity of actually finding a list of documents that would satisfy either query :(.
Thanks for picking up on my error, much appreciated.</description> <content:encoded><![CDATA[<p>Ah yes, now I see it, you&#8217;re right the intersecting part of the two circles is in play for the OR NOT case but not in play for the NOT case (as per your truth table). Unfortunately, this doesn&#8217;t remove any of the complexity of actually finding a list of documents that would satisfy either query :(.</p><p>Thanks for picking up on my error, much appreciated.</p> ]]></content:encoded> </item> <item><title>By: Tordek</title><link>http://www.skorks.com/2010/02/lets-roll-our-own-boolean-query-search-engine/comment-page-1/#comment-3488</link> <dc:creator>Tordek</dc:creator> <pubDate>Sat, 06 Feb 2010 18:46:16 +0000</pubDate> <guid
isPermaLink="false">http://www.skorks.com/?p=1289#comment-3488</guid> <description>If you actually draw this case out with the two intersecting circles you’ll find that the area colored by both queries is NOT the same.
&quot;hello world&quot; is not a query in my example; it&#039;s an item you search for. IE, a document containing both &quot;hello&quot; and &quot;world&quot;.
Basic logic:
&lt;pre&gt;
p&#124;q&#124;p OR NOT q&#124; NOT q
0&#124;0&#124;1         &#124;1
0&#124;1&#124;0         &#124;0
1&#124;0&#124;1         &#124;1
1&#124;1&#124;1         &#124;0 &lt;- Different.
&lt;/pre&gt;</description> <content:encoded><![CDATA[<p>If you actually draw this case out with the two intersecting circles you’ll find that the area colored by both queries is NOT the same.</p><p>&#8220;hello world&#8221; is not a query in my example; it&#8217;s an item you search for. IE, a document containing both &#8220;hello&#8221; and &#8220;world&#8221;.</p><p>Basic logic:</p><pre>
p|q|p OR NOT q| NOT q
0|0|1         |1
0|1|0         |0
1|0|1         |1
1|1|1         |0 &lt;- Different.
</pre>]]></content:encoded> </item> <item><title>By: Alan Skorkin</title><link>http://www.skorks.com/2010/02/lets-roll-our-own-boolean-query-search-engine/comment-page-1/#comment-3485</link> <dc:creator>Alan Skorkin</dc:creator> <pubDate>Sat, 06 Feb 2010 02:08:37 +0000</pubDate> <guid
isPermaLink="false">http://www.skorks.com/?p=1289#comment-3485</guid> <description>If you draw this case out with the two intersecting circles you&#039;ll find that the area colored by both queries is the same, which makes me think they are indeed equivalent.
“hello world”  - is not the trivial case. In a boolean system you can&#039;t do multi-word queries without an operator unless the tokens used during indexing are also multi-word.</description> <content:encoded><![CDATA[<p>If you draw this case out with the two intersecting circles you&#8217;ll find that the area colored by both queries is the same, which makes me think they are indeed equivalent.</p><p>“hello world”  &#8211; is not the trivial case. In a boolean system you can&#8217;t do multi-word queries without an operator unless the tokens used during indexing are also multi-word.</p> ]]></content:encoded> </item> <item><title>By: Tordek</title><link>http://www.skorks.com/2010/02/lets-roll-our-own-boolean-query-search-engine/comment-page-1/#comment-3480</link> <dc:creator>Tordek</dc:creator> <pubDate>Fri, 05 Feb 2010 19:04:46 +0000</pubDate> <guid
isPermaLink="false">http://www.skorks.com/?p=1289#comment-3480</guid> <description>&quot;hello OR NOT world&quot; is NOT the same as &quot;NOT world&quot;.
The trivial case: &quot;hello world&quot; will match the first query (since hello is there), but will not match the second.</description> <content:encoded><![CDATA[<p>&#8220;hello OR NOT world&#8221; is NOT the same as &#8220;NOT world&#8221;.</p><p>The trivial case: &#8220;hello world&#8221; will match the first query (since hello is there), but will not match the second.</p> ]]></content:encoded> </item> </channel> </rss>
