Pārlūkot izejas kodu

[DOCUMENTATION] French: new file: Zend_Search_Lucene-Queries.xml

git-svn-id: http://framework.zend.com/svn/framework/standard/trunk@20414 44c647ce-9c0f-0410-b52a-842ac1e357ba
kara 16 gadi atpakaļ
vecāks
revīzija
b0c65f5a1e

+ 611 - 0
documentation/manual/fr/module_specs/Zend_Search_Lucene-Queries.xml

@@ -0,0 +1,611 @@
+<?xml version="1.0" encoding="utf-8"?>
+<!-- EN-Revision: 20115 -->
+<!-- Reviewed: no -->
+<sect1 id="zend.search.lucene.query-api">
+    <title>API de construction de requêtes</title>
+    <para>
+        En plus du parsage automatique d'une requête en chaîne de caractères, il est également possible de construire
+        cette requête à l'aide de l'<acronym>API</acronym> de requête.
+    </para>
+    <para>
+        Les requêtes utilisateur peuvent être combinées avec les requêtes créées à travers l'API de requêtes.
+        Utilisez simplement le parseur de requêtes pour construire une requête à partir d'une chaîne :
+        <programlisting language="php"><![CDATA[
+$query = Zend_Search_Lucene_Search_QueryParser::parse($queryString);
+]]></programlisting>
+    </para>
+    <sect2 id="zend.search.lucene.queries.exceptions">
+        <title>Les Exceptions du parseur de requêtes</title>
+        <para>
+        Le parseur de requêtes peut générer deux types d'exceptions :
+        <itemizedlist>
+            <listitem>
+                <para>
+                    Une <classname>Zend_Search_Lucene_Exception</classname> est levée si quelque-chose d'anormal se produit dans le
+                    parseur même.
+                </para>
+            </listitem>
+            <listitem>
+                <para>
+                    Une <classname>Zend_Search_Lucene_Search_QueryParserException</classname> est levée s'il y a une erreur dans la syntaxe de la requête.
+                </para>
+            </listitem>
+        </itemizedlist>
+        C'est une bonne idée d'attraper les <classname>Zend_Search_Lucene_Search_QueryParserException</classname>s et
+        de les traiter de manière appropriée :
+        <programlisting language="php"><![CDATA[
+try {
+    $query = Zend_Search_Lucene_Search_QueryParser::parse($queryString);
+} catch (Zend_Search_Lucene_Search_QueryParserException $e) {
+    echo "Query syntax error: " . $e->getMessage() . "\n";
+}
+]]></programlisting>
+        </para>
+        <para>
+            La même technique devrait être utilisée pour la méthode find() d'un objet <classname>Zend_Search_Lucene</classname>.
+        </para>
+        <para>
+            Depuis la version 1.5, les exceptions de parsage de requête sont supprimées par défaut. Si la requête ne
+            respecte pas le langage de requêtes, elle est "tokenizée" à l'aide de l'analyseur par défaut et tous les
+            termes "tokenizés" sont utilisés dans la recherche.
+            Pour activer les exceptions, utilisez <methodname>Zend_Search_Lucene_Search_QueryParser::dontSuppressQueryParsingExceptions()</methodname>.
+            Les méthodes <methodname>Zend_Search_Lucene_Search_QueryParser::suppressQueryParsingExceptions()</methodname> et
+            <methodname>Zend_Search_Lucene_Search_QueryParser::queryParsingExceptionsSuppressed()</methodname>
+            sont également destinées à gérer le comportement de gestion des exceptions.
+        </para>
+    </sect2>
+    <sect2 id="zend.search.lucene.queries.term-query">
+        <title>Term Query</title>
+        <para>
+            Term queries can be used for searching with a single term.
+        </para>
+        <para>
+        Query string:
+        </para>
+        <programlisting language="querystring"><![CDATA[
+word1
+]]></programlisting>
+        <para>or</para>
+        <para>
+        Query construction by <acronym>API</acronym>:
+        </para>
+        <programlisting language="php"><![CDATA[
+$term  = new Zend_Search_Lucene_Index_Term('word1', 'field1');
+$query = new Zend_Search_Lucene_Search_Query_Term($term);
+$hits  = $index->find($query);
+]]></programlisting>
+        <para>
+            The term field is optional. <classname>Zend_Search_Lucene</classname> searches through all indexed fields in each document if the field is not specified:
+            <programlisting language="php"><![CDATA[
+// Search for 'word1' in all indexed fields
+$term  = new Zend_Search_Lucene_Index_Term('word1');
+$query = new Zend_Search_Lucene_Search_Query_Term($term);
+$hits  = $index->find($query);
+]]></programlisting>
+        </para>
+    </sect2>
+    <sect2 id="zend.search.lucene.queries.multiterm-query">
+        <title>Multi-Term Query</title>
+        <para>
+            Multi-term queries can be used for searching with a set of terms.
+        </para>
+        <para>
+            Each term in a set can be defined as <emphasis>required</emphasis>,
+            <emphasis>prohibited</emphasis>, or <emphasis>neither</emphasis>.
+            <itemizedlist>
+                <listitem>
+                    <para>
+                        <emphasis>required</emphasis> means that documents not matching this term will not match
+                        the query;
+                    </para>
+                </listitem>
+                <listitem>
+                    <para>
+                        <emphasis>prohibited</emphasis> means that documents matching this term will not match
+                        the query;
+                    </para>
+                </listitem>
+                <listitem>
+                    <para>
+                        <emphasis>neither</emphasis>, in which case matched documents are neither prohibited
+                        from, nor required to, match the term. A document must match at least 1 term, however, to
+                        match the query.
+                    </para>
+                </listitem>
+            </itemizedlist>
+        </para>
+        <para>
+            If optional terms are added to a query with required terms,
+            both queries will have the same result set but the optional terms may affect the score of the matched documents.
+        </para>
+        <para>
+            Both search methods can be used for multi-term queries.
+        </para>
+        <para>
+            Query string:
+        </para>
+        <programlisting language="querystring"><![CDATA[
++word1 author:word2 -word3
+]]></programlisting>
+        <itemizedlist>
+            <listitem>
+                <para>
+                    '+' is used to define a required term.
+                </para>
+            </listitem>
+            <listitem>
+                <para>
+                    '-' is used to define a prohibited term.
+                </para>
+            </listitem>
+            <listitem>
+                <para>
+                    'field:' prefix is used to indicate a document field for a search.
+                    If it's omitted, then all fields are searched.
+                </para>
+            </listitem>
+        </itemizedlist>
+        <para>or</para>
+        <para>
+        Query construction by <acronym>API</acronym>:
+        </para>
+        <programlisting language="php"><![CDATA[
+$query = new Zend_Search_Lucene_Search_Query_MultiTerm();
+$query->addTerm(new Zend_Search_Lucene_Index_Term('word1'), true);
+$query->addTerm(new Zend_Search_Lucene_Index_Term('word2', 'author'),
+                null);
+$query->addTerm(new Zend_Search_Lucene_Index_Term('word3'), false);
+$hits  = $index->find($query);
+]]></programlisting>
+        <para>
+            It's also possible to specify terms list within MultiTerm query constructor:
+            <programlisting language="php"><![CDATA[
+$terms = array(new Zend_Search_Lucene_Index_Term('word1'),
+               new Zend_Search_Lucene_Index_Term('word2', 'author'),
+               new Zend_Search_Lucene_Index_Term('word3'));
+$signs = array(true, null, false);
+$query = new Zend_Search_Lucene_Search_Query_MultiTerm($terms, $signs);
+$hits  = $index->find($query);
+]]></programlisting>
+        </para>
+        <para>
+            The <varname>$signs</varname> array contains information about the term type:
+            <itemizedlist>
+                <listitem>
+                    <para>
+                        <constant>TRUE</constant> is used to define required term.
+                    </para>
+                </listitem>
+                <listitem>
+                    <para>
+                        <constant>FALSE</constant> is used to define prohibited term.
+                    </para>
+                </listitem>
+                <listitem>
+                    <para>
+                        <constant>NULL</constant> is used to define a term that is neither required
+                        nor prohibited.
+                    </para>
+                </listitem>
+            </itemizedlist>
+        </para>
+    </sect2>
+    <sect2 id="zend.search.lucene.queries.boolean-query">
+        <title>Boolean Query</title>
+        <para>
+            Boolean queries allow to construct query using other queries and boolean operators.
+        </para>
+        <para>
+            Each subquery in a set can be defined as <emphasis>required</emphasis>,
+            <emphasis>prohibited</emphasis>, or <emphasis>optional</emphasis>.
+            <itemizedlist>
+                <listitem>
+                    <para>
+                        <emphasis>required</emphasis> means that documents not matching this subquery will not match
+                        the query;
+                    </para>
+                </listitem>
+                <listitem>
+                    <para>
+                        <emphasis>prohibited</emphasis> means that documents matching this subquery will not match
+                        the query;
+                    </para>
+                </listitem>
+                <listitem>
+                    <para>
+                        <emphasis>optional</emphasis>, in which case matched documents are neither prohibited
+                        from, nor required to, match the subquery. A document must match at least 1 subquery, however, to
+                        match the query.
+                    </para>
+                </listitem>
+            </itemizedlist>
+        </para>
+        <para>
+            If optional subqueries are added to a query with required subqueries,
+            both queries will have the same result set but the optional subqueries may affect the score of the matched documents.
+        </para>
+        <para>
+            Both search methods can be used for boolean queries.
+        </para>
+        <para>
+            Query string:
+        </para>
+        <programlisting language="querystring"><![CDATA[
++(word1 word2 word3) author:(word4 word5) -word6
+]]></programlisting>
+        <itemizedlist>
+            <listitem>
+                <para>
+                    '+' is used to define a required subquery.
+                </para>
+            </listitem>
+            <listitem>
+                <para>
+                    '-' is used to define a prohibited subquery.
+                </para>
+            </listitem>
+            <listitem>
+                <para>
+                    'field:' prefix is used to indicate a document field for a search.
+                    If it's omitted, then all fields are searched.
+                </para>
+            </listitem>
+        </itemizedlist>
+        <para>or</para>
+        <para>
+        Query construction by <acronym>API</acronym>:
+        </para>
+        <programlisting language="php"><![CDATA[
+$query = new Zend_Search_Lucene_Search_Query_Boolean();
+$subquery1 = new Zend_Search_Lucene_Search_Query_MultiTerm();
+$subquery1->addTerm(new Zend_Search_Lucene_Index_Term('word1'));
+$subquery1->addTerm(new Zend_Search_Lucene_Index_Term('word2'));
+$subquery1->addTerm(new Zend_Search_Lucene_Index_Term('word3'));
+$subquery2 = new Zend_Search_Lucene_Search_Query_MultiTerm();
+$subquery2->addTerm(new Zend_Search_Lucene_Index_Term('word4', 'author'));
+$subquery2->addTerm(new Zend_Search_Lucene_Index_Term('word5', 'author'));
+$term6 = new Zend_Search_Lucene_Index_Term('word6');
+$subquery3 = new Zend_Search_Lucene_Search_Query_Term($term6);
+$query->addSubquery($subquery1, true  /* required */);
+$query->addSubquery($subquery2, null  /* optional */);
+$query->addSubquery($subquery3, false /* prohibited */);
+$hits  = $index->find($query);
+]]></programlisting>
+        <para>
+            It's also possible to specify subqueries list within Boolean query constructor:
+            <programlisting language="php"><![CDATA[
+...
+$subqueries = array($subquery1, $subquery2, $subquery3);
+$signs = array(true, null, false);
+$query = new Zend_Search_Lucene_Search_Query_Boolean($subqueries, $signs);
+$hits  = $index->find($query);
+]]></programlisting>
+        </para>
+        <para>
+            The <varname>$signs</varname> array contains information about the subquery type:
+            <itemizedlist>
+                <listitem>
+                    <para>
+                        <constant>TRUE</constant> is used to define required subquery.
+                    </para>
+                </listitem>
+                <listitem>
+                    <para>
+                        <constant>FALSE</constant> is used to define prohibited subquery.
+                    </para>
+                </listitem>
+                <listitem>
+                    <para>
+                        <constant>NULL</constant> is used to define a subquery that is neither
+                        required nor prohibited.
+                    </para>
+                </listitem>
+            </itemizedlist>
+        </para>
+        <para>
+            Each query which uses boolean operators can be rewritten using signs notation and constructed using API. For example:
+            <programlisting language="querystring"><![CDATA[
+word1 AND (word2 AND word3 AND NOT word4) OR word5
+]]></programlisting>
+            is equivalent to
+            <programlisting language="querystring"><![CDATA[
+(+(word1) +(+word2 +word3 -word4)) (word5)
+]]></programlisting>
+        </para>
+    </sect2>
+    <sect2 id="zend.search.lucene.queries.wildcard">
+        <title>Wildcard Query</title>
+        <para>
+            Wildcard queries can be used to search for documents containing strings matching specified patterns.
+        </para>
+        <para>
+            The '?' symbol is used as a single character wildcard.
+        </para>
+        <para>
+            The '*' symbol is used as a multiple character wildcard.
+        </para>
+        <para>
+            Query string:
+            <programlisting language="querystring"><![CDATA[
+field1:test*
+]]></programlisting>
+        </para>
+        <para>or</para>
+        <para>
+            Query construction by API:
+            <programlisting language="php"><![CDATA[
+$pattern = new Zend_Search_Lucene_Index_Term('test*', 'field1');
+$query = new Zend_Search_Lucene_Search_Query_Wildcard($pattern);
+$hits  = $index->find($query);
+]]></programlisting>
+        </para>
+        <para>
+            The term field is optional. <classname>Zend_Search_Lucene</classname> searches through all fields on each document if a field is not specified:
+            <programlisting language="php"><![CDATA[
+$pattern = new Zend_Search_Lucene_Index_Term('test*');
+$query = new Zend_Search_Lucene_Search_Query_Wildcard($pattern);
+$hits  = $index->find($query);
+]]></programlisting>
+        </para>
+    </sect2>
+    <sect2 id="zend.search.lucene.queries.fuzzy">
+        <title>Fuzzy Query</title>
+        <para>
+            Fuzzy queries can be used to search for documents containing strings matching terms similar to specified term.
+        </para>
+        <para>
+            Query string:
+            <programlisting language="querystring"><![CDATA[
+field1:test~
+]]></programlisting>
+            This query matches documents containing 'test' 'text' 'best' words and others.
+        </para>
+        <para>or</para>
+        <para>
+            Query construction by API:
+            <programlisting language="php"><![CDATA[
+$term = new Zend_Search_Lucene_Index_Term('test', 'field1');
+$query = new Zend_Search_Lucene_Search_Query_Fuzzy($term);
+$hits  = $index->find($query);
+]]></programlisting>
+        </para>
+        <para>
+            Optional similarity can be specified after "~" sign.
+        </para>
+        <para>
+            Query string:
+            <programlisting language="querystring"><![CDATA[
+field1:test~0.4
+]]></programlisting>
+        </para>
+        <para>or</para>
+        <para>
+            Query construction by API:
+            <programlisting language="php"><![CDATA[
+$term = new Zend_Search_Lucene_Index_Term('test', 'field1');
+$query = new Zend_Search_Lucene_Search_Query_Fuzzy($term, 0.4);
+$hits  = $index->find($query);
+]]></programlisting>
+        </para>
+        <para>
+            The term field is optional. <classname>Zend_Search_Lucene</classname> searches through all fields on each document if a field is not specified:
+            <programlisting language="php"><![CDATA[
+$term = new Zend_Search_Lucene_Index_Term('test');
+$query = new Zend_Search_Lucene_Search_Query_Fuzzy($term);
+$hits  = $index->find($query);
+]]></programlisting>
+        </para>
+    </sect2>
+    <sect2 id="zend.search.lucene.queries.phrase-query">
+        <title>Phrase Query</title>
+        <para>
+            Phrase Queries can be used to search for a phrase within documents.
+        </para>
+        <para>
+            Phrase Queries are very flexible and allow the user or developer to search for exact phrases as well as 'sloppy' phrases.
+        </para>
+        <para>
+            Phrases can also contain gaps or terms in the same places; they can be generated by
+            the analyzer for different purposes. For example, a term can be duplicated to increase the term
+            its weight, or several synonyms can be placed into a single position.
+        </para>
+        <programlisting language="php"><![CDATA[
+$query1 = new Zend_Search_Lucene_Search_Query_Phrase();
+// Add 'word1' at 0 relative position.
+$query1->addTerm(new Zend_Search_Lucene_Index_Term('word1'));
+// Add 'word2' at 1 relative position.
+$query1->addTerm(new Zend_Search_Lucene_Index_Term('word2'));
+// Add 'word3' at 3 relative position.
+$query1->addTerm(new Zend_Search_Lucene_Index_Term('word3'), 3);
+...
+$query2 = new Zend_Search_Lucene_Search_Query_Phrase(
+                array('word1', 'word2', 'word3'), array(0,1,3));
+...
+// Query without a gap.
+$query3 = new Zend_Search_Lucene_Search_Query_Phrase(
+                array('word1', 'word2', 'word3'));
+...
+$query4 = new Zend_Search_Lucene_Search_Query_Phrase(
+                array('word1', 'word2'), array(0,1), 'annotation');
+]]></programlisting>
+        <para>
+            A phrase query can be constructed in one step with a class constructor or step by step with
+            <methodname>Zend_Search_Lucene_Search_Query_Phrase::addTerm()</methodname> method calls.
+        </para>
+        <para>
+            <classname>Zend_Search_Lucene_Search_Query_Phrase</classname> class constructor takes three optional arguments:
+        </para>
+        <programlisting language="php"><![CDATA[
+Zend_Search_Lucene_Search_Query_Phrase(
+    [array $terms[, array $offsets[, string $field]]]
+);
+]]></programlisting>
+        <para>
+            The <varname>$terms</varname> parameter is an array of strings that contains a set of
+            phrase terms. If it's omitted or equal to <constant>NULL</constant>, then an empty query
+            is constructed.
+        </para>
+        <para>
+            The <varname>$offsets</varname> parameter is an array of integers that contains offsets
+            of terms in a phrase. If it's omitted or equal to <constant>NULL</constant>, then the
+            terms' positions are assumed to be sequential with no gaps.
+        </para>
+        <para>
+            The <varname>$field</varname> parameter is a string that indicates the document field
+            to search. If it's omitted or equal to <constant>NULL</constant>, then the default field
+            is searched.
+        </para>
+        <para>
+            Thus:
+        </para>
+        <programlisting language="php"><![CDATA[
+$query =
+    new Zend_Search_Lucene_Search_Query_Phrase(array('zend', 'framework'));
+]]></programlisting>
+        <para>
+            will search for the phrase 'zend framework' in all fields.
+        </para>
+        <programlisting language="php"><![CDATA[
+$query = new Zend_Search_Lucene_Search_Query_Phrase(
+                 array('zend', 'download'), array(0, 2)
+             );
+]]></programlisting>
+        <para>
+            will search for the phrase 'zend ????? download' and match 'zend platform download', 'zend studio
+            download', 'zend core download', 'zend framework download', and so on.
+        </para>
+        <programlisting language="php"><![CDATA[
+$query = new Zend_Search_Lucene_Search_Query_Phrase(
+                 array('zend', 'framework'), null, 'title'
+             );
+]]></programlisting>
+        <para>
+            will search for the phrase 'zend framework' in the 'title' field.
+        </para>
+        <para>
+            <methodname>Zend_Search_Lucene_Search_Query_Phrase::addTerm()</methodname> takes two arguments, a
+            required <classname>Zend_Search_Lucene_Index_Term</classname> object and an optional position:
+        </para>
+        <programlisting language="php"><![CDATA[
+Zend_Search_Lucene_Search_Query_Phrase::addTerm(
+    Zend_Search_Lucene_Index_Term $term[, integer $position]
+);
+]]></programlisting>
+        <para>
+            The <varname>$term</varname> parameter describes the next term in the phrase. It must indicate the same field as previous terms, or an exception will be thrown.
+        </para>
+        <para>
+            The <varname>$position</varname> parameter indicates the term position in the phrase.
+        </para>
+        <para>
+            Thus:
+        </para>
+        <programlisting language="php"><![CDATA[
+$query = new Zend_Search_Lucene_Search_Query_Phrase();
+$query->addTerm(new Zend_Search_Lucene_Index_Term('zend'));
+$query->addTerm(new Zend_Search_Lucene_Index_Term('framework'));
+]]></programlisting>
+        <para>
+            will search for the phrase 'zend framework'.
+         </para>
+        <programlisting language="php"><![CDATA[
+$query = new Zend_Search_Lucene_Search_Query_Phrase();
+$query->addTerm(new Zend_Search_Lucene_Index_Term('zend'), 0);
+$query->addTerm(new Zend_Search_Lucene_Index_Term('framework'), 2);
+]]></programlisting>
+        <para>
+            will search for the phrase 'zend ????? download' and match 'zend platform download', 'zend studio
+            download', 'zend core download', 'zend framework download', and so on.
+        </para>
+        <programlisting language="php"><![CDATA[
+$query = new Zend_Search_Lucene_Search_Query_Phrase();
+$query->addTerm(new Zend_Search_Lucene_Index_Term('zend', 'title'));
+$query->addTerm(new Zend_Search_Lucene_Index_Term('framework', 'title'));
+]]></programlisting>
+        <para>
+            will search for the phrase 'zend framework' in the 'title' field.
+        </para>
+        <para>
+            The slop factor sets the number of other words permitted between specified words in the query phrase. If set to zero,
+            then the corresponding query is an exact phrase search. For larger values this works like the WITHIN or NEAR
+            operators.
+        </para>
+        <para>
+            The slop factor is in fact an edit distance, where the edits correspond to moving terms in the query
+            phrase. For example, to switch the order of two words requires two moves (the
+            first move places the words atop one another), so to permit re-orderings of phrases, the slop factor
+            must be at least two.
+        </para>
+        <para>
+            More exact matches are scored higher than sloppier matches; thus, search results are sorted by
+            exactness. The slop is zero by default, requiring exact matches.
+        </para>
+        <para>
+            The slop factor can be assigned after query creation:
+        </para>
+        <programlisting language="php"><![CDATA[
+// Query without a gap.
+$query =
+    new Zend_Search_Lucene_Search_Query_Phrase(array('word1', 'word2'));
+// Search for 'word1 word2', 'word1 ... word2'
+$query->setSlop(1);
+$hits1 = $index->find($query);
+// Search for 'word1 word2', 'word1 ... word2',
+// 'word1 ... ... word2', 'word2 word1'
+$query->setSlop(2);
+$hits2 = $index->find($query);
+]]></programlisting>
+    </sect2>
+    <sect2 id="zend.search.lucene.queries.range">
+        <title>Range Query</title>
+        <para>
+            <link linkend="zend.search.lucene.query-language.range">Range queries</link> are intended for searching terms within specified interval.
+        </para>
+        <para>
+            Query string:
+            <programlisting language="querystring"><![CDATA[
+mod_date:[20020101 TO 20030101]
+title:{Aida TO Carmen}
+]]></programlisting>
+        </para>
+        <para>or</para>
+        <para>
+            Query construction by API:
+            <programlisting language="php"><![CDATA[
+$from = new Zend_Search_Lucene_Index_Term('20020101', 'mod_date');
+$to   = new Zend_Search_Lucene_Index_Term('20030101', 'mod_date');
+$query = new Zend_Search_Lucene_Search_Query_Range(
+                 $from, $to, true // inclusive
+             );
+$hits  = $index->find($query);
+]]></programlisting>
+        </para>
+        <para>
+            Term fields are optional. <classname>Zend_Search_Lucene</classname> searches through all fields if the field is not specified:
+            <programlisting language="php"><![CDATA[
+$from = new Zend_Search_Lucene_Index_Term('Aida');
+$to   = new Zend_Search_Lucene_Index_Term('Carmen');
+$query = new Zend_Search_Lucene_Search_Query_Range(
+                 $from, $to, false // non-inclusive
+             );
+$hits  = $index->find($query);
+]]></programlisting>
+        </para>
+        <para>
+            Either (but not both) of the boundary terms may be set to <constant>NULL</constant>.
+            <classname>Zend_Search_Lucene</classname> searches from the beginning or
+            up to the end of the dictionary for the specified field(s) in this case:
+            <programlisting language="php"><![CDATA[
+// searches for ['20020101' TO ...]
+$from = new Zend_Search_Lucene_Index_Term('20020101', 'mod_date');
+$query = new Zend_Search_Lucene_Search_Query_Range(
+                 $from, null, true // inclusive
+             );
+$hits  = $index->find($query);
+]]></programlisting>
+        </para>
+    </sect2>
+</sect1>
+<!--
+vim:se ts=4 sw=4 et:
+-->