Zend_Locale-Parsing.xml 19 KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269270271272273274275276277278279280281282283284285286287288289290291292293294295296297298299300301302303304305306307308309310311312313314315316317318319320321322323324325326327328329330331332333334335336337338339340341342343344345346347348349350351352353354355356357358359360361362363364365366367368369370371372373374375376377378379380381382383384385386387388389390391392393394395396397398399400401402403404405406407408409410411412413414415416417418419420421422423424425426427428429430431432433434435436437438439440441442443444445446447448449450451452453454455456457458459460461462463464465466467468469470471472473474475476477478479480481482483484485486487488489490
  1. <?xml version="1.0" encoding="UTF-8"?>
  2. <!-- Reviewed: no -->
  3. <sect1 id="zend.locale.parsing">
  4. <title>Normalization and Localization</title>
  5. <para>
  6. <classname>Zend_Locale_Format</classname> is a internal component used by <classname>Zend_Locale</classname>. All locale aware classes use
  7. <classname>Zend_Locale_Format</classname> for normalization and localization of numbers and dates. Normalization involves
  8. parsing input from a variety of data representations, like dates, into a standardized, structured
  9. representation, such as a PHP array with year, month, and day elements.
  10. </para>
  11. <para>
  12. The exact same string containing a number or a date might mean different things to people with different customs
  13. and conventions. Disambiguation of numbers and dates requires rules about how to interpret these strings and
  14. normalize the values into a standardized data structure. Thus, all methods in <classname>Zend_Locale_Format</classname>
  15. require a locale in order to parse the input data.
  16. <note>
  17. <title>Default "root" Locale</title>
  18. <para>
  19. If no locale is specified, then normalization and localization will use the standard "root" locale,
  20. which might yield unexpected behavior, if the input originated in a different locale, or output for a
  21. specific locale was expected.
  22. </para>
  23. </note>
  24. </para>
  25. <sect2 id="zend.locale.number.normalize">
  26. <title>Number normalization: getNumber($input, Array $options)</title>
  27. <para>
  28. There are many
  29. <ulink url="http://en.wikipedia.org/wiki/Numeral">number systems</ulink>
  30. different from the common
  31. <ulink url="http://en.wikipedia.org/wiki/Decimal">decimal system</ulink>
  32. (e.g. "3.14"). Numbers can be normalized with the <code>getNumber()</code> function to obtain the standard
  33. decimal representation. for all number-related discussions in this manual,
  34. <ulink url="http://en.wikipedia.org/wiki/Arabic_numerals">Arabic/European numerals (0,1,2,3,4,5,6,7,8,9)</ulink>
  35. are implied, unless explicitly stated otherwise. The options array may contain a 'locale' to define grouping
  36. and decimal characters. The array may also have a 'precision' to truncate excess digits from the result.
  37. </para>
  38. <example id="zend.locale.number.normalize.example-1">
  39. <title>Number normalization</title>
  40. <programlisting language="php"><![CDATA[
  41. $locale = new Zend_Locale('de_AT');
  42. $number = Zend_Locale_Format::getNumber('13.524,678',
  43. array('locale' => $locale,
  44. 'precision' => 3)
  45. );
  46. print $number; // will return 13524.678
  47. ]]></programlisting>
  48. </example>
  49. <sect3 id="zend.locale.number.normalize.precision">
  50. <title>Precision and Calculations</title>
  51. <para>
  52. Since <code>getNumber($value, array $options = array())</code> can normalize extremely large numbers,
  53. check the result carefully before using finite precision calculations, such as ordinary PHP math
  54. operations. For example, <code>if ((string)int_val($number) != $number) { use
  55. <ulink url="http://www.php.net/bc">BCMath</ulink>
  56. or
  57. <ulink url="http://www.php.net/gmp">GMP</ulink>
  58. </code>. Most PHP installations support the BCMath extension.
  59. </para>
  60. <para>
  61. Also, the precision of the resulting decimal representation can be rounded to a desired length with
  62. <code>getNumber()</code> with the option <code>'precision'</code>. If no precision is given,
  63. no rounding occurs. Use only PHP integers to specify the precision.
  64. </para>
  65. <para>
  66. If the resulting decimal representation should be truncated to a desired length instead of rounded
  67. the option <code>'number_format'</code> can be used instead. Define the length of the decimal
  68. representation with the desired length of zeros. The result will then not be rounded.
  69. So if the defined precision within <code>number_format</code> is zero the value "1.6" will
  70. return "1", not "2. See the example nearby:
  71. </para>
  72. <example id="zend.locale.number.normalize.precision.example-1">
  73. <title>Number normalization with precision</title>
  74. <programlisting language="php"><![CDATA[
  75. $locale = new Zend_Locale('de_AT');
  76. $number = Zend_Locale_Format::getNumber('13.524,678',
  77. array('precision' => 1,
  78. 'locale' => $locale)
  79. );
  80. print $number; // will return 13524.7
  81. $number = Zend_Locale_Format::getNumber('13.524,678',
  82. array('number_format' => '#.00',
  83. 'locale' => $locale)
  84. );
  85. print $number; // will return 13524.67
  86. ]]></programlisting>
  87. </example>
  88. </sect3>
  89. </sect2>
  90. <sect2 id="zend.locale.number.localize">
  91. <title>Number localization</title>
  92. <para>
  93. <code>toNumber($value, array $options = array())</code> can localize numbers to the following
  94. <link linkend="zend.locale.appendix">supported locales</link>
  95. . This function will return a localized string of the given number in a conventional format for a specific
  96. locale. The 'number_format' option explicitly specifies a non-default number format for use with
  97. <code>toNumber()</code>.
  98. </para>
  99. <example id="zend.locale.number.localize.example-1">
  100. <title>Number localization</title>
  101. <programlisting language="php"><![CDATA[
  102. $locale = new Zend_Locale('de_AT');
  103. $number = Zend_Locale_Format::toNumber(13547.36,
  104. array('locale' => $locale));
  105. // will return 13.547,36
  106. print $number;
  107. ]]></programlisting>
  108. </example>
  109. <para>
  110. <note>
  111. <title>Unlimited length</title>
  112. <para>
  113. <code>toNumber()</code> can localize numbers with unlimited length. It is not related to integer or
  114. float limitations.
  115. </para>
  116. </note>
  117. </para>
  118. <para>
  119. The same way as within <code>getNumber()</code>, <code>toNumber()</code> handles precision. If no precision
  120. is given, the complete localized number will be returned.
  121. </para>
  122. <example id="zend.locale.number.localize.example-2">
  123. <title>Number localization with precision</title>
  124. <programlisting language="php"><![CDATA[
  125. $locale = new Zend_Locale('de_AT');
  126. $number = Zend_Locale_Format::toNumber(13547.3678,
  127. array('precision' => 2,
  128. 'locale' => $locale));
  129. // will return 13.547,37
  130. print $number;
  131. ]]></programlisting>
  132. </example>
  133. <para>
  134. Using the option 'number_format' a self defined format for generating a number can be defined.
  135. The format itself has to be given in CLDR format as described below. The locale is used to get
  136. separation, precision and other number formatting signs from it. German for example defines
  137. ',' as precision separation and in English the '.' sign is used.
  138. </para>
  139. <table id="zend.locale.number.localize.table-1">
  140. <title>Format tokens for self generated number formats
  141. </title>
  142. <tgroup cols="4">
  143. <thead>
  144. <row>
  145. <entry>Token</entry>
  146. <entry>Description</entry>
  147. <entry>Example format</entry>
  148. <entry>Generated output</entry>
  149. </row>
  150. </thead>
  151. <tbody>
  152. <row>
  153. <entry>#0</entry>
  154. <entry>Generates a number without precision and separation</entry>
  155. <entry>#0</entry>
  156. <entry>1234567</entry>
  157. </row>
  158. <row>
  159. <entry>,</entry>
  160. <entry>Generates a separation with the length from separation to next separation or to 0</entry>
  161. <entry>#,##0</entry>
  162. <entry>1,234,567</entry>
  163. </row>
  164. <row>
  165. <entry>#,##,##0</entry>
  166. <entry>Generates a standard separation of 3 and all following separations with 2</entry>
  167. <entry>#,##,##0</entry>
  168. <entry>12,34,567</entry>
  169. </row>
  170. <row>
  171. <entry>.</entry>
  172. <entry>Generates a precision</entry>
  173. <entry>#0.#</entry>
  174. <entry>1234567.1234</entry>
  175. </row>
  176. <row>
  177. <entry>0</entry>
  178. <entry>Generates a precision with a defined length</entry>
  179. <entry>#0.00</entry>
  180. <entry>1234567.12</entry>
  181. </row>
  182. </tbody>
  183. </tgroup>
  184. </table>
  185. <example id="zend.locale.number.localize.example-3">
  186. <title>Using a self defined number format</title>
  187. <programlisting language="php"><![CDATA[
  188. $locale = new Zend_Locale('de_AT');
  189. $number = Zend_Locale_Format::toNumber(13547.3678,
  190. array('number_format' => '#,#0.00',
  191. 'locale' => 'de')
  192. );
  193. // will return 1.35.47,36
  194. print $number;
  195. $number = Zend_Locale_Format::toNumber(13547.3,
  196. array('number_format' => '#,##0.00',
  197. 'locale' => 'de')
  198. );
  199. // will return 13.547,30
  200. print $number;
  201. ]]></programlisting>
  202. </example>
  203. </sect2>
  204. <sect2 id="zend.locale.number.test">
  205. <title>Number testing</title>
  206. <para>
  207. <code>isNumber($value, array $options = array())</code> checks if a given string is a number and returns
  208. true or false.
  209. </para>
  210. <example id="zend.locale.number.test.example-1">
  211. <title>Number testing</title>
  212. <programlisting language="php"><![CDATA[
  213. $locale = new Zend_Locale();
  214. if (Zend_Locale_Format::isNumber('13.445,36', array('locale' => 'de_AT')) {
  215. print "Number";
  216. } else {
  217. print "not a Number";
  218. }
  219. ]]></programlisting>
  220. </example>
  221. </sect2>
  222. <sect2 id="zend.locale.float.normalize">
  223. <title>Float value normalization</title>
  224. <para>
  225. Floating point values can be parsed with the <code>getFloat($value, array $options = array())</code>
  226. function. A floating point value will be returned.
  227. </para>
  228. <example id="zend.locale.float.normalize.example-1">
  229. <title>Floating point value normalization</title>
  230. <programlisting language="php"><![CDATA[
  231. $locale = new Zend_Locale('de_AT');
  232. $number = Zend_Locale_Format::getFloat('13.524,678',
  233. array('precision' => 2,
  234. 'locale' => $locale)
  235. );
  236. // will return 13524.68
  237. print $number;
  238. ]]></programlisting>
  239. </example>
  240. </sect2>
  241. <sect2 id="zend.locale.float.localize">
  242. <title>Floating point value localization</title>
  243. <para>
  244. <code>toFloat()</code> can localize floating point values. This function will return a localized string of
  245. the given number.
  246. </para>
  247. <example id="zend.locale.float.localize.example-1">
  248. <title>Floating point value localization</title>
  249. <programlisting language="php"><![CDATA[
  250. $locale = new Zend_Locale('de_AT');
  251. $number = Zend_Locale_Format::toFloat(13547.3655,
  252. array('precision' => 1,
  253. 'locale' => $locale)
  254. );
  255. // will return 13.547,4
  256. print $number;
  257. ]]></programlisting>
  258. </example>
  259. </sect2>
  260. <sect2 id="zend.locale.float.test">
  261. <title>Floating point value testing</title>
  262. <para>
  263. <code>isFloat($value, array $options = array())</code> checks if a given string is a floating point value
  264. and returns true or false.
  265. </para>
  266. <example id="zend.locale.float.test.example-1">
  267. <title>Floating point value testing</title>
  268. <programlisting language="php"><![CDATA[
  269. $locale = new Zend_Locale('de_AT');
  270. if (Zend_Locale_Format::isFloat('13.445,36', array('locale' => $locale)) {
  271. print "float";
  272. } else {
  273. print "not a float";
  274. }
  275. ]]></programlisting>
  276. </example>
  277. </sect2>
  278. <sect2 id="zend.locale.integer.normalize">
  279. <title>Integer value normalization</title>
  280. <para>
  281. Integer values can be parsed with the <code>getInteger()</code> function. A integer value will be returned.
  282. </para>
  283. <example id="zend.locale.integer.normalize.example-1">
  284. <title>Integer value normalization</title>
  285. <programlisting language="php"><![CDATA[
  286. $locale = new Zend_Locale('de_AT');
  287. $number = Zend_Locale_Format::getInteger('13.524,678',
  288. array('locale' => $locale));
  289. // will return 13524
  290. print $number;
  291. ]]></programlisting>
  292. </example>
  293. </sect2>
  294. <sect2 id="zend.locale.integer.localize">
  295. <title>Integer point value localization</title>
  296. <para>
  297. <code>toInteger($value, array $options = array())</code> can localize integer values. This function will
  298. return a localized string of the given number.
  299. </para>
  300. <example id="zend.locale.integer.localize.example-1">
  301. <title>Integer value localization</title>
  302. <programlisting language="php"><![CDATA[
  303. $locale = new Zend_Locale('de_AT');
  304. $number = Zend_Locale_Format::toInteger(13547.3655,
  305. array('locale' => $locale));
  306. // will return 13.547
  307. print $number;
  308. ]]></programlisting>
  309. </example>
  310. </sect2>
  311. <sect2 id="zend.locale.integer.test">
  312. <title>Integer value testing</title>
  313. <para>
  314. <code>isInteger($value, array $options = array())</code> checks if a given string is a integer value and
  315. returns true or false.
  316. </para>
  317. <example id="zend.locale.integer.test.example-1">
  318. <title>Integer value testing</title>
  319. <programlisting language="php"><![CDATA[
  320. $locale = new Zend_Locale('de_AT');
  321. if (Zend_Locale_Format::isInteger('13.445', array('locale' => $locale)) {
  322. print "integer";
  323. } else {
  324. print "not a integer";
  325. }
  326. ]]></programlisting>
  327. </example>
  328. </sect2>
  329. <sect2 id="zend.locale.numbersystems">
  330. <title>Numeral System Conversion</title>
  331. <para>
  332. <classname>Zend_Locale_Format::convertNumerals()</classname> converts digits between different
  333. <ulink url="http://en.wikipedia.org/wiki/Arabic_numerals">numeral systems</ulink>
  334. , including the standard Arabic/European/Latin numeral system (0,1,2,3,4,5,6,7,8,9), not to be confused with
  335. <ulink url="http://en.wikipedia.org/wiki/Eastern_Arabic_numerals">Eastern Arabic numerals</ulink>
  336. sometimes used with the Arabic language to express numerals. Attempts to use an unsupported numeral system
  337. will result in an exception, to avoid accidentally performing an incorrect conversion due to a spelling
  338. error. All characters in the input, which are not numerals for the selected numeral system, are copied to
  339. the output with no conversion provided for unit separator characters. <classname>Zend_Locale</classname>* components
  340. rely on the data provided by CLDR (see their
  341. <ulink url="http://unicode.org/cldr/data/diff/supplemental/languages_and_scripts.html?sortby=date">
  342. list of scripts grouped by language</ulink>).
  343. </para>
  344. <para>
  345. In CLDR and hereafter, the Europena/Latin numerals will
  346. be referred to as "Latin" or by the assigned 4-letter code "Latn".
  347. Also, the CLDR refers to this numeral systems as "scripts".
  348. </para>
  349. <para>
  350. Suppose a web form collected a numeric input expressed using Eastern Arabic digits "١‎٠٠".
  351. Most software and PHP functions expect input using Arabic numerals. Fortunately, converting this input
  352. to its equivalent Latin numerals "100" requires little effort using
  353. <code>convertNumerals($inputNumeralString, $sourceNumeralSystem, $destNumeralSystem)</code>
  354. , which returns the <code>$input</code> with numerals in the script <code>$sourceNumeralSystem</code>
  355. converted to the script <code>$destNumeralSystem</code>.
  356. </para>
  357. <example id="zend.locale.numbersystems.example-1">
  358. <title>Converting numerals from Eastern Arabic scripts to European/Latin scripts</title>
  359. <programlisting language="php"><![CDATA[
  360. $arabicScript = "١‎٠٠"; // Arabic for "100" (one hundred)
  361. $latinScript = Zend_Locale_Format::convertNumerals($arabicScript,
  362. 'Arab',
  363. 'Latn');
  364. print "\nOriginal: " . $arabicScript;
  365. print "\nNormalized: " . $latinScript;
  366. ]]></programlisting>
  367. </example>
  368. <para>
  369. Similarly, any of the supported numeral systems may be converted to any other supported numeral system.
  370. </para>
  371. <example id="zend.locale.numbersystems.example-2">
  372. <title>Converting numerals from Latin script to Eastern Arabic script</title>
  373. <programlisting language="php"><![CDATA[
  374. $latinScript = '123';
  375. $arabicScript = Zend_Locale_Format::convertNumerals($latinScript,
  376. 'Latn',
  377. 'Arab');
  378. print "\nOriginal: " . $latinScript;
  379. print "\nLocalized: " . $arabicScript;
  380. ]]></programlisting>
  381. </example>
  382. <example id="zend.locale.numbersystems.example-3">
  383. <title>Getting 4 letter CLDR script code using a native-language name of the script</title>
  384. <programlisting language="php"><![CDATA[
  385. function getScriptCode($scriptName, $locale)
  386. {
  387. $scripts2names = Zend_Locale_Data::getList($locale, 'script');
  388. $names2scripts = array_flip($scripts2names);
  389. return $names2scripts[$scriptName];
  390. }
  391. echo getScriptCode('Latin', 'en'); // outputs "Latn"
  392. echo getScriptCode('Tamil', 'en'); // outputs "Taml"
  393. echo getScriptCode('tamoul', 'fr'); // outputs "Taml"
  394. ]]></programlisting>
  395. </example>
  396. <para>
  397. For a list of supported numeral systems call
  398. <methodname>Zend_Locale::getTranslationList('numberingsystem', 'en')</methodname>.
  399. </para>
  400. </sect2>
  401. </sect1>
  402. <!--
  403. vim:se ts=4 sw=4 et:
  404. -->