Zend_Locale-Parsing.xml 20 KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269270271272273274275276277278279280281282283284285286287288289290291292293294295296297298299300301302303304305306307308309310311312313314315316317318319320321322323324325326327328329330331332333334335336337338339340341342343344345346347348349350351352353354355356357358359360361362363364365366367368369370371372373374375376377378379380381382383384385386387388389390391392393394395396397398399400401402403404405406407408409410411412413414415416417418419420421422423424425426427428429430431432433434435436437438439440441442443444445446447448449450451452453454455456457458459460461462463464465466467468469470471472473474475476477478479480481482483484485486487488489490491492493494495496497498499500501502503504505506507508509510511512513514515516517518519
  1. <?xml version="1.0" encoding="UTF-8"?>
  2. <!-- Reviewed: no -->
  3. <sect1 id="zend.locale.parsing">
  4. <title>Normalization and Localization</title>
  5. <para>
  6. <classname>Zend_Locale_Format</classname> is an internal component used by
  7. <classname>Zend_Locale</classname>. All locale aware classes use
  8. <classname>Zend_Locale_Format</classname> for normalization and localization of numbers and
  9. dates. Normalization involves parsing input from a variety of data representations, like
  10. dates, into a standardized, structured representation, such as a <acronym>PHP</acronym>
  11. array with year, month, and day elements.
  12. </para>
  13. <para>
  14. The exact same string containing a number or a date might mean different things to people
  15. with different customs and conventions. Disambiguation of numbers and dates requires rules
  16. about how to interpret these strings and normalize the values into a standardized data
  17. structure. Thus, all methods in <classname>Zend_Locale_Format</classname> require a locale
  18. in order to parse the input data.
  19. <note>
  20. <title>Default "root" Locale</title>
  21. <para>
  22. If no locale is specified, then normalization and localization will use the standard
  23. "root" locale, which might yield unexpected behavior, if the input originated in a
  24. different locale, or output for a specific locale was expected.
  25. </para>
  26. </note>
  27. </para>
  28. <sect2 id="zend.locale.number.normalize">
  29. <title>Number normalization: getNumber($input, Array $options)</title>
  30. <para>
  31. There are many <ulink url="http://en.wikipedia.org/wiki/Numeral">number systems</ulink>
  32. different from the common <ulink
  33. url="http://en.wikipedia.org/wiki/Decimal">decimal system</ulink> (e.g. "3.14").
  34. Numbers can be normalized with the <methodname>getNumber()</methodname> function to
  35. obtain the standard decimal representation. for all number-related discussions in this
  36. manual, <ulink
  37. url="http://en.wikipedia.org/wiki/Arabic_numerals">Arabic/European numerals
  38. (0,1,2,3,4,5,6,7,8,9)</ulink> are implied, unless explicitly stated otherwise. The
  39. options array may contain a 'locale' to define grouping and decimal characters. The
  40. array may also have a 'precision' to truncate excess digits from the result.
  41. </para>
  42. <example id="zend.locale.number.normalize.example-1">
  43. <title>Number normalization</title>
  44. <programlisting language="php"><![CDATA[
  45. $locale = new Zend_Locale('de_AT');
  46. $number = Zend_Locale_Format::getNumber('13.524,678',
  47. array('locale' => $locale,
  48. 'precision' => 3)
  49. );
  50. print $number; // will return 13524.678
  51. ]]></programlisting>
  52. </example>
  53. <sect3 id="zend.locale.number.normalize.precision">
  54. <title>Precision and Calculations</title>
  55. <para>
  56. Since <methodname>getNumber($value, array $options = array())</methodname> can
  57. normalize extremely large numbers, check the result carefully before using finite
  58. precision calculations, such as ordinary <acronym>PHP</acronym> math operations. For
  59. example, <command>if ((string)int_val($number) != $number) {</command> use <ulink
  60. url="http://www.php.net/bc">BCMath</ulink> or <ulink
  61. url="http://www.php.net/gmp">GMP</ulink>. Most <acronym>PHP</acronym>
  62. installations support the BCMath extension.
  63. </para>
  64. <para>
  65. Also, the precision of the resulting decimal representation can be rounded to a
  66. desired length with <methodname>getNumber()</methodname> with the option
  67. '<property>precision</property>'. If no precision is given, no rounding occurs. Use
  68. only <acronym>PHP</acronym> integers to specify the precision.
  69. </para>
  70. <para>
  71. If the resulting decimal representation should be truncated to a desired length
  72. instead of rounded the option '<property>number_format</property>' can be used
  73. instead. Define the length of the decimal representation with the desired length
  74. of zeros. The result will then not be rounded. So if the defined precision within
  75. <property>number_format</property> is zero the value "1.6" will return "1", not "2.
  76. See the example nearby:
  77. </para>
  78. <example id="zend.locale.number.normalize.precision.example-1">
  79. <title>Number normalization with precision</title>
  80. <programlisting language="php"><![CDATA[
  81. $locale = new Zend_Locale('de_AT');
  82. $number = Zend_Locale_Format::getNumber('13.524,678',
  83. array('precision' => 1,
  84. 'locale' => $locale)
  85. );
  86. print $number; // will return 13524.7
  87. $number = Zend_Locale_Format::getNumber('13.524,678',
  88. array('number_format' => '#.00',
  89. 'locale' => $locale)
  90. );
  91. print $number; // will return 13524.67
  92. ]]></programlisting>
  93. </example>
  94. </sect3>
  95. </sect2>
  96. <sect2 id="zend.locale.number.localize">
  97. <title>Number localization</title>
  98. <para>
  99. <methodname>toNumber($value, array $options = array())</methodname> can localize numbers
  100. to the following <link linkend="zend.locale.appendix">supported locales</link>. This
  101. function will return a localized string of the given number in a conventional format for
  102. a specific locale. The 'number_format' option explicitly specifies a non-default number
  103. format for use with <methodname>toNumber()</methodname>.
  104. </para>
  105. <example id="zend.locale.number.localize.example-1">
  106. <title>Number localization</title>
  107. <programlisting language="php"><![CDATA[
  108. $locale = new Zend_Locale('de_AT');
  109. $number = Zend_Locale_Format::toNumber(13547.36,
  110. array('locale' => $locale));
  111. // will return 13.547,36
  112. print $number;
  113. ]]></programlisting>
  114. </example>
  115. <para>
  116. <note>
  117. <title>Unlimited length</title>
  118. <para>
  119. <methodname>toNumber()</methodname> can localize numbers with unlimited length.
  120. It is not related to integer or float limitations.
  121. </para>
  122. </note>
  123. </para>
  124. <para>
  125. The same way as within <methodname>getNumber()</methodname>,
  126. <methodname>toNumber()</methodname> handles precision. If no precision is given, the
  127. complete localized number will be returned.
  128. </para>
  129. <example id="zend.locale.number.localize.example-2">
  130. <title>Number localization with precision</title>
  131. <programlisting language="php"><![CDATA[
  132. $locale = new Zend_Locale('de_AT');
  133. $number = Zend_Locale_Format::toNumber(13547.3678,
  134. array('precision' => 2,
  135. 'locale' => $locale));
  136. // will return 13.547,37
  137. print $number;
  138. ]]></programlisting>
  139. </example>
  140. <para>
  141. Using the option 'number_format' a self defined format for generating a number can be
  142. defined. The format itself has to be given in <acronym>CLDR</acronym> format as
  143. described below. The locale is used to get separation, precision and other number
  144. formatting signs from it. German for example defines ',' as precision separation and in
  145. English the '.' sign is used.
  146. </para>
  147. <table id="zend.locale.number.localize.table-1">
  148. <title>Format tokens for self generated number formats</title>
  149. <tgroup cols="4">
  150. <thead>
  151. <row>
  152. <entry>Token</entry>
  153. <entry>Description</entry>
  154. <entry>Example format</entry>
  155. <entry>Generated output</entry>
  156. </row>
  157. </thead>
  158. <tbody>
  159. <row>
  160. <entry>#0</entry>
  161. <entry>Generates a number without precision and separation</entry>
  162. <entry>#0</entry>
  163. <entry>1234567</entry>
  164. </row>
  165. <row>
  166. <entry>,</entry>
  167. <entry>
  168. Generates a separation with the length from separation to next
  169. separation or to 0
  170. </entry>
  171. <entry>#,##0</entry>
  172. <entry>1,234,567</entry>
  173. </row>
  174. <row>
  175. <entry>#,##,##0</entry>
  176. <entry>
  177. Generates a standard separation of 3 and all following separations with
  178. 2
  179. </entry>
  180. <entry>#,##,##0</entry>
  181. <entry>12,34,567</entry>
  182. </row>
  183. <row>
  184. <entry>.</entry>
  185. <entry>Generates a precision</entry>
  186. <entry>#0.#</entry>
  187. <entry>1234567.1234</entry>
  188. </row>
  189. <row>
  190. <entry>0</entry>
  191. <entry>Generates a precision with a defined length</entry>
  192. <entry>#0.00</entry>
  193. <entry>1234567.12</entry>
  194. </row>
  195. </tbody>
  196. </tgroup>
  197. </table>
  198. <example id="zend.locale.number.localize.example-3">
  199. <title>Using a self defined number format</title>
  200. <programlisting language="php"><![CDATA[
  201. $locale = new Zend_Locale('de_AT');
  202. $number = Zend_Locale_Format::toNumber(13547.3678,
  203. array('number_format' => '#,#0.00',
  204. 'locale' => 'de')
  205. );
  206. // will return 1.35.47,36
  207. print $number;
  208. $number = Zend_Locale_Format::toNumber(13547.3,
  209. array('number_format' => '#,##0.00',
  210. 'locale' => 'de')
  211. );
  212. // will return 13.547,30
  213. print $number;
  214. ]]></programlisting>
  215. </example>
  216. </sect2>
  217. <sect2 id="zend.locale.number.test">
  218. <title>Number testing</title>
  219. <para>
  220. <methodname>isNumber($value, array $options = array())</methodname> checks if a given
  221. string is a number and returns <constant>TRUE</constant> or <constant>FALSE</constant>.
  222. </para>
  223. <example id="zend.locale.number.test.example-1">
  224. <title>Number testing</title>
  225. <programlisting language="php"><![CDATA[
  226. $locale = new Zend_Locale();
  227. if (Zend_Locale_Format::isNumber('13.445,36', array('locale' => 'de_AT'))) {
  228. print "Number";
  229. } else {
  230. print "not a Number";
  231. }
  232. ]]></programlisting>
  233. </example>
  234. </sect2>
  235. <sect2 id="zend.locale.float.normalize">
  236. <title>Float value normalization</title>
  237. <para>
  238. Floating point values can be parsed with the
  239. <methodname>getFloat($value, array $options = array())</methodname> function. A floating
  240. point value will be returned.
  241. </para>
  242. <example id="zend.locale.float.normalize.example-1">
  243. <title>Floating point value normalization</title>
  244. <programlisting language="php"><![CDATA[
  245. $locale = new Zend_Locale('de_AT');
  246. $number = Zend_Locale_Format::getFloat('13.524,678',
  247. array('precision' => 2,
  248. 'locale' => $locale)
  249. );
  250. // will return 13524.68
  251. print $number;
  252. ]]></programlisting>
  253. </example>
  254. </sect2>
  255. <sect2 id="zend.locale.float.localize">
  256. <title>Floating point value localization</title>
  257. <para>
  258. <methodname>toFloat()</methodname> can localize floating point values. This function
  259. will return a localized string of the given number.
  260. </para>
  261. <example id="zend.locale.float.localize.example-1">
  262. <title>Floating point value localization</title>
  263. <programlisting language="php"><![CDATA[
  264. $locale = new Zend_Locale('de_AT');
  265. $number = Zend_Locale_Format::toFloat(13547.3655,
  266. array('precision' => 1,
  267. 'locale' => $locale)
  268. );
  269. // will return 13.547,4
  270. print $number;
  271. ]]></programlisting>
  272. </example>
  273. </sect2>
  274. <sect2 id="zend.locale.float.test">
  275. <title>Floating point value testing</title>
  276. <para>
  277. <methodname>isFloat($value, array $options = array())</methodname> checks if a given
  278. string is a floating point value and returns <constant>TRUE</constant> or
  279. <constant>FALSE</constant>.
  280. </para>
  281. <example id="zend.locale.float.test.example-1">
  282. <title>Floating point value testing</title>
  283. <programlisting language="php"><![CDATA[
  284. $locale = new Zend_Locale('de_AT');
  285. if (Zend_Locale_Format::isFloat('13.445,36', array('locale' => $locale))) {
  286. print "float";
  287. } else {
  288. print "not a float";
  289. }
  290. ]]></programlisting>
  291. </example>
  292. </sect2>
  293. <sect2 id="zend.locale.integer.normalize">
  294. <title>Integer value normalization</title>
  295. <para>
  296. Integer values can be parsed with the <methodname>getInteger()</methodname> function. A
  297. integer value will be returned.
  298. </para>
  299. <example id="zend.locale.integer.normalize.example-1">
  300. <title>Integer value normalization</title>
  301. <programlisting language="php"><![CDATA[
  302. $locale = new Zend_Locale('de_AT');
  303. $number = Zend_Locale_Format::getInteger('13.524,678',
  304. array('locale' => $locale));
  305. // will return 13524
  306. print $number;
  307. ]]></programlisting>
  308. </example>
  309. </sect2>
  310. <sect2 id="zend.locale.integer.localize">
  311. <title>Integer point value localization</title>
  312. <para>
  313. <methodname>toInteger($value, array $options = array())</methodname> can localize
  314. integer values. This function will return a localized string of the given number.
  315. </para>
  316. <example id="zend.locale.integer.localize.example-1">
  317. <title>Integer value localization</title>
  318. <programlisting language="php"><![CDATA[
  319. $locale = new Zend_Locale('de_AT');
  320. $number = Zend_Locale_Format::toInteger(13547.3655,
  321. array('locale' => $locale));
  322. // will return 13.547
  323. print $number;
  324. ]]></programlisting>
  325. </example>
  326. </sect2>
  327. <sect2 id="zend.locale.integer.test">
  328. <title>Integer value testing</title>
  329. <para>
  330. <methodname>isInteger($value, array $options = array())</methodname> checks if a given
  331. string is an integer value and returns <constant>TRUE</constant> or
  332. <constant>FALSE</constant>.
  333. </para>
  334. <example id="zend.locale.integer.test.example-1">
  335. <title>Integer value testing</title>
  336. <programlisting language="php"><![CDATA[
  337. $locale = new Zend_Locale('de_AT');
  338. if (Zend_Locale_Format::isInteger('13.445', array('locale' => $locale))) {
  339. print "integer";
  340. } else {
  341. print "not an integer";
  342. }
  343. ]]></programlisting>
  344. </example>
  345. </sect2>
  346. <sect2 id="zend.locale.numbersystems">
  347. <title>Numeral System Conversion</title>
  348. <para>
  349. <methodname>Zend_Locale_Format::convertNumerals()</methodname> converts digits between
  350. different <ulink url="http://en.wikipedia.org/wiki/Arabic_numerals">numeral
  351. systems</ulink>, including the standard Arabic/European/Latin numeral system
  352. (0,1,2,3,4,5,6,7,8,9), not to be confused with <ulink
  353. url="http://en.wikipedia.org/wiki/Eastern_Arabic_numerals">Eastern Arabic
  354. numerals</ulink> sometimes used with the Arabic language to express numerals.
  355. Attempts to use an unsupported numeral system will result in an exception, to avoid
  356. accidentally performing an incorrect conversion due to a spelling error. All characters
  357. in the input, which are not numerals for the selected numeral system, are copied to the
  358. output with no conversion provided for unit separator characters.
  359. <classname>Zend_Locale</classname>* components rely on the data provided by
  360. <acronym>CLDR</acronym> (see their <ulink
  361. url="http://unicode.org/cldr/data/diff/supplemental/languages_and_scripts.html?sortby=date">list
  362. of scripts grouped by language</ulink>).
  363. </para>
  364. <para>
  365. In <acronym>CLDR</acronym> and hereafter, the Europena/Latin numerals will
  366. be referred to as "Latin" or by the assigned 4-letter code "Latn".
  367. Also, the <acronym>CLDR</acronym> refers to this numeral systems as "scripts".
  368. </para>
  369. <para>
  370. Suppose a web form collected a numeric input expressed using Eastern Arabic digits
  371. "١‎٠٠". Most software and <acronym>PHP</acronym> functions expect input using Arabic
  372. numerals. Fortunately, converting this input to its equivalent Latin numerals "100"
  373. requires little effort using <methodname>convertNumerals($inputNumeralString,
  374. $sourceNumeralSystem, $destNumeralSystem)</methodname>, which returns the
  375. <varname>$input</varname> with numerals in the script
  376. <varname>$sourceNumeralSystem</varname> converted to the script
  377. <varname>$destNumeralSystem</varname>.
  378. </para>
  379. <example id="zend.locale.numbersystems.example-1">
  380. <title>Converting numerals from Eastern Arabic scripts to European/Latin scripts</title>
  381. <programlisting language="php"><![CDATA[
  382. $arabicScript = "١‎٠٠"; // Arabic for "100" (one hundred)
  383. $latinScript = Zend_Locale_Format::convertNumerals($arabicScript,
  384. 'Arab',
  385. 'Latn');
  386. print "\nOriginal: " . $arabicScript;
  387. print "\nNormalized: " . $latinScript;
  388. ]]></programlisting>
  389. </example>
  390. <para>
  391. Similarly, any of the supported numeral systems may be converted to any other supported
  392. numeral system.
  393. </para>
  394. <example id="zend.locale.numbersystems.example-2">
  395. <title>Converting numerals from Latin script to Eastern Arabic script</title>
  396. <programlisting language="php"><![CDATA[
  397. $latinScript = '123';
  398. $arabicScript = Zend_Locale_Format::convertNumerals($latinScript,
  399. 'Latn',
  400. 'Arab');
  401. print "\nOriginal: " . $latinScript;
  402. print "\nLocalized: " . $arabicScript;
  403. ]]></programlisting>
  404. </example>
  405. <example id="zend.locale.numbersystems.example-3">
  406. <title>
  407. Getting 4 letter CLDR script code using a native-language name of the script
  408. </title>
  409. <programlisting language="php"><![CDATA[
  410. function getScriptCode($scriptName, $locale)
  411. {
  412. $scripts2names = Zend_Locale_Data::getList($locale, 'script');
  413. $names2scripts = array_flip($scripts2names);
  414. return $names2scripts[$scriptName];
  415. }
  416. echo getScriptCode('Latin', 'en'); // outputs "Latn"
  417. echo getScriptCode('Tamil', 'en'); // outputs "Taml"
  418. echo getScriptCode('tamoul', 'fr'); // outputs "Taml"
  419. ]]></programlisting>
  420. </example>
  421. <para>
  422. For a list of supported numeral systems call
  423. <methodname>Zend_Locale::getTranslationList('numberingsystem', 'en')</methodname>.
  424. </para>
  425. </sect2>
  426. </sect1>
  427. <!--
  428. vim:se ts=4 sw=4 et:
  429. -->