usermanual-getting-started.xml 11 KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269270271272273274275276277278279280281282283284285286287288289290291292293294295296297298299300301302303304305306307308309310311312313314315316
  1. <?xml version="1.0"?>
  2. <!DOCTYPE book PUBLIC "-//OASIS//DTD DocBook XML V4.3//EN"
  3. "http://www.oasis-open.org/docbook/xml/4.3/docbookx.dtd" [
  4. <!ENTITY % local.common.attrib "xmlns:xi CDATA #FIXED 'http://www.w3.org/2003/XInclude'">
  5. <!ENTITY version SYSTEM "version.xml">
  6. ]>
  7. <chapter id="getting-started">
  8. <title>Getting started with HarfBuzz</title>
  9. <section id="an-overview-of-the-harfbuzz-shaping-api">
  10. <title>An overview of the HarfBuzz shaping API</title>
  11. <para>
  12. The core of the HarfBuzz shaping API is the function
  13. <function>hb_shape()</function>. This function takes a font, a
  14. buffer containing a string of Unicode codepoints and
  15. (optionally) a list of font features as its input. It replaces
  16. the codepoints in the buffer with the corresponding glyphs from
  17. the font, correctly ordered and positioned, and with any of the
  18. optional font features applied.
  19. </para>
  20. <para>
  21. In addition to holding the pre-shaping input (the Unicode
  22. codepoints that comprise the input string) and the post-shaping
  23. output (the glyphs and positions), a HarfBuzz buffer has several
  24. properties that affect shaping. The most important are the
  25. text-flow direction (e.g., left-to-right, right-to-left,
  26. top-to-bottom, or bottom-to-top), the script tag, and the
  27. language tag.
  28. </para>
  29. <para>
  30. For input string buffers, flags are available to denote when the
  31. buffer represents the beginning or end of a paragraph, to
  32. indicate whether or not to visibly render Unicode <literal>Default
  33. Ignorable</literal> codepoints, and to modify the cluster-merging
  34. behavior for the buffer. For shaped output buffers, the
  35. individual X and Y offsets and <literal>advances</literal>
  36. (the logical dimensions) of each glyph are
  37. accessible. HarfBuzz also flags glyphs as
  38. <literal>UNSAFE_TO_BREAK</literal> if breaking the string at
  39. that glyph (e.g., in a line-breaking or hyphenation process)
  40. would require re-shaping the text.
  41. </para>
  42. <para>
  43. HarfBuzz also provides methods to compare the contents of
  44. buffers, join buffers, normalize buffer contents, and handle
  45. invalid codepoints, as well as to determine the state of a
  46. buffer (e.g., input codepoints or output glyphs). Buffer
  47. lifecycles are managed and all buffers are reference-counted.
  48. </para>
  49. <para>
  50. Although the default <function>hb_shape()</function> function is
  51. sufficient for most use cases, a variant is also provided that
  52. lets you specify which of HarfBuzz's shapers to use on a buffer.
  53. </para>
  54. <para>
  55. HarfBuzz can read TrueType fonts, TrueType collections, OpenType
  56. fonts, and OpenType collections. Functions are provided to query
  57. font objects about metrics, Unicode coverage, available tables and
  58. features, and variation selectors. Individual glyphs can also be
  59. queried for metrics, variations, and glyph names. OpenType
  60. variable fonts are supported, and HarfBuzz allows you to set
  61. variation-axis coordinates on font objects.
  62. </para>
  63. <para>
  64. HarfBuzz provides glue code to integrate with various other
  65. libraries, including FreeType, GObject, and CoreText. Support
  66. for integrating with Uniscribe and DirectWrite is experimental
  67. at present.
  68. </para>
  69. </section>
  70. <section id="terminology">
  71. <title>Terminology</title>
  72. <para>
  73. </para>
  74. <variablelist>
  75. <?dbfo list-presentation="blocks"?>
  76. <varlistentry>
  77. <term>script</term>
  78. <listitem>
  79. <para>
  80. In text shaping, a <emphasis>script</emphasis> is a
  81. writing system: a set of symbols, rules, and conventions
  82. that is used to represent a language or multiple
  83. languages.
  84. </para>
  85. <para>
  86. In general computing lingo, the word "script" can also
  87. be used to mean an executable program (usually one
  88. written in a human-readable programming language). For
  89. the sake of clarity, HarfBuzz documents will always use
  90. more specific terminology when referring to this
  91. meaning, such as "Python script" or "shell script." In
  92. all other instances, "script" refers to a writing system.
  93. </para>
  94. <para>
  95. For developers using HarfBuzz, it is important to note
  96. the distinction between a script and a language. Most
  97. scripts are used to write a variety of different
  98. languages, and many languages may be written in more
  99. than one script.
  100. </para>
  101. </listitem>
  102. </varlistentry>
  103. <varlistentry>
  104. <term>shaper</term>
  105. <listitem>
  106. <para>
  107. In HarfBuzz, a <emphasis>shaper</emphasis> is a
  108. handler for a specific script-shaping model. HarfBuzz
  109. implements separate shapers for Indic, Arabic, Thai and
  110. Lao, Khmer, Myanmar, Tibetan, Hangul, Hebrew, the
  111. Universal Shaping Engine (USE), and a default shaper for
  112. scripts with no script-specific shaping model.
  113. </para>
  114. </listitem>
  115. </varlistentry>
  116. <varlistentry>
  117. <term>cluster</term>
  118. <listitem>
  119. <para>
  120. In text shaping, a <emphasis>cluster</emphasis> is a
  121. sequence of codepoints that must be treated as an
  122. indivisible unit. Clusters can include code-point
  123. sequences that form a ligature or base-and-mark
  124. sequences. Tracking and preserving clusters is important
  125. when shaping operations might separate or reorder
  126. code points.
  127. </para>
  128. <para>
  129. HarfBuzz provides three cluster
  130. <emphasis>levels</emphasis> that implement different
  131. approaches to the problem of preserving clusters during
  132. shaping operations.
  133. </para>
  134. </listitem>
  135. </varlistentry>
  136. <varlistentry>
  137. <term>grapheme</term>
  138. <listitem>
  139. <para>
  140. In linguistics, a <emphasis>grapheme</emphasis> is one
  141. of the indivisible units that make up a writing system or
  142. script. Often, graphemes are individual symbols (letters,
  143. numbers, punctuation marks, logograms, etc.) but,
  144. depending on the writing system, a particular grapheme
  145. might correspond to a sequence of several Unicode code
  146. points.
  147. </para>
  148. <para>
  149. In practice, HarfBuzz and other text-shaping engines
  150. are not generally concerned with graphemes. However, it
  151. is important for developers using HarfBuzz to recognize
  152. that there is a difference between graphemes and shaping
  153. clusters (see above). The two concepts may overlap
  154. frequently, but there is no guarantee that they will be
  155. identical.
  156. </para>
  157. </listitem>
  158. </varlistentry>
  159. <varlistentry>
  160. <term>syllable</term>
  161. <listitem>
  162. <para>
  163. In linguistics, a <emphasis>syllable</emphasis> is an
  164. a sequence of sounds that makes up a building block of a
  165. particular language. Every language has its own set of
  166. rules describing what constitutes a valid syllable.
  167. </para>
  168. <para>
  169. For text-shaping purposes, the various definitions of
  170. "syllable" are important because script-specific shaping
  171. operations may be applied at the syllable level. For
  172. example, a reordering rule might specify that a vowel
  173. mark be reordered to the beginning of the syllable.
  174. </para>
  175. <para>
  176. Syllables will consist of one or more Unicode code
  177. points. The definition of a syllable for a particular
  178. writing system might correspond to how HarfBuzz
  179. identifies clusters (see above) for the same writing
  180. system. However, it is important for developers using
  181. HarfBuzz to recognize that there is a difference between
  182. syllables and shaping clusters. The two concepts may
  183. overlap frequently, but there is no guarantee that they
  184. will be identical.
  185. </para>
  186. </listitem>
  187. </varlistentry>
  188. </variablelist>
  189. </section>
  190. <section id="a-simple-shaping-example">
  191. <title>A simple shaping example</title>
  192. <para>
  193. Below is the simplest HarfBuzz shaping example possible.
  194. </para>
  195. <orderedlist numeration="arabic">
  196. <listitem>
  197. <para>
  198. Create a buffer and put your text in it.
  199. </para>
  200. </listitem>
  201. </orderedlist>
  202. <programlisting language="C">
  203. #include &lt;hb.h&gt;
  204. hb_buffer_t *buf;
  205. buf = hb_buffer_create();
  206. hb_buffer_add_utf8(buf, text, -1, 0, -1);
  207. </programlisting>
  208. <orderedlist numeration="arabic">
  209. <listitem override="2">
  210. <para>
  211. Set the script, language and direction of the buffer.
  212. </para>
  213. </listitem>
  214. </orderedlist>
  215. <programlisting language="C">
  216. // If you know the direction, script, and language
  217. hb_buffer_set_direction(buf, HB_DIRECTION_LTR);
  218. hb_buffer_set_script(buf, HB_SCRIPT_LATIN);
  219. hb_buffer_set_language(buf, hb_language_from_string("en", -1));
  220. // If you don't know the direction, script, and language
  221. hb_buffer_guess_segment_properties(buffer);
  222. </programlisting>
  223. <orderedlist numeration="arabic">
  224. <listitem override="3">
  225. <para>
  226. Create a face and a font from a font file.
  227. </para>
  228. </listitem>
  229. </orderedlist>
  230. <programlisting language="C">
  231. hb_blob_t *blob = hb_blob_create_from_file(filename); /* or hb_blob_create_from_file_or_fail() */
  232. hb_face_t *face = hb_face_create(blob, 0);
  233. hb_font_t *font = hb_font_create(face);
  234. </programlisting>
  235. <orderedlist numeration="arabic">
  236. <listitem override="4">
  237. <para>
  238. Shape!
  239. </para>
  240. </listitem>
  241. </orderedlist>
  242. <programlisting>
  243. hb_shape(font, buf, NULL, 0);
  244. </programlisting>
  245. <orderedlist numeration="arabic">
  246. <listitem override="5">
  247. <para>
  248. Get the glyph and position information.
  249. </para>
  250. </listitem>
  251. </orderedlist>
  252. <programlisting language="C">
  253. unsigned int glyph_count;
  254. hb_glyph_info_t *glyph_info = hb_buffer_get_glyph_infos(buf, &amp;glyph_count);
  255. hb_glyph_position_t *glyph_pos = hb_buffer_get_glyph_positions(buf, &amp;glyph_count);
  256. </programlisting>
  257. <orderedlist numeration="arabic">
  258. <listitem override="6">
  259. <para>
  260. Iterate over each glyph.
  261. </para>
  262. </listitem>
  263. </orderedlist>
  264. <programlisting language="C">
  265. hb_position_t cursor_x = 0;
  266. hb_position_t cursor_y = 0;
  267. for (unsigned int i = 0; i &lt; glyph_count; i++) {
  268. hb_codepoint_t glyphid = glyph_info[i].codepoint;
  269. hb_position_t x_offset = glyph_pos[i].x_offset;
  270. hb_position_t y_offset = glyph_pos[i].y_offset;
  271. hb_position_t x_advance = glyph_pos[i].x_advance;
  272. hb_position_t y_advance = glyph_pos[i].y_advance;
  273. /* draw_glyph(glyphid, cursor_x + x_offset, cursor_y + y_offset); */
  274. cursor_x += x_advance;
  275. cursor_y += y_advance;
  276. }
  277. </programlisting>
  278. <orderedlist numeration="arabic">
  279. <listitem override="7">
  280. <para>
  281. Tidy up.
  282. </para>
  283. </listitem>
  284. </orderedlist>
  285. <programlisting language="C">
  286. hb_buffer_destroy(buf);
  287. hb_font_destroy(font);
  288. hb_face_destroy(face);
  289. hb_blob_destroy(blob);
  290. </programlisting>
  291. <para>
  292. This example shows enough to get us started using HarfBuzz. In
  293. the sections that follow, we will use the remainder of
  294. HarfBuzz's API to refine and extend the example and improve its
  295. text-shaping capabilities.
  296. </para>
  297. </section>
  298. </chapter>