usermanual-opentype-features.xml 15 KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269270271272273274275276277278279280281282283284285286287288289290291292293294295296297298299300301302303304305306307308309310311312313314315316317318319320321322323324325326327328329330331332333334335336
  1. <?xml version="1.0"?>
  2. <!DOCTYPE book PUBLIC "-//OASIS//DTD DocBook XML V4.3//EN"
  3. "http://www.oasis-open.org/docbook/xml/4.3/docbookx.dtd" [
  4. <!ENTITY % local.common.attrib "xmlns:xi CDATA #FIXED 'http://www.w3.org/2003/XInclude'">
  5. <!ENTITY version SYSTEM "version.xml">
  6. ]>
  7. <chapter id="shaping-and-shape-plans">
  8. <title>Shaping and shape plans</title>
  9. <para>
  10. Once you have your face and font objects configured as desired and
  11. your input buffer is filled with the characters you need to shape,
  12. all you need to do is call <function>hb_shape()</function>.
  13. </para>
  14. <para>
  15. HarfBuzz will return the shaped version of the text in the same
  16. buffer that you provided, but it will be in output mode. At that
  17. point, you can iterate through the glyphs in the buffer, drawing
  18. each one at the specified position or handing them off to the
  19. appropriate graphics library.
  20. </para>
  21. <para>
  22. For the most part, HarfBuzz's shaping step is straightforward from
  23. the outside. But that doesn't mean there will never be cases where
  24. you want to look under the hood and see what is happening on the
  25. inside. HarfBuzz provides facilities for doing that, too.
  26. </para>
  27. <section id="shaping-buffer-output">
  28. <title>Shaping and buffer output</title>
  29. <para>
  30. The <function>hb_shape()</function> function call takes four arguments: the font
  31. object to use, the buffer of characters to shape, an array of
  32. user-specified features to apply, and the length of that feature
  33. array. The feature array can be NULL, so for the sake of
  34. simplicity we will start with that case.
  35. </para>
  36. <para>
  37. Internally, HarfBuzz looks at the tables of the font file to
  38. determine where glyph classes, substitutions, and positioning
  39. are defined, using that information to decide which
  40. <emphasis>shaper</emphasis> to use (<literal>ot</literal> for
  41. OpenType fonts, <literal>aat</literal> for Apple Advanced
  42. Typography fonts, and so on). It also looks at the direction,
  43. script, and language properties of the segment to figure out
  44. which script-specific shaping model is needed (at least, in
  45. shapers that support multiple options).
  46. </para>
  47. <para>
  48. If a font has a GDEF table, then that is used for
  49. glyph classes; if not, HarfBuzz will fall back to Unicode
  50. categorization by code point. If a font has an AAT <literal>morx</literal> table,
  51. then it is used for substitutions; if not, but there is a GSUB
  52. table, then the GSUB table is used. If the font has an AAT
  53. <literal>kerx</literal> table, then it is used for positioning; if not, but
  54. there is a GPOS table, then the GPOS table is used. If neither
  55. table is found, but there is a <literal>kern</literal> table, then HarfBuzz will
  56. use the <literal>kern</literal> table. If there is no <literal>kerx</literal>, no GPOS, and no
  57. <literal>kern</literal>, HarfBuzz will fall back to positioning marks itself.
  58. </para>
  59. <para>
  60. With a well-behaved OpenType font, you expect GDEF, GSUB, and
  61. GPOS tables to all be applied. HarfBuzz implements the
  62. script-specific shaping models in internal functions, rather
  63. than in the public API.
  64. </para>
  65. <para>
  66. The algorithms
  67. used for shaping can be quite involved; HarfBuzz tries
  68. to be compatible with the OpenType Layout specification
  69. and, wherever there is any ambiguity, HarfBuzz attempts to replicate the
  70. output of Microsoft's Uniscribe engine, to the extent that is feasible and desirable. See the <ulink
  71. url="https://docs.microsoft.com/en-us/typography/script-development/standard">Microsoft
  72. Typography pages</ulink> for more detail.
  73. </para>
  74. <para>
  75. In general, though, all that you need to know is that
  76. <function>hb_shape()</function> returns the results of shaping
  77. in the same buffer that you provided. The buffer's content type
  78. will now be set to
  79. <literal>HB_BUFFER_CONTENT_TYPE_GLYPHS</literal>, indicating
  80. that it contains shaped output, rather than input text. You can
  81. now extract the glyph information and positioning arrays:
  82. </para>
  83. <programlisting language="C">
  84. hb_glyph_info_t *glyph_info = hb_buffer_get_glyph_infos(buf, &amp;glyph_count);
  85. hb_glyph_position_t *glyph_pos = hb_buffer_get_glyph_positions(buf, &amp;glyph_count);
  86. </programlisting>
  87. <para>
  88. The glyph information array holds a <type>hb_glyph_info_t</type>
  89. for each output glyph, which has two fields:
  90. <parameter>codepoint</parameter> and
  91. <parameter>cluster</parameter>. Whereas, in the input buffer,
  92. the <parameter>codepoint</parameter> field contained the Unicode
  93. code point, it now contains the glyph ID of the corresponding
  94. glyph in the font. The <parameter>cluster</parameter> field is
  95. an integer that you can use to help identify when shaping has
  96. reordered, split, or combined code points; we will say more
  97. about that in the next chapter.
  98. </para>
  99. <para>
  100. The glyph positions array holds a corresponding
  101. <type>hb_glyph_position_t</type> for each output glyph,
  102. containing four fields: <parameter>x_advance</parameter>,
  103. <parameter>y_advance</parameter>,
  104. <parameter>x_offset</parameter>, and
  105. <parameter>y_offset</parameter>. The advances tell you how far
  106. you need to move the drawing point after drawing this glyph,
  107. depending on whether you are setting horizontal text (in which
  108. case you will have x advances) or vertical text (for which you
  109. will have y advances). The x and y offsets tell you where to
  110. move to start drawing the glyph; usually you will have both and
  111. x and a y offset, regardless of the text direction.
  112. </para>
  113. <para>
  114. Most of the time, you will rely on a font-rendering library or
  115. other graphics library to do the actual drawing of glyphs, so
  116. you will need to iterate through the glyphs in the buffer and
  117. pass the corresponding values off.
  118. </para>
  119. </section>
  120. <section id="shaping-opentype-features">
  121. <title>OpenType features</title>
  122. <para>
  123. OpenType features enable fonts to include smart behavior,
  124. implemented as "lookup" rules stored in the GSUB and GPOS
  125. tables. The OpenType specification defines a long list of
  126. standard features that fonts can use for these behaviors; each
  127. feature has a four-character reserved name and a well-defined
  128. semantic meaning.
  129. </para>
  130. <para>
  131. Some OpenType features are defined for the purpose of supporting
  132. script-specific shaping, and are automatically activated, but
  133. only when a buffer's script property is set to a script that the
  134. feature supports.
  135. </para>
  136. <para>
  137. Other features are more generic and can apply to several (or
  138. any) script, and shaping engines are expected to implement
  139. them. By default, HarfBuzz activates several of these features
  140. on every text run. They include <literal>abvm</literal>,
  141. <literal>blwm</literal>, <literal>ccmp</literal>,
  142. <literal>locl</literal>, <literal>mark</literal>,
  143. <literal>mkmk</literal>, and <literal>rlig</literal>.
  144. </para>
  145. <para>
  146. In addition, if the text direction is horizontal, HarfBuzz
  147. also applies the <literal>calt</literal>,
  148. <literal>clig</literal>, <literal>curs</literal>,
  149. <literal>dist</literal>, <literal>kern</literal>,
  150. <literal>liga</literal> and <literal>rclt</literal>, features.
  151. </para>
  152. <para>
  153. Additionally, when HarfBuzz encounters a fraction slash
  154. (<literal>U+2044</literal>), it looks backward and forward for decimal
  155. digits (Unicode General Category = Nd), and enables features
  156. <literal>numr</literal> on the sequence before the fraction slash,
  157. <literal>dnom</literal> on the sequence after the fraction slash,
  158. and <literal>frac</literal> on the whole sequence including the fraction
  159. slash.
  160. </para>
  161. <para>
  162. Some script-specific shaping models
  163. (see <xref linkend="opentype-shaping-models" />) disable some of the
  164. features listed above:
  165. </para>
  166. <itemizedlist>
  167. <listitem>
  168. <para>
  169. Hangul: <literal>calt</literal>
  170. </para>
  171. </listitem>
  172. <listitem>
  173. <para>
  174. Indic: <literal>liga</literal>
  175. </para>
  176. </listitem>
  177. <listitem>
  178. <para>
  179. Khmer: <literal>liga</literal>
  180. </para>
  181. </listitem>
  182. </itemizedlist>
  183. <para>
  184. If the text direction is vertical, HarfBuzz applies
  185. the <literal>vert</literal> feature by default.
  186. </para>
  187. <para>
  188. Still other features are designed to be purely optional and left
  189. up to the application or the end user to enable or disable as desired.
  190. </para>
  191. <para>
  192. You can adjust the set of features that HarfBuzz applies to a
  193. buffer by supplying an array of <type>hb_feature_t</type>
  194. features as the third argument to
  195. <function>hb_shape()</function>. For a simple case, let's just
  196. enable the <literal>dlig</literal> feature, which turns on any
  197. "discretionary" ligatures in the font:
  198. </para>
  199. <programlisting language="C">
  200. hb_feature_t userfeatures[1];
  201. userfeatures[0].tag = HB_TAG('d','l','i','g');
  202. userfeatures[0].value = 1;
  203. userfeatures[0].start = HB_FEATURE_GLOBAL_START;
  204. userfeatures[0].end = HB_FEATURE_GLOBAL_END;
  205. </programlisting>
  206. <para>
  207. <literal>HB_FEATURE_GLOBAL_END</literal> and
  208. <literal>HB_FEATURE_GLOBAL_END</literal> are macros we can use
  209. to indicate that the features will be applied to the entire
  210. buffer. We could also have used a literal <literal>0</literal>
  211. for the start and a <literal>-1</literal> to indicate the end of
  212. the buffer (or have selected other start and end positions, if needed).
  213. </para>
  214. <para>
  215. When we pass the <varname>userfeatures</varname> array to
  216. <function>hb_shape()</function>, any discretionary ligature
  217. substitutions from our font that match the text in our buffer
  218. will get performed:
  219. </para>
  220. <programlisting language="C">
  221. hb_shape(font, buf, userfeatures, num_features);
  222. </programlisting>
  223. <para>
  224. Just like we enabled the <literal>dlig</literal> feature by
  225. setting its <parameter>value</parameter> to
  226. <literal>1</literal>, you would disable a feature by setting its
  227. <parameter>value</parameter> to <literal>0</literal>. Some
  228. features can take other <parameter>value</parameter> settings;
  229. be sure you read the full specification of each feature tag to
  230. understand what it does and how to control it.
  231. </para>
  232. </section>
  233. <section id="shaping-shaper-selection">
  234. <title>Shaper selection</title>
  235. <para>
  236. The basic version of <function>hb_shape()</function> determines
  237. its shaping strategy based on examining the capabilities of the
  238. font file. OpenType font tables cause HarfBuzz to try the
  239. <literal>ot</literal> shaper, while AAT font tables cause HarfBuzz to try the
  240. <literal>aat</literal> shaper.
  241. </para>
  242. <para>
  243. In the real world, however, a font might include some unusual
  244. mix of tables, or one of the tables might simply be broken for
  245. the script you need to shape. So, sometimes, you might not
  246. want to rely on HarfBuzz's process for deciding what to do, and
  247. just tell <function>hb_shape()</function> what you want it to try.
  248. </para>
  249. <para>
  250. <function>hb_shape_full()</function> is an alternate shaping
  251. function that lets you supply a list of shapers for HarfBuzz to
  252. try, in order, when shaping your buffer. For example, if you
  253. have determined that HarfBuzz's attempts to work around broken
  254. tables gives you better results than the AAT shaper itself does,
  255. you might move the AAT shaper to the end of your list of
  256. preferences and call <function>hb_shape_full()</function>
  257. </para>
  258. <programlisting language="C">
  259. char *shaperprefs[3] = {"ot", "default", "aat"};
  260. ...
  261. hb_shape_full(font, buf, userfeatures, num_features, shaperprefs);
  262. </programlisting>
  263. <para>
  264. to get results you are happier with.
  265. </para>
  266. <para>
  267. You may also want to call
  268. <function>hb_shape_list_shapers()</function> to get a list of
  269. the shapers that were built at compile time in your copy of HarfBuzz.
  270. </para>
  271. </section>
  272. <section id="shaping-plans-and-caching">
  273. <title>Plans and caching</title>
  274. <para>
  275. Internally, HarfBuzz uses a structure called a shape plan to
  276. track its decisions about how to shape the contents of a
  277. buffer. The <function>hb_shape()</function> function builds up the shape plan by
  278. examining segment properties and by inspecting the contents of
  279. the font.
  280. </para>
  281. <para>
  282. This process can involve some decision-making and
  283. trade-offs — for example, HarfBuzz inspects the GSUB and GPOS
  284. lookups for the script and language tags set on the segment
  285. properties, but it falls back on the lookups under the
  286. <literal>DFLT</literal> tag (and sometimes other common tags)
  287. if there are actually no lookups for the tag requested.
  288. </para>
  289. <para>
  290. HarfBuzz also includes some work-arounds for
  291. handling well-known older font conventions that do not follow
  292. OpenType or Unicode specifications, for buggy system fonts, and for
  293. peculiarities of Microsoft Uniscribe. All of that means that a
  294. shape plan, while not something that you should edit directly in
  295. client code, still might be an object that you want to
  296. inspect. Furthermore, if resources are tight, you might want to
  297. cache the shape plan that HarfBuzz builds for your buffer and
  298. font, so that you do not have to rebuild it for every shaping call.
  299. </para>
  300. <para>
  301. You can create a cacheable shape plan with
  302. <function>hb_shape_plan_create_cached(face, props,
  303. user_features, num_user_features, shaper_list)</function>, where
  304. <parameter>face</parameter> is a face object (not a font object,
  305. notably), <parameter>props</parameter> is an
  306. <type>hb_segment_properties_t</type>,
  307. <parameter>user_features</parameter> is an array of
  308. <type>hb_feature_t</type>s (with length
  309. <parameter>num_user_features</parameter>), and
  310. <parameter>shaper_list</parameter> is a list of shapers to try.
  311. </para>
  312. <para>
  313. Shape plans are objects in HarfBuzz, so there are
  314. reference-counting functions and user-data attachment functions
  315. you can
  316. use. <function>hb_shape_plan_reference(shape_plan)</function>
  317. increases the reference count on a shape plan, while
  318. <function>hb_shape_plan_destroy(shape_plan)</function> decreases
  319. the reference count, destroying the shape plan when the last
  320. reference is dropped.
  321. </para>
  322. <para>
  323. You can attach user data to a shaper (with a key) using the
  324. <function>hb_shape_plan_set_user_data(shape_plan,key,data,destroy,replace)</function>
  325. function, optionally supplying a <function>destroy</function>
  326. callback to use. You can then fetch the user data attached to a
  327. shape plan with
  328. <function>hb_shape_plan_get_user_data(shape_plan, key)</function>.
  329. </para>
  330. </section>
  331. </chapter>