| 123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269270271272273274275276277278279280281282283284285286287288289290291292293294295296297298299300301302303304305306307308309310311312313314315316317318319320321322323324325326327328329330331332333334335336 |
- <?xml version="1.0"?>
- <!DOCTYPE book PUBLIC "-//OASIS//DTD DocBook XML V4.3//EN"
- "http://www.oasis-open.org/docbook/xml/4.3/docbookx.dtd" [
- <!ENTITY % local.common.attrib "xmlns:xi CDATA #FIXED 'http://www.w3.org/2003/XInclude'">
- <!ENTITY version SYSTEM "version.xml">
- ]>
- <chapter id="shaping-and-shape-plans">
- <title>Shaping and shape plans</title>
- <para>
- Once you have your face and font objects configured as desired and
- your input buffer is filled with the characters you need to shape,
- all you need to do is call <function>hb_shape()</function>.
- </para>
- <para>
- HarfBuzz will return the shaped version of the text in the same
- buffer that you provided, but it will be in output mode. At that
- point, you can iterate through the glyphs in the buffer, drawing
- each one at the specified position or handing them off to the
- appropriate graphics library.
- </para>
- <para>
- For the most part, HarfBuzz's shaping step is straightforward from
- the outside. But that doesn't mean there will never be cases where
- you want to look under the hood and see what is happening on the
- inside. HarfBuzz provides facilities for doing that, too.
- </para>
-
- <section id="shaping-buffer-output">
- <title>Shaping and buffer output</title>
- <para>
- The <function>hb_shape()</function> function call takes four arguments: the font
- object to use, the buffer of characters to shape, an array of
- user-specified features to apply, and the length of that feature
- array. The feature array can be NULL, so for the sake of
- simplicity we will start with that case.
- </para>
- <para>
- Internally, HarfBuzz looks at the tables of the font file to
- determine where glyph classes, substitutions, and positioning
- are defined, using that information to decide which
- <emphasis>shaper</emphasis> to use (<literal>ot</literal> for
- OpenType fonts, <literal>aat</literal> for Apple Advanced
- Typography fonts, and so on). It also looks at the direction,
- script, and language properties of the segment to figure out
- which script-specific shaping model is needed (at least, in
- shapers that support multiple options).
- </para>
- <para>
- If a font has a GDEF table, then that is used for
- glyph classes; if not, HarfBuzz will fall back to Unicode
- categorization by code point. If a font has an AAT <literal>morx</literal> table,
- then it is used for substitutions; if not, but there is a GSUB
- table, then the GSUB table is used. If the font has an AAT
- <literal>kerx</literal> table, then it is used for positioning; if not, but
- there is a GPOS table, then the GPOS table is used. If neither
- table is found, but there is a <literal>kern</literal> table, then HarfBuzz will
- use the <literal>kern</literal> table. If there is no <literal>kerx</literal>, no GPOS, and no
- <literal>kern</literal>, HarfBuzz will fall back to positioning marks itself.
- </para>
- <para>
- With a well-behaved OpenType font, you expect GDEF, GSUB, and
- GPOS tables to all be applied. HarfBuzz implements the
- script-specific shaping models in internal functions, rather
- than in the public API.
- </para>
- <para>
- The algorithms
- used for shaping can be quite involved; HarfBuzz tries
- to be compatible with the OpenType Layout specification
- and, wherever there is any ambiguity, HarfBuzz attempts to replicate the
- output of Microsoft's Uniscribe engine, to the extent that is feasible and desirable. See the <ulink
- url="https://docs.microsoft.com/en-us/typography/script-development/standard">Microsoft
- Typography pages</ulink> for more detail.
- </para>
- <para>
- In general, though, all that you need to know is that
- <function>hb_shape()</function> returns the results of shaping
- in the same buffer that you provided. The buffer's content type
- will now be set to
- <literal>HB_BUFFER_CONTENT_TYPE_GLYPHS</literal>, indicating
- that it contains shaped output, rather than input text. You can
- now extract the glyph information and positioning arrays:
- </para>
- <programlisting language="C">
- hb_glyph_info_t *glyph_info = hb_buffer_get_glyph_infos(buf, &glyph_count);
- hb_glyph_position_t *glyph_pos = hb_buffer_get_glyph_positions(buf, &glyph_count);
- </programlisting>
- <para>
- The glyph information array holds a <type>hb_glyph_info_t</type>
- for each output glyph, which has two fields:
- <parameter>codepoint</parameter> and
- <parameter>cluster</parameter>. Whereas, in the input buffer,
- the <parameter>codepoint</parameter> field contained the Unicode
- code point, it now contains the glyph ID of the corresponding
- glyph in the font. The <parameter>cluster</parameter> field is
- an integer that you can use to help identify when shaping has
- reordered, split, or combined code points; we will say more
- about that in the next chapter.
- </para>
- <para>
- The glyph positions array holds a corresponding
- <type>hb_glyph_position_t</type> for each output glyph,
- containing four fields: <parameter>x_advance</parameter>,
- <parameter>y_advance</parameter>,
- <parameter>x_offset</parameter>, and
- <parameter>y_offset</parameter>. The advances tell you how far
- you need to move the drawing point after drawing this glyph,
- depending on whether you are setting horizontal text (in which
- case you will have x advances) or vertical text (for which you
- will have y advances). The x and y offsets tell you where to
- move to start drawing the glyph; usually you will have both and
- x and a y offset, regardless of the text direction.
- </para>
- <para>
- Most of the time, you will rely on a font-rendering library or
- other graphics library to do the actual drawing of glyphs, so
- you will need to iterate through the glyphs in the buffer and
- pass the corresponding values off.
- </para>
- </section>
-
- <section id="shaping-opentype-features">
- <title>OpenType features</title>
- <para>
- OpenType features enable fonts to include smart behavior,
- implemented as "lookup" rules stored in the GSUB and GPOS
- tables. The OpenType specification defines a long list of
- standard features that fonts can use for these behaviors; each
- feature has a four-character reserved name and a well-defined
- semantic meaning.
- </para>
- <para>
- Some OpenType features are defined for the purpose of supporting
- script-specific shaping, and are automatically activated, but
- only when a buffer's script property is set to a script that the
- feature supports.
- </para>
- <para>
- Other features are more generic and can apply to several (or
- any) script, and shaping engines are expected to implement
- them. By default, HarfBuzz activates several of these features
- on every text run. They include <literal>abvm</literal>,
- <literal>blwm</literal>, <literal>ccmp</literal>,
- <literal>locl</literal>, <literal>mark</literal>,
- <literal>mkmk</literal>, and <literal>rlig</literal>.
- </para>
- <para>
- In addition, if the text direction is horizontal, HarfBuzz
- also applies the <literal>calt</literal>,
- <literal>clig</literal>, <literal>curs</literal>,
- <literal>dist</literal>, <literal>kern</literal>,
- <literal>liga</literal> and <literal>rclt</literal>, features.
- </para>
- <para>
- Additionally, when HarfBuzz encounters a fraction slash
- (<literal>U+2044</literal>), it looks backward and forward for decimal
- digits (Unicode General Category = Nd), and enables features
- <literal>numr</literal> on the sequence before the fraction slash,
- <literal>dnom</literal> on the sequence after the fraction slash,
- and <literal>frac</literal> on the whole sequence including the fraction
- slash.
- </para>
- <para>
- Some script-specific shaping models
- (see <xref linkend="opentype-shaping-models" />) disable some of the
- features listed above:
- </para>
- <itemizedlist>
- <listitem>
- <para>
- Hangul: <literal>calt</literal>
- </para>
- </listitem>
- <listitem>
- <para>
- Indic: <literal>liga</literal>
- </para>
- </listitem>
- <listitem>
- <para>
- Khmer: <literal>liga</literal>
- </para>
- </listitem>
- </itemizedlist>
- <para>
- If the text direction is vertical, HarfBuzz applies
- the <literal>vert</literal> feature by default.
- </para>
- <para>
- Still other features are designed to be purely optional and left
- up to the application or the end user to enable or disable as desired.
- </para>
- <para>
- You can adjust the set of features that HarfBuzz applies to a
- buffer by supplying an array of <type>hb_feature_t</type>
- features as the third argument to
- <function>hb_shape()</function>. For a simple case, let's just
- enable the <literal>dlig</literal> feature, which turns on any
- "discretionary" ligatures in the font:
- </para>
- <programlisting language="C">
- hb_feature_t userfeatures[1];
- userfeatures[0].tag = HB_TAG('d','l','i','g');
- userfeatures[0].value = 1;
- userfeatures[0].start = HB_FEATURE_GLOBAL_START;
- userfeatures[0].end = HB_FEATURE_GLOBAL_END;
- </programlisting>
- <para>
- <literal>HB_FEATURE_GLOBAL_END</literal> and
- <literal>HB_FEATURE_GLOBAL_END</literal> are macros we can use
- to indicate that the features will be applied to the entire
- buffer. We could also have used a literal <literal>0</literal>
- for the start and a <literal>-1</literal> to indicate the end of
- the buffer (or have selected other start and end positions, if needed).
- </para>
- <para>
- When we pass the <varname>userfeatures</varname> array to
- <function>hb_shape()</function>, any discretionary ligature
- substitutions from our font that match the text in our buffer
- will get performed:
- </para>
- <programlisting language="C">
- hb_shape(font, buf, userfeatures, num_features);
- </programlisting>
- <para>
- Just like we enabled the <literal>dlig</literal> feature by
- setting its <parameter>value</parameter> to
- <literal>1</literal>, you would disable a feature by setting its
- <parameter>value</parameter> to <literal>0</literal>. Some
- features can take other <parameter>value</parameter> settings;
- be sure you read the full specification of each feature tag to
- understand what it does and how to control it.
- </para>
- </section>
- <section id="shaping-shaper-selection">
- <title>Shaper selection</title>
- <para>
- The basic version of <function>hb_shape()</function> determines
- its shaping strategy based on examining the capabilities of the
- font file. OpenType font tables cause HarfBuzz to try the
- <literal>ot</literal> shaper, while AAT font tables cause HarfBuzz to try the
- <literal>aat</literal> shaper.
- </para>
- <para>
- In the real world, however, a font might include some unusual
- mix of tables, or one of the tables might simply be broken for
- the script you need to shape. So, sometimes, you might not
- want to rely on HarfBuzz's process for deciding what to do, and
- just tell <function>hb_shape()</function> what you want it to try.
- </para>
- <para>
- <function>hb_shape_full()</function> is an alternate shaping
- function that lets you supply a list of shapers for HarfBuzz to
- try, in order, when shaping your buffer. For example, if you
- have determined that HarfBuzz's attempts to work around broken
- tables gives you better results than the AAT shaper itself does,
- you might move the AAT shaper to the end of your list of
- preferences and call <function>hb_shape_full()</function>
- </para>
- <programlisting language="C">
- char *shaperprefs[3] = {"ot", "default", "aat"};
- ...
- hb_shape_full(font, buf, userfeatures, num_features, shaperprefs);
- </programlisting>
- <para>
- to get results you are happier with.
- </para>
- <para>
- You may also want to call
- <function>hb_shape_list_shapers()</function> to get a list of
- the shapers that were built at compile time in your copy of HarfBuzz.
- </para>
- </section>
-
- <section id="shaping-plans-and-caching">
- <title>Plans and caching</title>
- <para>
- Internally, HarfBuzz uses a structure called a shape plan to
- track its decisions about how to shape the contents of a
- buffer. The <function>hb_shape()</function> function builds up the shape plan by
- examining segment properties and by inspecting the contents of
- the font.
- </para>
- <para>
- This process can involve some decision-making and
- trade-offs — for example, HarfBuzz inspects the GSUB and GPOS
- lookups for the script and language tags set on the segment
- properties, but it falls back on the lookups under the
- <literal>DFLT</literal> tag (and sometimes other common tags)
- if there are actually no lookups for the tag requested.
- </para>
- <para>
- HarfBuzz also includes some work-arounds for
- handling well-known older font conventions that do not follow
- OpenType or Unicode specifications, for buggy system fonts, and for
- peculiarities of Microsoft Uniscribe. All of that means that a
- shape plan, while not something that you should edit directly in
- client code, still might be an object that you want to
- inspect. Furthermore, if resources are tight, you might want to
- cache the shape plan that HarfBuzz builds for your buffer and
- font, so that you do not have to rebuild it for every shaping call.
- </para>
- <para>
- You can create a cacheable shape plan with
- <function>hb_shape_plan_create_cached(face, props,
- user_features, num_user_features, shaper_list)</function>, where
- <parameter>face</parameter> is a face object (not a font object,
- notably), <parameter>props</parameter> is an
- <type>hb_segment_properties_t</type>,
- <parameter>user_features</parameter> is an array of
- <type>hb_feature_t</type>s (with length
- <parameter>num_user_features</parameter>), and
- <parameter>shaper_list</parameter> is a list of shapers to try.
- </para>
- <para>
- Shape plans are objects in HarfBuzz, so there are
- reference-counting functions and user-data attachment functions
- you can
- use. <function>hb_shape_plan_reference(shape_plan)</function>
- increases the reference count on a shape plan, while
- <function>hb_shape_plan_destroy(shape_plan)</function> decreases
- the reference count, destroying the shape plan when the last
- reference is dropped.
- </para>
- <para>
- You can attach user data to a shaper (with a key) using the
- <function>hb_shape_plan_set_user_data(shape_plan,key,data,destroy,replace)</function>
- function, optionally supplying a <function>destroy</function>
- callback to use. You can then fetch the user data attached to a
- shape plan with
- <function>hb_shape_plan_get_user_data(shape_plan, key)</function>.
- </para>
- </section>
-
- </chapter>
|