| 123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269270271272273274275276277278279280281282283284285286287288289290291292293294295296297298299300301302303304305306307308309310311312313314315316 |
- <?xml version="1.0"?>
- <!DOCTYPE book PUBLIC "-//OASIS//DTD DocBook XML V4.3//EN"
- "http://www.oasis-open.org/docbook/xml/4.3/docbookx.dtd" [
- <!ENTITY % local.common.attrib "xmlns:xi CDATA #FIXED 'http://www.w3.org/2003/XInclude'">
- <!ENTITY version SYSTEM "version.xml">
- ]>
- <chapter id="getting-started">
- <title>Getting started with HarfBuzz</title>
- <section id="an-overview-of-the-harfbuzz-shaping-api">
- <title>An overview of the HarfBuzz shaping API</title>
- <para>
- The core of the HarfBuzz shaping API is the function
- <function>hb_shape()</function>. This function takes a font, a
- buffer containing a string of Unicode codepoints and
- (optionally) a list of font features as its input. It replaces
- the codepoints in the buffer with the corresponding glyphs from
- the font, correctly ordered and positioned, and with any of the
- optional font features applied.
- </para>
- <para>
- In addition to holding the pre-shaping input (the Unicode
- codepoints that comprise the input string) and the post-shaping
- output (the glyphs and positions), a HarfBuzz buffer has several
- properties that affect shaping. The most important are the
- text-flow direction (e.g., left-to-right, right-to-left,
- top-to-bottom, or bottom-to-top), the script tag, and the
- language tag.
- </para>
- <para>
- For input string buffers, flags are available to denote when the
- buffer represents the beginning or end of a paragraph, to
- indicate whether or not to visibly render Unicode <literal>Default
- Ignorable</literal> codepoints, and to modify the cluster-merging
- behavior for the buffer. For shaped output buffers, the
- individual X and Y offsets and <literal>advances</literal>
- (the logical dimensions) of each glyph are
- accessible. HarfBuzz also flags glyphs as
- <literal>UNSAFE_TO_BREAK</literal> if breaking the string at
- that glyph (e.g., in a line-breaking or hyphenation process)
- would require re-shaping the text.
- </para>
-
- <para>
- HarfBuzz also provides methods to compare the contents of
- buffers, join buffers, normalize buffer contents, and handle
- invalid codepoints, as well as to determine the state of a
- buffer (e.g., input codepoints or output glyphs). Buffer
- lifecycles are managed and all buffers are reference-counted.
- </para>
- <para>
- Although the default <function>hb_shape()</function> function is
- sufficient for most use cases, a variant is also provided that
- lets you specify which of HarfBuzz's shapers to use on a buffer.
- </para>
- <para>
- HarfBuzz can read TrueType fonts, TrueType collections, OpenType
- fonts, and OpenType collections. Functions are provided to query
- font objects about metrics, Unicode coverage, available tables and
- features, and variation selectors. Individual glyphs can also be
- queried for metrics, variations, and glyph names. OpenType
- variable fonts are supported, and HarfBuzz allows you to set
- variation-axis coordinates on font objects.
- </para>
-
- <para>
- HarfBuzz provides glue code to integrate with various other
- libraries, including FreeType, GObject, and CoreText. Support
- for integrating with Uniscribe and DirectWrite is experimental
- at present.
- </para>
- </section>
- <section id="terminology">
- <title>Terminology</title>
- <para>
-
- </para>
- <variablelist>
- <?dbfo list-presentation="blocks"?>
- <varlistentry>
- <term>script</term>
- <listitem>
- <para>
- In text shaping, a <emphasis>script</emphasis> is a
- writing system: a set of symbols, rules, and conventions
- that is used to represent a language or multiple
- languages.
- </para>
- <para>
- In general computing lingo, the word "script" can also
- be used to mean an executable program (usually one
- written in a human-readable programming language). For
- the sake of clarity, HarfBuzz documents will always use
- more specific terminology when referring to this
- meaning, such as "Python script" or "shell script." In
- all other instances, "script" refers to a writing system.
- </para>
- <para>
- For developers using HarfBuzz, it is important to note
- the distinction between a script and a language. Most
- scripts are used to write a variety of different
- languages, and many languages may be written in more
- than one script.
- </para>
- </listitem>
- </varlistentry>
-
- <varlistentry>
- <term>shaper</term>
- <listitem>
- <para>
- In HarfBuzz, a <emphasis>shaper</emphasis> is a
- handler for a specific script-shaping model. HarfBuzz
- implements separate shapers for Indic, Arabic, Thai and
- Lao, Khmer, Myanmar, Tibetan, Hangul, Hebrew, the
- Universal Shaping Engine (USE), and a default shaper for
- scripts with no script-specific shaping model.
- </para>
- </listitem>
- </varlistentry>
-
- <varlistentry>
- <term>cluster</term>
- <listitem>
- <para>
- In text shaping, a <emphasis>cluster</emphasis> is a
- sequence of codepoints that must be treated as an
- indivisible unit. Clusters can include code-point
- sequences that form a ligature or base-and-mark
- sequences. Tracking and preserving clusters is important
- when shaping operations might separate or reorder
- code points.
- </para>
- <para>
- HarfBuzz provides three cluster
- <emphasis>levels</emphasis> that implement different
- approaches to the problem of preserving clusters during
- shaping operations.
- </para>
- </listitem>
- </varlistentry>
-
- <varlistentry>
- <term>grapheme</term>
- <listitem>
- <para>
- In linguistics, a <emphasis>grapheme</emphasis> is one
- of the indivisible units that make up a writing system or
- script. Often, graphemes are individual symbols (letters,
- numbers, punctuation marks, logograms, etc.) but,
- depending on the writing system, a particular grapheme
- might correspond to a sequence of several Unicode code
- points.
- </para>
- <para>
- In practice, HarfBuzz and other text-shaping engines
- are not generally concerned with graphemes. However, it
- is important for developers using HarfBuzz to recognize
- that there is a difference between graphemes and shaping
- clusters (see above). The two concepts may overlap
- frequently, but there is no guarantee that they will be
- identical.
- </para>
- </listitem>
- </varlistentry>
-
- <varlistentry>
- <term>syllable</term>
- <listitem>
- <para>
- In linguistics, a <emphasis>syllable</emphasis> is an
- a sequence of sounds that makes up a building block of a
- particular language. Every language has its own set of
- rules describing what constitutes a valid syllable.
- </para>
- <para>
- For text-shaping purposes, the various definitions of
- "syllable" are important because script-specific shaping
- operations may be applied at the syllable level. For
- example, a reordering rule might specify that a vowel
- mark be reordered to the beginning of the syllable.
- </para>
- <para>
- Syllables will consist of one or more Unicode code
- points. The definition of a syllable for a particular
- writing system might correspond to how HarfBuzz
- identifies clusters (see above) for the same writing
- system. However, it is important for developers using
- HarfBuzz to recognize that there is a difference between
- syllables and shaping clusters. The two concepts may
- overlap frequently, but there is no guarantee that they
- will be identical.
- </para>
- </listitem>
- </varlistentry>
- </variablelist>
-
- </section>
- <section id="a-simple-shaping-example">
- <title>A simple shaping example</title>
- <para>
- Below is the simplest HarfBuzz shaping example possible.
- </para>
- <orderedlist numeration="arabic">
- <listitem>
- <para>
- Create a buffer and put your text in it.
- </para>
- </listitem>
- </orderedlist>
- <programlisting language="C">
- #include <hb.h>
- hb_buffer_t *buf;
- buf = hb_buffer_create();
- hb_buffer_add_utf8(buf, text, -1, 0, -1);
- </programlisting>
- <orderedlist numeration="arabic">
- <listitem override="2">
- <para>
- Set the script, language and direction of the buffer.
- </para>
- </listitem>
- </orderedlist>
- <programlisting language="C">
- // If you know the direction, script, and language
- hb_buffer_set_direction(buf, HB_DIRECTION_LTR);
- hb_buffer_set_script(buf, HB_SCRIPT_LATIN);
- hb_buffer_set_language(buf, hb_language_from_string("en", -1));
- // If you don't know the direction, script, and language
- hb_buffer_guess_segment_properties(buffer);
- </programlisting>
- <orderedlist numeration="arabic">
- <listitem override="3">
- <para>
- Create a face and a font from a font file.
- </para>
- </listitem>
- </orderedlist>
- <programlisting language="C">
- hb_blob_t *blob = hb_blob_create_from_file(filename); /* or hb_blob_create_from_file_or_fail() */
- hb_face_t *face = hb_face_create(blob, 0);
- hb_font_t *font = hb_font_create(face);
- </programlisting>
- <orderedlist numeration="arabic">
- <listitem override="4">
- <para>
- Shape!
- </para>
- </listitem>
- </orderedlist>
- <programlisting>
- hb_shape(font, buf, NULL, 0);
- </programlisting>
- <orderedlist numeration="arabic">
- <listitem override="5">
- <para>
- Get the glyph and position information.
- </para>
- </listitem>
- </orderedlist>
- <programlisting language="C">
- unsigned int glyph_count;
- hb_glyph_info_t *glyph_info = hb_buffer_get_glyph_infos(buf, &glyph_count);
- hb_glyph_position_t *glyph_pos = hb_buffer_get_glyph_positions(buf, &glyph_count);
- </programlisting>
- <orderedlist numeration="arabic">
- <listitem override="6">
- <para>
- Iterate over each glyph.
- </para>
- </listitem>
- </orderedlist>
- <programlisting language="C">
- hb_position_t cursor_x = 0;
- hb_position_t cursor_y = 0;
- for (unsigned int i = 0; i < glyph_count; i++) {
- hb_codepoint_t glyphid = glyph_info[i].codepoint;
- hb_position_t x_offset = glyph_pos[i].x_offset;
- hb_position_t y_offset = glyph_pos[i].y_offset;
- hb_position_t x_advance = glyph_pos[i].x_advance;
- hb_position_t y_advance = glyph_pos[i].y_advance;
- /* draw_glyph(glyphid, cursor_x + x_offset, cursor_y + y_offset); */
- cursor_x += x_advance;
- cursor_y += y_advance;
- }
- </programlisting>
- <orderedlist numeration="arabic">
- <listitem override="7">
- <para>
- Tidy up.
- </para>
- </listitem>
- </orderedlist>
- <programlisting language="C">
- hb_buffer_destroy(buf);
- hb_font_destroy(font);
- hb_face_destroy(face);
- hb_blob_destroy(blob);
- </programlisting>
-
- <para>
- This example shows enough to get us started using HarfBuzz. In
- the sections that follow, we will use the remainder of
- HarfBuzz's API to refine and extend the example and improve its
- text-shaping capabilities.
- </para>
- </section>
- </chapter>
|