+ Note that the default callback will be called for any tag that isn't
+ specifically subscribed upon by the application, and can thus be called
+ in various contexts. For simplicity, the example above doesn't take
+ this into account (the <code>parent</code> could be of different
+ types, depending on the context).<br>
+ <br>
+ Note also that the default callback is not called when the parent context
+ is <code>NULL</code><code></code>. This is e.g. the case if none
+ of the "upper" tags has been subscribed upon.<br>
+
+
+<hr width="100%" size="2">
+
+<h2><a name="Other_API_functions"></a>Other API functions<br>
+ </h2>
+
+ Although the above describes the basic interface of the gedcom parser, there
+ are some other functions that allow to customize the behaviour of the library.
+ These will be explained in the current section.<br>
+
+
+<h3><a name="Debugging"></a>Debugging</h3>
+ The library can generate various debugging output, not only from itself,
+ but also the debugging output generated by the yacc parser. By default,
+ no debugging output is generated, but this can be customized using the
+following function:<br>
+
+
+<blockquote><code>void <b>gedcom_set_debug_level</b> (int level, FILE*
+trace_output)</code><br>
+ </blockquote>
+ The <code>level</code> can be one of the following values:<br>
+
+
+<ul>
+ <li>0: no debugging information (this is the
+ default)</li>
+ <li>1: only debugging information from libgedcom
+ itself</li>
+ <li>2: debugging information from libgedcom
+ and yacc</li>
+
+
+</ul>
+ If the <code>trace_output</code> is <code>NULL</code>, debugging information
+ will be written to <code>stderr</code>, otherwise the given file handle
+ is used (which must be open).<br>
+ <br>
+
+
+<h3><a name="Error_treatment"></a>Error treatment</h3>
+ One of the previous sections already described the callback to be
+registered to get error messages. The library also allows to customize
+what happens on an error, using the following function:<br>
+
+
+<blockquote><code>void <b>gedcom_set_error_handling</b> (Gedcom_err_mech
+ mechanism)</code><br>
+ </blockquote>
+ The <code>mechanism</code> can be one of:<br>
+
+
+<ul>
+ <li><code>IMMED_FAIL</code>: immediately fail
+the parsing on an error (this is the default)</li>
+ <li><code>DEFER_FAIL</code>: continue parsing
+after an error, but return a failure code eventually</li>
+ <li><code>IGNORE_ERRORS</code>: continue parsing
+ after an error, return success always</li>
+
+
+</ul>
+ This doesn't influence the generation of error or warning messages,
+ only the behaviour of the parser and its return code.<br>
+ <br>
+
+
+<h3><a name="Compatibility_mode"></a>Compatibility mode<br>
+ </h3>
+ Applications are not necessarily true to the GEDCOM spec (or use a
+different version than 5.5). The intention is that the library is
+resilient to this, and goes in compatibility mode for files written by specific
+programs (detected via the HEAD.SOUR tag). This compatibility mode
+can be enabled and disabled via the following function:<br>
+
+
+<blockquote><code>void <b>gedcom_set_compat_handling</b> (int enable_compat)</code><br>
+ </blockquote>
+ The argument can be:<br>
+
+
+<ul>
+ <li>0: disable compatibility mode</li>
+ <li>1: allow compatibility mode (this is the
+default)<br>
+ </li>
+
+
+</ul>
+ Currently, there is a beginning for compatibility for ftree and Lifelines (3.0.2).<br>
+
+<hr width="100%" size="2">
+<h2><a name="Converting_character_sets"></a>Converting character sets</h2>
+ All strings passed by the GEDCOM parser to the application are in UTF-8
+ encoding. Typically, an application needs to convert this to something
+ else to be able to display it.<br>
+ <br>
+ The most common case is that the output character set is controlled by
+the <code>locale</code> mechanism (i.e. via the <code>LANG</code>, <code>
+ LC_ALL</code> or <code>LC_CTYPE</code> environment variables), which also
+controls the <code>gettext</code> mechanism in the application. <br>
+ <br>
+ <br>
+
+ The source distribution of <code>
+gedcom-parse</code> contains an a library implementing help functions for UTF-8 encoding (<code></code>see
+the "utf8" subdirectory of the top directory). Feel free to use
+ it in your source code. It isn't installed anywhere, so you need
+to take over the source and header files in your application. Note that on
+some systems it uses libcharset, which is also included in this subdirectory.
+ <br>
+ <br>
+ Its interface contains first of all the following two help functions:<br>
+
+<blockquote>
+ <pre><code>int <b>is_utf8_string</b> (char *input);<br>int <b>utf8_strlen</b> (char *input);<br></code></pre></blockquote>The
+first one returns 1 if the given input is a valid UTF-8 string, it returns
+0 otherwise, the second gives the number of UTF-8 characters in the given
+input. Note that the second function assumes that the input is valid
+UTF-8, and gives unpredictable results if it isn't.<br>
+<br>
+For conversion, the following functions are available:<br>
+<blockquote>
+ <pre><code></code><code>char *<b>convert_utf8_to_locale</b> (char *input, int *conv_failures);<br>char *<b>convert_locale_to_utf8</b> (char *input);<br></code></pre>
+</blockquote>
+<blockquote>
+ </blockquote>
+
+ Both functions return a pointer to a static buffer that is overwritten
+ on each call. To function properly, the application must first set
+the locale using the <code>setlocale</code> function (the second step detailed
+ below). All other steps given below, including setting up and closing
+ down the conversion handles, are transparantly handled by the two functions.
+ <br>
+ <br>
+ If you pass a pointer to an integer to the first function, it will be
+set to the number of conversion failures, i.e. characters that couldn't
+be converted; you can also just pass <code>NULL</code> if you are not interested
+(note that usually, the interesting information is just whether there <i>
+were</i> conversion failures or not, which is then given by the integer
+being bigger than zero or not). The second function doesn't need this,
+because any locale can be converted to UTF-8.<br>
+ <br>
+ You can change the "?" that is output for characters that can't be converted
+ to any string you want, using the following function before the conversion
+ calls:<br>
+
+<blockquote>
+ <pre><code>void <b>convert_set_unknown</b> (const char *unknown);</code></pre>
+ </blockquote>
+ <br>
+ If you want to have your own functions for it instead of this example
+implementation, the following steps need to be taken by the application
+(more detailed info can be found in the info file of the GNU libc library
+in the "Generic Charset Conversion" section under "Character Set Handling"
+or online <a href="http://www.gnu.org/manual/glibc-2.2.3/html_chapter/libc_6.html#SEC99">
+ here</a>):<br>
+
+<ul>
+ <li>inclusion of some headers:</li>
+
+</ul>
+
+<blockquote>
+ <blockquote>
+ <pre><code>#include <locale.h> /* for setlocale */<br>#include <langinfo.h> /* for nl_langinfo */<br>#include <iconv.h> /* for iconv_* functions */<br></code></pre>
+ </blockquote>
+ </blockquote>
+
+<ul>
+ <li>set the program's current locale to what
+the user configured in the environment:</li>
+
+</ul>
+
+<blockquote>
+ <blockquote>
+ <pre><code>setlocale(LC_ALL, "");</code><br></pre>
+ </blockquote>
+ </blockquote>
+
+<ul>
+ <li>open a conversion handle for conversion
+ from UTF-8 to the character set of the current locale (once for the entire
+ program):</li>
+
+</ul>
+
+<blockquote>
+ <blockquote>
+ <pre><code>iconv_t iconv_handle;<br>...<br>iconv_handle = iconv_open(nl_langinfo(CODESET), "UTF-8");</code><br>if (iconv_handle == (iconv_t) -1)<br> /* signal an error */<br></pre>
+ </blockquote>
+ </blockquote>
+
+<ul>
+ <li>then, every string can be converted
+ using the following:</li>