<html>
<head>
<title>Using the GEDCOM parser library</title>
-
+
<meta http-equiv="content-type" content="text/html; charset=ISO-8859-1">
</head>
<body>
-
+
<h1 align="center">Using the GEDCOM parser library</h1>
- <br>
-
+ <br>
+
<h2>Index</h2>
-
+
<ul>
- <li><a href="#anchor">Overview</a></li>
- <li><a href="#Error_handling">Error handling</a></li>
- <li><a href="#Data_callback_mechanism">Data callback mechanism</a></li>
-
+ <li><a href="#anchor">Overview</a></li>
+ <li><a href="#Error_handling">Error handling</a></li>
+ <li><a href="#Data_callback_mechanism">Data callback mechanism</a></li>
+
<ul>
- <li><a href="#Start_and_end_callbacks">Start and end callbacks</a></li>
- <li><a href="#Default_callbacks">Default callbacks</a></li>
-
+ <li><a href="#Start_and_end_callbacks">Start and end callbacks</a></li>
+ <li><a href="#Default_callbacks">Default callbacks</a></li>
+
</ul>
- <li><a href="#Other_API_functions">Other API functions</a></li>
-
+ <li><a href="#Other_API_functions">Other API functions</a></li>
+
<ul>
- <li><a href="#Debugging">Debugging</a></li>
- <li><a href="#Error_treatment">Error treatment</a></li>
- <li><a href="#Compatibility_mode">Compatibility mode</a></li>
-
+ <li><a href="#Debugging">Debugging</a></li>
+ <li><a href="#Error_treatment">Error treatment</a></li>
+ <li><a href="#Compatibility_mode">Compatibility mode</a></li>
+
</ul>
- <li><a href="interface.html">Interface details</a><br>
- </li>
-
+ <li><a href="interface.html">Interface details</a><br>
+ </li>
+
</ul>
-
-<hr width="100%" size="2">
-<h2><a name="Overview"></a>Overview<br>
- </h2>
- The GEDCOM parser library is built as a callback-based parser (comparable
- to the SAX interface of XML). It comes with:<br>
+<hr width="100%" size="2">
+<h2><a name="Overview"></a>Overview<br>
+ </h2>
+ The GEDCOM parser library is built as a callback-based parser (comparable
+ to the SAX interface of XML). It comes with:<br>
+
<ul>
- <li>a library (<code>libgedcom.so</code>), to be linked in the application
- program</li>
- <li>a header file (<code>gedcom.h</code>), to be used in the sources
-of the application program</li>
-
+ <li>a library (<code>libgedcom.so</code>), to be linked in the application
+ program</li>
+ <li>a header file (<code>gedcom.h</code>), to be used in the sources
+ of the application program</li>
+ <li>a header file (<code>gedcom-tags.h</code>) that is also installed,
+but that is automatically included via <code>gedcom.h</code><br>
+ </li>
+
</ul>
- Next to these, there is also a data directory in <code>$PREFIX/share/gedcom-parse</code>
- that contains some additional stuff, but which is not immediately important
- at first. I'll leave the description of the data directory for later.<br>
- <br>
- The very simplest call of the gedcom parser is simply the following piece
- of code (include of the gedcom header is assumed, as everywhere in this manual):<br>
-
-<blockquote><code>int result;<br>
- ...<br>
- result = <b>gedcom_parse_file</b>("myfamily.ged");<br>
- </code> </blockquote>
- Although this will not provide much information, one thing it does is
-parse the entire file and return the result. The function returns
-0 on success and 1 on failure. No other information is available using
-this function only.<br>
+ Next to these, there is also a data directory in <code>$PREFIX/share/gedcom-parse</code>
+ that contains some additional stuff, but which is not immediately important
+ at first. I'll leave the description of the data directory for later.<br>
<br>
- The next sections will refine this to be able to have meaningful errors
-and the actual data that is in the file.<br>
+ The very simplest call of the gedcom parser is simply the following piece
+ of code (include of the gedcom header is assumed, as everywhere in this
+manual):<br>
- <hr width="100%" size="2">
+<blockquote><code>int result;<br>
+ ...<br>
+ result = <b>gedcom_parse_file</b>("myfamily.ged");<br>
+ </code> </blockquote>
+ Although this will not provide much information, one thing it does is
+parse the entire file and return the result. The function returns 0
+on success and 1 on failure. No other information is available using
+this function only.<br>
+ <br>
+ The next sections will refine this to be able to have meaningful errors
+ and the actual data that is in the file.<br>
+
+ <hr width="100%" size="2">
<h2><a name="Error_handling"></a>Error handling</h2>
- Since this is a relatively simple topic, it is discussed before the actual
+ Since this is a relatively simple topic, it is discussed before the actual
callback mechanism, although it also uses a callback...<br>
- <br>
- The library can be used in several different circumstances, both terminal-based
- as GUI-based. Therefore, it leaves the actual display of the error
-message up to the application. For this, the application needs to register
-a callback before parsing the GEDCOM file, which will be called by the library
+ <br>
+ The library can be used in several different circumstances, both terminal-based
+ as GUI-based. Therefore, it leaves the actual display of the error
+message up to the application. For this, the application needs to register
+a callback before parsing the GEDCOM file, which will be called by the library
on errors, warnings and messages.<br>
- <br>
- A typical piece of code would be:<br>
-
- <blockquote><code>void <b>my_message_handler</b> (Gedcom_msg_type type,
+ <br>
+ A typical piece of code would be:<br>
+
+ <blockquote><code>void <b>my_message_handler</b> (Gedcom_msg_type type,
char *msg)<br>
- {<br>
- ...<br>
- }<br>
- ...<br>
- <b>gedcom_set_message_handler</b>(my_message_handler);<br>
- ...<br>
- result = <b>gedcom_parse_file</b>("myfamily.ged");</code><br>
- </blockquote>
- In the above piece of code, <code>my_message_handler</code> is the callback
+ {<br>
+ ...<br>
+ }<br>
+ ...<br>
+ <b>gedcom_set_message_handler</b>(my_message_handler);<br>
+ ...<br>
+ result = <b>gedcom_parse_file</b>("myfamily.ged");</code><br>
+ </blockquote>
+ In the above piece of code, <code>my_message_handler</code> is the callback
that will be called for errors (<code>type=ERROR</code>), warnings (<code>
- type=WARNING</code>) and messages (<code>type=MESSAGE</code>). The
-callback must have the signature as in the example. For errors, the
+ type=WARNING</code>) and messages (<code>type=MESSAGE</code>). The
+callback must have the signature as in the example. For errors, the
<code> msg</code> passed to the callback will have the format:<br>
-
+
<blockquote><code>Error on line</code> <i><lineno></i>: <i><actual_message></i><br>
- </blockquote>
- Note that the entire string will be properly internationalized, and encoded
- in UTF-8 (see "Why UTF-8?" <i>LINK TBD</i>). Also, no newline
- is appended, so that the application program can use it in any way it wants.
- Warnings are similar, but use "Warning" instead of "Error". Messages
+ </blockquote>
+ Note that the entire string will be properly internationalized, and encoded
+ in UTF-8 (see "Why UTF-8?" <i>LINK TBD</i>). Also, no newline
+ is appended, so that the application program can use it in any way it wants.
+ Warnings are similar, but use "Warning" instead of "Error". Messages
are plain text, without any prefix.<br>
- <br>
- With this in place, the resulting code will already show errors and warnings
+ <br>
+ With this in place, the resulting code will already show errors and warnings
produced by the parser, e.g. on the terminal if a simple <code>printf</code>
- is used in the message handler.<br>
-
- <hr width="100%" size="2">
+ is used in the message handler.<br>
+
+ <hr width="100%" size="2">
<h2><a name="Data_callback_mechanism"></a>Data callback mechanism</h2>
- The most important use of the parser is of course to get the data out of
- the GEDCOM file. As already mentioned, the parser uses a callback
+ The most important use of the parser is of course to get the data out
+of the GEDCOM file. As already mentioned, the parser uses a callback
mechanism for that. In fact, the mechanism involves two levels.<br>
- <br>
- The primary level is that each of the sections in a GEDCOM file is notified
- to the application code via a "start element" callback and an "end element"
- callback (much like in a SAX interface for XML), i.e. when a line containing
- a certain tag is parsed, the "start element" callback is called for that
-tag, and when all its subordinate lines with their tags have been processed,
-the "end element" callback is called for the original tag. Since GEDCOM
- is hierarchical, this results in properly nested calls to appropriate "start
+ <br>
+ The primary level is that each of the sections in a GEDCOM file is notified
+ to the application code via a "start element" callback and an "end element"
+ callback (much like in a SAX interface for XML), i.e. when a line containing
+ a certain tag is parsed, the "start element" callback is called for that
+tag, and when all its subordinate lines with their tags have been processed,
+the "end element" callback is called for the original tag. Since GEDCOM
+ is hierarchical, this results in properly nested calls to appropriate "start
element" and "end element" callbacks.<br>
- <br>
- However, it would be typical for a genealogy program to support only a
-subset of the GEDCOM standard, certainly a program that is still under development.
- Moreover, under GEDCOM it is allowed for an application to define
-its own tags, which will typically not be supported by another application.
- Still, in that case, data preservation is important; it would hardly
- be accepted that information that is not understood by a certain program
+ <br>
+ However, it would be typical for a genealogy program to support only a
+subset of the GEDCOM standard, certainly a program that is still under development.
+ Moreover, under GEDCOM it is allowed for an application to define its
+ own tags, which will typically not be supported by another application.
+ Still, in that case, data preservation is important; it would hardly
+ be accepted that information that is not understood by a certain program
is just removed.<br>
- <br>
- Therefore, the second level of callbacks involves a "default callback".
- An application needs to subscribe to callbacks for tags it does support,
-and need to provide a "default callback" which will be called for tags it
-doesn't support. The application can then choose to just store the information
-that comes via the default callback in plain textual format.<br>
- <br>
- After this introduction, let's see what the API looks like...<br>
- <br>
-
+ <br>
+ Therefore, the second level of callbacks involves a "default callback".
+ An application needs to subscribe to callbacks for tags it does support,
+ and need to provide a "default callback" which will be called for tags it
+ doesn't support. The application can then choose to just store the
+information that comes via the default callback in plain textual format.<br>
+ <br>
+ After this introduction, let's see what the API looks like...<br>
+ <br>
+
<h3><a name="Start_and_end_callbacks"></a>Start and end callbacks</h3>
-
+
<h4><i>Callbacks for records</i> <br>
- </h4>
- As a simple example, we will get some information from the header of a
+ </h4>
+ As a simple example, we will get some information from the header of a
GEDCOM file. First, have a look at the following piece of code:<br>
-
- <blockquote><code>Gedcom_ctxt <b>my_header_start_cb</b> (int level,
- Gedcom_val xref, char *tag)<br>
- {<br>
- printf("The header starts\n");<br>
- return (Gedcom_ctxt)1;<br>
- }<br>
- <br>
- void <b>my_header_end_cb</b> (Gedcom_ctxt self)<br>
- {<br>
- printf("The header ends, context is %d\n", self); /* context
+
+ <blockquote><code>Gedcom_ctxt <b>my_header_start_cb</b> (int level,
+ Gedcom_val xref, char *tag, int parsed_tag)<br>
+ {<br>
+ printf("The header starts\n");<br>
+ return (Gedcom_ctxt)1;<br>
+ }<br>
+ <br>
+ void <b>my_header_end_cb</b> (Gedcom_ctxt self)<br>
+ {<br>
+ printf("The header ends, context is %d\n", self); /* context
will print as "1" */<br>
- }<br>
- <br>
- ...<br>
- <b>gedcom_subscribe_to_record</b>(REC_HEAD, my_header_start_cb,
-my_header_end_cb);<br>
- ...<br>
- result = <b>gedcom_parse_file</b>("myfamily.ged");</code><br>
- </blockquote>
- Using the <code>gedcom_subscribe_to_record</code> function, the application
- requests to use the specified callbacks as start and end callback. The end
- callback is optional: you can pass <code>NULL</code> if you are not interested
- in the end callback. The identifiers to use as first argument to the
+ }<br>
+ <br>
+ ...<br>
+ <b>gedcom_subscribe_to_record</b>(REC_HEAD, my_header_start_cb,
+ my_header_end_cb);<br>
+ ...<br>
+ result = <b>gedcom_parse_file</b>("myfamily.ged");</code><br>
+ </blockquote>
+ Using the <code>gedcom_subscribe_to_record</code> function, the application
+ requests to use the specified callbacks as start and end callback. The end
+ callback is optional: you can pass <code>NULL</code> if you are not interested
+ in the end callback. The identifiers to use as first argument to the
function (here <code>REC_HEAD</code>) are described in the <a href="interface.html#Record_identifiers">
- interface details</a>.<br>
- <br>
- From the name of the function it becomes clear that this function is specific
- to complete records. For the separate elements in records there is
-another function, which we'll see shortly. Again, the callbacks need
+ interface details</a>.<br>
+ <br>
+ From the name of the function it becomes clear that this function is specific
+ to complete records. For the separate elements in records there is
+another function, which we'll see shortly. Again, the callbacks need
to have the signatures as shown in the example.<br>
- <br>
- The <code>Gedcom_ctxt</code> type that is used as a result of the start
-callback and as an argument to the end callback is vital for passing context
-necessary for the application. This type is meant to be opaque; in fact,
-it's a void pointer, so you can pass anything via it. The important
-thing to know is that the context that the application returns in the start
-callback will be passed in the end callback as an argument, and as we will
-see shortly, also to all the directly subordinate elements of the record.<br>
- <br>
- The example passes a simple integer as context, but an application could
- e.g. pass a <code>struct</code> that will contain the information for the
- header. In the end callback, the application could then e.g. do some
+ <br>
+ The <code>Gedcom_ctxt</code> type that is used as a result of the start
+ callback and as an argument to the end callback is vital for passing context
+ necessary for the application. This type is meant to be opaque; in
+fact, it's a void pointer, so you can pass anything via it. The important
+ thing to know is that the context that the application returns in the start
+ callback will be passed in the end callback as an argument, and as we will
+ see shortly, also to all the directly subordinate elements of the record.<br>
+ <br>
+The <code>tag</code> is the GEDCOM tag in string format, the <code>parsed_tag</code>
+ is an integer, for which symbolic values are defined as <code>TAG_HEAD,</code>
+ <code>TAG_SOUR,</code> <code>TAG_DATA,</code> ... and <code>USERTAG </code><code></code>
+for the application-specific tags. These values are defined in the
+header <code>gedcom-tags.h</code> that is installed, and included via <code>
+gedcom.h</code> (so no need to include <code>gedcom-tags.h</code> yourself).<br>
+ <br>
+ The example passes a simple integer as context, but an application could
+ e.g. pass a <code>struct</code> that will contain the information for the
+ header. In the end callback, the application could then e.g. do some
finalizing operations on the <code>struct</code> to put it in its database.<br>
- <br>
- (Note that the <code>Gedcom_val</code> type for the <code>xref</code> argument
- was not discussed, see further for this)<br>
- <br>
-
+ <br>
+ (Note that the <code>Gedcom_val</code> type for the <code>xref</code>
+argument was not discussed, see further for this)<br>
+ <br>
+
<h4><i>Callbacks for elements</i></h4>
- We will now retrieve the SOUR field (the name of the program that wrote
-the file) from the header:<br>
-
- <blockquote><code>Gedcom_ctxt <b>my_header_source_start_cb</b>(Gedcom_ctxt
+ We will now retrieve the SOUR field (the name of the program that wrote
+ the file) from the header:<br>
+
+ <blockquote><code>Gedcom_ctxt <b>my_header_source_start_cb</b>(Gedcom_ctxt
parent,<br>
-
- int
- level,<br>
-
- char*
- tag,<br>
-
- char*
- raw_value,<br>
-
- Gedcom_val parsed_value)<br>
- {<br>
- char *source = GEDCOM_STRING(parsed_value);<br>
- printf("This file was written by %s\n", source);<br>
- return parent;<br>
- }<br>
- <br>
- void <b>my_header_source_end_cb</b>(Gedcom_ctxt parent,<br>
-
- Gedcom_ctxt self,<br>
-
- Gedcom_val parsed_value)<br>
- {<br>
- printf("End of the source description\n");<br>
- }<br>
- <br>
- ...<br>
- <b>gedcom_subscribe_to_element</b>(ELT_HEAD_SOUR,<br>
-
- my_header_source_start_cb,<br>
-
- my_header_source_end_cb);<br>
- ...<br>
- result = <b>gedcom_parse_file</b>("myfamily.ged");</code><br>
- </blockquote>
- The subscription mechanism for elements is similar, only the signatures
-of the callbacks differ. The signature for the start callback shows
-that the context of the parent line (e.g. the <code>struct</code> that describes
- the header) is passed to this start callback. The callback itself
-returns here the same context, but this can be its own context object of
-course. The end callback is called with both the context of the parent
-and the context of itself, which will be the same in the example. Again,
-the list of identifiers to use as a first argument for the subscription function
-are detailed in the <a href="interface.html#Element_identifiers">interface
+
+ int
+ level,<br>
+
+ char*
+ tag,<br>
+
+ char*
+ raw_value,<br>
+
+ int
+ parsed_tag,<br>
+
+ Gedcom_val
+ parsed_value)<br>
+ {<br>
+ char *source = GEDCOM_STRING(parsed_value);<br>
+ printf("This file was written by %s\n", source);<br>
+ return parent;<br>
+ }<br>
+ <br>
+ void <b>my_header_source_end_cb</b>(Gedcom_ctxt parent,<br>
+
+ Gedcom_ctxt self,<br>
+
+ Gedcom_val parsed_value)<br>
+ {<br>
+ printf("End of the source description\n");<br>
+ }<br>
+ <br>
+ ...<br>
+ <b>gedcom_subscribe_to_element</b>(ELT_HEAD_SOUR,<br>
+
+ my_header_source_start_cb,<br>
+
+ my_header_source_end_cb);<br>
+ ...<br>
+ result = <b>gedcom_parse_file</b>("myfamily.ged");</code><br>
+ </blockquote>
+ The subscription mechanism for elements is similar, only the signatures
+ of the callbacks differ. The signature for the start callback shows
+ that the context of the parent line (e.g. the <code>struct</code> that describes
+ the header) is passed to this start callback. The callback itself returns
+ here the same context, but this can be its own context object of course.
+ The end callback is called with both the context of the parent and
+the context of itself, which will be the same in the example. Again,
+the list of identifiers to use as a first argument for the subscription function
+are detailed in the <a href="interface.html#Element_identifiers">interface
details</a> .<br>
- <br>
- If we look at the other arguments of the start callback, we see the level
- number (the initial number of the line in the GEDCOM file), the tag (e.g.
- "SOUR"), and then a raw value and a parsed value. The raw value is
-just the raw string that occurs as value on the line next to the tag (in
-UTF-8 encoding). The parsed value is the meaningful value that is parsed
-from that raw string.<br>
- <br>
- The <code>Gedcom_val</code> type is meant to be an opaque type. The
- only thing that needs to be known about it is that it can contain specific
- data types, which have to be retrieved from it using pre-defined macros.
+ <br>
+ If we look at the other arguments of the start callback, we see the level
+ number (the initial number of the line in the GEDCOM file), the tag (e.g.
+ "SOUR"), and then a raw value, a parsed tag and a parsed value. The
+raw value is just the raw string that occurs as value on the line next to
+the tag (in UTF-8 encoding). The parsed value is the meaningful value
+that is parsed from that raw string. The parsed tag is described in
+the section for record callbacks.<br>
+ <br>
+ The <code>Gedcom_val</code> type is meant to be an opaque type. The
+ only thing that needs to be known about it is that it can contain specific
+ data types, which have to be retrieved from it using pre-defined macros.
These data types are described in the <a href="interface.html#Gedcom_val_types">
- interface details</a>. <br>
- <br>
- Some extra notes:<br>
-
+ interface details</a>. <br>
+ <br>
+ Some extra notes:<br>
+
<ul>
- <li>The <code>Gedcom_val</code> argument of the end callback
+ <li>The <code>Gedcom_val</code> argument of the end callback
is currently not used. It is there for future enhancements.</li>
- <li>There is also a <code>Gedcom_val</code> argument in the
-start callback for records. This argument is currently a string value
-giving the pointer in string form.</li>
-
+ <li>There is also a <code>Gedcom_val</code> argument in the
+ start callback for records. This argument is currently a string value
+ giving the pointer in string form.</li>
+
</ul>
-
+
<h3><a name="Default_callbacks"></a>Default callbacks<br>
- </h3>
- As described above, an application doesn't always implement the entire
-GEDCOM spec, and application-specific tags may have been added by other applications.
- To preserve this extra data anyway, a default callback can be registered
-by the application, as in the following example:<br>
-
- <blockquote><code>void <b>my_default_cb</b> (Gedcom_ctxt parent,
-int level, char* tag, char* raw_value)<br>
- {<br>
- ...<br>
- }<br>
- <br>
- ...<br>
- <b>gedcom_set_default_callback</b>(my_default_cb);<br>
- ...<br>
- result = <b>gedcom_parse_file</b>("myfamily.ged");</code><br>
- </blockquote>
- This callback has a similar signature as the previous ones, but
-it doesn't contain a parsed value. However, it does contain the parent
-context, that was returned by the application for the most specific containing
-tag that the application supported.<br>
- <br>
- Suppose e.g. that this callback is called for some tags in the header that
-are specific to some other application, then our application could make sure
-that the parent context contains the struct or object that represents the
-header, and use the default callback here to add the level, tag and raw_value
-as plain text in a member of that struct or object, thus preserving the information.
- The application can then write this out when the data is saved again
-in a GEDCOM file. To make it more specific, consider the following example:<br>
-
+ </h3>
+ As described above, an application doesn't always implement the entire
+GEDCOM spec, and application-specific tags may have been added by other applications.
+ To preserve this extra data anyway, a default callback can be registered
+ by the application, as in the following example:<br>
+
+ <blockquote><code>void <b>my_default_cb</b> (Gedcom_ctxt parent,
+ int level, char* tag, char* raw_value, int parsed_tag)<br>
+ {<br>
+ ...<br>
+ }<br>
+ <br>
+ ...<br>
+ <b>gedcom_set_default_callback</b>(my_default_cb);<br>
+ ...<br>
+ result = <b>gedcom_parse_file</b>("myfamily.ged");</code><br>
+ </blockquote>
+ This callback has a similar signature as the previous ones,
+but it doesn't contain a parsed value. However, it does contain the
+parent context, that was returned by the application for the most specific
+containing tag that the application supported.<br>
+ <br>
+ Suppose e.g. that this callback is called for some tags in the header that
+ are specific to some other application, then our application could make
+sure that the parent context contains the struct or object that represents
+the header, and use the default callback here to add the level, tag and
+raw_value as plain text in a member of that struct or object, thus preserving
+the information. The application can then write this out when the
+data is saved again in a GEDCOM file. To make it more specific, consider
+the following example:<br>
+
<blockquote><code>struct header {<br>
- char* source;<br>
- ...<br>
- char* extra_text;<br>
- };<br>
- <br>
- Gedcom_ctxt my_header_start_cb(int level, Gedcom_val xref, char* tag)<br>
- {<br>
- struct header head = my_make_header_struct();<br>
- return (Gedcom_ctxt)head;<br>
- }<br>
- <br>
- void my_default_cb(Gedcom_ctxt parent, int level, char* tag, char* raw_value)<br>
- {<br>
- struct header head = (struct header)parent;<br>
- my_header_add_to_extra_text(head, level, tag, raw_value);<br>
- }<br>
- <br>
- gedcom_set_default_callback(my_default_cb);<br>
- gedcom_subscribe_to_record(REC_HEAD, my_header_start, NULL);<br>
- ...<br>
- result = gedcom_parse_file(filename);</code><br>
- </blockquote>
- Note that the default callback will be called for any tag that isn't specifically
-subscribed upon by the application, and can thus be called in various contexts.
- For simplicity, the example above doesn't take this into account (the
- <code>parent</code> could be of different types, depending on
-the context).<br>
-
- <hr width="100%" size="2">
+ char* source;<br>
+ ...<br>
+ char* extra_text;<br>
+ };<br>
+ <br>
+ Gedcom_ctxt my_header_start_cb(int level, Gedcom_val xref, char* tag, int
+parsed_tag)<br>
+ {<br>
+ struct header head = my_make_header_struct();<br>
+ return (Gedcom_ctxt)head;<br>
+ }<br>
+ <br>
+ void my_default_cb(Gedcom_ctxt parent, int level, char* tag, char* raw_value,
+int parsed_tag)<br>
+ {<br>
+ struct header head = (struct header)parent;<br>
+ my_header_add_to_extra_text(head, level, tag, raw_value);<br>
+ }<br>
+ <br>
+ gedcom_set_default_callback(my_default_cb);<br>
+ gedcom_subscribe_to_record(REC_HEAD, my_header_start, NULL);<br>
+ ...<br>
+ result = gedcom_parse_file(filename);</code><br>
+ </blockquote>
+ Note that the default callback will be called for any tag that isn't specifically
+ subscribed upon by the application, and can thus be called in various contexts.
+ For simplicity, the example above doesn't take this into account (the
+ <code>parent</code> could be of different types, depending
+on the context).<br>
+
+ <hr width="100%" size="2">
<h2><a name="Other_API_functions"></a>Other API functions<br>
- </h2>
- Although the above describes the basic interface of libgedcom, there are
-some other functions that allow to customize the behaviour of the library.
- These will be explained in the current section.<br>
-
+ </h2>
+ Although the above describes the basic interface of libgedcom, there are
+ some other functions that allow to customize the behaviour of the library.
+ These will be explained in the current section.<br>
+
<h3><a name="Debugging"></a>Debugging</h3>
- The library can generate various debugging output, not only from itself,
-but also the debugging output generated by the yacc parser. By default,
-no debugging output is generated, but this can be customized using the following
-function:<br>
-
- <blockquote><code>void <b>gedcom_set_debug_level</b> (int level,
-FILE* trace_output)</code><br>
- </blockquote>
- The <code>level</code> can be one of the following values:<br>
-
+ The library can generate various debugging output, not only from itself,
+ but also the debugging output generated by the yacc parser. By default,
+ no debugging output is generated, but this can be customized using the following
+ function:<br>
+
+ <blockquote><code>void <b>gedcom_set_debug_level</b> (int level,
+ FILE* trace_output)</code><br>
+ </blockquote>
+ The <code>level</code> can be one of the following values:<br>
+
<ul>
- <li>0: no debugging information (this is the default)</li>
- <li>1: only debugging information from libgedcom
-itself</li>
- <li>2: debugging information from libgedcom and
+ <li>0: no debugging information (this is the default)</li>
+ <li>1: only debugging information from libgedcom
+ itself</li>
+ <li>2: debugging information from libgedcom and
yacc</li>
-
+
</ul>
- If the <code>trace_output</code> is <code>NULL</code>, debugging information
-will be written to <code>stderr</code>, otherwise the given file handle is
-used (which must be open).<br>
- <br>
-
+ If the <code>trace_output</code> is <code>NULL</code>, debugging information
+ will be written to <code>stderr</code>, otherwise the given file handle
+is used (which must be open).<br>
+ <br>
+
<h3><a name="Error_treatment"></a>Error treatment</h3>
- One of the previous sections already described the callback to be registered
-to get error messages. The library also allows to customize what happens
-on an error, using the following function:<br>
-
- <blockquote><code>void <b>gedcom_set_error_handling</b> (Gedcom_err_mech
-mechanism)</code><br>
- </blockquote>
- The <code>mechanism</code> can be one of:<br>
-
+ One of the previous sections already described the callback to be registered
+ to get error messages. The library also allows to customize what happens
+ on an error, using the following function:<br>
+
+ <blockquote><code>void <b>gedcom_set_error_handling</b> (Gedcom_err_mech
+ mechanism)</code><br>
+ </blockquote>
+ The <code>mechanism</code> can be one of:<br>
+
<ul>
- <li><code>IMMED_FAIL</code>: immediately fail the parsing
-on an error (this is the default)</li>
- <li><code>DEFER_FAIL</code>: continue parsing after
+ <li><code>IMMED_FAIL</code>: immediately fail the parsing
+ on an error (this is the default)</li>
+ <li><code>DEFER_FAIL</code>: continue parsing after
an error, but return a failure code eventually</li>
- <li><code>IGNORE_ERRORS</code>: continue parsing after
-an error, return success always</li>
-
+ <li><code>IGNORE_ERRORS</code>: continue parsing after
+ an error, return success always</li>
+
</ul>
- This doesn't influence the generation of error or warning messages, only
-the behaviour of the parser and its return code.<br>
- <br>
-
+ This doesn't influence the generation of error or warning messages, only
+ the behaviour of the parser and its return code.<br>
+ <br>
+
<h3><a name="Compatibility_mode"></a>Compatibility mode<br>
- </h3>
- Applications are not necessarily true to the GEDCOM spec (or use a different
-version than 5.5). The intention is that the library is resilient to
-this, and goes in compatibility mode for files written by specific programs
-(detected via the HEAD.SOUR tag). This compatibility mode can be enabled
-and disabled via the following function:<br>
-
+ </h3>
+ Applications are not necessarily true to the GEDCOM spec (or use a different
+ version than 5.5). The intention is that the library is resilient
+to this, and goes in compatibility mode for files written by specific programs
+ (detected via the HEAD.SOUR tag). This compatibility mode can be enabled
+ and disabled via the following function:<br>
+
<blockquote><code>void <b>gedcom_set_compat_handling</b>
- (int enable_compat)</code><br>
- </blockquote>
- The argument can be:<br>
-
+ (int enable_compat)</code><br>
+ </blockquote>
+ The argument can be:<br>
+
<ul>
- <li>0: disable compatibility mode</li>
- <li>1: allow compatibility mode (this is the default)<br>
- </li>
-
+ <li>0: disable compatibility mode</li>
+ <li>1: allow compatibility mode (this is the default)<br>
+ </li>
+
</ul>
- Note that, currently, no actual compatibility code is present, but this
+ Note that, currently, no actual compatibility code is present, but this
is on the to-do list.<br>
-
- <hr width="100%" size="2">
+
+ <hr width="100%" size="2">
<pre>$Id$<br>$Name$<br></pre>
- <pre>
- </pre>
-
+
+ <pre> </pre>
+
</body>
</html>
/* Parser for Gedcom.
- Copyright (C) 2001 The Genes Development Team
+ Copyright (C) 2001, 2002 The Genes Development Team
This file is part of the Gedcom parser library.
Contributed by Peter Verthez <Peter.Verthez@advalvas.be>, 2001.
%union {
int number;
char *string;
+ struct tag_struct tag;
Gedcom_ctxt ctxt;
}
%token <string> DELIM
%token <string> ANYCHAR
%token <string> POINTER
-%token <string> USERTAG
-%token <string> TAG_ABBR
-%token <string> TAG_ADDR
-%token <string> TAG_ADR1
-%token <string> TAG_ADR2
-%token <string> TAG_ADOP
-%token <string> TAG_AFN
-%token <string> TAG_AGE
-%token <string> TAG_AGNC
-%token <string> TAG_ALIA
-%token <string> TAG_ANCE
-%token <string> TAG_ANCI
-%token <string> TAG_ANUL
-%token <string> TAG_ASSO
-%token <string> TAG_AUTH
-%token <string> TAG_BAPL
-%token <string> TAG_BAPM
-%token <string> TAG_BARM
-%token <string> TAG_BASM
-%token <string> TAG_BIRT
-%token <string> TAG_BLES
-%token <string> TAG_BLOB
-%token <string> TAG_BURI
-%token <string> TAG_CALN
-%token <string> TAG_CAST
-%token <string> TAG_CAUS
-%token <string> TAG_CENS
-%token <string> TAG_CHAN
-%token <string> TAG_CHAR
-%token <string> TAG_CHIL
-%token <string> TAG_CHR
-%token <string> TAG_CHRA
-%token <string> TAG_CITY
-%token <string> TAG_CONC
-%token <string> TAG_CONF
-%token <string> TAG_CONL
-%token <string> TAG_CONT
-%token <string> TAG_COPR
-%token <string> TAG_CORP
-%token <string> TAG_CREM
-%token <string> TAG_CTRY
-%token <string> TAG_DATA
-%token <string> TAG_DATE
-%token <string> TAG_DEAT
-%token <string> TAG_DESC
-%token <string> TAG_DESI
-%token <string> TAG_DEST
-%token <string> TAG_DIV
-%token <string> TAG_DIVF
-%token <string> TAG_DSCR
-%token <string> TAG_EDUC
-%token <string> TAG_EMIG
-%token <string> TAG_ENDL
-%token <string> TAG_ENGA
-%token <string> TAG_EVEN
-%token <string> TAG_FAM
-%token <string> TAG_FAMC
-%token <string> TAG_FAMF
-%token <string> TAG_FAMS
-%token <string> TAG_FCOM
-%token <string> TAG_FILE
-%token <string> TAG_FORM
-%token <string> TAG_GEDC
-%token <string> TAG_GIVN
-%token <string> TAG_GRAD
-%token <string> TAG_HEAD
-%token <string> TAG_HUSB
-%token <string> TAG_IDNO
-%token <string> TAG_IMMI
-%token <string> TAG_INDI
-%token <string> TAG_LANG
-%token <string> TAG_LEGA
-%token <string> TAG_MARB
-%token <string> TAG_MARC
-%token <string> TAG_MARL
-%token <string> TAG_MARR
-%token <string> TAG_MARS
-%token <string> TAG_MEDI
-%token <string> TAG_NAME
-%token <string> TAG_NATI
-%token <string> TAG_NATU
-%token <string> TAG_NCHI
-%token <string> TAG_NICK
-%token <string> TAG_NMR
-%token <string> TAG_NOTE
-%token <string> TAG_NPFX
-%token <string> TAG_NSFX
-%token <string> TAG_OBJE
-%token <string> TAG_OCCU
-%token <string> TAG_ORDI
-%token <string> TAG_ORDN
-%token <string> TAG_PAGE
-%token <string> TAG_PEDI
-%token <string> TAG_PHON
-%token <string> TAG_PLAC
-%token <string> TAG_POST
-%token <string> TAG_PROB
-%token <string> TAG_PROP
-%token <string> TAG_PUBL
-%token <string> TAG_QUAY
-%token <string> TAG_REFN
-%token <string> TAG_RELA
-%token <string> TAG_RELI
-%token <string> TAG_REPO
-%token <string> TAG_RESI
-%token <string> TAG_RESN
-%token <string> TAG_RETI
-%token <string> TAG_RFN
-%token <string> TAG_RIN
-%token <string> TAG_ROLE
-%token <string> TAG_SEX
-%token <string> TAG_SLGC
-%token <string> TAG_SLGS
-%token <string> TAG_SOUR
-%token <string> TAG_SPFX
-%token <string> TAG_SSN
-%token <string> TAG_STAE
-%token <string> TAG_STAT
-%token <string> TAG_SUBM
-%token <string> TAG_SUBN
-%token <string> TAG_SURN
-%token <string> TAG_TEMP
-%token <string> TAG_TEXT
-%token <string> TAG_TIME
-%token <string> TAG_TITL
-%token <string> TAG_TRLR
-%token <string> TAG_TYPE
-%token <string> TAG_VERS
-%token <string> TAG_WIFE
-%token <string> TAG_WILL
-
-%type <string> anystdtag
-%type <string> anytoptag
+%token <tag> USERTAG
+%token <tag> TAG_ABBR
+%token <tag> TAG_ADDR
+%token <tag> TAG_ADR1
+%token <tag> TAG_ADR2
+%token <tag> TAG_ADOP
+%token <tag> TAG_AFN
+%token <tag> TAG_AGE
+%token <tag> TAG_AGNC
+%token <tag> TAG_ALIA
+%token <tag> TAG_ANCE
+%token <tag> TAG_ANCI
+%token <tag> TAG_ANUL
+%token <tag> TAG_ASSO
+%token <tag> TAG_AUTH
+%token <tag> TAG_BAPL
+%token <tag> TAG_BAPM
+%token <tag> TAG_BARM
+%token <tag> TAG_BASM
+%token <tag> TAG_BIRT
+%token <tag> TAG_BLES
+%token <tag> TAG_BLOB
+%token <tag> TAG_BURI
+%token <tag> TAG_CALN
+%token <tag> TAG_CAST
+%token <tag> TAG_CAUS
+%token <tag> TAG_CENS
+%token <tag> TAG_CHAN
+%token <tag> TAG_CHAR
+%token <tag> TAG_CHIL
+%token <tag> TAG_CHR
+%token <tag> TAG_CHRA
+%token <tag> TAG_CITY
+%token <tag> TAG_CONC
+%token <tag> TAG_CONF
+%token <tag> TAG_CONL
+%token <tag> TAG_CONT
+%token <tag> TAG_COPR
+%token <tag> TAG_CORP
+%token <tag> TAG_CREM
+%token <tag> TAG_CTRY
+%token <tag> TAG_DATA
+%token <tag> TAG_DATE
+%token <tag> TAG_DEAT
+%token <tag> TAG_DESC
+%token <tag> TAG_DESI
+%token <tag> TAG_DEST
+%token <tag> TAG_DIV
+%token <tag> TAG_DIVF
+%token <tag> TAG_DSCR
+%token <tag> TAG_EDUC
+%token <tag> TAG_EMIG
+%token <tag> TAG_ENDL
+%token <tag> TAG_ENGA
+%token <tag> TAG_EVEN
+%token <tag> TAG_FAM
+%token <tag> TAG_FAMC
+%token <tag> TAG_FAMF
+%token <tag> TAG_FAMS
+%token <tag> TAG_FCOM
+%token <tag> TAG_FILE
+%token <tag> TAG_FORM
+%token <tag> TAG_GEDC
+%token <tag> TAG_GIVN
+%token <tag> TAG_GRAD
+%token <tag> TAG_HEAD
+%token <tag> TAG_HUSB
+%token <tag> TAG_IDNO
+%token <tag> TAG_IMMI
+%token <tag> TAG_INDI
+%token <tag> TAG_LANG
+%token <tag> TAG_LEGA
+%token <tag> TAG_MARB
+%token <tag> TAG_MARC
+%token <tag> TAG_MARL
+%token <tag> TAG_MARR
+%token <tag> TAG_MARS
+%token <tag> TAG_MEDI
+%token <tag> TAG_NAME
+%token <tag> TAG_NATI
+%token <tag> TAG_NATU
+%token <tag> TAG_NCHI
+%token <tag> TAG_NICK
+%token <tag> TAG_NMR
+%token <tag> TAG_NOTE
+%token <tag> TAG_NPFX
+%token <tag> TAG_NSFX
+%token <tag> TAG_OBJE
+%token <tag> TAG_OCCU
+%token <tag> TAG_ORDI
+%token <tag> TAG_ORDN
+%token <tag> TAG_PAGE
+%token <tag> TAG_PEDI
+%token <tag> TAG_PHON
+%token <tag> TAG_PLAC
+%token <tag> TAG_POST
+%token <tag> TAG_PROB
+%token <tag> TAG_PROP
+%token <tag> TAG_PUBL
+%token <tag> TAG_QUAY
+%token <tag> TAG_REFN
+%token <tag> TAG_RELA
+%token <tag> TAG_RELI
+%token <tag> TAG_REPO
+%token <tag> TAG_RESI
+%token <tag> TAG_RESN
+%token <tag> TAG_RETI
+%token <tag> TAG_RFN
+%token <tag> TAG_RIN
+%token <tag> TAG_ROLE
+%token <tag> TAG_SEX
+%token <tag> TAG_SLGC
+%token <tag> TAG_SLGS
+%token <tag> TAG_SOUR
+%token <tag> TAG_SPFX
+%token <tag> TAG_SSN
+%token <tag> TAG_STAE
+%token <tag> TAG_STAT
+%token <tag> TAG_SUBM
+%token <tag> TAG_SUBN
+%token <tag> TAG_SURN
+%token <tag> TAG_TEMP
+%token <tag> TAG_TEXT
+%token <tag> TAG_TIME
+%token <tag> TAG_TITL
+%token <tag> TAG_TRLR
+%token <tag> TAG_TYPE
+%token <tag> TAG_VERS
+%token <tag> TAG_WIFE
+%token <tag> TAG_WILL
+
+%type <tag> anystdtag
+%type <tag> anytoptag
+%type <tag> fam_event_tag
+%type <tag> indiv_attr_tag
+%type <tag> indiv_birt_tag
+%type <tag> indiv_gen_tag
+%type <tag> lio_bapl_tag
%type <string> line_item
%type <string> line_value
%type <string> mand_line_item
%type <string> opt_xref
%type <string> opt_value
%type <string> opt_line_item
-%type <string> fam_event_tag
-%type <string> indiv_attr_tag
-%type <string> indiv_birt_tag
-%type <string> indiv_gen_tag
-%type <string> lio_bapl_tag
%type <ctxt> head_sect
%%
;
head_sour_corp_sect : OPEN DELIM TAG_CORP mand_line_item
{ $<ctxt>$ = start_element(ELT_HEAD_SOUR_CORP, PARENT,
- $1, $3, $4,
+ $1, $3, $4,
GEDCOM_MAKE_STRING($4));
START(CORP, $<ctxt>$)
}
head_sour_data_sect : OPEN DELIM TAG_DATA mand_line_item
{ $<ctxt>$ = start_element(ELT_HEAD_SOUR_DATA, PARENT,
- $1, $3, $4,
+ $1, $3, $4,
GEDCOM_MAKE_STRING($4));
START(DATA, $<ctxt>$)
}
head_sour_data_date_sect : OPEN DELIM TAG_DATE mand_line_item
{ struct date_value dv = gedcom_parse_date($4);
$<ctxt>$ = start_element(ELT_HEAD_SOUR_DATA_DATE,
- PARENT, $1, $3, $4,
+ PARENT, $1, $3, $4,
GEDCOM_MAKE_DATE(dv));
START(DATE, $<ctxt>$)
}
;
head_sour_data_copr_sect : OPEN DELIM TAG_COPR mand_line_item
{ $<ctxt>$ = start_element(ELT_HEAD_SOUR_DATA_COPR,
- PARENT, $1, $3, $4,
+ PARENT, $1, $3, $4,
GEDCOM_MAKE_STRING($4));
START(COPR, $<ctxt>$)
}
/* HEAD.DEST */
head_dest_sect : OPEN DELIM TAG_DEST mand_line_item
{ $<ctxt>$ = start_element(ELT_HEAD_DEST,
- PARENT, $1, $3, $4,
+ PARENT, $1, $3, $4,
GEDCOM_MAKE_STRING($4));
START(DEST, $<ctxt>$)
}
head_date_sect : OPEN DELIM TAG_DATE mand_line_item
{ struct date_value dv = gedcom_parse_date($4);
$<ctxt>$ = start_element(ELT_HEAD_DATE,
- PARENT, $1, $3, $4,
+ PARENT, $1, $3, $4,
GEDCOM_MAKE_DATE(dv));
START(DATE, $<ctxt>$)
}
head_date_time_sect : OPEN DELIM TAG_TIME mand_line_item
{ $<ctxt>$ = start_element(ELT_HEAD_DATE_TIME,
- PARENT, $1, $3, $4,
+ PARENT, $1, $3, $4,
GEDCOM_MAKE_STRING($4));
START(TIME, $<ctxt>$)
}
/* HEAD.SUBM */
head_subm_sect : OPEN DELIM TAG_SUBM mand_pointer
{ $<ctxt>$ = start_element(ELT_HEAD_SUBM,
- PARENT, $1, $3, $4,
+ PARENT, $1, $3, $4,
GEDCOM_MAKE_STRING($4));
START(SUBM, $<ctxt>$)
}
/* HEAD.SUBN */
head_subn_sect : OPEN DELIM TAG_SUBN mand_pointer
{ $<ctxt>$ = start_element(ELT_HEAD_SUBN,
- PARENT, $1, $3, $4,
+ PARENT, $1, $3, $4,
GEDCOM_MAKE_STRING($4));
START(SUBN, $<ctxt>$)
}
/* HEAD.FILE */
head_file_sect : OPEN DELIM TAG_FILE mand_line_item
{ $<ctxt>$ = start_element(ELT_HEAD_FILE,
- PARENT, $1, $3, $4,
+ PARENT, $1, $3, $4,
GEDCOM_MAKE_STRING($4));
START(FILE, $<ctxt>$)
}
/* HEAD.COPR */
head_copr_sect : OPEN DELIM TAG_COPR mand_line_item
{ $<ctxt>$ = start_element(ELT_HEAD_COPR,
- PARENT, $1, $3, $4,
+ PARENT, $1, $3, $4,
GEDCOM_MAKE_STRING($4));
START(COPR, $<ctxt>$)
}
;
head_gedc_vers_sect : OPEN DELIM TAG_VERS mand_line_item
{ $<ctxt>$ = start_element(ELT_HEAD_GEDC_VERS,
- PARENT, $1, $3, $4,
+ PARENT, $1, $3, $4,
GEDCOM_MAKE_STRING($4));
START(VERS, $<ctxt>$)
}
;
head_gedc_form_sect : OPEN DELIM TAG_FORM mand_line_item
{ $<ctxt>$ = start_element(ELT_HEAD_GEDC_FORM,
- PARENT, $1, $3, $4,
+ PARENT, $1, $3, $4,
GEDCOM_MAKE_STRING($4));
START(FORM, $<ctxt>$)
}
head_char_sect : OPEN DELIM TAG_CHAR mand_line_item
{ if (open_conv_to_internal($4) == 0) YYERROR;
$<ctxt>$ = start_element(ELT_HEAD_CHAR,
- PARENT, $1, $3, $4,
+ PARENT, $1, $3, $4,
GEDCOM_MAKE_STRING($4));
START(CHAR, $<ctxt>$)
}
;
head_char_vers_sect : OPEN DELIM TAG_VERS mand_line_item
{ $<ctxt>$ = start_element(ELT_HEAD_CHAR_VERS,
- PARENT, $1, $3, $4,
+ PARENT, $1, $3, $4,
GEDCOM_MAKE_STRING($4));
START(VERS, $<ctxt>$)
}
/* HEAD.LANG */
head_lang_sect : OPEN DELIM TAG_LANG mand_line_item
{ $<ctxt>$ = start_element(ELT_HEAD_LANG,
- PARENT, $1, $3, $4,
+ PARENT, $1, $3, $4,
GEDCOM_MAKE_STRING($4));
START(LANG, $<ctxt>$)
}
;
user_rec : OPEN DELIM opt_xref USERTAG
- { if ($4[0] != '_') {
+ { if ($4.string[0] != '_') {
gedcom_error(_("Undefined tag (and not a valid user tag): %s"),
$4);
YYERROR;
{ end_record(REC_USER, $<ctxt>7); }
;
user_sect : OPEN DELIM opt_xref USERTAG
- { if ($4[0] != '_') {
+ { if ($4.string[0] != '_') {
gedcom_error(_("Undefined tag (and not a valid user tag): %s"),
$4);
YYERROR;