1 <!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"><html><head><title>Gedcom object model in C</title>
4 <meta http-equiv="content-type" content="text/html; charset=ISO-8859-1"></head><body text="#000000" bgcolor="#ffffff" link="#000099" vlink="#990099" alink="#000099">
6 <h1 align="center">Gedcom object model in C</h1>
12 <li><a href="#Main_functions">Main functions</a></li>
14 <li><a href="#Object_model_structure">Object model structure</a></li>
16 <li><a href="#Object_lists">Object lists</a><br>
23 <li><a href="#User_data">User data</a></li>
25 <li><a href="#Other_functions">Other functions</a></li>
27 <li><a href="#Manipulating_strings">Manipulating strings</a><br>
35 <li><a href="gomxref.html">C object model details</a><br>
41 <hr width="100%" size="2">
43 <h2><a name="Main_functions"></a>Main functions<br>
46 There are two ways to start with a GEDCOM object model (after having called <code>gedcom_init</code>): either by starting from scratch, or by starting from a given GEDCOM file. This is done via the following two functions:<br>
47 <blockquote><code>int <b>gom_parse_file</b> (const char* file_name);<br>
49 <blockquote>This initializes the object model by parsing the GEDCOM file given by <code>file_name</code>. It returns 0 on success and 1 on failure.<br>
52 <blockquote><code>int <b>gom_new_model</b> ();<br>
54 <blockquote>This starts an empty model. Actually, this is done by processing the file "<code>new.ged</code>" in the gedcom-parse data directory.<br>
57 In the GEDCOM object model, all the data is immediately available after calling <code>gom_parse_file()</code> or <code>gom_new_model()</code>. For this, an entire model based on C structs is used. These structs are documented <a href="file:///home/verthezp/src/external/gedcom-parse/doc/gomxref.html">here</a>,
58 and follow the GEDCOM syntax quite closely. Each of the records in
59 a GEDCOM file are modelled by a separate struct, and some common sub-structures
60 have their own struct definition.<br>
63 The following functions are available to get at these structs:<br>
65 <li>First, there are two functions to get the header record and the submission
66 record (there can be only one of them in a GEDCOM file):<br>
67 <blockquote><code>struct header* <b>gom_get_header</b>();<br>
68 struct submission* <b>gom_get_submission</b>();<br>
71 <li>Further, for each of the other records, there are two functions, one
72 to get the first of such records, and one to get a record via its cross-reference
73 tag in the GEDCOM file:<br>
74 <blockquote><code>struct XXX* <b>gom_get_first_XXX</b>();<br>
75 struct XXX* <b>gom_get_XXX_by_xref</b>(char* xref);</code><br>
79 <blockquote>The <code><b>XXX</b></code> stands for one of the following: <code><b>family</b>, </code><code><b>individual</b>, <b>multimedia</b>, <b>note</b>, <b>repository</b>, <b>source</b>, <b>submitter</b>, <b>user_rec</b></code>.<br>
81 <hr width="100%" size="2">
82 <h2><a name="Object_model_structure"></a>Object model structure<br>
85 <h3><a name="Object_lists"></a>Object lists<br>
87 All records of a certain type are linked together in a linked list. The
88 above functions only give access to the first record of each linked list.
89 The others can be accessed by traversing the linked list via the <code>next</code> member of the structs. This means that e.g. the following piece of code will traverse the linked list of family records:<br>
90 <blockquote><code>struct family* fam;<br>
92 for (fam = gom_get_first_family() ; fam ; fam = fam->next) {<br>
96 The <code>next</code> member of the last element in the list is guaranteed to have the <code>NULL</code> value.<br>
98 Actually, the linked list is a doubly-linked list: each record also has a <code>previous</code> member. But for implementation reasons the behaviour of this <code>previous</code> member on the edges of the linked list will not be guaranteed, i.e. it can be circular or terminated with <code>NULL</code>, no assumptions can be made in the application code.<br>
100 This linked-list model applies also to all sub-structures of the main record structs, i.e. each struct that has a <code>next </code>and <code>previous</code>
101 member follows the above conventions. This means that the following
102 piece of code traverses all children of a family (see the details of the
103 different structs <a href="gomxref.html">here</a>):<br>
104 <blockquote><code>struct family* fam = ...;<br>
106 struct xref_list* xrl;<br>
107 for (xrl = fam->children ; xrl ; xrl = xrl->next) {<br>
111 Note that all character strings in the object model are encoded in UTF-8 (<a href="file:///home/verthezp/src/external/gedcom-parse/doc/encoding.html">Why UTF-8?</a>), but see <a href="#Manipulating_strings">below</a> for how to convert these automatically.<br>
112 <h3><a name="User_data"></a>User data</h3>
115 Each of the structs has an extra member called <code>extra</code> (of type <code>struct user_data*</code>).
116 This gathers all non-standard GEDCOM tags within the scope of the struct
117 in a flat linked list, no matter what the internal structure of the non-standard
118 tags is. Each element of the linked list has:<br>
120 <li>a level: the level number in the GEDCOM file</li>
121 <li>a tag: the tag given in the GEDCOM file</li>
122 <li>a value: the value, which can be a string value or a cross-reference value (one of the two will be non-NULL)<br>
125 This way, none of the information in the GEDCOM file is lost, even the non-standard information.<br>
127 <hr width="100%" size="2">
128 <h2><a name="Other_functions"></a>Other functions</h2>
129 <h3><a name="Manipulating_strings"></a>Manipulating strings<br>
131 There are some functions available to retrieve and change strings in the
132 Gedcom object model, depending whether you use UTF-8 strings in your application
133 or locale-defined strings.<br>
135 The following functions retrieve and set the string in UTF-8 encoding:<br>
136 <blockquote><code>char* <b>gom_get_string</b> (char* data);<br>
137 char* <b>gom_set_string</b> (char** data, const char* utf8_str);</code><br>
139 The first function is in fact superfluous, because it just returns the <code>data</code>, but it is there for symmetry with the functions given below for the locale-defined input and output. <br>
141 The second function returns the new value if successful, or <code>NULL</code>
142 if an error occurred (e.g. failure to allocate memory). It makes a
143 copy of the input string to store it in the object model. It also takes
144 care of deallocating the old value of the data if needed. Note that
145 the set function needs the address of the data variable, to be able to modify
148 Examples of use of these strings would be, e.g. for retrieving and setting the system ID in the header:<br>
149 <blockquote><code>struct header* head = gom_get_header();</code><code></code><br>
150 <code>char* oldvalue = gom_get_string(head->source.id);<br>
151 char* newvalue = "My_Gedcom_Tool";<br>
153 <code>if (gom_set_string(&head->source.id, newvalue)) {<br>
154 printf("Modified system id from %s to %s\n", oldvalue, newvalue);<br>
158 A second couple of functions retrieve and set the string in the format defined by the current locale:<br>
159 <blockquote><code>char* <b>gom_get_string_for_locale</b> (char* data, int* conversion_failures);<br>
160 char* <b>gom_set_string_for_locale</b> (char** data, const char* locale_str)</code>;<br>
162 The use of these functions is the same as the previous ones, but e.g. in
163 the "en_US" locale the string will be returned by the first function in the
164 ISO-8859-1 encoding and the second function expects the <code>locale_str</code> to be in this encoding. Conversion to and from UTF-8 for the object model is done on the fly.<br>
166 Since the conversion from UTF-8 to the locale encoding is not always possible,
167 the get function has a second parameter that can return the number of conversion
168 failures for the result string. Pass a pointer to an integer if you
169 want to know this. You can pass <code>NULL</code> if you're not interested.<br>
170 <hr width="100%" size="2">
171 <pre><font size="-1">$Id$<br>$Name$</font><br></pre>