From 5d16fb60418c30cd6693556ac6d56d6c0f8cd7a6 Mon Sep 17 00:00:00 2001 From: Peter Verthez Date: Fri, 1 Nov 2002 20:27:36 +0000 Subject: [PATCH] Documented the string manipulation functions. --- doc/gom.html | 83 +++++++++++++++++++++++++++++++++++++++++++++------- 1 file changed, 72 insertions(+), 11 deletions(-) diff --git a/doc/gom.html b/doc/gom.html index 84985a9..f36dbd0 100644 --- a/doc/gom.html +++ b/doc/gom.html @@ -12,10 +12,23 @@
  • Main functions
  • Object model structure
  • + -
  • User data
    -
    + + + +
  • Other functions
  • + + @@ -31,12 +44,12 @@ There are two ways to start with a GEDCOM object model (after having called gedcom_init): either by starting from scratch, or by starting from a given GEDCOM file.  This is done via the following two functions:
    -
    int gom_parse_file (const char* file_name)
    +
    int gom_parse_file (const char* file_name);
    This initializes the object model by parsing the GEDCOM file given by file_name.  It returns 0 on success and 1 on failure.
    -
    int gom_new_model ()
    +
    int gom_new_model ();
    This starts an empty model.  Actually, this is done by processing the file "new.ged" in the gedcom-parse data directory.
    @@ -63,12 +76,14 @@ struct XXX*   gom_get_XXX_by_xref(char* xref);

    -
    The XXX stands for one of the following: family, individual, multimedia, note, repository, source, submitter, user_rec.
    +
    The XXX stands for one of the following: family, individual, multimedia, note, repository, source, submitter, user_rec.
    +

    Object model structure

    - +

    Object lists
    +

    All records of a certain type are linked together in a linked list.  The above functions only give access to the first record of each linked list.  The others can be accessed by traversing the linked list via the next member of the structs.  This means that e.g. the following piece of code will traverse the linked list of family records:
    @@ -83,7 +98,7 @@ The next member of the last element in the list is guaranteed to ha Actually, the linked list is a doubly-linked list: each record also has a previous member.  But for implementation reasons the behaviour of this previous member on the edges of the linked list will not be guaranteed, i.e. it can be circular or terminated with NULL, no assumptions can be made in the application code.

    This linked-list model applies also to all sub-structures of the main record structs, i.e. each struct that has a next and previous -member following the above conventions.  This means that the following +member follows the above conventions.  This means that the following piece of code traverses all children of a family (see the details of the different structs here):
    struct family* fam = ...;
    @@ -93,8 +108,9 @@ for (xrl = fam->children ; xrl ; xrl = xrl->next) {
      ...
    }

    -Note that all character strings in the object model are encoded in UTF-8 (Why UTF-8?).
    -

    User data

    +Note that all character strings in the object model are encoded in UTF-8 (Why UTF-8?), but see below for how to convert these automatically.
    +

    User data

    + Each of the structs has an extra member called extra (of type struct user_data*).  This gathers all non-standard GEDCOM tags within the scope of the struct @@ -107,9 +123,53 @@ tags is.  Each element of the linked list has:
    This way, none of the information in the GEDCOM file is lost, even the non-standard information.
    -
    - +
    +
    +

    Other functions

    +

    Manipulating strings
    +

    +There are some functions available to retrieve and change strings in the +Gedcom object model, depending whether you use UTF-8 strings in your application +or locale-defined strings.
    +
    +The following functions retrieve and set the string in UTF-8 encoding:
    +
    char* gom_get_string (char* data);
    +char* gom_set_string (char** data, const char* utf8_str);

    +
    +The first function is in fact superfluous, because it just returns the data, but it is there for symmetry with the functions given below for the locale-defined input and output.  
    +
    +The second function returns the new value if successful, or NULL +if an error occurred (e.g. failure to allocate memory).  It makes a +copy of the input string to store it in the object model.  It also takes +care of deallocating the old value of the data if needed.  Note that +the set function needs the address of the data variable, to be able to modify +it.
    +
    +Examples of use of these strings would be, e.g. for retrieving and setting the system ID in the header:
    +
    struct header* head = gom_get_header();
    + char* oldvalue = gom_get_string(head->source.id);
    +char* newvalue = "My_Gedcom_Tool";
    +

    + if (gom_set_string(&head->source.id, newvalue)) {
    +  printf("Modified system id from %s to %s\n", oldvalue, newvalue);
    +}

    +
    +
    +A second couple of functions retrieve and set the string in the format defined by the current locale:
    +
    char* gom_get_string_for_locale (char* data, int* conversion_failures);
    +char* gom_set_string_for_locale (char** data, const char* locale_str)
    ;
    +
    +The use of these functions is the same as the previous ones, but e.g. in +the "en_US" locale the string will be returned by the first function in the +ISO-8859-1 encoding and the second function expects the locale_str to be in this encoding.  Conversion to and from UTF-8 for the object model is done on the fly.
    +
    +Since the conversion from UTF-8 to the locale encoding is not always possible, +the get function has a second parameter that can return the number of conversion +failures for the result string.  Pass a pointer to an integer if you +want to know this.  You can pass NULL if you're not interested.
    +
    $Id$
    $Name$

    +
                        
    @@ -118,4 +178,5 @@ This way, none of the information in the GEDCOM file is lost, even the non-stand


    +
    \ No newline at end of file -- 2.30.2