From a6f453d612fe285d2585b689e3e4d675de455510 Mon Sep 17 00:00:00 2001 From: Peter Verthez Date: Mon, 31 Dec 2001 15:55:40 +0000 Subject: [PATCH] More documentation. --- doc/index.html | 41 +- doc/interface.html | 1880 ++++++++++++++++++++++++++++++++++++++++++++ doc/links.html | 10 + doc/usage.html | 606 ++++++++------ 4 files changed, 2260 insertions(+), 277 deletions(-) create mode 100644 doc/interface.html create mode 100644 doc/links.html diff --git a/doc/index.html b/doc/index.html index 09858f3..6fa2fde 100644 --- a/doc/index.html +++ b/doc/index.html @@ -2,35 +2,36 @@ The GEDCOM parser library - + - +

The GEDCOM parser library

- This is the documentation for the GEDCOM parser library, release @VERSION@.
+ This is the documentation for the GEDCOM parser library, release @VERSION@.
+
+ The GEDCOM parser library is a C library that provides an API to applications + to parse and process arbitrary genealogy files in the standard gedcom format. +  It supports release 5.5 + of the GEDCOM standard.

- The GEDCOM parser library is a C library that provides an API to applications - to parse and process arbitrary genealogy files in the standard gedcom format. - It supports release 5.5 - of the GEDCOM standard.
-
- The rest of the documentation is divided into three parts:
- + The rest of the documentation is divided into three parts:
+ - -
$Id$
- $Name$
-
- + +
$Id: index.html,v 1.1 2001/12/30 22:45:43 verthezp +Exp $
+ $Name$
+
+ diff --git a/doc/interface.html b/doc/interface.html new file mode 100644 index 0000000..46ccd08 --- /dev/null +++ b/doc/interface.html @@ -0,0 +1,1880 @@ + + + + Libgedcom interface details + + + + + +

Libgedcom interface details

+
+ +

Index

+ + +
+ +
+

Record identifiers

+ The following table describes the identifiers to be used in the record callbacks. + The last column gives the Gedcom_val + type of the xref argument in the header start callback.
+
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
Record
+
Meaning
+
Possible
+  xref types

+
REC_HEAD
+
The header of the GEDCOM file
+
NULL
+
REC_FAM
+
A record describing a family
+
STRING
+
REC_INDI
+
A record describing an individual
+
STRING
+
REC_OBJE
+
A record describing a multimedia object
+
STRING
+
REC_NOTE
+
A record describing a note
+
STRING
+
REC_REPO
+
A record describing a source repository
+
STRING
+
REC_SOUR
+
A record describing a source
+
STRING
+
REC_SUBN
+
A record describing the submission
+
STRING
+
REC_SUBM
+
A record describing the submitter
+
STRING
+
REC_USER
+
An application-specific record (the tag + in the start callback contains the actually used tag).
+
NULL
+ STRING
+
+ +
+

Element identifiers

+The following table describes the identifiers to be used in the element callbacks. + The last column gives the +Gedcom_val type of the val argument in the element +start callback.  (TO BE COMPLETED)
+

Element
+
Possible
+tags
+

+
Used within
+
Possible
+ val types
+

+
ELT_HEAD_SOUR
+
SOUR
+
REC_HEAD
+
STRING
+
ELT_HEAD_SOUR_VERS
+
VERS
+
ELT_HEAD_SOUR
+
STRING
+
ELT_HEAD_SOUR_NAME
+
NAME
+
ELT_HEAD_SOUR
+
STRING
+
ELT_HEAD_SOUR_CORP
+
CORP
+
ELT_HEAD_SOUR
+
STRING
+
ELT_HEAD_SOUR_DATA
+
DATA
+
ELT_HEAD_SOUR
+
STRING
+
ELT_HEAD_SOUR_DATA_DATE
+
DATE
+
ELT_HEAD_SOUR_DATA
+
DATE
+
ELT_HEAD_SOUR_DATA_COPR
+
COPR
+
ELT_HEAD_SOUR_DATA
+
STRING
+
ELT_HEAD_DEST
+
DEST
+
REC_HEAD
+
STRING
+
ELT_HEAD_DATE
+
DATE
+
REC_HEAD
+
DATE
+
ELT_HEAD_DATE_TIME
+
TIME
+
ELT_HEAD_DATE
+
STRING
+
ELT_HEAD_SUBM
+
SUBM
+
REC_HEAD
+
STRING
+
ELT_HEAD_SUBN
+
SUBN
+
REC_HEAD
+
STRING
+
ELT_HEAD_FILE
+
FILE
+
REC_HEAD
+
STRING
+
ELT_HEAD_COPR
+
COPR
+
REC_HEAD
+
STRING
+
ELT_HEAD_GEDC
+
GEDC
+
REC_HEAD
+
NULL
+
ELT_HEAD_GEDC_VERS
+
VERS
+
ELT_HEAD_GEDC
+
STRING
+
ELT_HEAD_GEDC_FORM
+
FORM
+
ELT_HEAD_GEDC
+
STRING
+
ELT_HEAD_CHAR
+
CHAR
+
REC_HEAD
+
STRING
+
ELT_HEAD_CHAR_VERS
+
VERS
+
ELT_HEAD_CHAR
+
STRING
+
ELT_HEAD_LANG
+
LANG
+
REC_HEAD
+
STRING
+
ELT_HEAD_PLAC
+
PLAC
+
REC_HEAD
+
NULL
+
ELT_HEAD_PLAC_FORM
+
FORM
+
ELT_HEAD_PLAC
+
STRING
+
ELT_HEAD_NOTE
+
NOTE
+
REC_HEAD
+
STRING
+
ELT_FAM_HUSB
+
HUSB
+
REC_FAM
+
STRING
+
ELT_FAM_WIFE
+
WIFE
+
REC_FAM
+
STRING
+
ELT_FAM_CHIL
+
CHIL
+
REC_FAM
+
STRING
+
ELT_FAM_NCHI
+
NCHI
+
REC_FAM
+
STRING
+
ELT_FAM_SUBM
+
SUBM
+
REC_FAM
+
STRING
+
ELT_INDI_RESN
+
RESN
+
REC_INDI
+
STRING
+
ELT_INDI_SEX
+
SEX
+
REC_INDI
+
STRING
+
ELT_INDI_SUBM
+
SUBM
+
REC_INDI
+
STRING
+
ELT_INDI_ALIA
+
ALIA
+
REC_INDI
+
STRING
+
ELT_INDI_ANCI
+
ANCI
+

+
STRING
+
ELT_INDI_DESI
+
DESI
+

+
STRING
+
ELT_INDI_RFN
+
RFN
+

+
STRING
+
ELT_INDI_AFN
+
AFN
+

+
STRING
+
ELT_OBJE_FORM
+
FORM
+

+
STRING
+
ELT_OBJE_TITL
+
TITL
+

+
STRING
+
ELT_OBJE_BLOB
+
BLOB
+

+
NULL
+
ELT_OBJE_BLOB_CONT
+
CONT
+

+
STRING
+
ELT_OBJE_OBJE
+
OBJE
+

+
STRING
+
ELT_REPO_NAME
+
NAME
+

+
STRING
+
ELT_SOUR_DATA
+
DATA
+

+
NULL
+
ELT_SOUR_DATA_EVEN
+
EVEN
+

+
STRING
+
ELT_SOUR_DATA_EVEN_DATE
+
DATE
+

+
DATE
+
ELT_SOUR_DATA_EVEN_PLAC
+
PLAC
+

+
STRING
+
ELT_SOUR_DATA_AGNC
+
AGNC
+

+
STRING
+
ELT_SOUR_AUTH
+
AUTH
+

+
STRING
+
ELT_SOUR_TITL
+
TITL
+

+
STRING
+
ELT_SOUR_ABBR
+
ABBR
+

+
STRING
+
ELT_SOUR_PUBL
+
PUBL
+

+
STRING
+
ELT_SOUR_TEXT
+
TEXT
+

+
STRING
+
ELT_SUBN_SUBM
+
SUBM
+

+
STRING
+
ELT_SUBN_FAMF
+
FAMF
+

+
STRING
+
ELT_SUBN_TEMP
+
TEMP
+

+
STRING
+
ELT_SUBN_ANCE
+
ANCE
+

+
STRING
+
ELT_SUBN_DESC
+
DESC
+

+
STRING
+
ELT_SUBN_ORDI
+
ORDI
+

+
STRING
+
ELT_SUBN_RIN
+
RIN
+

+
STRING
+
ELT_SUBM_NAME
+
NAME
+

+
STRING
+
ELT_SUBM_LANG
+
LANG
+

+
STRING
+
ELT_SUBM_RFN
+
RFN
+

+
STRING
+
ELT_SUBM_RIN
+
RIN
+

+
STRING
+
ELT_SUB_ADDR
+
ADDR
+

+
STRING
+
ELT_SUB_ADDR_CONT
+
CONT
+

+
STRING
+
ELT_SUB_ADDR_ADR1
+
ADR1
+

+
STRING
+
ELT_SUB_ADDR_ADR2
+
ADR2
+

+
STRING
+
ELT_SUB_ADDR_CITY
+
CITY
+

+
STRING
+
ELT_SUB_ADDR_STAE
+
STAE
+

+
STRING
+
ELT_SUB_ADDR_POST
+
POST
+

+
STRING
+
ELT_SUB_ADDR_CTRY
+
CTRY
+

+
STRING
+
ELT_SUB_PHON
+
PHON
+

+
STRING
+
ELT_SUB_ASSO
+
ASSO
+

+
STRING
+
ELT_SUB_ASSO_TYPE
+
TYPE
+

+
STRING
+
ELT_SUB_ASSO_RELA
+
RELA
+

+
STRING
+
ELT_SUB_CHAN
+
CHAN
+

+
NULL
+
ELT_SUB_CHAN_DATE
+
DATE
+

+
DATE
+
ELT_SUB_CHAN_TIME
+
TIME
+

+
STRING
+
ELT_SUB_FAMC
+
FAMC
+

+
STRING
+
ELT_SUB_FAMC_PEDI
+
PEDI
+

+
STRING
+
ELT_SUB_CONT
+
CONT
+

+
STRING
+
ELT_SUB_CONC
+
CONC
+

+
STRING
+
ELT_SUB_EVT_TYPE
+
TYPE
+

+
STRING
+
ELT_SUB_EVT_DATE
+
DATE
+

+
DATE
+
ELT_SUB_EVT_AGE
+
AGE
+

+
STRING
+
ELT_SUB_EVT_AGNC
+
AGNC
+

+
STRING
+
ELT_SUB_EVT_CAUS
+
CAUS
+

+
STRING
+
ELT_SUB_FAM_EVT
+
ANUL, CENS, DIV,
+DIVF, ENGA, MARR,
+MARB, MARC, MARL,
+MARS

+

+
NULL
+STRING

+
ELT_SUB_FAM_EVT_HUSB
+
HUSB
+

+
NULL
+
ELT_SUB_FAM_EVT_WIFE
+
WIFE
+

+
NULL
+
ELT_SUB_FAM_EVT_AGE
+
AGE
+

+
STRING
+
ELT_SUB_FAM_EVT_EVEN
+
EVEN
+

+
NULL
+
ELT_SUB_IDENT_REFN
+
REFN
+

+
STRING
+
ELT_SUB_IDENT_REFN_TYPE
+
TYPE
+

+
STRING
+
ELT_SUB_IDENT_RIN
+
RIN
+

+
STRING
+
ELT_SUB_INDIV_ATTR
+
CAST, DSCR, EDUC,
+IDNO, NATI, NCHR,
+NMR, OCCU, PROP,
+RELI, SSN, TITL

+

+
STRING
+
ELT_SUB_INDIV_RESI
+
RESI
+

+
NULL
+
ELT_SUB_INDIV_BIRT
+
BIRT, CHR
+

+
NULL
+STRING

+
ELT_SUB_INDIV_BIRT_FAMC
+
FAMC
+

+
STRING
+
ELT_SUB_INDIV_GEN
+
DEAT, BURI, CREM,
+BAPM, BARM, BASM,
+BLES, CHRA, CONF,
+FCOM, ORDN, NATU,
+EMIG, IMMI, CENS,
+PROB, WILL, GRAD,
+RETI

+

+
NULL
+STRING

+
ELT_SUB_INDIV_ADOP
+
ADOP
+

+
NULL
+STRING

+
ELT_SUB_INDIV_ADOP_FAMC
+
FAMC
+

+
STRING
+
ELT_SUB_INDIV_ADOP_FAMC_ADOP
+
ADOP
+

+
STRING
+
ELT_SUB_INDIV_EVEN
+
EVEN
+

+
NULL
+
ELT_SUB_LIO_BAPL
+
BAPL, CONL, ENDL
+

+
NULL
+
ELT_SUB_LIO_BAPL_STAT
+
STAT
+

+
STRING
+
ELT_SUB_LIO_BAPL_DATE
+
DATE
+

+
DATE
+
ELT_SUB_LIO_BAPL_TEMP
+
TEMP
+

+
STRING
+
ELT_SUB_LIO_BAPL_PLAC
+
PLAC
+

+
STRING
+
ELT_SUB_LIO_SLGC
+
SLGC
+

+
NULL
+
ELT_SUB_LIO_SLGC_FAMC
+
FAMC
+

+
STRING
+
ELT_SUB_LSS_SLGS
+
SLGS
+

+
NULL
+
ELT_SUB_LSS_SLGS_STAT
+
STAT
+

+
STRING
+
ELT_SUB_LSS_SLGS_DATE
+
DATE
+

+
DATE
+
ELT_SUB_LSS_SLGS_TEMP
+
TEMP
+

+
STRING
+
ELT_SUB_LSS_SLGS_PLAC
+
PLAC
+

+
STRING
+
ELT_SUB_MULTIM_OBJE
+
OBJE
+

+
NULL
+
ELT_SUB_MULTIM_OBJE_FORM
+
FORM
+

+
STRING
+
ELT_SUB_MULTIM_OBJE_TITL
+
TITL
+

+
STRING
+
ELT_SUB_MULTIM_OBJE_FILE
+
FILE
+

+
STRING
+
ELT_SUB_NOTE
+
NOTE
+

+
NULL
+STRING

+
ELT_SUB_PERS_NAME
+
NAME
+

+
STRING
+
ELT_SUB_PERS_NAME_NPFX
+
NPFX
+

+
STRING
+
ELT_SUB_PERS_NAME_GIVN
+
GIVN
+

+
STRING
+
ELT_SUB_PERS_NAME_NICK
+
NICK
+

+
STRING
+
ELT_SUB_PERS_NAME_SPFX
+
SPFX
+

+
STRING
+
ELT_SUB_PERS_NAME_SURN
+
SURN
+

+
STRING
+
ELT_SUB_PERS_NAME_NSFX
+
NSFX
+

+
STRING
+
ELT_SUB_PLAC
+
PLAC
+

+
STRING
+
ELT_SUB_PLAC_FORM
+
FORM
+

+
STRING
+
ELT_SUB_SOUR
+
SOUR
+

+
STRING
+
ELT_SUB_SOUR_PAGE
+
PAGE
+

+
STRING
+
ELT_SUB_SOUR_EVEN
+
EVEN
+

+
STRING
+
ELT_SUB_SOUR_EVEN_ROLE
+
ROLE
+

+
STRING
+
ELT_SUB_SOUR_DATA
+
DATA
+

+
NULL
+
ELT_SUB_SOUR_DATA_DATE
+
DATE
+

+
DATE
+
ELT_SUB_SOUR_TEXT
+
TEXT
+

+
STRING
+
ELT_SUB_SOUR_QUAY
+
QUAY
+

+
STRING
+
ELT_SUB_REPO
+
REPO
+

+
STRING
+
ELT_SUB_REPO_CALN
+
CALN
+

+
STRING
+
ELT_SUB_REPO_CALN_MEDI
+
MEDI
+

+
STRING
+
ELT_SUB_FAMS
+
FAMS
+

+
STRING
+
ELT_USER
+
any tag starting
+with an underscore

+

+
NULL
+STRING

+
+ +
+

Gedcom_val types
+

+ Currently, the specific Gedcom_val types are (with val + of type Gedcom_val):
+
+ + + + + + + + + + + + + + + + + + + + + + + +

+
type checker
+
cast operator
+
null value
+
GEDCOM_IS_NULL(val)
+
N/A
+
string
+
GEDCOM_IS_STRING(val)
+
char* str = GEDCOM_STRING(val);
+
date
+
GEDCOM_IS_DATE(val)
+
struct date_value dv = GEDCOM_DATE(val);
+
+
+ The type checker returns a true or a false value according to the type +of the value, but this is in principle only necessary in the rare circumstances +that two types are possible, or where an optional value can be provided.  In +most cases, the type is fixed for a specific tag.
+
+ The null value is used for when the GEDCOM spec doesn't allow a value, or +when an optional value is allowed but none is given.
+  
+ The string value is the most general used value currently, for all those +values that don't have a more specific meaning.  In essence, the value +that is returned by GEDCOM_STRING is always the same as the raw_value passed +to the start callback, and is thus in fact redundant.
+
+ The date value is used for all elements that return a date.  (Description +of struct date_value TBD: look in the header file for the moment).
+
+

struct date_value

+This struct describes a date as given in the GEDCOM file, and has the following +definition:
+
struct date_value {
+  Date_value_type  type;
+  struct date      date1;
+  struct date      date2;
+  char              phrase[MAX_PHRASE_LEN ++ 1];
+};

+
+ It depends on the first member, the type, which members are actually relevant:
+
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
Date_value_type
+
Meaning
+
Relevant members
+
DV_NO_MODIFIER
+
just a simple date
+
date1
+
DV_BEFORE
+
a range (BEFORE date1)
+
date1
+
DV_AFTER
+
a range (AFTER date1)
+
date1
+
DV_BETWEEN
+
a range (BETWEEN date1 AND date2)
+
date1, date2
+
DV_FROM
+
a period (FROM date1)
+
date1
+
DV_TO
+
a period (TO date1)
+
date1
+
DV_FROM_TO
+
a period (FROM date1 TO date2)
+
date1, date2
+
DV_ABOUT
+
an approximation (ABOUT date1)
+
date1
+
DV_CALCULATED
+
an approximation (CALCULATED date1)
+
date1
+
DV_ESTIMATED
+
an approximation (ESTIMATED date1)
+
date1
+
DV_INTERPRETED
+
INTERPRETED date1 FROM a given free form date phrase
+
date1, phrase
+
DV_PHRASE
+
a free form date phrase
+
phrase
+
+
+

struct date
+

+The date1 and date2 also have a strict syntax:
+
struct date {
+  Calendar_type  cal;
+  char           day_str[MAX_DAY_LEN + 1];
+  char           month_str[MAX_MONTH_LEN + +1];
+  char           year_str[MAX_YEAR_LEN + 1];
+
+  int            day;
+  int            month;
+  int            year;
+  Year_type      year_type;
+
+  Date_type      type;
+  long int       sdn1;
+  long int       sdn2;
+};

+
+The first four fields are the primary fields parsed from the value in the +GEDCOM file.  The day_str, month_str and +year_str are the literal parts of the date that denote the day, month +and year.  The calendar type cal is one of (see calendar +overview LINK TBD):
+ +The next four fields are deduced from the first four:
+ +It is possible that the year_str is given as e.g. "1677/78". + This is coming from a date in a so called "annunciation style", where +the year began on 25 March, so that "20 March 1677/78" is 20 March 1677 in +"annunciation style" and 20 March 1678 in "circumcision style" (the current +style).  See calendar overview (LINK TBD).
+
+In this case, the year will contain the "circumcision style" +year (1678 in the example), and year_type will be YEAR_DOUBLE. +  Normal dates will have a year_type equal to YEAR_SINGLE +.
+
+Finally, the last three fields are probably the most interesting values for +applications that want to process dates.  Basically, the date is converted +to a serial day number (aka Julian day), which is the unique day number since +November 25, 4714 BC in the Gregorian calendar.  The advantage of these +day numbers is that they are unique and independent of the calendar system. + Furthermore, date differences can just be computed by subtracting the +serial day numbers.
+
+However, since dates in GEDCOM are not necessarily exact (e.g. "MAR 1990"), +it is not possible to represent all GEDCOM dates with 1 serial day number. + Two cases can be distinguished:
+ +
+
These are represented by a serial day number in sdn1 + and a Date_type equal to DATE_EXACT.
+
+
+ +
+
These are represented by 2 serial day numbers ( +sdn1 and sdn2) and a Date_type equal to + DATE_BOUNDED.
+
+For example, the Gregorian date "MAR 1990" is represented by the serial day +numbers for "1 MAR 1990" and "31 MAR 1990", and the Gregorian date "1990" +is represented by the serial day numbers for "1 JAN 1990" and "31 DEC 1990". + Similarly for the other calendar types.
+
+
+
+
$Id$
+ $Name$
+
+ + + diff --git a/doc/links.html b/doc/links.html new file mode 100644 index 0000000..c0018d7 --- /dev/null +++ b/doc/links.html @@ -0,0 +1,10 @@ + + + + GEDCOM links + + + +TO BE COMPLETED + + diff --git a/doc/usage.html b/doc/usage.html index f47684f..3d13ab0 100644 --- a/doc/usage.html +++ b/doc/usage.html @@ -2,306 +2,398 @@ Using the GEDCOM parser library - + - - + +

Using the GEDCOM parser library

-
- +
+

Index

+ -
-

Overview
-

- The GEDCOM parser library is built as a callback-based parser (comparable -to the SAX interface of XML).  It comes with:
+
+

Overview
+

+ The GEDCOM parser library is built as a callback-based parser (comparable + to the SAX interface of XML).  It comes with:
+ - Next to these, there is also a data directory in $PREFIX/share/gedcom-parse - that contains some additional stuff, but which is not immediately important -at first.  I'll leave the description of the data directory for later.
-
- The very simplest call of the gedcom parser is simply the following piece -of code (include of the gedcom header is assumed, as everywhere in this manual):
- + Next to these, there is also a data directory in $PREFIX/share/gedcom-parse + that contains some additional stuff, but which is not immediately important + at first.  I'll leave the description of the data directory for later.
+
+ The very simplest call of the gedcom parser is simply the following piece + of code (include of the gedcom header is assumed, as everywhere in this +manual):
+
int result;
- ...
- result = gedcom_parse_file("myfamily.ged");
-
- Although this will not provide much information, one thing it does is parse -the entire file and return the result.  The function returns 0 on success -and 1 on failure.  No other information is available using this function + ...
+ result = gedcom_parse_file("myfamily.ged");
+ + Although this will not provide much information, one thing it does is parse +the entire file and return the result.  The function returns 0 on success +and 1 on failure.  No other information is available using this function only.
-
-The next sections will refine this to be able to have meaningful errors and -the actual data that is in the file.
-
+
+ The next sections will refine this to be able to have meaningful errors +and the actual data that is in the file.
+ +

Error handling

-Since this is a relatively simple topic, it is discussed before the actual + Since this is a relatively simple topic, it is discussed before the actual callback mechanism, although it also uses a callback...
-
-The library can be used in several different circumstances, both terminal-based -as GUI-based.  Therefore, it leaves the actual display of the error -message up to the application.  For this, the application needs to register -a callback before parsing the GEDCOM file, which will be called by the library +
+ The library can be used in several different circumstances, both terminal-based +as GUI-based.  Therefore, it leaves the actual display of the error message +up to the application.  For this, the application needs to register a +callback before parsing the GEDCOM file, which will be called by the library on errors, warnings and messages.
-
-A typical piece of code would be:
-
void my_message_handler (Gedcom_msg_type type, +
+ A typical piece of code would be:
+ +
void my_message_handler (Gedcom_msg_type type, char *msg)
-{
-  ...
-}
-...
- gedcom_set_message_handler(my_message_handler);
-...
-result = gedcom_parse_file("myfamily.ged");

-
-In the above piece of code, my_message_handler is the callback + {
+   ...
+ }
+ ...
+ gedcom_set_message_handler(my_message_handler);
+ ...
+ result = gedcom_parse_file("myfamily.ged");

+
+ In the above piece of code, my_message_handler is the callback that will be called for errors (type=ERROR), warnings ( -type=WARNING) and messages (type=MESSAGE).  The -callback must have the signature as in the example.  For errors, the - msg passed to the callback will have the format:
+type=WARNING) and messages (type=MESSAGE).  The callback +must have the signature as in the example.  For errors, the +msg passed to the callback will have the format:
+
Error on line <lineno>: <actual_message>
-
-Note that the entire string will be properly internationalized, and encoded -in UTF-8 (see "Why UTF-8?"  LINK TBD).  Also, no newline -is appended, so that the application program can use it in any way it wants. - Warnings are similar, but use "Warning" instead of "Error".  Messages + + Note that the entire string will be properly internationalized, and encoded +in UTF-8 (see "Why UTF-8?"  LINK TBD).  Also, no newline +is appended, so that the application program can use it in any way it wants. + Warnings are similar, but use "Warning" instead of "Error".  Messages are plain text, without any prefix.
-
-With this in place, the resulting code will already show errors and warnings +
+ With this in place, the resulting code will already show errors and warnings produced by the parser, e.g. on the terminal if a simple printf - is used in the message handler.
-
+ is used in the message handler.
+ +

Data callback mechanism

-The most important use of the parser is of course to get the data out of -the GEDCOM file.  As already mentioned, the parser uses a callback mechanism + The most important use of the parser is of course to get the data out of +the GEDCOM file.  As already mentioned, the parser uses a callback mechanism for that.  In fact, the mechanism involves two levels.
-
-The primary level is that each of the sections in a GEDCOM file is notified -to the application code via a "start element" callback and an "end element" -callback (much like in a SAX interface for XML), i.e. when a line containing -a certain tag is parsed, the "start element" callback is called for that -tag, and when all its subordinate lines with their tags have been processed, -the "end element" callback is called for the original tag.  Since GEDCOM -is hierarchical, this results in properly nested calls to appropriate "start +
+ The primary level is that each of the sections in a GEDCOM file is notified +to the application code via a "start element" callback and an "end element" +callback (much like in a SAX interface for XML), i.e. when a line containing +a certain tag is parsed, the "start element" callback is called for that tag, +and when all its subordinate lines with their tags have been processed, the +"end element" callback is called for the original tag.  Since GEDCOM +is hierarchical, this results in properly nested calls to appropriate "start element" and "end element" callbacks.
-
-However, it would be typical for a genealogy program to support only a subset -of the GEDCOM standard, certainly a program that is still under development. - Moreover, under GEDCOM it is allowed for an application to define its -own tags, which will typically not  be supported by another application. - Still, in that case, data preservation is important; it would hardly -be accepted that information that is not understood by a certain program -is just removed.
-
-Therefore, the second level of callbacks involves a "default callback".  An -application needs to subscribe to callbacks for tags it does support, and -need to provide a "default callback" which will be called for tags it doesn't -support.  The application can then choose to just store the information -that comes via the default callback in plain textual format.
-
-After this introduction, let's see what the API looks like...
-
+
+ However, it would be typical for a genealogy program to support only a subset +of the GEDCOM standard, certainly a program that is still under development. + Moreover, under GEDCOM it is allowed for an application to define its +own tags, which will typically not  be supported by another application. + Still, in that case, data preservation is important; it would hardly +be accepted that information that is not understood by a certain program is +just removed.
+
+ Therefore, the second level of callbacks involves a "default callback". + An application needs to subscribe to callbacks for tags it does support, +and need to provide a "default callback" which will be called for tags it +doesn't support.  The application can then choose to just store the +information that comes via the default callback in plain textual format.
+
+ After this introduction, let's see what the API looks like...
+
+

Start and end callbacks

+

Callbacks for records
-

-As a simple example, we will get some information from the header of a GEDCOM + + As a simple example, we will get some information from the header of a GEDCOM file.  First, have a look at the following piece of code:
-
Gedcom_ctxt my_header_start_cb (int level, + +
Gedcom_ctxt my_header_start_cb (int level, Gedcom_val xref, char *tag)
-{
-  printf("The header starts\n");
-  return (Gedcom_ctxt)1;
-}
-
-void my_header_end_cb (Gedcom_ctxt self)
-{
-  printf("The header ends, context is %d\n", self);   /* context + {
+   printf("The header starts\n");
+   return (Gedcom_ctxt)1;
+ }
+
+ void my_header_end_cb (Gedcom_ctxt self)
+ {
+   printf("The header ends, context is %d\n", self);   /* context will print as "1" */
-}
-
-...
- gedcom_subscribe_to_record(REC_HEAD, my_header_start_cb, my_header_end_cb);
-...
-result = gedcom_parse_file("myfamily.ged");

-
- Using the gedcom_subscribe_to_record function, the application -requests to use the specified callbacks as start and end callback. The end -callback is optional: you can pass NULL if you are not interested -in the end callback.  The identifiers to use as first argument to the -function (here REC_HEAD) are described in TBD (use the header -file for now...).
-
-From the name of the function it becomes clear that this function is specific -to complete records.  For the separate elements in records there is -another function, which we'll see shortly.  Again, the callbacks need -to have the signatures as shown in the example.
-
-The Gedcom_ctxt type that is used as a result of the start callback -and as an argument to the end callback is vital for passing context necessary -for the application.  This type is meant to be opaque; in fact, it's -a void pointer, so you can pass anything via it.  The important thing -to know is that the context that the application returns in the start callback -will be passed in the end callback as an argument, and as we will see shortly, -also to all the directly subordinate elements of the record.
-
-The example passes a simple integer as context, but an application could -e.g. pass a struct that will contain the information for the -header.  In the end callback, the application could then e.g. do some + }
+
+ ...
+ gedcom_subscribe_to_record(REC_HEAD, my_header_start_cb, +my_header_end_cb);
+ ...
+ result = gedcom_parse_file("myfamily.ged");

+
+ Using the gedcom_subscribe_to_record function, the application +requests to use the specified callbacks as start and end callback. The end +callback is optional: you can pass NULL if you are not interested +in the end callback.  The identifiers to use as first argument to the +function (here REC_HEAD) are described in the +interface details.
+
+ From the name of the function it becomes clear that this function is specific +to complete records.  For the separate elements in records there is another +function, which we'll see shortly.  Again, the callbacks need to have +the signatures as shown in the example.
+
+ The Gedcom_ctxt type that is used as a result of the start +callback and as an argument to the end callback is vital for passing context +necessary for the application.  This type is meant to be opaque; in +fact, it's a void pointer, so you can pass anything via it.  The important +thing to know is that the context that the application returns in the start +callback will be passed in the end callback as an argument, and as we will +see shortly, also to all the directly subordinate elements of the record.
+
+ The example passes a simple integer as context, but an application could +e.g. pass a struct that will contain the information for the +header.  In the end callback, the application could then e.g. do some finalizing operations on the struct to put it in its database.
-
-(Note that the Gedcom_val type for the xref argument +
+ (Note that the Gedcom_val type for the xref argument was not discussed, see further for this)
-
+
+

Callbacks for elements

-We will now retrieve the SOUR field (the name of the program that wrote the -file) from the header:
-
Gedcom_ctxt my_header_source_start_cb(Gedcom_ctxt + We will now retrieve the SOUR field (the name of the program that wrote +the file) from the header:
+ +
Gedcom_ctxt my_header_source_start_cb(Gedcom_ctxt parent,
-                      -                int     +                       +                int         level,
-                      -                char*     +                       +                char*       tag,
-                      -                char*     +                       +                char*       raw_value,
-                      +                                       Gedcom_val  parsed_value)
-{
-  char *source = GEDCOM_STRING(parsed_value);
-  printf("This file was written by %s\n", source);
-  return parent;
-}
-
-void my_header_source_end_cb(Gedcom_ctxt parent,
-                      + {
+   char *source = GEDCOM_STRING(parsed_value);
+   printf("This file was written by %s\n", source);
+   return parent;
+ }
+
+ void my_header_source_end_cb(Gedcom_ctxt parent,
+                              Gedcom_ctxt self,
-                      +                              Gedcom_val  parsed_value)
-{
-  printf("End of the source description\n");
-}
-
-...
- gedcom_subscribe_to_element(ELT_HEAD_SOUR,
-                      + {
+   printf("End of the source description\n");
+ }
+
+ ...
+ gedcom_subscribe_to_element(ELT_HEAD_SOUR,
+                             my_header_source_start_cb,
-                      +                             my_header_source_end_cb);
-...
-result = gedcom_parse_file("myfamily.ged");

-
-The subscription mechanism for elements is similar, only the signatures of -the callbacks differ.  The signature for the start callback shows that -the context of the parent line (e.g. the struct that describes -the header) is passed to this start callback.  The callback itself returns -here the same context, but this can be its own context object of course. - The end callback is called with both the context of the parent and -the context of itself, which will be the same in the example.
-
-If we look at the other arguments of the start callback, we see the level -number (the initial number of the line in the GEDCOM file), the tag (e.g. -"SOUR"), and then a raw value and a parsed value.  The raw value is -just the raw string that occurs as value on the line next to the tag (in -UTF-8 encoding).  The parsed value is the meaningful value that is parsed -from that raw string.
-
-The Gedcom_val type is meant to be an opaque type.  The -only thing that needs to be known about it is that it can contain specific -data types, which have to be retrieved from it using pre-defined macros. - Currently, the specific types are (with val of type -Gedcom_val):
-
- - - - - - - - - - - - - - - - - - - - - - - -

-
type checker
-
cast operator
-
null value
-
GEDCOM_IS_NULL(val)
-
N/A
-
string
-
GEDCOM_IS_STRING(val)
-
char* str = GEDCOM_STRING(val);
-
date
-
GEDCOM_IS_DATE(val)
-
struct date_value dv = GEDCOM_DATE(val) -;
-
-
-The null value is used for when the GEDCOM spec doesn't allow a value, or -when an optional value is allowed but none is given.

-The string value is the most general used value currently, for all those -values that don't have a more specific meaning.  In essence, the value -that is returned by GEDCOM_STRING is always the same as the raw_value passed -to the start callback, and is thus in fact redundant.
-
-The date value is used for all elements that return a date.  (Description -of struct date_value TBD: look in the header file for the moment).
-
-The type checker returns a true or a false value according to the type of -the value, but this is in principle only necessary in the rare circumstances -that two types are possible, or where an optional value can be provided. - In most cases, the type is fixed for a specific tag (types per tag -to be described).
+ ...
+ result = gedcom_parse_file("myfamily.ged");

+
+ The subscription mechanism for elements is similar, only the signatures +of the callbacks differ.  The signature for the start callback shows +that the context of the parent line (e.g. the struct that describes +the header) is passed to this start callback.  The callback itself returns +here the same context, but this can be its own context object of course.  The +end callback is called with both the context of the parent and the context +of itself, which will be the same in the example.  Again, the list of +identifiers to use as a first argument for the subscription function are +detailed in the interface details +.
+
+ If we look at the other arguments of the start callback, we see the level +number (the initial number of the line in the GEDCOM file), the tag (e.g. +"SOUR"), and then a raw value and a parsed value.  The raw value is just +the raw string that occurs as value on the line next to the tag (in UTF-8 +encoding).  The parsed value is the meaningful value that is parsed from +that raw string.
+
+ The Gedcom_val type is meant to be an opaque type.  The +only thing that needs to be known about it is that it can contain specific +data types, which have to be retrieved from it using pre-defined macros.  These +data types are described in the +interface details.

-Some extra notes:
+ Some extra notes:
+ +

Default callbacks
-

-TO BE COMPLETED
-
$Id$
- $Name$
-
- - - + + As described above, an application doesn't always implement the entire GEDCOM +spec, and application-specific tags may have been added by other applications. + To preserve this extra data anyway, a default callback can be registered +by the application, as in the following example:
+
void my_default_cb (Gedcom_ctxt parent, +int level, char* tag, char* raw_value)
+{
+  ...
+}
+
+...
+ gedcom_set_default_callback(my_default_cb);
+...
+result = gedcom_parse_file("myfamily.ged");

+
+ This callback has a similar signature as the previous ones, but +it doesn't contain a parsed value.  However, it does contain the parent +context, that was returned by the application for the most specific containing +tag that the application supported.
+
+Suppose e.g. that this callback is called for some tags in the header that +are specific to some other application, then our application could make sure +that the parent context contains the struct or object that represents the +header, and use the default callback here to add the level, tag and raw_value +as plain text in a member of that struct or object, thus preserving the information. + The application can then write this out when the data is saved again +in a GEDCOM file.  To make it more specific, consider the following +example:
+
struct header {
+  char* source;
+  ...
+  char* extra_text;
+};
+
+Gedcom_ctxt my_header_start_cb(int level, Gedcom_val xref, char* tag)
+{
+  struct header head = my_make_header_struct();
+  return (Gedcom_ctxt)head;
+}
+
+void my_default_cb(Gedcom_ctxt parent, int level, char* tag, char* raw_value)
+{
+  struct header head = (struct header)parent;
+  my_header_add_to_extra_text(head, level, tag, raw_value);
+}
+
+gedcom_set_default_callback(my_default_cb);
+gedcom_subscribe_to_record(REC_HEAD, my_header_start, NULL);
+...
+result = gedcom_parse_file(filename);

+
+Note that the default callback will be called for any tag that isn't specifically +subscribed upon by the application, and can thus be called in various contexts. + For simplicity, the example above doesn't take this into account (the + parent could be of different types, depending +on the context).
+
+

Other API functions
+

+Although the above describes the basic interface of libgedcom, there are +some other functions that allow to customize the behaviour of the library. + These will be explained in the current section.
+

Debugging

+The library can generate various debugging output, not only from itself, +but also the debugging output generated by the yacc parser.  By default, +no debugging output is generated, but this can be customized using the following +function:
+
void gedcom_set_debug_level (int level, +FILE* trace_output)
+
+The level can be one of the following values:
+ +If the trace_output is NULL, debugging information +will be written to stderr, otherwise the given file handle is +used (which must be open).
+
+

Error treatment

+One of the previous sections already described the callback to be registered +to get error messages.  The library also allows to customize what happens +on an error, using the following function:
+
void gedcom_set_error_handling (Gedcom_err_mech +mechanism)
+
+The mechanism can be one of:
+ +This doesn't influence the generation of error or warning messages, only +the behaviour of the parser and its return code.
+
+

Compatibility mode
+

+Applications are not necessarily true to the GEDCOM spec (or use a different +version than 5.5).  The intention is that the library is resilient to +this, and goes in compatibility mode for files written by specific programs +(detected via the HEAD.SOUR tag).  This compatibility mode can be enabled +and disabled via the following function:
+
void gedcom_set_compat_handling + (int enable_compat)
+
+The argument can be:
+ +Note that, currently, no actual compatibility code is present, but this is +on the to-do list.
+
$Id: usage.html,v 1.1 2001/12/30 +22:45:43 verthezp Exp $
+ $Name$
+
+ + + -- 2.30.2