X-Git-Url: https://git.dlugolecki.net.pl/?a=blobdiff_plain;f=doc%2Fparser.html;h=4ff35cde9321cb3ee8bea78d852a1efa4c9742f2;hb=851f227679c7528a2fc079f602a1553f217fa7b2;hp=15d253b856fe9b32ef9adea9b2495b409d423687;hpb=fdcc3ac19960afc4ee198a82cb397afa34ca6a67;p=gedcom-parse.git diff --git a/doc/parser.html b/doc/parser.html index 15d253b..4ff35cd 100644 --- a/doc/parser.html +++ b/doc/parser.html @@ -1,129 +1,124 @@
- + -- If everything goes OK, you'll see that some gedcom files are parsed, -and that each parse is successful. Note that the used gedcom files -are made by Heiner -Eichmann - and are an excellent way to test gedcom parsers thoroughly.make clean
- make
- make test
-
+ If everything goes OK, you'll see that some gedcom files are parsed, + and that each parse is successful. Note that the used gedcom files + are made by Heiner + Eichmann and are an excellent way to test gedcom parsers thoroughly../configure
+ make
+ make check
+
gedcom-parse
program that is generated
- by make test
. gedcom-parse
generates is
- in UTF-8 format (more on this later), some preparation is necessary to
+ The basic testing described above doesn't show anything else than
+"Parse succeeded", which is nice, but not very interesting. Some
+more detailed tests are possible, via the testgedcom
program
+that is generated by make test
. testgedcom
generates is
+ in UTF-8 format (more on this later), some preparation is necessary to
have a full view on it. Basically, you need a terminal that understands
-and can display UTF-8 encoded characters, and you need to proper fonts installed
- to display them. I'll give some advice on this here, based on the
-Red Hat 7.1 distribution that I use, with glibc 2.2 and XFree86 4.0.x. Any
- other distribution that has the same or newer versions for these components
- should give the same results.xterm
in its unicode mode (which is supported by the
+ and can display UTF-8 encoded characters, and you need to proper fonts installed
+ to display them. I'll give some advice on this here, based on the
+ Red Hat 7.1 distribution that I use, with glibc 2.2 and XFree86 4.0.x. Any
+ other distribution that has the same or newer versions for these components
+ should give the same results.xterm
in its unicode mode (which is supported by the
xterm
coming with XFree86 4.0.x). UTF-8 capabilities
- have only recently been added to gnome-terminal
, so probably
- that is not in your distribution yet (it certainly isn't in Red Hat 7.1).xterm
in unicode mode is then e.g. (put
- everything on 1 line !):- This first sets theLANG=en_GB.UTF-8 xterm -bg 'black' -fg 'DarkGrey' -cm - -fn '-Misc-Fixed-Medium-R-SemiCondensed--13-120-75-75-C-60-ISO10646-1'
-
LANG
variable to a locale that
-uses UTF-8, and then starts xterm
with a proper Unicode font.
- Some sample UTF-8 plain text files can be found
- here
- . Just cat
them on the command line and see the result.gnome-terminal
, so probably
+ that is not in your distribution yet (it certainly isn't in Red Hat 7.1).xterm
in unicode mode is then e.g. (put
+ everything on 1 line !):gedcom-parse
- program print the values that it parses. An example of a command
- line is (in the gedcom
directory):+ This will show all tokens in the./gedcom_parse -dg t/ulhc.ged
+- TheLANG=en_GB.UTF-8 xterm -bg 'black' -fg 'DarkGrey' -cm + -fn '-Misc-Fixed-Medium-R-SemiCondensed--13-120-75-75-C-60-ISO10646-1'
-dg
option instructs the parser to show its own debug - messages (see./gedcom_parse -h
for the full set of -options). If everything is OK, you'll see the values from the gedcom -file, containing a lot of special characters.
+ This first sets theLANG
variable to a locale that + uses UTF-8, and then startsxterm
with a proper Unicode font. + Some sample UTF-8 plain text files can be found + here . Justcat
them on the command line +and see the result.
- For the ANSEL test file (t/ansel.ged
), you have to set the - environment variableGCONV_PATH
to theansel
subdirectory - of the gedcom directory:
- -- This is because for the ANSEL character set an extra module is needed -for the iconv library (more on this later). But again, this should -show a lot of special characters.export GCONV_PATH=./ansel
- ./gedcom_parse -dg t/ansel.ged
-
+ +Testing the parser with debugging
+ Given the UTF-8 capable terminal, you can now let thetestgedcom
+ program print the values that it parses. An example of a command + line is (in thegedcom
directory):
+ ++ The./testgedcom -dg t/ulhc.ged
+-dg
option instructs the parser to show its own debug + messages (see./testgedcom -h
for the full set of options). + If everything is OK, you'll see the values from the gedcom file, containing + a lot of special characters.
- + For the ANSEL test file (t/ansel.ged
), you have to set +the environment variableGCONV_PATH
to theansel
+ subdirectory of the gedcom directory:
+ ++ This is because for the ANSEL character set an extra module is needed + for the iconv library (more on this later). But again, this should + show a lot of special characters.export GCONV_PATH=./ansel
+ ./testgedcom -dg t/ansel.ged
+
+
+Testing the lexers separately
- The lexers themselves can be tested separately. For the 1-byte -lexer (i.e. supporting the encodings with 1 byte per characters, such as -ASCII, ANSI and ANSEL), the sequence of commands would be:
- + The lexers themselves can be tested separately. For the 1-byte + lexer (i.e. supporting the encodings with 1 byte per characters, such +as ASCII, ANSI and ANSEL), the sequence of commands would be:
+-This will show all tokens in themake clean
- make test_1byte
-t/allged.ged
test file. Similar -tests can be done usingmake test_hilo
andmake test_lohi
- (for the unicode lexers).
-
- This concludes the testing setup. Now for some explanations...
-
- + make test_1byte
+
t/allged.ged
test file. Similar
+ tests can be done using make test_hilo
and make test_lohi
+ (for the unicode lexers).$Id$+
$Name$