From: Peter Verthez Date: Tue, 1 Oct 2002 16:20:25 +0000 (+0000) Subject: File to test ansel conversion. X-Git-Url: https://git.dlugolecki.net.pl/?a=commitdiff_plain;h=3abd409f8916cc7f487eb094596b54b6b80b2806;p=gedcom-parse.git File to test ansel conversion. --- diff --git a/t/input/anselutf8.ged b/t/input/anselutf8.ged new file mode 100644 index 0000000..64dea8f --- /dev/null +++ b/t/input/anselutf8.ged @@ -0,0 +1,315 @@ +0 HEAD +1 CHAR ANSEL +1 SOUR REGISTERED_SOURCE_NAME +1 GEDC +2 VERS 5.5 +2 FORM Lineage-Linked +1 NOTE This GEDCOM transmission contains a charcter set test. It consists +2 CONT of a single family (two parents, many children). The parents are empty +2 CONT in the ANSEL version of the transmission. The children contain the +2 CONT combined letters and the special charcters (value > 128). +2 CONT The NAME tag of each 'person' is the name of the characters tested +2 CONT within the person. The BIRT.PLAC and DEAT.PLAC tags contain the +2 CONT test-strings. +2 CONT The first children contain special characters. Here the test string +2 CONT is 'character name (test character), ...' where 'character name' +2 CONT is the name of the character (like 'british pound') and +2 CONT 'test character' is a single byte representing this character +2 CONT in ANSEL. +2 CONT The last children contain combined characters. The name tag gives +2 CONT the name of the non-spacing character tested within the 'person'. +2 CONT Within the name the hex-values of the non-spacing character is given +2 CONT in ANSEL and UNICODE. The test strings contain the whole latin +2 CONT alphabet combined with this non-spacing character: captial letters +2 CONT in the BIRT.PLAC tag and small letters in the DEAT.PLAC tag. +2 CONT Example: One 'person' is named 'circle above'. The BIRT.PLAC +2 CONT tag contains all 26 capital letters with a small ring on top. +2 CONT Note: Not all charcters can be displayed on all computers. +2 CONT This strongly depends on the installed fonts and codepages. +2 CONT Many of the combined characters generated here do not even have +2 CONT a UNICDOE code point! +2 CONT This file based mainly on the GEDCOM 5.5 specification +2 CONT (see: ftp.gedcom.org/pub/genealogy/gedcom/gedcom55.zip) +2 CONT and on an updated ANSEL description in: +2 CONT http://www.gendex.com/gedcom55/55gcappd.htm +1 SUBM @SUBMITTER@ +1 DATE 20 JAN 1998 +0 @SUBMITTER@ SUBM +1 NAME /H. Eichmann/ +1 ADDR email: h.eichmann@@gmx.de +0 @FATHER@ INDI +1 NAME /cyrillic (not possible in ANSEL)/ +1 SEX M +1 FAMS @FAMILY@ +0 @MOTHER@ INDI +1 NAME /greek (not possible in ANSEL)/ +1 SEX F +1 FAMS @FAMILY@ +0 @CHILD0@ INDI +1 FAMC @FAMILY@ +1 NAME /Special Characters 0/ +1 BIRT +2 PLAC slash l - uppercase (Ł), slash o - uppercase (Ø), slash d - uppercase (Đ), thorn - uppercase (Þ) +1 DEAT +2 PLAC ligature ae - uppercase (Æ), ligature oe - uppercase (Œ), miagkii znak (ʹ), middle dot (·), musical flat (♭) +0 @CHILD1@ INDI +1 FAMC @FAMILY@ +1 NAME /Special Characters 1/ +1 BIRT +2 PLAC patent mark (®), plus-or-minus (±), hook o - uppercase (Ơ), hook u - uppercase (Ư) +1 DEAT +2 PLAC alif (ʾ), ayn (ʻ), slash l - lowercase (ł), slash o - lowercase (ø), slash d - lowercase (đ) +0 @CHILD2@ INDI +1 FAMC @FAMILY@ +1 NAME /Special Characters 2/ +1 BIRT +2 PLAC thorn - lowercase (þ), ligature ae - lowercase (æ), ligature oe - lowercase (œ), tverdyi znak (ʺ) +1 DEAT +2 PLAC dotless i - lowercase (ı), british pound (£), eth (ð), hook o - lowercase (ơ), hook u - lowercase (ư) +0 @CHILD3@ INDI +1 FAMC @FAMILY@ +1 NAME /Special Characters 3/ +1 BIRT +2 PLAC degree sign (°), script l (ℓ), phonograph copyright mark (℗), copyright symbol (©) +1 DEAT +2 PLAC musical sharp (♯), inverted question mark (¿), inverted exclamation mark (¡), es zet (ß) +0 @CHILD4@ INDI +1 FAMC @FAMILY@ +1 NAME code: E0 (Unicode: hook above, 0309)/low rising tone mark/ +1 BIRT +2 PLAC ẢB̉C̉D̉ẺF̉G̉H̉ỈJ̉K̉L̉M̉N̉ỎP̉Q̉R̉S̉T̉ỦV̉W̉X̉ỶZ̉ +1 DEAT +2 PLAC ảb̉c̉d̉ẻf̉g̉h̉ỉj̉k̉l̉m̉n̉ỏp̉q̉r̉s̉t̉ủv̉w̉x̉ỷz̉ +0 @CHILD5@ INDI +1 FAMC @FAMILY@ +1 NAME code: E1 (Unicode: grave, 0300)/grave accent/ +1 BIRT +2 PLAC ÀB̀C̀D̀ÈF̀G̀H̀ÌJ̀K̀L̀M̀ǸÒP̀Q̀R̀S̀T̀ÙV̀ẀX̀ỲZ̀ +1 DEAT +2 PLAC àb̀c̀d̀èf̀g̀h̀ìj̀k̀l̀m̀ǹòp̀q̀r̀s̀t̀ùv̀ẁx̀ỳz̀ +0 @CHILD6@ INDI +1 FAMC @FAMILY@ +1 NAME code: E2 (Unicode: acute, 0301)/acute accent/ +1 BIRT +2 PLAC ÁB́ĆD́ÉF́ǴH́ÍJ́ḰĹḾŃÓṔQ́ŔŚT́ÚV́ẂX́ÝŹ +1 DEAT +2 PLAC áb́ćd́éf́ǵh́íj́ḱĺḿńóṕq́ŕśt́úv́ẃx́ýź +0 @CHILD7@ INDI +1 FAMC @FAMILY@ +1 NAME code: E3 (Unicode: circumflex, 0302)/circumflex accent/ +1 BIRT +2 PLAC ÂB̂ĈD̂ÊF̂ĜĤÎĴK̂L̂M̂N̂ÔP̂Q̂R̂ŜT̂ÛV̂ŴX̂ŶẐ +1 DEAT +2 PLAC âb̂ĉd̂êf̂ĝĥîĵk̂l̂m̂n̂ôp̂q̂r̂ŝt̂ûv̂ŵx̂ŷẑ +0 @CHILD8@ INDI +1 FAMC @FAMILY@ +1 NAME code: E4 (Unicode: tilde, 0303)/tilde/ +1 BIRT +2 PLAC ÃB̃C̃D̃ẼF̃G̃H̃ĨJ̃K̃L̃M̃ÑÕP̃Q̃R̃S̃T̃ŨṼW̃X̃ỸZ̃ +1 DEAT +2 PLAC ãb̃c̃d̃ẽf̃g̃h̃ĩj̃k̃l̃m̃ñõp̃q̃r̃s̃t̃ũṽw̃x̃ỹz̃ +0 @CHILD9@ INDI +1 FAMC @FAMILY@ +1 NAME code: E5 (Unicode: macron, 0304)/macron/ +1 BIRT +2 PLAC ĀB̄C̄D̄ĒF̄ḠH̄ĪJ̄K̄L̄M̄N̄ŌP̄Q̄R̄S̄T̄ŪV̄W̄X̄ȲZ̄ +1 DEAT +2 PLAC āb̄c̄d̄ēf̄ḡh̄īj̄k̄l̄m̄n̄ōp̄q̄r̄s̄t̄ūv̄w̄x̄ȳz̄ +0 @CHILD10@ INDI +1 FAMC @FAMILY@ +1 NAME code: E6 (Unicode: breve, 0306)/breve/ +1 BIRT +2 PLAC ĂB̆C̆D̆ĔF̆ĞH̆ĬJ̆K̆L̆M̆N̆ŎP̆Q̆R̆S̆T̆ŬV̆W̆X̆Y̆Z̆ +1 DEAT +2 PLAC ăb̆c̆d̆ĕf̆ğh̆ĭj̆k̆l̆m̆n̆ŏp̆q̆r̆s̆t̆ŭv̆w̆x̆y̆z̆ +0 @CHILD11@ INDI +1 FAMC @FAMILY@ +1 NAME code: E7 (Unicode: dot above, 0307)/dot above/ +1 BIRT +2 PLAC ȦḂĊḊĖḞĠḢİJ̇K̇L̇ṀṄȮṖQ̇ṘṠṪU̇V̇ẆẊẎŻ +1 DEAT +2 PLAC ȧḃċḋėḟġḣi̇j̇k̇l̇ṁṅȯṗq̇ṙṡṫu̇v̇ẇẋẏż +0 @CHILD12@ INDI +1 FAMC @FAMILY@ +1 NAME code: E8 (Unicode: diaeresis, 0308)/umlaut (dieresis)/ +1 BIRT +2 PLAC ÄB̈C̈D̈ËF̈G̈ḦÏJ̈K̈L̈M̈N̈ÖP̈Q̈R̈S̈T̈ÜV̈ẄẌŸZ̈ +1 DEAT +2 PLAC äb̈c̈d̈ëf̈g̈ḧïj̈k̈l̈m̈n̈öp̈q̈r̈s̈ẗüv̈ẅẍÿz̈ +0 @CHILD13@ INDI +1 FAMC @FAMILY@ +1 NAME code: E9 (Unicode: caron, 030C)/hacek/ +1 BIRT +2 PLAC ǍB̌ČĎĚF̌ǦȞǏJ̌ǨĽM̌ŇǑP̌Q̌ŘŠŤǓV̌W̌X̌Y̌Ž +1 DEAT +2 PLAC ǎb̌čďěf̌ǧȟǐǰǩľm̌ňǒp̌q̌řšťǔv̌w̌x̌y̌ž +0 @CHILD14@ INDI +1 FAMC @FAMILY@ +1 NAME code: EA (Unicode: ring above, 030A)/circle above (angstrom)/ +1 BIRT +2 PLAC ÅB̊C̊D̊E̊F̊G̊H̊I̊J̊K̊L̊M̊N̊O̊P̊Q̊R̊S̊T̊ŮV̊W̊X̊Y̊Z̊ +1 DEAT +2 PLAC åb̊c̊d̊e̊f̊g̊h̊i̊j̊k̊l̊m̊n̊o̊p̊q̊r̊s̊t̊ův̊ẘx̊ẙz̊ +0 @CHILD15@ INDI +1 FAMC @FAMILY@ +1 NAME code: EB (Unicode: ligature left half, FE20)/ligature, left half/ +1 BIRT +2 PLAC A︠B︠C︠D︠E︠F︠G︠H︠I︠J︠K︠L︠M︠N︠O︠P︠Q︠R︠S︠T︠U︠V︠W︠X︠Y︠Z︠ +1 DEAT +2 PLAC a︠b︠c︠d︠e︠f︠g︠h︠i︠j︠k︠l︠m︠n︠o︠p︠q︠r︠s︠t︠u︠v︠w︠x︠y︠z︠ +0 @CHILD16@ INDI +1 FAMC @FAMILY@ +1 NAME code: EC (Unicode: ligature right half, FE21)/ligature, right half/ +1 BIRT +2 PLAC A︡B︡C︡D︡E︡F︡G︡H︡I︡J︡K︡L︡M︡N︡O︡P︡Q︡R︡S︡T︡U︡V︡W︡X︡Y︡Z︡ +1 DEAT +2 PLAC a︡b︡c︡d︡e︡f︡g︡h︡i︡j︡k︡l︡m︡n︡o︡p︡q︡r︡s︡t︡u︡v︡w︡x︡y︡z︡ +0 @CHILD17@ INDI +1 FAMC @FAMILY@ +1 NAME code: ED (Unicode: comma above right, 0315)/high comma, off center/ +1 BIRT +2 PLAC A̕B̕C̕D̕E̕F̕G̕H̕I̕J̕K̕L̕M̕N̕O̕P̕Q̕R̕S̕T̕U̕V̕W̕X̕Y̕Z̕ +1 DEAT +2 PLAC a̕b̕c̕d̕e̕f̕g̕h̕i̕j̕k̕l̕m̕n̕o̕p̕q̕r̕s̕t̕u̕v̕w̕x̕y̕z̕ +0 @CHILD18@ INDI +1 FAMC @FAMILY@ +1 NAME code: EE (Unicode: double acute, 030B)/double acute accent/ +1 BIRT +2 PLAC A̋B̋C̋D̋E̋F̋G̋H̋I̋J̋K̋L̋M̋N̋ŐP̋Q̋R̋S̋T̋ŰV̋W̋X̋Y̋Z̋ +1 DEAT +2 PLAC a̋b̋c̋d̋e̋f̋g̋h̋i̋j̋k̋l̋m̋n̋őp̋q̋r̋s̋t̋űv̋w̋x̋y̋z̋ +0 @CHILD19@ INDI +1 FAMC @FAMILY@ +1 NAME code: EF (Unicode: candrabindu, 0310)/candrabindu/ +1 BIRT +2 PLAC A̐B̐C̐D̐E̐F̐G̐H̐I̐J̐K̐L̐M̐N̐O̐P̐Q̐R̐S̐T̐U̐V̐W̐X̐Y̐Z̐ +1 DEAT +2 PLAC a̐b̐c̐d̐e̐f̐g̐h̐i̐j̐k̐l̐m̐n̐o̐p̐q̐r̐s̐t̐u̐v̐w̐x̐y̐z̐ +0 @CHILD20@ INDI +1 FAMC @FAMILY@ +1 NAME code: F0 (Unicode: cedilla, 0327)/cedilla/ +1 BIRT +2 PLAC A̧B̧ÇḐȨF̧ĢḨI̧J̧ĶĻM̧ŅO̧P̧Q̧ŖŞŢU̧V̧W̧X̧Y̧Z̧ +1 DEAT +2 PLAC a̧b̧çḑȩf̧ģḩi̧j̧ķļm̧ņo̧p̧q̧ŗşţu̧v̧w̧x̧y̧z̧ +0 @CHILD21@ INDI +1 FAMC @FAMILY@ +1 NAME code: F1 (Unicode: ogonek, 0328)/right hook/ +1 BIRT +2 PLAC ĄB̨C̨D̨ĘF̨G̨H̨ĮJ̨K̨L̨M̨N̨ǪP̨Q̨R̨S̨T̨ŲV̨W̨X̨Y̨Z̨ +1 DEAT +2 PLAC ąb̨c̨d̨ęf̨g̨h̨įj̨k̨l̨m̨n̨ǫp̨q̨r̨s̨t̨ųv̨w̨x̨y̨z̨ +0 @CHILD22@ INDI +1 FAMC @FAMILY@ +1 NAME code: F2 (Unicode: dot below, 0323)/dot below/ +1 BIRT +2 PLAC ẠḄC̣ḌẸF̣G̣ḤỊJ̣ḲḶṂṆỌP̣Q̣ṚṢṬỤṾẈX̣ỴẒ +1 DEAT +2 PLAC ạḅc̣ḍẹf̣g̣ḥịj̣ḳḷṃṇọp̣q̣ṛṣṭụṿẉx̣ỵẓ +0 @CHILD23@ INDI +1 FAMC @FAMILY@ +1 NAME code: F3 (Unicode: diaeresis below, 0324)/double dot below/ +1 BIRT +2 PLAC A̤B̤C̤D̤E̤F̤G̤H̤I̤J̤K̤L̤M̤N̤O̤P̤Q̤R̤S̤T̤ṲV̤W̤X̤Y̤Z̤ +1 DEAT +2 PLAC a̤b̤c̤d̤e̤f̤g̤h̤i̤j̤k̤l̤m̤n̤o̤p̤q̤r̤s̤t̤ṳv̤w̤x̤y̤z̤ +0 @CHILD24@ INDI +1 FAMC @FAMILY@ +1 NAME code: F4 (Unicode: ring below, 0325)/circle below/ +1 BIRT +2 PLAC ḀB̥C̥D̥E̥F̥G̥H̥I̥J̥K̥L̥M̥N̥O̥P̥Q̥R̥S̥T̥U̥V̥W̥X̥Y̥Z̥ +1 DEAT +2 PLAC ḁb̥c̥d̥e̥f̥g̥h̥i̥j̥k̥l̥m̥n̥o̥p̥q̥r̥s̥t̥u̥v̥w̥x̥y̥z̥ +0 @CHILD25@ INDI +1 FAMC @FAMILY@ +1 NAME code: F5 (Unicode: double low line, 0333)/double underscore/ +1 BIRT +2 PLAC A̳B̳C̳D̳E̳F̳G̳H̳I̳J̳K̳L̳M̳N̳O̳P̳Q̳R̳S̳T̳U̳V̳W̳X̳Y̳Z̳ +1 DEAT +2 PLAC a̳b̳c̳d̳e̳f̳g̳h̳i̳j̳k̳l̳m̳n̳o̳p̳q̳r̳s̳t̳u̳v̳w̳x̳y̳z̳ +0 @CHILD26@ INDI +1 FAMC @FAMILY@ +1 NAME code: F6 (Unicode: line below, 0332)/underscore/ +1 BIRT +2 PLAC A̲B̲C̲D̲E̲F̲G̲H̲I̲J̲K̲L̲M̲N̲O̲P̲Q̲R̲S̲T̲U̲V̲W̲X̲Y̲Z̲ +1 DEAT +2 PLAC a̲b̲c̲d̲e̲f̲g̲h̲i̲j̲k̲l̲m̲n̲o̲p̲q̲r̲s̲t̲u̲v̲w̲x̲y̲z̲ +0 @CHILD27@ INDI +1 FAMC @FAMILY@ +1 NAME code: F7 (Unicode: comma below, 0326)/left hook/ +1 BIRT +2 PLAC A̦B̦C̦D̦E̦F̦G̦H̦I̦J̦K̦L̦M̦N̦O̦P̦Q̦R̦ȘȚU̦V̦W̦X̦Y̦Z̦ +1 DEAT +2 PLAC a̦b̦c̦d̦e̦f̦g̦h̦i̦j̦k̦l̦m̦n̦o̦p̦q̦r̦șțu̦v̦w̦x̦y̦z̦ +0 @CHILD28@ INDI +1 FAMC @FAMILY@ +1 NAME code: F8 (Unicode: left half ring below, 031C)/right cedilla/ +1 BIRT +2 PLAC A̜B̜C̜D̜E̜F̜G̜H̜I̜J̜K̜L̜M̜N̜O̜P̜Q̜R̜S̜T̜U̜V̜W̜X̜Y̜Z̜ +1 DEAT +2 PLAC a̜b̜c̜d̜e̜f̜g̜h̜i̜j̜k̜l̜m̜n̜o̜p̜q̜r̜s̜t̜u̜v̜w̜x̜y̜z̜ +0 @CHILD29@ INDI +1 FAMC @FAMILY@ +1 NAME code: F9 (Unicode: breve below, 032E)/half circle below/ +1 BIRT +2 PLAC A̮B̮C̮D̮E̮F̮G̮ḪI̮J̮K̮L̮M̮N̮O̮P̮Q̮R̮S̮T̮U̮V̮W̮X̮Y̮Z̮ +1 DEAT +2 PLAC a̮b̮c̮d̮e̮f̮g̮ḫi̮j̮k̮l̮m̮n̮o̮p̮q̮r̮s̮t̮u̮v̮w̮x̮y̮z̮ +0 @CHILD30@ INDI +1 FAMC @FAMILY@ +1 NAME code: FA (Unicode: double tilde left half, FE22)/double tilde, left half/ +1 BIRT +2 PLAC A︢B︢C︢D︢E︢F︢G︢H︢I︢J︢K︢L︢M︢N︢O︢P︢Q︢R︢S︢T︢U︢V︢W︢X︢Y︢Z︢ +1 DEAT +2 PLAC a︢b︢c︢d︢e︢f︢g︢h︢i︢j︢k︢l︢m︢n︢o︢p︢q︢r︢s︢t︢u︢v︢w︢x︢y︢z︢ +0 @CHILD31@ INDI +1 FAMC @FAMILY@ +1 NAME code: FB (Unicode: double tilde right half, FE23)/double tilde, right half/ +1 BIRT +2 PLAC A︣B︣C︣D︣E︣F︣G︣H︣I︣J︣K︣L︣M︣N︣O︣P︣Q︣R︣S︣T︣U︣V︣W︣X︣Y︣Z︣ +1 DEAT +2 PLAC a︣b︣c︣d︣e︣f︣g︣h︣i︣j︣k︣l︣m︣n︣o︣p︣q︣r︣s︣t︣u︣v︣w︣x︣y︣z︣ +0 @CHILD32@ INDI +1 FAMC @FAMILY@ +1 NAME code: FE (Unicode: comma above, 0313)/high comma, centered/ +1 BIRT +2 PLAC A̓B̓C̓D̓E̓F̓G̓H̓I̓J̓K̓L̓M̓N̓O̓P̓Q̓R̓S̓T̓U̓V̓W̓X̓Y̓Z̓ +1 DEAT +2 PLAC a̓b̓c̓d̓e̓f̓g̓h̓i̓j̓k̓l̓m̓n̓o̓p̓q̓r̓s̓t̓u̓v̓w̓x̓y̓z̓ +0 @FAMILY@ FAM +1 HUSB @FATHER@ +1 WIFE @MOTHER@ +1 CHIL @CHILD0@ +1 CHIL @CHILD1@ +1 CHIL @CHILD2@ +1 CHIL @CHILD3@ +1 CHIL @CHILD4@ +1 CHIL @CHILD5@ +1 CHIL @CHILD6@ +1 CHIL @CHILD7@ +1 CHIL @CHILD8@ +1 CHIL @CHILD9@ +1 CHIL @CHILD10@ +1 CHIL @CHILD11@ +1 CHIL @CHILD12@ +1 CHIL @CHILD13@ +1 CHIL @CHILD14@ +1 CHIL @CHILD15@ +1 CHIL @CHILD16@ +1 CHIL @CHILD17@ +1 CHIL @CHILD18@ +1 CHIL @CHILD19@ +1 CHIL @CHILD20@ +1 CHIL @CHILD21@ +1 CHIL @CHILD22@ +1 CHIL @CHILD23@ +1 CHIL @CHILD24@ +1 CHIL @CHILD25@ +1 CHIL @CHILD26@ +1 CHIL @CHILD27@ +1 CHIL @CHILD28@ +1 CHIL @CHILD29@ +1 CHIL @CHILD30@ +1 CHIL @CHILD31@ +1 CHIL @CHILD32@ +0 TRLR