Updates for string and UTF-8 functions.

author Peter Verthez <Peter.Verthez@advalvas.be>

Fri, 22 Nov 2002 21:32:34 +0000 (21:32 +0000)

committer Peter Verthez <Peter.Verthez@advalvas.be>

Fri, 22 Nov 2002 21:32:34 +0000 (21:32 +0000)
author Peter Verthez <Peter.Verthez@advalvas.be>
Fri, 22 Nov 2002 21:32:34 +0000 (21:32 +0000)
committer Peter Verthez <Peter.Verthez@advalvas.be>
Fri, 22 Nov 2002 21:32:34 +0000 (21:32 +0000)
diff --git a/doc/gom.html b/doc/gom.html

index f36dbd0c33ab5c7be3ac8c22fed8f4a2c06167d0..a0829bcf474a82f829f455a8bd8faae82412df98 100644 (file)
--- a/doc/gom.html
+++ b/doc/gom.html
@@ -134,16 +134,16 @@ or locale-defined strings.<br>
  <br>
  The following functions retrieve and set the string in UTF-8 encoding:<br>
  <blockquote><code>char* <b>gom_get_string</b> (char* data);<br>
-char* <b>gom_set_string</b> (char** data, const char* utf8_str);</code><br>
+char* <b>gom_set_string</b> (char** data, const char* str_in_utf8);</code><br>
  </blockquote>
  The first function is in fact superfluous, because it just returns the <code>data</code>, but it is there for symmetry with the functions given below for the locale-defined input and output. &nbsp;<br>
  <br>
  The second function returns the new value if successful, or <code>NULL</code>
-if an error occurred (e.g. failure to allocate memory). &nbsp;It makes a
+if an error occurred (e.g. failure to allocate memory or the given string is not a valid UTF-8 string). &nbsp;It makes a
  copy of the input string to store it in the object model. &nbsp;It also takes
  care of deallocating the old value of the data if needed. &nbsp;Note that
  the set function needs the address of the data variable, to be able to modify
-it.<br>
+it. &nbsp;In the case of an error, the target data variable is not modified.<br>
  <br>
  Examples of use of these strings would be, e.g. for retrieving and setting the system ID in the header:<br>
  <blockquote><code>struct header* head = gom_get_header();</code><code></code><br>
@@ -157,16 +157,18 @@ char* newvalue = "My_Gedcom_Tool";<br>
  <br>
  A second couple of functions retrieve and set the string in the format defined by the current locale:<br>
  <blockquote><code>char* <b>gom_get_string_for_locale</b> (char* data, int* conversion_failures);<br>
-char* <b>gom_set_string_for_locale</b> (char** data, const char* locale_str)</code>;<br>
+char* <b>gom_set_string_for_locale</b> (char** data, const char* str_in_locale)</code>;<br>
  </blockquote>
  The use of these functions is the same as the previous ones, but e.g. in
  the "en_US" locale the string will be returned by the first function in the
-ISO-8859-1 encoding and the second function expects the <code>locale_str</code> to be in this encoding. &nbsp;Conversion to and from UTF-8 for the object model is done on the fly.<br>
+ISO-8859-1 encoding and the second function expects the <code>str_in_locale</code> to be in this encoding. &nbsp;Conversion to and from UTF-8 for the object model is done on the fly.<br>
  <br>
  Since the conversion from UTF-8 to the locale encoding is not always possible,
  the get function has a second parameter that can return the number of conversion
  failures for the result string. &nbsp;Pass a pointer to an integer if you
-want to know this. &nbsp;You can pass <code>NULL</code> if you're not interested.<br>
+want to know this. &nbsp;You can pass <code>NULL</code> if you're not interested. &nbsp;The function returns <code>NULL</code>
+if an error occurred (e.g. if the given string is not a valid string for
+the current locale); in that case the target data variable is not modified.<br>
  <hr width="100%" size="2">
  <pre><font size="-1">$Id$<br>$Name$</font><br></pre>
  
@@ -179,4 +181,6 @@ want to know this. &nbsp;You can pass <code>NULL</code> if you're not interested
  <br>
  <br>
  <br>
+<br>
+<br>
  </body></html>
 \ No newline at end of file
diff --git a/doc/usage.html b/doc/usage.html

index 00bddefbfd66c45b1cc5413ff52315aab964ce9c..0a8c7e85c5ece5d1352d099348de28d8366d81f0 100644 (file)
--- a/doc/usage.html
+++ b/doc/usage.html
@@ -516,17 +516,29 @@ controls the <code>gettext</code>  mechanism in the application. &nbsp;<br>
                         <br>
                                                                          
                                          The source distribution of <code>
-gedcom-parse</code>   contains an example implementation (<code>utf8-locale.c</code>
- and <code>  utf8-locale.h</code>  in the "t" subdirectory of the top directory).&nbsp;
-&nbsp;Feel free to use  it in your source code (it is not part of the library,
-and it isn't installed  anywhere, so you need to take over the source and
-header file in your application).  &nbsp;<br>
+gedcom-parse</code>   contains an a library implementing help functions for UTF-8 encoding (<code></code>see
+the "utf8" subdirectory of the top directory).&nbsp; &nbsp;Feel free to use
+ it in your source code. &nbsp;It isn't installed  anywhere, so you need
+to take over the source and header files in your application. Note that on
+some systems it uses libcharset, which is also included in this subdirectory.
+ &nbsp;<br>
                         <br>
-    Its interface is:<br>
+    Its interface contains first of all the following two help functions:<br>
                           
  <blockquote>      
-  <pre><code>char *<b>convert_utf8_to_locale</b> (char *input, int *conv_failures);<br>char *<b>convert_locale_to_utf8</b> (char *input);<br></code></pre>
+  <pre><code>int   <b>is_utf8_string</b> (char *input);<br>int   <b>utf8_strlen</b> (char *input);<br></code></pre></blockquote>The
+first one returns 1 if the given input is a valid UTF-8 string, it returns
+0 otherwise, the second gives the number of UTF-8 characters in the given
+input. &nbsp;Note that the second function assumes that the input is valid
+UTF-8, and gives unpredictable results if it isn't.<br>
+<br>
+For conversion, the following functions are available:<br>
+<blockquote>
+  <pre><code></code><code>char *<b>convert_utf8_to_locale</b> (char *input, int *conv_failures);<br>char *<b>convert_locale_to_utf8</b> (char *input);<br></code></pre>
+</blockquote>
+<blockquote>
    </blockquote>
+
      Both functions return a pointer to a static buffer that is overwritten
   on each call. &nbsp;To function properly, the application must first set
  the locale using the <code>setlocale</code> function (the second step detailed
@@ -674,9 +686,9 @@ handle needs to be closed (when the program exits):<br>
    <blockquote>                                                     
      <pre><code>iconv_close(iconv_handle);<br></code></pre>
                                               </blockquote>
-                                             </blockquote>
-                                                  The example implementation 
- mentioned above grows the output buffer dynamically and outputs "?" for characters
+                                             </blockquote> 
+                                                  The example implementation
+mentioned above grows the output buffer dynamically and outputs "?" for characters 
   that can't be converted.<br>
                                                                          
                           
@@ -730,4 +742,5 @@ There are three preprocessor symbols defined for version checks in the
  <br>
  <br>
  <br>
+<br>
  </body></html>
 \ No newline at end of file
author	Peter Verthez <Peter.Verthez@advalvas.be>
	Fri, 22 Nov 2002 21:32:34 +0000 (21:32 +0000)
committer	Peter Verthez <Peter.Verthez@advalvas.be>
	Fri, 22 Nov 2002 21:32:34 +0000 (21:32 +0000)
doc/gom.html		patch \| blob \| history
doc/usage.html		patch \| blob \| history