Text Processing Support Routines

generic text processing engine.

Support for text processing:

  • converting to UTF-8

  • converting from UTF-8

  • striping

  • splitting concatenating

Functions:

1. char_kind_t Yap_wide_chtype(int ch):

< Other, not assigned

< Letter, uppercase

< Letter, lowercase

< Letter, titlecase

< Letter, modifier

< Letter, other

< Mark, nonspacing

< Mark, spacing combining

< Mark, enclosing

< Number, decimal digit

< Number, letter

< Number, other

< Punctuation, connector

< Punctuation, dash

< Punctuation, open

< Punctuation, close

< Punctuation, initial quote

< Punctuation, final quote

< Punctuation, other

< Symbol, math

< Symbol, currency

< Symbol, modifier

unsure in YAP, let's assume a,c us treated as aƧ

< Symbol, other

< Separator, space

< Separator, line

< Separator, paragraph

< Other, control

< Other, format

< Other, surrogate

< Other, private use

1. static Term Globalize(Term v USES_REGS):

1. static void * codes2buf(Term t0, void *b0, bool get_codes, bool fixed USES_REGS):

1. static unsigned char * latin2ut/8(seq_tv_t *inp):

1. static unsigned char * wchar2ut/8(seq_tv_t *inp):

1. static void * slice(ssize_t min, ssize_t max, const unsigned char *buf USES_REGS):

1. unsigned char * Yap_ListOfCodesToBuffer(unsigned char buf, Term t, seq_tv_t inp USES_REGS):

1. unsigned char * Yap_ListOfCharsToBuffer(unsigned char buf, Term t, seq_tv_t inp USES_REGS):

1. static unsigned char * Yap_ListToBuffer(unsigned char buf, Term t, seq_tv_t inp USES_REGS):

1. static yap_error_number gen_type_error(int flags):

1. unsigned char * Yap_readText(seq_tv_t *inp USES_REGS):

1. static Term write_strings(unsigned char s0, seq_tv_t out USES_REGS):

1. static Term write_atoms(void s0, seq_tv_t out USES_REGS):

1. static Term write_codes(void s0, seq_tv_t out USES_REGS):

1. static Atom write_atom(void s0, seq_tv_t out USES_REGS):

1. void * write_buffer(unsigned char s0, seq_tv_t out USES_REGS):

1. static size_t write_length(const unsigned char s0, seq_tv_t out USES_REGS):

1. static Term write_number(unsigned char s, seq_tv_t out USES_REGS):

1. static Term string_to_term(void s, seq_tv_t out USES_REGS):

1. Term Yap_StringToNumberTerm(const char s, encoding_t encp, bool error_on):

1. bool write_Text(unsigned char inp, seq_tv_t out USES_REGS):

1. static size_t upcase(void s0, seq_tv_t out USES_REGS):

1. static size_t downcase(void s0, seq_tv_t out USES_REGS):

1. bool Yap_CVT_Text(seq_tv_t inp, seq_tv_t out USES_REGS):

1. bool Yap_Concat_Text(int tot, seq_tv_t inp[], seq_tv_t *out USES_REGS):

1. bool Yap_Splice_Text(int n, ssize_t cuts[], seq_tv_t *inp, seq_tv_t outv[] USES_REGS):

1. const char * Yap_PredIndicatorToUTF8String(PredEntry ap, char s0, size_t sz): : Convert from a predicate structure to an UTF-8 string of the form.

module:name/arity.

The result is in very volatile memory.

sthe buffer return:
the temporary string