DOC HOME SITE MAP MAN PAGES GNU INFO SEARCH PRINT BOOK
 

/usr/man/cat.3/pcrepartial.3.Z(/usr/man/cat.3/pcrepartial.3.Z)





NAME

       PCRE - Perl-compatible regular expressions


PARTIAL MATCHING IN PCRE


       In  normal  use  of  PCRE,  if  the  subject  string  that is passed to
       pcre_exec() matches as far as it goes, but is too short  to  match  the
       entire pattern, PCRE_ERROR_NOMATCH is returned. There are circumstances
       where it might be helpful to distinguish this case from other cases  in
       which there is no match.

       Consider, for example, an application where a human is required to type
       in data for a field with specific formatting requirements.  An  example
       might be a date in the form ddmmmyy, defined by this pattern:

         ^\d?\d(jan|feb|mar|apr|may|jun|jul|aug|sep|oct|nov|dec)\d\d$

       If the application sees the user's keystrokes one by one, and can check
       that what has been typed so far is potentially valid,  it  is  able  to
       raise  an  error as soon as a mistake is made, possibly beeping and not
       reflecting the character that has been typed. This  immediate  feedback
       is  likely  to  be a better user interface than a check that is delayed
       until the entire string has been entered.

       PCRE supports the concept of partial matching by means of the PCRE_PAR-
       TIAL  option,  which  can be set when calling pcre_exec(). When this is
       done,  the   return   code   PCRE_ERROR_NOMATCH   is   converted   into
       PCRE_ERROR_PARTIAL  if  at  any  time  during  the matching process the
       entire subject string matched part of the pattern. No captured data  is
       set when this occurs.

       Using PCRE_PARTIAL disables one of PCRE's optimizations. PCRE remembers
       the last literal byte in a pattern, and abandons  matching  immediately
       if  such a byte is not present in the subject string. This optimization
       cannot be used for a subject string that might match only partially.


RESTRICTED PATTERNS FOR PCRE_PARTIAL


       Because of the way certain internal optimizations  are  implemented  in
       PCRE,  the  PCRE_PARTIAL  option  cannot  be  used  with  all patterns.
       Repeated single characters such as

         a{2,4}

       and repeated single metasequences such as

         \d+

       are not permitted if the maximum number of occurrences is greater  than
       one.  Optional items such as \d? (where the maximum is one) are permit-
       ted.  Quantifiers with any values are permitted after  parentheses,  so
       the invalid examples above can be coded thus:

         (a){2,4}
         (\d)+

       These  constructions  run more slowly, but for the kinds of application
       that are envisaged for this facility, this is not felt to  be  a  major
       restriction.

       If  PCRE_PARTIAL  is  set  for  a  pattern that does not conform to the
       restrictions, pcre_exec() returns the error code  PCRE_ERROR_BADPARTIAL
       (-13).


EXAMPLE OF PARTIAL MATCHING USING PCRETEST


       If  the  escape  sequence  \P  is  present in a pcretest data line, the
       PCRE_PARTIAL flag is used for the match. Here is a run of pcretest that
       uses the date example quoted above:

           re> /^\d?\d(jan|feb|mar|apr|may|jun|jul|aug|sep|oct|nov|dec)\d\d$/
         data> 25jun04P
          0: 25jun04
          1: jun
         data> 25dec3P
         Partial match
         data> 3juP
         Partial match
         data> 3jujP
         No match
         data> jP
         No match

       The  first  data  string  is  matched completely, so pcretest shows the
       matched substrings. The remaining four strings do not  match  the  com-
       plete pattern, but the first two are partial matches.

Last updated: 08 September 2004
Copyright (c) 1997-2004 University of Cambridge.

                                                                       PCRE(3)

Man(1) output converted with man2html