/usr/man/cat.3/pcrepartial.3.Z(/usr/man/cat.3/pcrepartial.3.Z)
NAME
PCRE - Perl-compatible regular expressions
PARTIAL MATCHING IN PCRE
In normal use of PCRE, if the subject string that is passed to
pcre_exec() matches as far as it goes, but is too short to match the
entire pattern, PCRE_ERROR_NOMATCH is returned. There are circumstances
where it might be helpful to distinguish this case from other cases in
which there is no match.
Consider, for example, an application where a human is required to type
in data for a field with specific formatting requirements. An example
might be a date in the form ddmmmyy, defined by this pattern:
^\d?\d(jan|feb|mar|apr|may|jun|jul|aug|sep|oct|nov|dec)\d\d$
If the application sees the user's keystrokes one by one, and can check
that what has been typed so far is potentially valid, it is able to
raise an error as soon as a mistake is made, possibly beeping and not
reflecting the character that has been typed. This immediate feedback
is likely to be a better user interface than a check that is delayed
until the entire string has been entered.
PCRE supports the concept of partial matching by means of the PCRE_PAR-
TIAL option, which can be set when calling pcre_exec(). When this is
done, the return code PCRE_ERROR_NOMATCH is converted into
PCRE_ERROR_PARTIAL if at any time during the matching process the
entire subject string matched part of the pattern. No captured data is
set when this occurs.
Using PCRE_PARTIAL disables one of PCRE's optimizations. PCRE remembers
the last literal byte in a pattern, and abandons matching immediately
if such a byte is not present in the subject string. This optimization
cannot be used for a subject string that might match only partially.
RESTRICTED PATTERNS FOR PCRE_PARTIAL
Because of the way certain internal optimizations are implemented in
PCRE, the PCRE_PARTIAL option cannot be used with all patterns.
Repeated single characters such as
a{2,4}
and repeated single metasequences such as
\d+
are not permitted if the maximum number of occurrences is greater than
one. Optional items such as \d? (where the maximum is one) are permit-
ted. Quantifiers with any values are permitted after parentheses, so
the invalid examples above can be coded thus:
(a){2,4}
(\d)+
These constructions run more slowly, but for the kinds of application
that are envisaged for this facility, this is not felt to be a major
restriction.
If PCRE_PARTIAL is set for a pattern that does not conform to the
restrictions, pcre_exec() returns the error code PCRE_ERROR_BADPARTIAL
(-13).
EXAMPLE OF PARTIAL MATCHING USING PCRETEST
If the escape sequence \P is present in a pcretest data line, the
PCRE_PARTIAL flag is used for the match. Here is a run of pcretest that
uses the date example quoted above:
re> /^\d?\d(jan|feb|mar|apr|may|jun|jul|aug|sep|oct|nov|dec)\d\d$/
data> 25jun04P
0: 25jun04
1: jun
data> 25dec3P
Partial match
data> 3juP
Partial match
data> 3jujP
No match
data> jP
No match
The first data string is matched completely, so pcretest shows the
matched substrings. The remaining four strings do not match the com-
plete pattern, but the first two are partial matches.
Last updated: 08 September 2004
Copyright (c) 1997-2004 University of Cambridge.
PCRE(3)
Man(1) output converted with
man2html