C12Adapter Opensource C++ Interface
|
POSIX-like regular expression handler. More...
Public Types | |
enum | { NUMBER_OF_SUBEXPRESSIONS = 10 } |
Public Member Functions | |
MRegexp () | |
Default constructor. | |
MRegexp (const MStdString &exp, bool caseInsensitive=false) | |
Constructor of the regular expression that takes an expression as standard string. More... | |
MRegexp (MConstChars exp, bool caseInsensitive=false) | |
Constructor of the regular expression that takes an expression as a pointer to a zero terminated string. More... | |
MRegexp (const MRegexp &r) | |
Copy constructor. More... | |
virtual | ~MRegexp () |
Object destructor. | |
bool | IsCompiled () const |
Check whether a valid regular expression was supplied. | |
int | GetCount () const |
Return the number of items found after a successful Match. | |
const MStdString & | GetPattern () const |
Get the pattern, as it was set at compile method. | |
MRegexp & | operator= (const MRegexp &r) |
Assignment operator. More... | |
void | Compile (const MStdString &exp, bool caseInsensitive=false) |
Compile the regular expression given as standard string. More... | |
void | Clear () |
Clear the regular expression, possibly reclaim memory. | |
bool | Match (const MStdString &) |
Examine the character string with this regular expression, returning true if there is a match. More... | |
MStdString | Item (int i) const |
Return the I-th matched item after a successful Match. More... | |
MStdString | operator[] (int i) const |
Return the I-th matched item after a successful Match. More... | |
int | GetItemStart (int i) const |
Return the starting offset of the I-th matched item from the beginning of the character array used in Match. More... | |
int | GetItemLength (int i) const |
Return the length of the I-th matched item as used in Match. More... | |
MStdString | GetReplaceString (const MStdString &source) const |
Get the string for replacement, use source as standard string. More... | |
void | CheckIsCompiled () const |
Check if the regular expression is compiled, throw error if not. More... | |
![]() | |
virtual | ~MObject () |
Object destructor. | |
virtual const MClass * | GetClass () const =0 |
Get the final class of the object. More... | |
virtual unsigned | GetEmbeddedSizeof () const |
For embedded object types, return the size of the class. More... | |
bool | IsEmbeddedObject () const |
Tell if the object is of embedded kind. More... | |
SHOW_INTERNAL MVariant | Call (const MStdString &name, const MVariant ¶ms) |
Call the object service with parameters, given as variant. More... | |
MVariant | Call0 (const MStdString &name) |
Call the object service with no parameters. More... | |
MVariant | Call1 (const MStdString &name, const MVariant &p1) |
Call the object service with one parameter. More... | |
MVariant | Call2 (const MStdString &name, const MVariant &p1, const MVariant &p2) |
Call the object service with two parameter. More... | |
MVariant | Call3 (const MStdString &name, const MVariant &p1, const MVariant &p2, const MVariant &p3) |
Call the object service with three parameter. More... | |
MVariant | Call4 (const MStdString &name, const MVariant &p1, const MVariant &p2, const MVariant &p3, const MVariant &p4) |
Call the object service with four parameter. More... | |
MVariant | Call5 (const MStdString &name, const MVariant &p1, const MVariant &p2, const MVariant &p3, const MVariant &p4, const MVariant &p5) |
Call the object service with five parameter. More... | |
MVariant | Call6 (const MStdString &name, const MVariant &p1, const MVariant &p2, const MVariant &p3, const MVariant &p4, const MVariant &p5, const MVariant &p6) |
Call the object service with six parameter. More... | |
virtual MVariant | CallV (const MStdString &name, const MVariant::VariantVector ¶ms) |
Call the object service with parameters, given as variant vector. More... | |
virtual bool | IsPropertyPresent (const MStdString &name) const |
Tell if the property with the given name exists. | |
virtual bool | IsServicePresent (const MStdString &name) const |
Tell if the service with the given name exists. | |
virtual MVariant | GetProperty (const MStdString &name) const |
Get the property value using name of the property. More... | |
virtual void | SetProperty (const MStdString &name, const MVariant &value) |
Set the property using name of the property, and value. More... | |
virtual MStdStringVector | GetAllPropertyNames () const |
Return the list of publicly available properties, persistent or not. More... | |
virtual MStdStringVector | GetAllPersistentPropertyNames () const |
Return the list of persistent properties. More... | |
virtual void | SetPersistentPropertiesToDefault () |
Set the persistent properties of the object to their default values. More... | |
virtual MVariant | GetPersistentPropertyDefaultValue (const MStdString &name) const |
Get the default value of persistent property with the name given. More... | |
virtual void | SetPersistentPropertyToDefault (const MStdString &name) |
Set the persistent property with the name given to default value. More... | |
virtual const char * | GetType () const |
Get the name of the type for the object (could be the same as class name). | |
virtual void | SetType (const MStdString &) |
Intentionally, it will set the name of the type for the object, but the service will not allow setting the name to anything other than the current name. More... | |
virtual void | Validate () |
Validate internal structures of the object. More... | |
Static Public Member Functions | |
static bool | StaticMatch (MConstChars regexp, const MStdString &str, bool caseInsensitive=false) |
Do a match using the given regular expression and string without creating MRegexp object. More... | |
![]() | |
static const MClass * | GetStaticClass () |
Get the declared class of this particular object. More... | |
static bool | IsClassPresent (const MStdString &name) |
Tells if the given class name is available. More... | |
Additional Inherited Members | |
![]() | |
static const MClass | s_class |
Class of MObject. | |
![]() | |
MObject () | |
Object constructor, protected as the class is abstract. | |
void | DoSetPersistentPropertiesToDefault (const MClass *staticClass) |
Set the persistent properties to their default values for one object provided the class for that object. More... | |
POSIX-like regular expression handler.
A class could be given a regular expression and from that, return specific substrings (items) from its input. Regular expressions may not be the fastest way to parse input (though with careful anchoring they can be made so that they fail quickly if they are going to) but once you have a working library they do allow for fairly rapid coding. On the whole this is good enough, worry about making it faster once you have it working and actually know that your optimization effort isn't going unnoticed. For example:
Will give:
If you decompose the regular expression you get:
Note: The phrase tagged regular expression refers to any part of the regular expression that is, because it was surrounded by parenthesis, accessible after a match has been made as a separate item.
In English, we are looking for two fields. The first will be all characters from the start of the line through to the second field (without any surrounding white space), and the second will be all characters within parenthesis following the first field.
A regular expression is zero or more branches, separated by '|'. It matches anything that matches one of the branches.
A branch is zero or more pieces, concatenated. It matches a match for the first, followed by a match for the second, etc.
A piece is an atom possibly followed by '*', '+', or '?'. An atom followed by '*' matches a sequence of 0 or more matches of the atom. An atom followed by '+' matches a sequence of 1 or more matches of the atom. An atom followed by '?' matches a match of the atom, or the empty string. An atom is a regular expression in parentheses (matching a match for the regular expression), a range (see below), '.' (matching any single character), '^' (matching the empty string at the beginning of the input string), '$' (matching the empty string at the end of the input string), a '\' followed by a single character (matching that character), or a single character with no other significance (matching that character).
A range is a sequence of characters enclosed in '[]'. It normally matches any single character from the sequence. If the sequence begins with '^', it matches any single character not from the rest of the sequence. If two characters in the sequence are separated by '-', this is shorthand for the full list of ASCII characters between them (e.g. '[0-9]' matches any decimal digit). To include a literal ']' in the sequence, make it the first character (following a possible '^'). To include a literal '-', make it the first or last character.
If a regular expression could match two different parts of the input string, it will match the one which begins earliest. If both begin in the same place but match different lengths, or match the same length in different ways, life gets messier, as follows. In general, the possibilities in a list of branches are considered in left-to-right order, the possibilities for '*', '+', and '?' are considered longest-first, nested constructs are considered from the outermost in, and concatenated constructs are considered leftmost-first. The match that will be chosen is the one that uses the earliest possibility in the first choice that has to be made. If there is more than one choice, the next will be made in the same manner (earliest possibility) subject to the decision on the first choice. And so forth.
For example, '(ab|a)b*c' could match 'abc' in one of two ways. The first choice is between 'ab' and 'a'; since 'ab' is earlier, and does lead to a successful overall match, it is chosen. Since the 'b' is already spoken for, the 'b*' must match its last possibility–the empty string–since it must respect the earlier choice.
In the particular case where the regular expression does not use `|' and does not apply `*', `+', or `?' to parenthesized subexpressions, the net effect is that the longest possible match will be chosen. So `ab*', presented with `xabbbby', will match `abbbb'. Note that if `ab*' is tried against `xabyabbbz', it will match `ab' just after `x', due to the begins-earliest rule. (In effect, the decision on where to start the match is the first choice to be made, hence subsequent choices must respect it even if this leads them to less-preferred alternatives.)
anonymous enum |
MRegexp::MRegexp | ( | const MStdString & | exp, |
bool | caseInsensitive = false |
||
) |
Constructor of the regular expression that takes an expression as standard string.
exp | Regular expression |
caseInsensitive | When true, the match shall be case insensitive, false by default. |
MRegexp::MRegexp | ( | MConstChars | exp, |
bool | caseInsensitive = false |
||
) |
Constructor of the regular expression that takes an expression as a pointer to a zero terminated string.
exp | Regular expression |
caseInsensitive | When true, the match shall be case insensitive, false by default. |
MRegexp::MRegexp | ( | const MRegexp & | r | ) |
Copy constructor.
void MRegexp::CheckIsCompiled | ( | ) | const |
Check if the regular expression is compiled, throw error if not.
void MRegexp::Compile | ( | const MStdString & | exp, |
bool | caseInsensitive = false |
||
) |
Compile the regular expression given as standard string.
The format of the regular expression is defined in the class header.
exp | Regular expression |
caseInsensitive | When true, the match shall be case insensitive, false by default. |
int MRegexp::GetItemLength | ( | int | i | ) | const |
Return the length of the I-th matched item as used in Match.
Along with the GetItemStart, this service can be used as follows:
int MRegexp::GetItemStart | ( | int | i | ) | const |
Return the starting offset of the I-th matched item from the beginning of the character array used in Match.
MStdString MRegexp::GetReplaceString | ( | const MStdString & | source | ) | const |
Get the string for replacement, use source as standard string.
After a successful Match one can retrieve a replacement string as an alternative to building up the various items by hand.
Each character in the source string will be copied to the return value except for the following special characters:
So:
Will give: "David == example.com!david"
MStdString MRegexp::Item | ( | int | i | ) | const |
Return the I-th matched item after a successful Match.
As in the classic regexp, the zeroth element is the whole string, and the last allowed index is equal to GetCount. Look at operator[] for convenience.
bool MRegexp::Match | ( | const MStdString & | ) |
Examine the character string with this regular expression, returning true if there is a match.
This match updates the state of this MRegexp object so that the items of the match can be obtained. The 0th item is the item of string that matched the whole regular expression. The others are those items that matched parenthesized expressions within the regular expression, with parenthesized expressions numbered in left-to-right order of their opening parentheses. If a parenthesized expression does not participate in the match at all, its length is 0.
Assignment operator.
|
inline |
Return the I-th matched item after a successful Match.
As in the classic regexp, the zeroth element is the whole string, and the last allowed index is equal to GetCount. This is a more convenient C++ way of accessing an item.
|
static |
Do a match using the given regular expression and string without creating MRegexp object.
regexp | Regular expression to match |
str | String in which the regular expression shall be matched. |
caseInsensitive | When true, the match shall be case insensitive, false by default. |