|
|
|
@ -60,14 +60,24 @@ Composition:
|
|
|
|
|
x|y Match x or y (prefer x)
|
|
|
|
|
xy|z Match xy or z (prefer xy)
|
|
|
|
|
|
|
|
|
|
Repitition (always greedy, preferring more):
|
|
|
|
|
|
|
|
|
|
x* Match x zero or more times
|
|
|
|
|
x+ Match x one or more times
|
|
|
|
|
x? Match x zero or one time
|
|
|
|
|
x{m,n} Match x between m and n times (inclusive)
|
|
|
|
|
x{m,} Match x atleast m times
|
|
|
|
|
x{,n} Match x between 0 and n times (inclusive)
|
|
|
|
|
Repitition:
|
|
|
|
|
|
|
|
|
|
Greedy:
|
|
|
|
|
x* Match x zero or more times, prefer more
|
|
|
|
|
x+ Match x one or more times, prefer more
|
|
|
|
|
x? Match x zero or one time, prefer one
|
|
|
|
|
x{m,n} Match x between m and n times (inclusive), prefer more
|
|
|
|
|
x{m,} Match x atleast m times, prefer more
|
|
|
|
|
x{,n} Match x between 0 and n times (inclusive), prefer more
|
|
|
|
|
x{m} Match x exactly m times
|
|
|
|
|
|
|
|
|
|
Lazy:
|
|
|
|
|
x*? Match x zero or more times, prefer fewer
|
|
|
|
|
x+? Match x one or more times, prefer fewer
|
|
|
|
|
x?? Match x zero or one time, prefer zero
|
|
|
|
|
x{m,n}? Match x between m and n times (inclusive), prefer fewer
|
|
|
|
|
x{m,}? Match x atleast m times, prefer fewer
|
|
|
|
|
x{,n}? Match x between 0 and n times (inclusive), prefer fewer
|
|
|
|
|
x{m} Match x exactly m times
|
|
|
|
|
|
|
|
|
|
Grouping:
|
|
|
|
@ -107,17 +117,13 @@ Numeric ranges:
|
|
|
|
|
The engine and the API differ from [regexp] in a few ways, some of them very subtle.
|
|
|
|
|
The key differences are mentioned below.
|
|
|
|
|
|
|
|
|
|
1. Greediness:
|
|
|
|
|
|
|
|
|
|
This engine currently does not support non-greedy operators.
|
|
|
|
|
|
|
|
|
|
2. Byte-slices and runes:
|
|
|
|
|
1. Byte-slices and runes:
|
|
|
|
|
|
|
|
|
|
My engine does not support byte-slices. When a matching function receives a string, it converts it into a
|
|
|
|
|
rune-slice to iterate through it. While this has some space overhead, the convenience of built-in unicode
|
|
|
|
|
support made the tradeoff worth it.
|
|
|
|
|
|
|
|
|
|
3. Return values
|
|
|
|
|
2. Return values
|
|
|
|
|
|
|
|
|
|
Rather than using primitives for return values, my engine defines two types that are used as return
|
|
|
|
|
values: a [Group] represents a capturing group, and a [Match] represents a list of groups.
|
|
|
|
@ -152,10 +158,9 @@ returns the 0-group.
|
|
|
|
|
|
|
|
|
|
The following features from [regexp] are (currently) NOT supported:
|
|
|
|
|
1. Named capturing groups
|
|
|
|
|
2. Non-greedy operators
|
|
|
|
|
3. Negated POSIX classes
|
|
|
|
|
4. Embedded flags (flags are instead passed as arguments to [Compile])
|
|
|
|
|
5. Literal text with \Q ... \E
|
|
|
|
|
2. Negated POSIX classes
|
|
|
|
|
3. Embedded flags (flags are instead passed as arguments to [Compile])
|
|
|
|
|
4. Literal text with \Q ... \E
|
|
|
|
|
|
|
|
|
|
The following features are not available in [regexp], but are supported in my engine:
|
|
|
|
|
1. Lookarounds
|
|
|
|
|