2 Commits

Author SHA1 Message Date
119475b41b Updated README 2025-02-14 12:13:01 -05:00
6151cc8cf6 Updated documentation 2025-02-14 12:07:43 -05:00
2 changed files with 24 additions and 19 deletions

View File

@@ -2,8 +2,8 @@
Kleingrep is a regular expression engine, providing a library and command-line tool written in Go.
It aims to provide a more featureful engine, compared to the one in
[Go's standard library](https://pkg.go.dev/regexp), while retaining some semblance of efficiency.
It aims to provide a more featureful engine, compared to the one in Go's
[regexp](https://pkg.go.dev/regexp), while retaining some semblance of efficiency.
The engine does __not__ use backtracking, relying on the NFA-based method described in
[Russ Cox's articles](https://swtch.com/~rsc/regexp). As such, it is immune to catastrophic backtracking.

View File

@@ -60,14 +60,24 @@ Composition:
x|y Match x or y (prefer x)
xy|z Match xy or z (prefer xy)
Repitition (always greedy, preferring more):
Repitition:
x* Match x zero or more times
x+ Match x one or more times
x? Match x zero or one time
x{m,n} Match x between m and n times (inclusive)
x{m,} Match x atleast m times
x{,n} Match x between 0 and n times (inclusive)
Greedy:
x* Match x zero or more times, prefer more
x+ Match x one or more times, prefer more
x? Match x zero or one time, prefer one
x{m,n} Match x between m and n times (inclusive), prefer more
x{m,} Match x atleast m times, prefer more
x{,n} Match x between 0 and n times (inclusive), prefer more
x{m} Match x exactly m times
Lazy:
x*? Match x zero or more times, prefer fewer
x+? Match x one or more times, prefer fewer
x?? Match x zero or one time, prefer zero
x{m,n}? Match x between m and n times (inclusive), prefer fewer
x{m,}? Match x atleast m times, prefer fewer
x{,n}? Match x between 0 and n times (inclusive), prefer fewer
x{m} Match x exactly m times
Grouping:
@@ -107,17 +117,13 @@ Numeric ranges:
The engine and the API differ from [regexp] in a few ways, some of them very subtle.
The key differences are mentioned below.
1. Greediness:
This engine currently does not support non-greedy operators.
2. Byte-slices and runes:
1. Byte-slices and runes:
My engine does not support byte-slices. When a matching function receives a string, it converts it into a
rune-slice to iterate through it. While this has some space overhead, the convenience of built-in unicode
support made the tradeoff worth it.
3. Return values
2. Return values
Rather than using primitives for return values, my engine defines two types that are used as return
values: a [Group] represents a capturing group, and a [Match] represents a list of groups.
@@ -152,10 +158,9 @@ returns the 0-group.
The following features from [regexp] are (currently) NOT supported:
1. Named capturing groups
2. Non-greedy operators
3. Negated POSIX classes
4. Embedded flags (flags are instead passed as arguments to [Compile])
5. Literal text with \Q ... \E
2. Negated POSIX classes
3. Embedded flags (flags are instead passed as arguments to [Compile])
4. Literal text with \Q ... \E
The following features are not available in [regexp], but are supported in my engine:
1. Lookarounds