From fcdb23524a1b952776c6d65430b31c1d33b4c0e4 Mon Sep 17 00:00:00 2001 From: Aadhavan Srinivasan Date: Sat, 1 Feb 2025 11:04:24 -0500 Subject: [PATCH] Added more documentation --- regex/doc.go | 31 +++++++++++++++++++++++++++++++ 1 file changed, 31 insertions(+) diff --git a/regex/doc.go b/regex/doc.go index 820ba0e..398ec07 100644 --- a/regex/doc.go +++ b/regex/doc.go @@ -121,5 +121,36 @@ this engine will _always_ go for the longest possible match, regardless of the o My engine does not support byte-slices. When a matching function receives a string, it converts it into a rune-slice to iterate through it. While this has some space overhead, the convenience of built-in unicode support made the tradeoff worth it. + +3. Return values + +Rather than using primitives for return values, my engine defines two types that are used as return +values: a [Group] represents a capturing group, and a [Match] represents a list of groups. + +[regexp] specifies a regular expression that gives a list of all the matching functions that it supports. The +equivalent expression for this engine is: + + Find(All)?(String)?(Submatch)? + +[Reg.Find] returns the index of the leftmost match in the string. + +If a function contains 'All' it returns all matches instead of just the leftmost one. + +If a function contains 'String' it returns the matched text, rather than the indices. + +If a function contains 'Submatch' it returns the match, including all submatches found by +capturing groups. + +The term '0-group' is used to refer to the 0th capturing group of a match (which is the entire match). +Given the following regex: + + x(y) + +and the input string: + + xyz + +The 0th group would contain 'xy' and the 1st group would contain 'y'. Any matching function without 'Submatch' in its name +returns the 0-group. */ package regex