Instead, I tweaked the rest of the matching function, so that a special
check isn't necessary. If we are trying to match at the end of a string,
we skip any of the actual matching and proceed straight to finding
0-length matches.
This change was made because, with the special case, capturing groups
weren't getting updated if we had an end-of-string match.
ifi<len(re_runes)&&re_runes[i]=='('&&(i==0||re_runes[i-1]!='\\')&&(i<len(re_runes)-1&&re_runes[i+1]=='?'){// Unescaped open parentheses followed by question mark = lokaround. Don't mess with it.
ifi<len(re_runes)&&re_runes[i]=='('&&(i==0||re_runes[i-1]!='\\')&&(i<len(re_runes)-2&&re_runes[i+1]=='?'&&slices.Contains([]rune{'=','!','<'},re_runes[i+2])){// Unescaped open parentheses followed by question mark then '<', '!' or '=' => lokaround. Don't mess with it.
ifi<len(re_runes)&&(re_runes[i]!='('&&re_runes[i]!='|'&&re_runes[i]!='\\')||(i>0&&re_runes[i-1]=='\\'){// Every character should be concatenated if it is escaped
ifi<len(re_runes)&&(re_runes[i]!='('&&re_runes[i]!=NONCAPLPAREN_CHAR&&re_runes[i]!='|'&&re_runes[i]!='\\')||(i>0&&re_runes[i-1]=='\\'){// Every character should be concatenated if it is escaped
transitionsmap[int][]*State// Transitions to different states (maps a character (int representation) to a _list of states. This is useful if one character can lead multiple states eg. ab|aa)
isKleenebool// Identifies whether current node is a 0-state representing Kleene star
assertassertType// Type of assertion of current node - NONE means that the node doesn't assert anything
zeroMatchFoundbool// Whether or not the state has been used for a zero-length match - only relevant for zero states
allCharsbool// Whether or not the state represents all characters (eg. a 'dot' metacharacter). A 'dot' node doesn't store any contents directly, as it would take up too much space
except[]rune// Only valid if allChars is true - match all characters _except_ the ones in this block. Useful for inverting character classes.
lookaroundRegexstring// Only for lookaround states - Contents of the regex that the lookaround state holds
@ -37,7 +36,9 @@ type State struct {
groupBeginbool// Whether or not the node starts a capturing group
groupEndbool// Whether or not the node ends a capturing group
groupNumint// Which capturing group the node starts / ends
threadGroups[]Group// Assuming that a state is part of a 'thread' in the matching process, this array stores the indices of capturing groups in the current thread. As matches are found for this state, its groups will be copied over.
// The following properties depend on the current match - I should think about resetting them for every match.
zeroMatchFoundbool// Whether or not the state has been used for a zero-length match - only relevant for zero states
threadGroups[]Group// Assuming that a state is part of a 'thread' in the matching process, this array stores the indices of capturing groups in the current thread. As matches are found for this state, its groups will be copied over.
ifs.assert==PLA||s.assert==NLA{// Lookahead - return true (or false) if at least one match starts at the current index
ifmatchIdx[0].startIdx==idx{
ifs.assert==PLA||s.assert==NLA{// Lookahead - return true (or false) if at least one match starts at 0. Zero is used because the test-string _starts_ from idx.