25cb79f01b
Changed the value of EPSILON, so that we can use the NUL character
...
(which it used to be) in a regex; Also added code to detect escaped
backslashes
Specifically, I replace an escaped backslash with a metacharacter, then
replace it back later on. This prevents problems, like detecting whether
the opening bracket is escaped in '\\[a]'.
2025-01-21 22:12:29 -05:00
47ec95f7bb
Created function that returns a 'default' state
2025-01-19 21:45:07 -06:00
3f0360b9be
Fixed bug where I used the 'lookaroundNumCaptureGroups' member of the wrong State struct
2025-01-09 10:39:04 -06:00
644ed15af0
Use new API for findAllMatches
2025-01-06 20:10:25 -06:00
61bced606e
Added comments - certain members of State depend on the current match, should be reset
2024-12-16 22:32:22 -05:00
332c2fe5a2
Made lookarounds a little more efficient by only matching from (or to, in the case of lookbehind) the current index
2024-12-11 00:31:08 -05:00
437ca2ee57
Improved submatch tracking by storing all group indices as a part of the state, which is viewed as a 'thread'
2024-12-11 00:16:24 -05:00
11f7f1d746
Added fields to state, to determine capturing group information. 0th group refers to entire match
2024-12-09 01:05:01 -05:00
745fab9639
Clone lookaroundNFA when cloning a state; use compiled regex for
...
lookarounds instead of compiling a new one
2024-11-27 12:15:30 -05:00
393769f152
Accounted for last character being a newline when checking for EOS (we can be at the second-last character if the last one is a newline
2024-11-27 11:44:39 -05:00
25c333bea4
Added function to determine if a state is a lookaround
2024-11-24 15:01:06 -05:00
77d19cd84e
Added lookaround-related fields to State struct, added lookaround support to checkAssertion()
2024-11-22 00:11:51 -05:00
ea64ddc88a
Removed unnecessary duplication of assertion checking
2024-11-20 10:38:41 -05:00
708a9e1303
Added field to denote all characters which an 'allChars' node _shouldn't_ match (useful for invertinc character classes
2024-11-20 09:39:24 -05:00
c56d81a335
Added unicode support to dot metacharacter - it now matches _any_ unicode character (almost)
2024-11-18 16:44:43 -05:00
8a1f1dc621
Added unicode support
...
Replaced strings with rune-slices, which capture unicode codepoints more
accurately.
2024-11-18 10:41:50 -05:00
21142e6e13
Wrote function to clone the NFA starting at a given state, and a function to find question mark operator (a? == (a|))
2024-11-03 14:37:38 -05:00
dca81c1796
Replaced rune-slice parameters with string parameters in functions; avoids unnecessary conversion from strings to rune-slices
2024-11-01 01:53:50 -04:00
fccd3a76f5
Wrote function to check if the assertion of a state is true
2024-10-31 17:56:04 -04:00
0736e813c1
Fixed boneheaded mistake with checking assertion types
2024-10-31 17:14:03 -04:00
1aff6e2fa4
Added a field to State, that tells me what kind of assertion (if any) it is making. Also added function to check if a state's contents contain a given value (checks assertions), and to find all matches that a state has for a character
2024-10-31 17:13:34 -04:00
3778869567
Use stateContents type to allow a state to store multiple characters
2024-10-28 17:38:43 -04:00
aee24644e9
Use new unique_append function signature
2024-10-28 09:39:37 -04:00
ae219f763a
Added alternate function, removed relevant code from main; also started working on escape characters
2024-10-27 15:30:33 -04:00
bf3060b672
Used 'unique append' to ensure that a transition can only contain a given state once
2024-10-27 12:52:59 -04:00
b327143fa2
Added function for concatenation and kleene star
2024-10-27 11:19:06 -04:00
9d3bc2b804
Fixed kleene star behavior, which used to behave like a '+'
2024-10-23 08:51:49 -04:00
bc11777ad5
Fixed Kleene Star matching
2024-10-22 17:07:01 -04:00
213da40c3b
Allow one state to map to multiple states with the same transition eg. ab|aa
2024-10-22 14:35:03 -04:00
8394e7867e
Fixed bug with last state detection
2024-10-21 23:17:10 -04:00
82b33f3c9a
First commit
2024-10-21 23:08:52 -04:00