Commit Graph

170 Commits

Author SHA1 Message Date
4373d35216 Wrote function to find the 'n'th match of a regex 2025-01-05 21:40:53 -06:00
3fa4d0f75e Updated TODO 2025-01-03 19:18:00 -05:00
6f9173f771 Finished support for -m flag; refactoring pending 2025-01-03 19:17:24 -05:00
8a0586d107 Added support for printing specific match indices ('-m' and '-p' flags combined) 2025-01-03 15:49:14 -06:00
13ca954072 Started working on '-m num' flag : print the <num>th match 2024-12-19 04:29:05 -05:00
85eb13287e Updated TODO 2024-12-19 04:28:36 -05:00
e83d746ded Added more test cases 2024-12-18 15:22:50 -05:00
98f4c9e418 Added support for non-capturing groups 2024-12-18 15:22:43 -05:00
8d6e1a41a5 Fixed bug where a repeated capturing group eg. (a){3} wouldn't capture only the last iteration, like it should 2024-12-16 22:58:39 -05:00
93a5e24c8d Added more tests 2024-12-16 22:32:36 -05:00
61bced606e Added comments - certain members of State depend on the current match, should be reset 2024-12-16 22:32:22 -05:00
71cab59a89 Got rid of unnecessary special case to match at end-of-string
Instead, I tweaked the rest of the matching function, so that a special
check isn't necessary. If we are trying to match at the end of a string,
we skip any of the actual matching and proceed straight to finding
0-length matches.

This change was made because, with the special case, capturing groups
weren't getting updated if we had an end-of-string match.
2024-12-12 14:49:45 -05:00
8c8e209587 Removed return values that weren't being used 2024-12-12 14:35:06 -05:00
332c2fe5a2 Made lookarounds a little more efficient by only matching from (or to, in the case of lookbehind) the current index 2024-12-11 00:31:08 -05:00
3fda07280e Added more tests 2024-12-11 00:30:37 -05:00
e2b08f8d5f Updated TODO 2024-12-11 00:17:29 -05:00
84cccc73ec Added grouping tests 2024-12-11 00:16:35 -05:00
437ca2ee57 Improved submatch tracking by storing all group indices as a part of the state, which is viewed as a 'thread' 2024-12-11 00:16:24 -05:00
00902944f6 Added code to match capturing groups and store into a Group (used to be MatchIndex) 2024-12-09 01:28:18 -05:00
80ea262064 Updated test-case structs to reflect the name of the new type 2024-12-09 01:06:18 -05:00
f5eb9c8218 Defined postfixNodes for LPAREN and RPAREN 2024-12-09 01:05:47 -05:00
20fbd20994 Added helper function to expand a slice to a given length 2024-12-09 01:05:26 -05:00
11f7f1d746 Added fields to state, to determine capturing group information. 0th group refers to entire match 2024-12-09 01:05:01 -05:00
822d1f319f Added initial support for capturing groups 2024-12-09 01:04:31 -05:00
745fab9639 Clone lookaroundNFA when cloning a state; use compiled regex for
lookarounds instead of compiling a new one
2024-11-27 12:15:30 -05:00
34e9aedbd6 Compile lookaround regex to avoid compiling each time we want to use it 2024-11-27 12:15:01 -05:00
6208f32710 Added support for numeric ranges: <5-38> will match all numbers between 5 and 38, inclusive on both ends. Also print line number on which matches occur, if we are in printing (and single line) mode 2024-11-27 11:48:04 -05:00
cbd6ea136b If the NFA starts with an assertion, make sure it's true before doing anything else. Also, check for last-state _lookaround_ rather than just last state, before breaking (instead of aborting) when the assertion fails 2024-11-27 11:46:38 -05:00
eb6a044ecf Added angle brackets to list of special characters (which need to be escaped to be used literally 2024-11-27 11:45:27 -05:00
393769f152 Accounted for last character being a newline when checking for EOS (we can be at the second-last character if the last one is a newline 2024-11-27 11:44:39 -05:00
e36310b32d Added function (and helper functions) to generate a regex that matches all numbers in a range 2024-11-27 11:43:57 -05:00
298285e44c Added more test cases 2024-11-27 11:43:34 -05:00
0de3a94ce3 Fixed bug with lookaheads: f(?=f) would not match anything in 'ffa', because of the 'a' at the end of the string. Fixed by checking if there are other last states when an assertion fails, rather than immediately aborting 2024-11-24 15:04:51 -05:00
fe1136c54c Fixed bug with parentheses in lookaround regex; fixed bug with reading last line of test string (if it doesn't end in a newline) 2024-11-24 15:02:58 -05:00
25c333bea4 Added function to determine if a state is a lookaround 2024-11-24 15:01:06 -05:00
74c177324b Added more test cases 2024-11-24 15:00:47 -05:00
7916629c4d Added substitute flag - substitute matched text with given text 2024-11-23 10:12:22 -05:00
ee02e7575e Added function to generate all case variations of a rune 2024-11-23 09:26:27 -05:00
c87a4b7136 Added case-insensitve flag 2024-11-23 09:26:11 -05:00
924e2a8dbc Added some AI-generated test cases (llama3.1:405b) 2024-11-22 00:12:41 -05:00
21b2d5a2a9 Added lookaround-related fields to postfixNode struct 2024-11-22 00:12:22 -05:00
77d19cd84e Added lookaround-related fields to State struct, added lookaround support to checkAssertion() 2024-11-22 00:11:51 -05:00
051a8551f3 Match zero-length match at end of string, even if the start node is an assertion (end of string, lookarounds, etc.) 2024-11-22 00:10:58 -05:00
11c0a0552f Added support for lokarounds; parsing and adding nodes for different lookarounds 2024-11-22 00:10:15 -05:00
c807d6664e Changed generation of characters for non-whitespace, non-digit and non-word characters - it's basically an inverted character class now 2024-11-20 10:39:24 -05:00
d986999001 Added more tests 2024-11-20 10:38:57 -05:00
ea64ddc88a Removed unnecessary duplication of assertion checking 2024-11-20 10:38:41 -05:00
1ba871d618 Removed dotChars() function, moved notDotChars() setting to main() 2024-11-20 10:38:22 -05:00
1a1a8f4f9c Moved flag-checking after flag.Parse() 2024-11-20 10:37:33 -05:00
e882f41400 Added fields to denote all the characters that an 'allChars' postfixNode _shouldn't_ represent (useful for inverting character classes) 2024-11-20 09:41:37 -05:00