24fa365be1
Moved some auxiliary functions into compile.go; use new API for compiling and finding matches
2025-01-06 20:14:57 -06:00
1da3f7f0e0
Changed API for match-finding functions - take in a Reg instead of start state and numGroups separately
2025-01-06 20:14:19 -06:00
8e8067482a
Rewrote to use new API for compiling and finding matches
2025-01-06 20:12:18 -06:00
644ed15af0
Use new API for findAllMatches
2025-01-06 20:10:25 -06:00
c8613c1ba2
Major restructuring - added new type, changed return types for shuntingYard and thompson
...
I added a new function 'Compile' that calls shuntingYard and thompson. I also added
a new type 'Reg' that this function returns - it represents the starting state and contains
the number of capturing groups in the regex. I also rewrote shuntingYard and thompson
to return errors instead of panicking.
2025-01-06 20:08:24 -06:00
ddbcb309b0
Made shuntingYard return an error instead of panicking, moved it and thompson to compile.go
2025-01-06 12:29:04 -06:00
72263509d3
Rewrote behavior of '-m' flag to use the 'nth match' function from matching.go
2025-01-05 21:41:14 -06:00
4373d35216
Wrote function to find the 'n'th match of a regex
2025-01-05 21:40:53 -06:00
3fa4d0f75e
Updated TODO
2025-01-03 19:18:00 -05:00
6f9173f771
Finished support for -m flag; refactoring pending
2025-01-03 19:17:24 -05:00
8a0586d107
Added support for printing specific match indices ('-m' and '-p' flags combined)
2025-01-03 15:49:14 -06:00
13ca954072
Started working on '-m num' flag : print the <num>th match
2024-12-19 04:29:05 -05:00
85eb13287e
Updated TODO
2024-12-19 04:28:36 -05:00
e83d746ded
Added more test cases
2024-12-18 15:22:50 -05:00
98f4c9e418
Added support for non-capturing groups
2024-12-18 15:22:43 -05:00
8d6e1a41a5
Fixed bug where a repeated capturing group eg. (a){3} wouldn't capture only the last iteration, like it should
2024-12-16 22:58:39 -05:00
93a5e24c8d
Added more tests
2024-12-16 22:32:36 -05:00
61bced606e
Added comments - certain members of State depend on the current match, should be reset
2024-12-16 22:32:22 -05:00
71cab59a89
Got rid of unnecessary special case to match at end-of-string
...
Instead, I tweaked the rest of the matching function, so that a special
check isn't necessary. If we are trying to match at the end of a string,
we skip any of the actual matching and proceed straight to finding
0-length matches.
This change was made because, with the special case, capturing groups
weren't getting updated if we had an end-of-string match.
2024-12-12 14:49:45 -05:00
8c8e209587
Removed return values that weren't being used
2024-12-12 14:35:06 -05:00
332c2fe5a2
Made lookarounds a little more efficient by only matching from (or to, in the case of lookbehind) the current index
2024-12-11 00:31:08 -05:00
3fda07280e
Added more tests
2024-12-11 00:30:37 -05:00
e2b08f8d5f
Updated TODO
2024-12-11 00:17:29 -05:00
84cccc73ec
Added grouping tests
2024-12-11 00:16:35 -05:00
437ca2ee57
Improved submatch tracking by storing all group indices as a part of the state, which is viewed as a 'thread'
2024-12-11 00:16:24 -05:00
00902944f6
Added code to match capturing groups and store into a Group (used to be MatchIndex)
2024-12-09 01:28:18 -05:00
80ea262064
Updated test-case structs to reflect the name of the new type
2024-12-09 01:06:18 -05:00
f5eb9c8218
Defined postfixNodes for LPAREN and RPAREN
2024-12-09 01:05:47 -05:00
20fbd20994
Added helper function to expand a slice to a given length
2024-12-09 01:05:26 -05:00
11f7f1d746
Added fields to state, to determine capturing group information. 0th group refers to entire match
2024-12-09 01:05:01 -05:00
822d1f319f
Added initial support for capturing groups
2024-12-09 01:04:31 -05:00
745fab9639
Clone lookaroundNFA when cloning a state; use compiled regex for
...
lookarounds instead of compiling a new one
2024-11-27 12:15:30 -05:00
34e9aedbd6
Compile lookaround regex to avoid compiling each time we want to use it
2024-11-27 12:15:01 -05:00
6208f32710
Added support for numeric ranges: <5-38> will match all numbers between 5 and 38, inclusive on both ends. Also print line number on which matches occur, if we are in printing (and single line) mode
2024-11-27 11:48:04 -05:00
cbd6ea136b
If the NFA starts with an assertion, make sure it's true before doing anything else. Also, check for last-state _lookaround_ rather than just last state, before breaking (instead of aborting) when the assertion fails
2024-11-27 11:46:38 -05:00
eb6a044ecf
Added angle brackets to list of special characters (which need to be escaped to be used literally
2024-11-27 11:45:27 -05:00
393769f152
Accounted for last character being a newline when checking for EOS (we can be at the second-last character if the last one is a newline
2024-11-27 11:44:39 -05:00
e36310b32d
Added function (and helper functions) to generate a regex that matches all numbers in a range
2024-11-27 11:43:57 -05:00
298285e44c
Added more test cases
2024-11-27 11:43:34 -05:00
0de3a94ce3
Fixed bug with lookaheads: f(?=f) would not match anything in 'ffa', because of the 'a' at the end of the string. Fixed by checking if there are other last states when an assertion fails, rather than immediately aborting
2024-11-24 15:04:51 -05:00
fe1136c54c
Fixed bug with parentheses in lookaround regex; fixed bug with reading last line of test string (if it doesn't end in a newline)
2024-11-24 15:02:58 -05:00
25c333bea4
Added function to determine if a state is a lookaround
2024-11-24 15:01:06 -05:00
74c177324b
Added more test cases
2024-11-24 15:00:47 -05:00
7916629c4d
Added substitute flag - substitute matched text with given text
2024-11-23 10:12:22 -05:00
ee02e7575e
Added function to generate all case variations of a rune
2024-11-23 09:26:27 -05:00
c87a4b7136
Added case-insensitve flag
2024-11-23 09:26:11 -05:00
924e2a8dbc
Added some AI-generated test cases (llama3.1:405b)
2024-11-22 00:12:41 -05:00
21b2d5a2a9
Added lookaround-related fields to postfixNode struct
2024-11-22 00:12:22 -05:00
77d19cd84e
Added lookaround-related fields to State struct, added lookaround support to checkAssertion()
2024-11-22 00:11:51 -05:00
051a8551f3
Match zero-length match at end of string, even if the start node is an assertion (end of string, lookarounds, etc.)
2024-11-22 00:10:58 -05:00