Commit Graph

381 Commits

Author SHA1 Message Date
3f5f8fad2c Removed unnecessary functions (using staticcheck) 2025-01-27 19:58:59 -05:00
e671137493 Changed error messages - removed capitalization and punctuation to find Go's error message guidelines 2025-01-27 19:57:15 -05:00
abc40bf770 Return an error if a POSIX charclass is specified outside of brackets 2025-01-27 16:07:23 -05:00
3fb9bc1446 Added support for POSIX character classes 2025-01-27 16:00:35 -05:00
ae76e2e55e Added a function to generate a slice of all values (inclusive) in a range 2025-01-27 16:00:26 -05:00
dec6aaca93 Added POSIX charclass tests 2025-01-27 16:00:05 -05:00
43d0cbf0a0 Use 'CONCAT' instead of literally specifiying the rune 2025-01-27 13:54:02 -05:00
68a3581d93 Added note on PCRE backreferences 2025-01-26 22:19:16 -05:00
ff250338b4 Added more tests; added backreference comment 2025-01-26 22:18:34 -05:00
0367c0d614 Added more tests 2025-01-26 10:24:29 -05:00
304ef68d45 Added more tests 2025-01-25 22:41:43 -05:00
1db61108e4 Allow pipes that have a missing operand - if an operand is missing, it is replaced with a zeroLengthMatchState(), which always has a zero-length match 2025-01-25 22:36:58 -05:00
8feaefeeb8 Added more tests 2025-01-25 22:36:04 -05:00
a259f0ceab Created a function to return a state that will always have a zero-length state 2025-01-25 22:35:52 -05:00
08e01a1c81 Loosened restrictions for concatenation - It's okay if one of the
elements is missing
2025-01-25 13:09:47 -05:00
5c2869ff81 Updated test case 2025-01-25 13:09:29 -05:00
4dfc77900f Added new assertion that always evaluates to true 2025-01-25 13:04:51 -05:00
93903fc557 Allowed creation of empty non-capturing groups 2025-01-25 13:04:36 -05:00
036e625a15 Added more test cases 2025-01-25 13:04:08 -05:00
4966a222f9 Added detection of empty parentheses, as zero-length matches 2025-01-25 12:44:40 -05:00
263619c50c Added more test cases 2025-01-25 12:23:15 -05:00
d7c9c181e1 Fixed bug in character class implementation 2025-01-24 19:48:53 -05:00
5a085907cf WIP - fixing character classes 2025-01-24 17:06:19 -05:00
65e5b4e2af Added more test cases 2025-01-24 17:06:00 -05:00
1520edad55 Enforce the rule that character classes must have at least one character; interpret literal closing brackets as regular characters 2025-01-24 15:50:36 -05:00
6fb266e0d2 Refactored isNormalChar(), wrote function to get special characters that have metachar replacements 2025-01-24 15:49:33 -05:00
423fcc9b54 Added more test cases (1 failing) 2025-01-24 14:58:18 -05:00
cf4d305b31 Allow hyphen to be escaped inside character class 2025-01-24 14:58:07 -05:00
9d3c228ace Fixed edge cases with character ranges and character classes 2025-01-24 14:57:47 -05:00
5e12fe1c42 Added 'flags' field to test struct for all-group tests 2025-01-24 11:11:48 -05:00
f87458ee99 Added 'flags' field to test struct for 0-group tests 2025-01-24 11:10:01 -05:00
2937f2d917 Removed old comment 2025-01-22 20:27:35 -05:00
efab70f9dc Implemented character range detection later in the code, using a metacharacter 2025-01-22 20:26:58 -05:00
cf964e41db Modified genRange() so that it can work on ints and runes 2025-01-22 20:25:49 -05:00
649485f01d Removed character range creation from the first part of shuntingYard() (the part that adds concatenation operators), because octal and hex values haven't yet been deciphered at this point in the code 2025-01-22 16:51:00 -05:00
ae09462bd4 Added important note 2025-01-21 22:19:37 -05:00
d210a85253 Updated handling of '\b' when inside character class, made invalid
escapes an error.

The '\b' value refers to a word boundary normally, but refers to the
backspace ASCII value inside a character class. I updated
newEscapedNode() to deal with this. I also changed the behavior, so that
trying to escape any other value results in an error, instead of just
returning the character as-is.
2025-01-21 22:14:38 -05:00
48cff259b2 Updated tests 2025-01-21 22:13:57 -05:00
25cb79f01b Changed the value of EPSILON, so that we can use the NUL character
(which it used to be) in a regex; Also added code to detect escaped
backslashes

Specifically, I replace an escaped backslash with a metacharacter, then
replace it back later on. This prevents problems, like detecting whether
the opening bracket is escaped in '\\[a]'.
2025-01-21 22:12:29 -05:00
0fb78abf7f Added function to replace an element in a slice given its value 2025-01-21 22:09:41 -05:00
9dc4fd4595 Started adding tests from Python's RE test suite 2025-01-20 18:04:19 -05:00
099612ae7f Bug fixes, changed the way I parse octal values 2025-01-20 18:04:05 -05:00
9115858261 Changed assignment of the unicode values by 1, so that EPSILON can now be 0xF0000 2025-01-20 17:08:07 -05:00
fb46ed62d9 Added tests for FindString 2025-01-19 22:56:47 -05:00
47ec95f7bb Created function that returns a 'default' state 2025-01-19 21:45:07 -06:00
a14ab81697 Updated function names, addeed new function 'FindString' that returns the _text_ of the match 2025-01-19 21:44:15 -06:00
7056026e10 Added a new class 'CHARCLASS', which represents a character class with some other postfixNodes in it. The 'except' field now contains a list of postfixNodes rather than runes 2025-01-19 21:43:21 -06:00
b81a2f8452 Added functions to find if a character is a valid hex value and a valid octal value 2025-01-19 21:31:18 -06:00
fcdb4a8868 Added another test, changed function calls to match new names 2025-01-19 21:30:56 -06:00
3a3333b38a New features, changed character class behavior
I added support for hex values (eg. \x0F), octal values (eg. \012) and
extended hex values (eg. \x{000F2A}). I also expanded the abilities of
character clsses, to include things like escaped characters (eg. [aefp\)])
and character ranges _inside_ inverted character classes (eg. [^\w] which is
functionally equivalent to [\W]).
2025-01-19 21:26:56 -06:00