924e2a8dbc
Added some AI-generated test cases (llama3.1:405b)
2024-11-22 00:12:41 -05:00
21b2d5a2a9
Added lookaround-related fields to postfixNode struct
2024-11-22 00:12:22 -05:00
77d19cd84e
Added lookaround-related fields to State struct, added lookaround support to checkAssertion()
2024-11-22 00:11:51 -05:00
051a8551f3
Match zero-length match at end of string, even if the start node is an assertion (end of string, lookarounds, etc.)
2024-11-22 00:10:58 -05:00
11c0a0552f
Added support for lokarounds; parsing and adding nodes for different lookarounds
2024-11-22 00:10:15 -05:00
c807d6664e
Changed generation of characters for non-whitespace, non-digit and non-word characters - it's basically an inverted character class now
2024-11-20 10:39:24 -05:00
d986999001
Added more tests
2024-11-20 10:38:57 -05:00
ea64ddc88a
Removed unnecessary duplication of assertion checking
2024-11-20 10:38:41 -05:00
1ba871d618
Removed dotChars() function, moved notDotChars() setting to main()
2024-11-20 10:38:22 -05:00
1a1a8f4f9c
Moved flag-checking after flag.Parse()
2024-11-20 10:37:33 -05:00
e882f41400
Added fields to denote all the characters that an 'allChars' postfixNode _shouldn't_ represent (useful for inverting character classes)
2024-11-20 09:41:37 -05:00
b3ee1fe5e8
Convert an inverting character class into an 'allChars' node, with the characters marked as exceptions
2024-11-20 09:40:40 -05:00
708a9e1303
Added field to denote all characters which an 'allChars' node _shouldn't_ match (useful for invertinc character classes
2024-11-20 09:39:24 -05:00
c694c47be7
Added flag to print match indices, and to enable multi-line mode
2024-11-20 01:06:23 -05:00
2569f52552
Wrote toString function for MatchIndex
2024-11-20 01:04:31 -05:00
160b2f9215
Added newline character as an escaped node
2024-11-20 01:04:01 -05:00
992c5a9300
Replaced isAlphaNum() with isNormalChar(), which returns true if the character isn't special (also returns true for unicode characters, which the previous function didn't
2024-11-20 00:24:43 -05:00
1e0502c6aa
Added unicode tests
2024-11-20 00:23:57 -05:00
c56d81a335
Added unicode support to dot metacharacter - it now matches _any_ unicode character (almost)
2024-11-18 16:44:43 -05:00
8a1f1dc621
Added unicode support
...
Replaced strings with rune-slices, which capture unicode codepoints more
accurately.
2024-11-18 10:41:50 -05:00
805766a5ba
Added support for -l : only print lines with at least one match (or with exactly 0 matches, if -v is enabled
2024-11-18 10:02:34 -05:00
dcd712dceb
Added support for -o flag: only print matching content
2024-11-18 09:36:16 -05:00
f2b8812b05
Added support for -v flag, to invert which values are printed in color. Also got rid of unecessary 'else' clause
2024-11-17 22:19:55 -05:00
11641596fa
Read multiple lines from stdin and apply regex to each one; Convert the array of matchIndex structs into a flat array of indices; speeds up process of checking if we have to print a character in color
2024-11-17 21:49:11 -05:00
b55b80ec6c
Updated TODO
...
I didn't like the existing capturing group implementation, so I moved
that to a separate branch. This branch does not (at the moment) any code
relating to capturing groups.
2024-11-17 21:29:18 -05:00
137ea3c746
Made findAllMatchesHelper non-recursive, added pruneIndices (improved performance) and more changes
...
I made findAllMatchesHelper a non-recursive function. It now only
returns the first match it finds in the string (so I should probably
rename it).
These indices are collected by findAllMatches and pruned (to
remove overlaps). The overlap function has also been rewritten, to make
it (I believe) less than O(n^2). I also used the uniq_arr type to make
checking for uniqueness O(1) instaed of O(n) (as it was with
unique_append()). This has resulted in massive performance gains.
There's been a lot of changes here, and I probably haven't documented
all of them.
2024-11-07 16:16:50 -05:00
9201ed49bd
Changed type from matchIndex to MatchIndex
2024-11-07 16:12:21 -05:00
9a073aa514
Added node types for left and right parentheses
2024-11-07 15:55:37 -05:00
7d265495f5
Got rid of list for uniq_arr (O(n) deletion) and instead have separate method to create list (O(n) list creation)
2024-11-07 15:55:13 -05:00
e2e99ff6a9
Added fnunction to generate numbers in a range; added capacity to some slices to prevent unnecessary reallocations
2024-11-06 15:16:51 -05:00
8a69ea8cb7
Added unique array data structure - O(1) addition and retrieval (I think)
2024-11-06 15:15:44 -05:00
ea17251bf8
Might have made a change to improve performance
2024-11-04 08:42:26 -05:00
e8aca8606a
Added test cases
2024-11-03 15:09:21 -05:00
9698c4f1d8
Fixed error in calculating word boundary (off-by-one)
2024-11-03 15:04:57 -05:00
c032dcb2ea
Added more test cases
2024-11-03 15:04:19 -05:00
269e2d0e1c
Updated go.mod
2024-11-03 14:38:46 -05:00
21142e6e13
Wrote function to clone the NFA starting at a given state, and a function to find question mark operator (a? == (a|))
2024-11-03 14:37:38 -05:00
b602295bee
Added support for specifying how often a postfixNode is repeated
2024-11-03 14:36:56 -05:00
1d9d1a5b81
Fixed calculation of overlapping (used to check for subset instead)
2024-11-03 14:36:23 -05:00
d8f52b8ccc
Added support for numeric specifiers, moved question mark operator to its own function
2024-11-03 14:36:04 -05:00
dca81c1796
Replaced rune-slice parameters with string parameters in functions; avoids unnecessary conversion from strings to rune-slices
2024-11-01 01:53:50 -04:00
723be527fb
Updated TODO
2024-10-31 17:56:45 -04:00
fccd3a76f5
Wrote function to check if the assertion of a state is true
2024-10-31 17:56:04 -04:00
315f68df12
Fixed typo
2024-10-31 17:55:41 -04:00
fd957d9518
Added more test cases
2024-10-31 17:55:07 -04:00
19dc5064c8
Made conditions for word boundary a little more relaxed
2024-10-31 17:54:45 -04:00
a19d409796
Set node type to ASSERTION if the character represents an assertion
2024-10-31 17:14:56 -04:00
0736e813c1
Fixed boneheaded mistake with checking assertion types
2024-10-31 17:14:03 -04:00
1aff6e2fa4
Added a field to State, that tells me what kind of assertion (if any) it is making. Also added function to check if a state's contents contain a given value (checks assertions), and to find all matches that a state has for a character
2024-10-31 17:13:34 -04:00
f3bf5e9740
Added function to check for word boundaries and delete an element from a slice
2024-10-31 17:09:25 -04:00