126 Commits (c807d6664ecffaa403148744a998d840e58eb2c3)
 

Author SHA1 Message Date
Aadhavan Srinivasan c807d6664e Changed generation of characters for non-whitespace, non-digit and non-word characters - it's basically an inverted character class now 1 month ago
Aadhavan Srinivasan d986999001 Added more tests 1 month ago
Aadhavan Srinivasan ea64ddc88a Removed unnecessary duplication of assertion checking 1 month ago
Aadhavan Srinivasan 1ba871d618 Removed dotChars() function, moved notDotChars() setting to main() 1 month ago
Aadhavan Srinivasan 1a1a8f4f9c Moved flag-checking after flag.Parse() 1 month ago
Aadhavan Srinivasan e882f41400 Added fields to denote all the characters that an 'allChars' postfixNode _shouldn't_ represent (useful for inverting character classes) 1 month ago
Aadhavan Srinivasan b3ee1fe5e8 Convert an inverting character class into an 'allChars' node, with the characters marked as exceptions 1 month ago
Aadhavan Srinivasan 708a9e1303 Added field to denote all characters which an 'allChars' node _shouldn't_ match (useful for invertinc character classes 1 month ago
Aadhavan Srinivasan c694c47be7 Added flag to print match indices, and to enable multi-line mode 1 month ago
Aadhavan Srinivasan 2569f52552 Wrote toString function for MatchIndex 1 month ago
Aadhavan Srinivasan 160b2f9215 Added newline character as an escaped node 1 month ago
Aadhavan Srinivasan 992c5a9300 Replaced isAlphaNum() with isNormalChar(), which returns true if the character isn't special (also returns true for unicode characters, which the previous function didn't 1 month ago
Aadhavan Srinivasan 1e0502c6aa Added unicode tests 1 month ago
Aadhavan Srinivasan c56d81a335 Added unicode support to dot metacharacter - it now matches _any_ unicode character (almost) 1 month ago
Aadhavan Srinivasan 8a1f1dc621 Added unicode support
Replaced strings with rune-slices, which capture unicode codepoints more
accurately.
1 month ago
Aadhavan Srinivasan 805766a5ba Added support for -l : only print lines with at least one match (or with exactly 0 matches, if -v is enabled 1 month ago
Aadhavan Srinivasan dcd712dceb Added support for -o flag: only print matching content 1 month ago
Aadhavan Srinivasan f2b8812b05 Added support for -v flag, to invert which values are printed in color. Also got rid of unecessary 'else' clause 1 month ago
Aadhavan Srinivasan 11641596fa Read multiple lines from stdin and apply regex to each one; Convert the array of matchIndex structs into a flat array of indices; speeds up process of checking if we have to print a character in color 1 month ago
Aadhavan Srinivasan b55b80ec6c Updated TODO
I didn't like the existing capturing group implementation, so I moved
that to a separate branch. This branch does not (at the moment) any code
relating to capturing groups.
1 month ago
Aadhavan Srinivasan 137ea3c746 Made findAllMatchesHelper non-recursive, added pruneIndices (improved performance) and more changes
I made findAllMatchesHelper a non-recursive function. It now only
returns the first match it finds in the string (so I should probably
rename it).

These indices are collected by findAllMatches and pruned (to
remove overlaps). The overlap function has also been rewritten, to make
it (I believe) less than O(n^2). I also used the uniq_arr type to make
checking for uniqueness O(1) instaed of O(n) (as it was with
unique_append()). This has resulted in massive performance gains.

There's been a lot of changes here, and I probably haven't documented
all of them.
2 months ago
Aadhavan Srinivasan 9201ed49bd Changed type from matchIndex to MatchIndex 2 months ago
Aadhavan Srinivasan 9a073aa514 Added node types for left and right parentheses 2 months ago
Aadhavan Srinivasan 7d265495f5 Got rid of list for uniq_arr (O(n) deletion) and instead have separate method to create list (O(n) list creation) 2 months ago
Aadhavan Srinivasan e2e99ff6a9 Added fnunction to generate numbers in a range; added capacity to some slices to prevent unnecessary reallocations 2 months ago
Aadhavan Srinivasan 8a69ea8cb7 Added unique array data structure - O(1) addition and retrieval (I think) 2 months ago
Aadhavan Srinivasan ea17251bf8 Might have made a change to improve performance 2 months ago
Aadhavan Srinivasan e8aca8606a Added test cases 2 months ago
Aadhavan Srinivasan 9698c4f1d8 Fixed error in calculating word boundary (off-by-one) 2 months ago
Aadhavan Srinivasan c032dcb2ea Added more test cases 2 months ago
Aadhavan Srinivasan 269e2d0e1c Updated go.mod 2 months ago
Aadhavan Srinivasan 21142e6e13 Wrote function to clone the NFA starting at a given state, and a function to find question mark operator (a? == (a|)) 2 months ago
Aadhavan Srinivasan b602295bee Added support for specifying how often a postfixNode is repeated 2 months ago
Aadhavan Srinivasan 1d9d1a5b81 Fixed calculation of overlapping (used to check for subset instead) 2 months ago
Aadhavan Srinivasan d8f52b8ccc Added support for numeric specifiers, moved question mark operator to its own function 2 months ago
Aadhavan Srinivasan dca81c1796 Replaced rune-slice parameters with string parameters in functions; avoids unnecessary conversion from strings to rune-slices 2 months ago
Aadhavan Srinivasan 723be527fb Updated TODO 2 months ago
Aadhavan Srinivasan fccd3a76f5 Wrote function to check if the assertion of a state is true 2 months ago
Aadhavan Srinivasan 315f68df12 Fixed typo 2 months ago
Aadhavan Srinivasan fd957d9518 Added more test cases 2 months ago
Aadhavan Srinivasan 19dc5064c8 Made conditions for word boundary a little more relaxed 2 months ago
Aadhavan Srinivasan a19d409796 Set node type to ASSERTION if the character represents an assertion 2 months ago
Aadhavan Srinivasan 0736e813c1 Fixed boneheaded mistake with checking assertion types 2 months ago
Aadhavan Srinivasan 1aff6e2fa4 Added a field to State, that tells me what kind of assertion (if any) it is making. Also added function to check if a state's contents contain a given value (checks assertions), and to find all matches that a state has for a character 2 months ago
Aadhavan Srinivasan f3bf5e9740 Added function to check for word boundaries and delete an element from a slice 2 months ago
Aadhavan Srinivasan 20db62c596 Got rid of function that I don't need anymore 2 months ago
Aadhavan Srinivasan 360bdc8e11 Big rewrite - assertion handling, zero-match fixes, change in recursive calls
I added support for transitions. I wrote a function to determine if
a given state has transitions for a character at a given point in the
string. This helps me check if the current state has an assertion, and
take actions based on that.

I also fixed zero-length matching (almost, see todo.txt). It works for
nearly all cases I could think of, although I still need to write more
tests. I wrote a function to check if zero-length matches are possible
with a given state.

I also changed the way recursive calls work. Rather than passing a
modified string, the function stores the location in the input string.
This location is updated with each call to the function.

Finally, the function now increments the offset by 1 instead of
incrementing by the length of the longest match. This leads to a bit of
overhead eg. if a regex matches index 1-5, then 1-5, 2-5, 3-5, 4-5 are
all stored. To fix this, I wrote (and used) a function to check if
a match overlaps with any matches in a slice.
2 months ago
Aadhavan Srinivasan 8dbecde3ae Added support for detecting assertion characters; changed input so that newline isn't required 2 months ago
Aadhavan Srinivasan a752491563 Added more test cases 2 months ago
Aadhavan Srinivasan 656c506aa8 Wrote function to provide correct node for escaped character 2 months ago