Grep: Is there a special way to include "no character" as a permutation for a "character position" match?

This one is for the true gurus out there. :slight_smile:

Say I have a start string, call it "beg", and an end string, call it "rem", is there a way such that I can perform a grep where a "no character" specification can be included (instead of the "*") as an option along with the other characters specified within the square brackets in following command:

grep "${beg}"'[-_*]'"${rem}"

Also, is there a way to specify the pattern " - " (space, hyphen, space) instead of that "[*]" ? Meaning, any (and only) one of

  • [-]
  • [_]
  • {no_character}
  • " - " ?

Is that even possible ?




Some context:

The search string

The Blitzkrieg Myth

is converting by my script into

patterns[1]="[Tt][Hh][Ee]"
patterns[2]="[Bb][Ll][Ii][Tt][Zz][Kk][Rr][Ii][Ee][Gg]"
patterns[3]="[Mm][Yy][Tt][Hh]"

then, used in a case statement for 1-8 words and applied in the code using the form (3-word pattern shown here):

sp='(?:-| - |_| |){1}'

pSingle="${patterns[1]}${sp}${patterns[2]}${sp}${patterns[3]}"

grep -a -E "${pSingle}" INDEX.allDrives.f.txt

Someone on StackOverflow answered the above question with a detailed breakdown of the command structure.

grep -E "${beg}"'(?:-| - |_| |){1}'"${rem}"
1 Like
$ grep -E "^beg[-_ ]?rem$"

or

grep -E "^beg( - |[-_ ]?)rem$"

2 Likes

Thank you, Eugene. What you are saying is that there is no need for the "{1}" numeric assignment string for the "(...)" expression to work?

Also, I take it that grep is sensitive to the placement of the "?" ?

1 Like

Hi, Eric! I constructed that expression literary following your request:

  • string starts with 'beg' so ^beg
  • string ends with rem so rem$
  • there is '-' or '_' or ' ' in between them so [-_ ]
  • there is only one or none of the mentioned characters so [-_ ]?

Yes, surely. Not grep as such but regex syntax. X? matches one or none occasion of the X character.

1 Like

Thank you, Eugene. So, are you saying that the form

(?:{regex})

is equivalent to the form

({regex})?

Correct?

1 Like

My pleasure, Eric!

I am not a regex guru... :blush: And I do use basic regex syntax mainly.

IMHO, the two forms are not equivalent. AFAIK, (?:value) is extended (perl-inspired) syntax for non-capturing grouping, whatever it is intended to mean. They say, one creates non-capturing group when it is not needed for matching. I do not know how such the groups are used.

Parenthesis in (value)? create capturing group and allow to treat the 'value' as an 'atom', i.e. question mark works as one or zero match operator.

2 Likes

Thank you, Eugene. I wanted to give you credit for the simplified form, but in the end, I realized that the original solution was the one that "detailed" the formulation of the expression to fit my original problem statement. So I hope you didn't mind my re-assigning that back to that solution.