Solution for Input a string in grep command in python [duplicate]
is Given Below:
I have a list of strings in python and want to run a recursive grep on each string in the list. Am using the following code,
import subprocess as sp
for python_file in python_files:
out = sp.getoutput("grep -r python_file . | wc -l")
print(out)
The output I am getting is the grep of the string “python_file”. What mistake am I committing and what should I do to correct this??
Your code has several issues. The immediate answer to what you seem to be asking was given in a comment, but there are more things to fix here.
If you want to pass in a variable instead of a static string, you have to use some sort of string interpolation.
grep
already knows how to report how many lines matched; use grep -c
. Or just ask Python to count the number of output lines. Trimming off the pipe to wc -l
allows you to also avoid invoking a shell, which is a good thing; see also Actual meaning of shell=True
in subprocess.
grep
already knows how to search for multiple expressions. Try passing in the whole list as an input file with grep -f -
.
import subprocess as sp
out = sp.check_output(
["grep", "-r", "-f", "-", "."],
input="n".join(python_files), text=True)
print(len(out.splitlines()))
If you want to speed up your processing and the patterns are all static strings, try also adding the -F
option to grep
.
Of course, all of this is relatively easy to do natively in Python, too. You should easily be able to find examples with os.walk()
.
Your intent isn’t totally clear from the way you’ve written your question, but the first argument to grep
is the pattern (python_file
in your example), and the second is the file(s) .
in your example
You could write this in native Python or just use grep directly, which is probably easier than using both!
grep
args
--count
will report just the number of matching lines--file
Read one or more newline separated patterns from file. (manpage)
grep --count --file patterns.txt -r .
import re
from pathlib import Path
for pattern in patterns:
count = 0
for path_file in Path(".").iterdir():
with open(path_file) as fh:
for line in fh:
if re.match(pattern, line):
count += 1
print(count)
NOTE that the behavior in your question would get a separate word count for each pattern, while you may really want a single count