Solution for Regex: Remove the letters with length 1-3 which are before the dot
is Given Below:
If I have an input something like this
input="AB. Hello word."
the output should be
output="Hello word."
Another example is
input="AB′. Hello word"
output = Hello Word
I want to produce a code which is generalized for any group of letter in any language. This is my code
text="A. Hello word."
text = re.sub(r'A. w{1,2}.*', '', text)
text
output = llo word.
So I can change ‘A’ with any other letter, but for some reason isn’t working well.
I tried also this one
text="Ab. Hello word."
text = re.sub(r'A+. w{1,2}.*', '', text)
text
output = Ab. Hello word.
but isn’t working as well.
Don’t use a regex for this, just .split()
on it, you can just split once and take the last half [-1]
>>> "Ab. Hello world.".split(".", 1)[-1].strip()
'Hello world.'
>>> "Hello world".split(".", 1)[-1]
'Hello world'
You may use this regex for matching:
b[A-Za-z]{1,3}′?.
Replace it with ""
.
RegEx Details:
b
: Word boundary[A-Za-z]{1,3}
: Match 1 to 3 letters′?
: Match an optional′
.
: Match a dot
This solution may be useful:
a = "AB. Hello word."
print(a[a.find(".")+1:])
Try this:
import re
regex = r"^[^.]{1,3}.s*"
test_str = ("AB. Hello word.n"
"AB′. Hello word.n"
"A. Hello word.n"
"Ab. Hello word.n")
subst = ""
# You can manually specify the number of replacements by changing the 4th argument
result = re.sub(regex, subst, test_str, 0, re.MULTILINE)
if result:
print (result)
Output:
Hello word.
Hello word.
Hello word.
Hello word.
Use this generic regex pattern:
“^.{0,}.”
Expalination:
^ Finds a match as the beginning of a string.
.{0,} Matches any string that contains a sequence of zero or more characters.
. ending with a . (dot)