Thursday, 1 August 2019

Python - String Pattern Matching


In this article we will try to match the pattern in the given string.
We will use Pandas Series.

Series.str can be used to access various functions for String

#Import all necessary libraries
import pandas as pd
import numpy as np
import re

#Create a Series
s_pat = pd.Series(['Parrot','pigeon','Eagle','sparrow',np.nan])
print(s_pat)


Output:













startswith() : Checks is the start of the string matches the pattern
Return :  A Series of Boolean values

#Match the pattern that starts with
print("Use startswith")
print(s_pat.str.startswith("P"))

Output:












If you want to display NaN to be false use, na=False

print(s_pat.str.startswith("P",na=False))

Output:












Similarly,
endswith() : Checks is the end of the string matches the pattern
Return :  A Series of Boolean values

print(s_pat.str.endswith("on"))


Output:













contains():  Check if the pattern or regular expression is contained in the string 
Return: A boolean series based on the pattern or regular expression is contained within the string of the Series

#Check if the pattern or Regular expression is contained in the string of Series
print("Use contains")
print(s_pat.str.contains("ro"))


Output:












To make it case insensitive use the parameter case=False

print(s_pat.str.contains("p",case=False))

Output:




findall(): Finds all the occurrence of the pattern or regular expression in the given series
Return : List of strings

#Find all the pattern in the string
print("Use findall")
print(s_pat.str.findall("Parrot"))












When the pattern matches more than one times in the string, then list of multiple string is returned.


print(s_pat.str.findall("r"))


Output:












To ignore-case the case add parameter flags. we need to import re,  ir regular expression

print(s_pat.str.findall("PARROT",flags=re.IGNORECASE))

Output:


No comments:

Post a Comment