15 Must-Know Python String Methods

It is not always about numbers.

By Soner Yildirim, Data Science Enthusiast on September 21, 2021 in Data Processing, NLP, Python, Text Analytics

Python is a great language. It is relatively easy to learn and has an intuitive syntax. The rich selection of libraries also contribute to the popularity and success of Python.

However, it is not just about the third party libraries. Base Python also provides numerous methods and functions to expedite and ease the typical tasks in data science.

In this article, we will go over 15 built-in string methods in Python. You might already be familiar with some of them but we will also see some of the rare ones.

The methods are quite self-explanatory so I will focus more on the examples to demonstrate how to use them rather than explaining what they do.

1. Capitalize

It makes the first letter uppercase.

txt = "python is awesome!"

txt.capitalize()
'Python is awesome!'

2. Upper

It makes all the letters uppercase.

txt = "Python is awesome!"

txt.upper()
'PYTHON IS AWESOME!'

3. Lower

It makes all the letters lowercase.

txt = "PYTHON IS AWESOME!"

txt.lower()
'python is awesome!'

4. Isupper

It checks if all the letters are uppercase.

txt = "PYTHON IS AWESOME!"

txt.isupper()
True

5. Islower

It checks if all the letters are lowercase

txt = "PYTHON IS AWESOME!"

txt.islower()
False

The following 3 methods are similar so I will do examples that include all of them.

6. Isnumeric

It checks if all the characters are numeric.

7. Isalpha

It checks if all the characters are in the alphabet.

8. Isalnum

It checks if all the characters are alphanumeric (i.e. letter or number).

# Example 1
txt = "Python"

print(txt.isnumeric())
False

print(txt.isalpha())
True

print(txt.isalnum())
True

# Example 2
txt = "2021"

print(txt.isnumeric())
True

print(txt.isalpha())
False

print(txt.isalnum())
True

# Example 3
txt = "Python2021"

print(txt.isnumeric())
False

print(txt.isalpha())
False

print(txt.isalnum())
True

# Example 4
txt = "Python-2021"

print(txt.isnumeric())
False

print(txt.isalpha())
False

print(txt.isalnum())
False

9. Count

It counts the number of occurrences of the given character in a string.

txt = "Data science"

txt.count("e")
2

10. Find

It returns the index of the first occurrence of the given character in a string.

txt = "Data science"

txt.find("a")
1

We can also find the second or other occurrences of a character.

txt.find("a", 2)
3

If we pass a sequence of characters, the find method returns the index where the sequence starts.

txt.find("sci")
5

11. Startswith

It checks if a string starts with the given character. We can use this method as a filter in a list comprehension.

mylist = ["John", "Jane", "Emily", "Jack", "Ashley"]

j_list = [name for name in mylist if name.startswith("J")]

j_list
['John', 'Jane', 'Jack']

12. Endswith

It checks if a string ends with the given character.

txt = "Python"

txt.endswith("n")
True

Both the endswith and startswith methods are case sensitive.

txt = "Python"

txt.startswith("p")
False

txt.startswith("P")
True

13. Replace

It replaces a string or a part of it with the given set of characters.

txt = "Python is awesome!"

txt = txt.replace("Python", "Data science")

txt
'Data science is awesome!'

14. Split

It splits a string at the occurrences of the specified character and returns a list that contains each part after splitting.

txt = 'Data science is awesome!'

txt.split()
['Data', 'science', 'is', 'awesome!']

By default, it splits at whitespace but we can make it based on any character or set of characters.

15. Partition

It partitions a string into 3 parts and returns a tuple that contains these parts.

txt = "Python is awesome!"
txt.partition("is")
('Python ', 'is', ' awesome!')

txt = "Python is awesome and it is easy to learn."
txt.partition("and")
('Python is awesome ', 'and', ' it is easy to learn.')

The partition method returns exactly 3 parts. If there are multiple occurrences of the character used for partitioning, the first one is taken into account.

txt = "Python and data science and machine learning"
txt.partition("and")
('Python ', 'and', ' data science and machine learning')

We can also do a similar operation with the split method by limiting the number of splits. However, there are some differences.

The split method returns a list
The returned list does not include the characters used for splitting

txt = "Python and data science and machine learning"
txt.split("and", 1)
['Python ', ' data science and machine learning']

Bonus

Thanks Matheus Ferreira for reminding me one of the greatest strings methods: join. I also use the join method but I forgot to add it here. It deserves to get in the list as a bonus.
The join method combines the strings in a collection into a single string.

mylist = ["Jane", "John", "Matt", "James"]

"-".join(mylist)

'Jane-John-Matt-James'

Let’s do an example with a tuple as well.

mytuple = ("Data science", "Machine learning")" and ".join(mytuple)'Data science and Machine learning'

Conclusion

When performing data science, we deal with textual data a lot. Moreover, the textual data requires much more preprocessing than plain numbers. Thankfully, Python’s built-in string methods are capable of performing such tasks efficiently and smoothly.

Thank you for reading. Please let me know if you have any feedback.

Bio: Soner Yıldırım is a Junior Data Scientist at Invent Analytics and blogger.

Original. Reposted with permission.

Related: