In this article, you will learn in depth about Python strings and different kind of functions and operations in strings.
A string is a sequence of characters which is treated as a single data item. Python strings are array of characters or any group of characters defined between double quotes or single quotes.
For example, “This is a string” or ‘This is a string’.
The characters in the string are stored in the consecutive memory location as the string is treated as an array of characters. Each character occupies 1 byte in the memory heap.
Python strings are also immutable, meaning they cannot be modified once created.
Creating strings in Python is as simple as assigning a value to the variable. The value that is being assigned must be in either within double quotes " "
or single quotes ' '
for the string.
A string literal can be of multiple lines as well. In such cases, triple quotes are used.
Here are the examples to demonstrate Python strings.
>>> #using single quotes
>>> py_str = 'YOLO'
>>> print (py_str)
YOLO
>>> #using double quotes
>>> py_str = "Hey there"
>>> print (py_str)
Hey there
>>> #using triple quotes for multiple line strings
>>> py_str = """ This is first line
and this is second line."""
>>> print (py_str)
This is first line
and this is second line.
Concatenation means joining two or more strings together.
In Python, this is achieved by using +
operator.
>>> #concatenating two strings
>>> py_str = "Hi" + "there"
>>> print (py_str)
Hi there
>>> #concatenating multiple strings
>>> py_str = "Hi" + "there" + "programmers"
>>> print (py_str)
Hi there programmers
>>> #using concatenating + and assignment operator =
>>> x = "Python"
>>> y = "Strings"
>>> x += y #this is equivalent to x = x + y
>>> print(x)
Python Strings
Python has built-in function len(string_name)
which calculates and return the length of a string supplied as an argument.
>>> #using len() to find length of a string
>>> py_str = "YOLO YOLO"
>>> print (len(py_str)) #to print the length of string
9
Here in the example, the length of the string is 9 because it also counts the whitespace in between.
Once we create Python string, we can’t modify them but we can access the characters in the string and use them in further operations.
In Python, individual characters or elements are accessed using indexing and a range of characters or string slices are accessed using slicing. Before jumping into examples, let us first know about indexing and slicing.
Before jumping into examples, let us first know about indexing and slicing.
What is indexing?
Indexing actually means locating an element from a Python sequence (list
, string
, tuple
, dictionary or any) by its position in that sequence. This indexing starts from zero meaning the first element of the sequence holds the index '0'
.
Python has indexing operator '[ ]'.
There are two ways we can index string or any sequence in Python:
0
and 0
being the index of the first item in the string or sequence-1
and -1
being the index of the last item in the string or sequence.For example, consider the following string with 4 characters.
py_str = "abcd"
Now, if we index with positive integers using the indexing operator '[ ]'
:
py_str[0] = a
, py_str[1] = b
, py_str[2] = c
, py_str[3] = d
And if we index with negative integers, then
py_str[-1] = d
, py_str[-2] = c
, py_str[-3] = b
, py_str[-4] = a
Note: Only integers are allowed in indexing.
IndexError: String index out of range
. And also using floating point numbers instead of integers to index will also raise an error saying: TypeError
What is Slicing in Python?
Slicing as the meaning goes is basically chopping or retrieving a set of values from a particular sequence.
In slicing we define a starting point to start retrieving the values, an endpoint to stop retrieving (the endpoint value is not included) and the step size. Step size is by default 1, if not mentioned explicitly.
[adsense1]
And the retrieved set of characters from the string is known as Python slices.
py_ssequence[start:end:step_size]
Now that we know about indexing and slicing, let’s learn about using slicing and indexing to access characters from Python strings with examples.
py_str[index_char]
Where index_char is the index of the character to be accessed from string py_str.
For example, if we have a string named py_str
, we can access its 5th character by using py_str[4]
as the indexing starts from 0.
py_str[m,n]
This will return characters starting from mth character to the nth character from the string excluding nth character.
Example:
>>> py_str = "Programming"
>>> #accessing 4th character using positive integer
>>> print (py_str[3]) #remember indexing starts from 0
g
>>> #accessing 3rd character from right using negative integer
>>> print (py_str[-3]) #indexing from right starts with -1
i
>>> #slicing out 1st character to 7th character
>>> print (py_str[0:7] #if not mentioned step size is 1
Progra
>>> #slicing 1st character to 5th last
>>> print (py_str[0:-4])
Program
>>> #to access whole string
>>> print (py_str[:])
Programming
>>> #to slice out whole string starting from 2nd character
>>> print (py_str[1:])
rogramming
Iterating through Python strings using for loop.
py_str = "Hey"
count = 1
for alphabets in py_str:
print (alphabets)
count = count+1
Output:
This will generate following output.
H e y
Now let’s write another example using while loop to check if a character is present in the string or not.
py_str = "trytoprogram"
l = len(py_str)
count = 0
while count<l:
if py_str[count] == 'o':
print ('character found')
break
count = count+1
Output:
This will generate following output.
character found
An escape character is a non-printable character which is represented with a backslash '\'
and are interpreted differently.
When do we need escape sequences?
Well, we need escape sequences in many things while coding. Here is one example.
In Python, a double quoted string literal can easily have a single quote in between and a single quote string can have a double quote in between.
But, what if we have both double quotes and single quotes in a string.
let us consider following example.
>>> print ("Gary said-"I don't like cats".")
SyntaxError : invalid syntax
That generated error called SyntaxError : invalid syntax
.
That is because when the interpreter encountered closing double quote at I
, it interpreted as the end of the string, hence generating the syntax error.
We can address such errors with triple quotes, but when triple quotes occur in between the string, it will again generate the error.
So, escape sequences are used to generate quotes in such cases. Here is the example demonstrating the solution using escape sequences.
>>> #using triple quotes
>>> print (""" Gary said-"I don't like cats".""" )
Gary said-"I don't like cats".
>>> #using escape sequences \" for printing double quotes and \' for single quotes
>>> print ("Gary said-\"I don\'t like cats\".")
Gary said-"I don't like cats".
Here is the list of escape sequences in Python.
Operator | Meaning |
---|---|
\newline | Newline ignored |
\\ | Backslash ( \ ) |
\’ | Single quote ( ‘ ) |
\” | Double quote ( ” ) |
\a | ASCII bell or alert |
\b | ASCII backspace |
\f | ASCII form feed |
\n | ASCII line feed |
\r | ASCII carriage return |
\t | ASCII horizontal tab |
\v | ASCII vertical tab |
\ooo | ASCII character with octal value ooo |
\xhh | ASCII character with hexadecimal value hh |
Python has many built-in functions or methods for the manipulation of strings. Here is the tabulated list of Python methods for string manipulation.
Python built-in methods with description |
---|
all( str) It returns true if all the elements in the iterable are true. |
any(str ) It returns true if any element in the iterable are true. |
ascii(str ) It returns printable version of string ‘str’. |
capitalize( ) It capitalizes the first letter of the string. |
center( ) Returns a space-padded string with the specified character. |
count(m) Counts how many times m occurs in a string. |
decode(encoding=’UTF-8′, errors=’strict’) It decodes the string using the codec registered for encoding. |
encode(encoding=’UTF-8′, errors=’strict’) It returns encoded version of the string. |
endswith(suffix) Checks if the string ends with specified character or suffix. |
expandtabs(tabsize) Expands tabs in string with multiple spaces. |
find(str) Determine if ‘str’ is present in string or not. |
index(str) Returns the index of substring ‘str’ but raises an exception if ‘str’ not found. |
isalnum( ) Returns true if string has at least 1 character and all characters are alphanumeric and false otherwise. |
isalpha( ) Returns true if string has alphanumeric character. |
isdigit( ) Checks if string contains only digits.. |
islower( ) Returns true if string has lowercase characters. |
isnumeric( ) Returns true if the string contains only numeric digits. |
isspace( ) It is used to check whitespace in the string. |
istitle( ) Checks if a string is properly titlecased or not. Return true if titlecased. |
isupper( ) Returns true if string has uppercase letters. |
join(seq) Returns concatenated string representations of elements in sequence ‘seq’. |
len(str) Returns the length of the string ‘str’. |
ljust(width) Returns a left-justified string of given width. |
lower( ) It converts all uppercase letters in string to lowercase. |
lstrip( ) Removes all leading characters or white spaces in the string. |
maketrans( ) Returns a translation table that is used in translate function. |
max(str) It returns the max alphabet from string ‘str’. |
min(str) Returns the min alphabe from the string str. |
replace(old, new) It replaces ‘old’ sub-string with the ‘new’ string. |
rfind(str, start,end) It returns highest index of the sub-string. |
rindex( str) It is same as index( ), but search backwards in the string. |
rjust(width) It returns right-justified string of given width. |
rstrip( ) It removes all trailing characters or white space of the string. |
split(str) It splits the string from the left. |
splitlines( ) Splits all the new lines in the string. |
startswith(str) It checks whether the string starts with character ‘str’. |
strip([chars]) It performs both lstrip() and rstrip() on the given string. |
swapcase( ) It reverses uppercase into lowercase characters and vice-versa. |
title( ) It returns the title cased string. |
translate(table) Translates string according to translation table acquired using maketrans( ) function. |
upper( ) Converts lowercase letters to uppercase in the string. |
zfill (width) It returns a string padded with 0’s, width being the length of string padded with 0’s. |
isdecimal( ) Returns true if a string contains only decimal characters. |