Python Strings — The Essentials

The Core Concept: Strings Are Immutable

This single fact explains most string behavior: you cannot change a string in place. Every "modification" creates a new string object.

python

s = "hello"
s[0] = "H"      # TypeError — can't modify in place
s = s.upper()    # Works — creates a NEW string

Why Immutability Matters

Because strings are immutable, they can be used as dictionary keys, set members, and shared safely between variables. The trade-off is that every "change" allocates a new string in memory.

Unicode by default — Python 3 strings are Unicode. No separate "byte string" vs "text" confusion:

python

len("café")   # 4 characters, not 5 bytes
len("😀")    # 1 character

Creating Strings

Quote Styles

Pick whichever avoids escaping:

python

'single quotes'
"double quotes"
"it's easy"          # No escaping needed
'she said "hi"'      # No escaping needed

Triple Quotes — Multi-line Strings

python

msg = """This spans
multiple lines
automatically"""

# Also used for docstrings
def greet(name):
    """Return a greeting for the given name."""
    return f"Hello, {name}!"

Raw Strings — No Escape Interpretation

python

path = r"C:\new\folder"    # Backslashes are literal
pattern = r"\d+\.\d+"      # Cleaner regex patterns

When to Use Raw Strings

Use r"..." for Windows file paths and regular expressions — the two most common cases where backslashes cause trouble.

Escape Sequences

python

"\n"    # Newline
"\t"    # Tab
"\\"    # Literal backslash
"\'"    # Literal single quote
"\""    # Literal double quote
"\0"    # Null character

Surprise — Escape Interpreted

path = "C:\new\folder"
print(path)
# C:
# ew
# older

Fix — Raw String or Double Backslash

path = r"C:\new\folder"
# or
path = "C:\\new\\folder"
print(path)
# C:\new\folder

f-strings and Formatting

f-strings (Python 3.6+) are the preferred way to embed expressions in strings:

python

name = "Alice"
age = 30
price = 49.99

# Basic interpolation
f"{name} is {age} years old"

# Format specifiers
f"{price:.2f}"               # '49.99' — 2 decimal places
f"{name:>10}"              # '     Alice' — right-aligned
f"{age:05d}"               # '00030' — zero-padded

# Expressions inside
f"{name.upper():>10}"      # Methods + format spec combined

# Debug shorthand (Python 3.8+)
f"{name=}"                 # "name='Alice'"
f"{2 + 2=}"               # "2 + 2=4"

Other Formatting Styles

You'll see these in existing codebases:

python

# .format() method
"Hello, {}".format(name)
"Hello, {0}. You are {1}.".format(name, age)

# %-style (still common in logging)
"Hello, %s. You are %d." % (name, age)

Which to Use?

Prefer f-strings for new code. Use .format() when the template is defined separately from the values. The % style still appears in logging calls.

Slicing

Strings support the same slicing syntax as lists: s[start:stop:step]

python

s = "Hello, World!"

s[0:5]      # 'Hello'       — start to stop (exclusive)
s[7:]       # 'World!'      — index 7 to end
s[:5]       # 'Hello'       — start to index 5
s[-6:]      # 'orld!'       — last 6 characters
s[::2]      # 'Hlo ol!'     — every other character
s[::-1]     # '!dlroW ,olleH' — reversed

Slicing Never Raises IndexError

Out-of-range slices silently return what's available: "hi"[0:100] returns "hi". But direct indexing like "hi"[100] raises IndexError.

Searching and Testing

python

# Membership
"py" in "python"                  # True
"PY" in "python"                  # False — case-sensitive

# Starts/ends with
"python".startswith("py")        # True
"python".endswith("on")          # True
"image.png".endswith((".png", ".jpg"))  # True — accepts a tuple

# Finding position
"python".find("th")              # 2 — returns index, or -1 if not found
"python".index("th")             # 2 — same but raises ValueError if not found

# Counting
"banana".count("a")              # 3

Fragile — index() Crashes

pos = "hello".index("xyz")
# ValueError: substring not found

Safe — find() Returns -1

pos = "hello".find("xyz")
if pos != -1:
    print(f"Found at {pos}")

Replacing and Transforming

python

# Replacing
"hello world".replace("world", "there")   # 'hello there'
"aaa".replace("a", "b", 2)               # 'bba' — limit replacements

# Case transformations
"hello world".title()       # 'Hello World'
"hello world".capitalize()  # 'Hello world'
"HELLO".lower()             # 'hello'
"hello".upper()             # 'HELLO'
"Hello".swapcase()          # 'hELLO'
"hello".casefold()          # 'hello' — aggressive lowercase for comparison

Remember: Strings Are Immutable

All these methods return a new string. The original is unchanged. You must reassign: s = s.upper()

Stripping and Padding

python

# Stripping whitespace
"  messy  ".strip()          # 'messy'
"  messy  ".lstrip()         # 'messy  '
"  messy  ".rstrip()         # '  messy'

# Stripping specific characters
"***hi***".strip("*")       # 'hi'
"xxhelloxx".strip("x")     # 'hello'

python

# Padding and alignment
"hi".center(10)       # '    hi    '
"hi".ljust(10)        # 'hi        '
"hi".rjust(10)        # '        hi'
"42".zfill(5)         # '00042'

# center/ljust/rjust accept a fill character
"hi".center(10, "-")  # '----hi----'

Splitting and Joining

python

# Basic split
"a,b,c".split(",")                  # ['a', 'b', 'c']

# Join — called on the separator
",".join(['a', 'b', 'c'])           # 'a,b,c'

# Split lines
"line1\nline2".splitlines()         # ['line1', 'line2']

# Partition — splits on first/last occurrence
"a.b.c".partition(".")              # ('a', '.', 'b.c')
"a.b.c".rpartition(".")             # ('a.b', '.', 'c')

# Limit splits
"a,b,c,d".split(",", 2)             # ['a', 'b', 'c,d']

Gotcha — split(" ")

"a  b".split(" ")
# ['a', '', 'b']
# Literal single-space split
# Empty string from double space

Better — split()

"a  b".split()
# ['a', 'b']
# No argument = split on any
# whitespace, ignore empties

Type Checking Methods

python

"abc".isalpha()       # True  — letters only
"123".isdigit()       # True  — digits only
"abc123".isalnum()    # True  — letters or digits
"   ".isspace()       # True  — whitespace only
"Hello".istitle()     # True  — title case
"HELLO".isupper()     # True  — all uppercase
"hello".islower()     # True  — all lowercase

Empty String Edge Case

All is* methods return False for empty strings: "".isdigit() is False.

Type Conversion

python

# To string
str(42)          # '42'
str(3.14)        # '3.14'
str(True)        # 'True'
str([1,2,3])    # '[1, 2, 3]'

# From string
int("42")        # 42
float("3.14")    # 3.14

# String to list of characters
list("abc")      # ['a', 'b', 'c']

# Ordinal conversions
ord("A")         # 65 — character to Unicode code point
chr(65)          # 'A' — code point to character

python

# String → bytes
"café".encode("utf-8")     # b'caf\xc3\xa9'

# Bytes → string
b'caf\xc3\xa9'.decode("utf-8")  # 'café'

Common Pitfalls

Concatenation in Loops is O(n²)

Slow — O(n²)

result = ""
for x in items:
    result += x
# Creates a new string each iteration

Fast — O(n)

result = "".join(items)
# Single allocation

`is` vs `==` for Strings

python

a = "hello"
b = "hello"
a is b   # True (interned), but DON'T rely on this
a == b   # True — always use == for string comparison

String Interning is an Implementation Detail

CPython interns some strings for performance, but this behavior is not guaranteed. is checks identity (same object), == checks equality (same value). Always use == for comparing string contents.

Forgetting to Reassign

Bug — Result Discarded

s = "hello"
s.upper()
print(s)  # 'hello' — unchanged!

Correct — Reassign

s = "hello"
s = s.upper()
print(s)  # 'HELLO'

Case-Sensitive Operations

Bug — Case Mismatch

user_input = "Yes"
if user_input == "yes":
    print("confirmed")
# Never prints!

Fix — Normalize Case

user_input = "Yes"
if user_input.lower() == "yes":
    print("confirmed")
# Works!

Summary: Key Takeaways

Concept	Key Takeaway
Immutability	Every "modification" creates a new string — you must reassign
Quote styles	Use `'`, `"`, or `"""` — pick whichever avoids escaping
Raw strings	`r"..."` for Windows paths and regex patterns
f-strings	Preferred formatting: `f"{expr:spec}"` with full expression support
Slicing	`s[start:stop:step]` — never raises IndexError
`in` operator	Substring check: `"py" in "python"` — case-sensitive
`find()` vs `index()`	`find` returns -1 on failure; `index` raises ValueError
`split()` vs `split(" ")`	No-arg `split()` handles any whitespace and strips empties
`join()`	Called on the separator: `",".join(list)`
Concatenation	Use `"".join()` in loops, not `+=`
Comparison	Always use `==`, never `is` for string values
Case sensitivity	Normalize with `.lower()` or `.casefold()` before comparing

The Core Concept: Strings Are Immutable

Creating Strings

Quote Styles

Triple Quotes — Multi-line Strings

Raw Strings — No Escape Interpretation

Escape Sequences

f-strings and Formatting

Other Formatting Styles

Slicing

Searching and Testing

Replacing and Transforming

Stripping and Padding

Splitting and Joining

Type Checking Methods

Type Conversion

Common Pitfalls

Concatenation in Loops is O(n²)

is vs == for Strings

Forgetting to Reassign

Case-Sensitive Operations

Summary: Key Takeaways

`is` vs `==` for Strings