Strings in DSA: Operations and Patterns
An introduction to strings for data structures and algorithms — immutability, indexing, slicing, common operations, ASCII versus Unicode, and the two-pointer and frequency-counter patterns you will use everywhere.
What you'll learn
- ✓How strings are represented in Python and JavaScript
- ✓Why string immutability matters for time complexity
- ✓Indexing, slicing, and the cost of each operation
- ✓Essential operations: split, join, reverse, replace
- ✓ASCII versus Unicode and how to translate characters to integers
- ✓The two-pointer and frequency-counter patterns on strings
Prerequisites
- •Basic Python syntax — see Python Strings
- •Familiarity with arrays and loops
Strings show up in nearly every coding interview and almost every real program. They look simple, but small details — immutability, encoding, hidden allocations — decide whether your solution is fast or quadratic. This post is the foundation: the model you need to reason about strings, the operations that show up over and over, and the two patterns that solve a surprising number of problems.
What a string is
A string is a sequence of characters. In memory, it is usually stored as a contiguous block of code units (bytes for ASCII, multi-byte for Unicode). Conceptually you can think of it as an array of characters, but with one critical twist in most languages: it is immutable.
s = "hello"
print(s[0]) # h
print(len(s)) # 5
const s = "hello";
console.log(s[0]); // h
console.log(s.length); // 5
Immutability across languages
In Python, JavaScript, Java, and Go, strings cannot be modified in place. Every operation that “changes” a string actually allocates a new one.
s = "hello"
# s[0] = "H" # TypeError: 'str' object does not support item assignment
s = "H" + s[1:] # new string
print(s) # Hello
let s = "hello";
// s[0] = "H"; // silently fails in non-strict mode
s = "H" + s.slice(1); // new string
console.log(s); // Hello
The practical consequence: building a string with += in a loop is O(n²) because each step copies the whole string so far. Use a list/array and join at the end.
# Slow: O(n^2)
result = ""
for ch in chars:
result += ch
# Fast: O(n)
parts = []
for ch in chars:
parts.append(ch)
result = "".join(parts)
In C and C++ you can mutate char buffers directly, but you must manage allocation yourself. In interviews assume strings are immutable unless you are explicitly given a mutable buffer (e.g. char[] in a Java problem).
Indexing and slicing
Indexing a single character is O(1) in both Python and JavaScript because each is stored compactly.
s = "algorithm"
print(s[0]) # a
print(s[-1]) # m (Python supports negative indices)
Slicing in Python builds a brand-new string. The cost is proportional to the slice length:
s = "algorithm"
print(s[0:4]) # algo
print(s[::-1]) # mhtirogla (a common trick: reverse)
JavaScript uses slice, substring, or substr (the last is deprecated):
const s = "algorithm";
console.log(s.slice(0, 4)); // algo
console.log(s.split("").reverse().join("")); // mhtirogla
Reversing in Python is s[::-1]; in JavaScript the idiomatic version splits to an array, reverses, and joins. Both are O(n).
Common operations
A handful of operations cover most string work:
s = "the quick brown fox"
s.split() # ['the', 'quick', 'brown', 'fox']
s.split(" ", 1) # ['the', 'quick brown fox']
"-".join(["a","b"]) # 'a-b'
s.replace("o", "0") # 'the quick br0wn f0x'
s.find("quick") # 4 (-1 if not found)
s.startswith("the") # True
s.lower() # 'the quick brown fox'
"abc".count("a") # 1
const s = "the quick brown fox";
s.split(" "); // ['the','quick','brown','fox']
["a","b"].join("-"); // 'a-b'
s.replaceAll("o", "0"); // 'the quick br0wn f0x'
s.indexOf("quick"); // 4
s.startsWith("the"); // true
s.toLowerCase();
split and join together are how you bounce between strings and arrays — the most useful conversion in string DSA.
Try it yourself. Given s = " Hello, World! ":
- Strip whitespace.
- Lowercase it.
- Replace the comma with a space and split on whitespace.
The result should be ['hello', 'world!'].
ASCII and Unicode
Every character has a numeric code point. ASCII covers the first 128: 'A' is 65, 'a' is 97, '0' is 48. Unicode extends this to over a million code points covering every script.
print(ord('A')) # 65
print(ord('a')) # 97
print(chr(97)) # a
print(ord('a') - ord('a')) # 0 — useful for "letter index"
console.log("A".charCodeAt(0)); // 65
console.log(String.fromCharCode(97)); // a
The pattern ord(ch) - ord('a') maps 'a'..'z' to 0..25, perfect for fixed-size frequency arrays:
def letter_counts(s):
counts = [0] * 26
for ch in s.lower():
if 'a' <= ch <= 'z':
counts[ord(ch) - ord('a')] += 1
return counts
For Unicode-heavy problems, prefer a hash map — code points are sparse, and a fixed-size array would waste memory.
Pattern 1: Two-pointer scans
Many string problems read once from both ends inward. This is the two-pointer pattern. It runs in O(n) time and O(1) extra space.
Palindrome check
def is_palindrome(s):
i, j = 0, len(s) - 1
while i < j:
if s[i] != s[j]:
return False
i += 1
j -= 1
return True
print(is_palindrome("level")) # True
print(is_palindrome("hello")) # False
Reverse a string in place (mutable buffer)
If the language lets you mutate (e.g. a list of characters in Python), two pointers reverse in O(n) with O(1) extra space:
def reverse_chars(chars):
i, j = 0, len(chars) - 1
while i < j:
chars[i], chars[j] = chars[j], chars[i]
i += 1
j -= 1
return chars
print(reverse_chars(list("hello"))) # ['o','l','l','e','h']
Two pointers also power “valid palindrome ignoring punctuation,” “is one string a rotation of another,” and the classic “merge two sorted strings” problems.
Pattern 2: Frequency counters
A great many “anagram,” “permutation,” and “unique character” problems reduce to counting how often each character appears. Use a fixed array for ASCII and a hash map for general Unicode.
from collections import Counter
def is_anagram(a, b):
return Counter(a) == Counter(b)
print(is_anagram("listen", "silent")) # True
print(is_anagram("rat", "car")) # False
Manual version, useful when you need finer control:
def is_anagram_manual(a, b):
if len(a) != len(b):
return False
counts = [0] * 26
for ch in a:
counts[ord(ch) - ord('a')] += 1
for ch in b:
counts[ord(ch) - ord('a')] -= 1
if counts[ord(ch) - ord('a')] < 0:
return False
return True
This runs in O(n) time and O(1) extra space (the array has fixed size 26).
Iterating with index
When you need both position and character, use enumerate in Python:
for i, ch in enumerate("hello"):
print(i, ch)
# 0 h
# 1 e
# 2 l
# 3 l
# 4 o
In JavaScript:
for (const [i, ch] of [..."hello"].entries()) {
console.log(i, ch);
}
The spread [..."hello"] handles surrogate pairs correctly for most emoji; iterating with a plain for loop on indices does not.
Building strings efficiently
A reminder that bears repeating, because it is the single most common performance bug:
# BAD: O(n^2)
out = ""
for word in words:
out += word
# GOOD: O(n)
out = "".join(words)
In Java use StringBuilder. In JavaScript build an array and join. In Go use strings.Builder. The reason is the same in every language: immutable strings cannot grow in place.
Try it yourself. Write a function compress(s) that turns "aaabbc" into "a3b2c1". Use a list and "".join at the end. What is the time complexity? What is the space complexity?
Complexity cheat sheet
| Operation | Python str | JavaScript string |
|---|---|---|
Index s[i] | O(1) | O(1) |
| Length | O(1) | O(1) |
Concatenate a + b | O( | a |
| Slice of length k | O(k) | O(k) |
in substring check | O(n·m) worst | O(n·m) worst |
split / join | O(n) | O(n) |
| Reverse | O(n) | O(n) |
The in (substring) check uses an optimized algorithm in CPython but is still worst-case O(n·m). For repeated searches in the same text, the next post on KMP and Rabin-Karp gives you better tools.
Recap
You now know:
- Strings are immutable in Python and JavaScript, so concatenation in loops is quadratic
- Indexing is O(1); slicing and most methods are O(n)
ordandchr(orcharCodeAt/fromCharCode) translate between characters and integers- The two-pointer pattern solves palindrome and reverse-in-place problems in O(n) / O(1)
- The frequency counter pattern solves anagram and unique-character problems in O(n)
- Build strings with a list +
join, not with+=
Next steps
The next post moves from operations to algorithms: string pattern matching. We’ll cover the naive scan, the KMP algorithm (and its famous LPS array), and the Rabin-Karp rolling hash.
→ Next: String Pattern Matching: Naive, KMP, and Rabin-Karp
Questions or feedback? Email codeloomdevv@gmail.com.