Strings in DSA: Operations and Patterns

Beginner 11 min read

What you'll learn

✓How strings are represented in Python and JavaScript
✓Why string immutability matters for time complexity
✓Indexing, slicing, and the cost of each operation
✓Essential operations: split, join, reverse, replace
✓ASCII versus Unicode and how to translate characters to integers
✓The two-pointer and frequency-counter patterns on strings

Prerequisites

•Basic Python syntax — see Python Strings
•Familiarity with arrays and loops

Strings show up in nearly every coding interview and almost every real program. They look simple, but small details — immutability, encoding, hidden allocations — decide whether your solution is fast or quadratic. This post is the foundation: the model you need to reason about strings, the operations that show up over and over, and the two patterns that solve a surprising number of problems.

What a string is

A string is a sequence of characters. In memory, it is usually stored as a contiguous block of code units (bytes for ASCII, multi-byte for Unicode). Conceptually you can think of it as an array of characters, but with one critical twist in most languages: it is immutable.

s = "hello"
print(s[0])      # h
print(len(s))    # 5

const s = "hello";
console.log(s[0]);      // h
console.log(s.length);  // 5

Immutability across languages

In Python, JavaScript, Java, and Go, strings cannot be modified in place. Every operation that “changes” a string actually allocates a new one.

s = "hello"
# s[0] = "H"          # TypeError: 'str' object does not support item assignment
s = "H" + s[1:]       # new string
print(s)              # Hello

let s = "hello";
// s[0] = "H";         // silently fails in non-strict mode
s = "H" + s.slice(1);  // new string
console.log(s);        // Hello

The practical consequence: building a string with += in a loop is O(n²) because each step copies the whole string so far. Use a list/array and join at the end.

# Slow: O(n^2)
result = ""
for ch in chars:
    result += ch

# Fast: O(n)
parts = []
for ch in chars:
    parts.append(ch)
result = "".join(parts)

In C and C++ you can mutate char buffers directly, but you must manage allocation yourself. In interviews assume strings are immutable unless you are explicitly given a mutable buffer (e.g. char[] in a Java problem).

Indexing and slicing

Indexing a single character is O(1) in both Python and JavaScript because each is stored compactly.

s = "algorithm"
print(s[0])    # a
print(s[-1])   # m   (Python supports negative indices)

Slicing in Python builds a brand-new string. The cost is proportional to the slice length:

s = "algorithm"
print(s[0:4])   # algo
print(s[::-1])  # mhtirogla   (a common trick: reverse)

JavaScript uses slice, substring, or substr (the last is deprecated):

const s = "algorithm";
console.log(s.slice(0, 4));   // algo
console.log(s.split("").reverse().join(""));  // mhtirogla

Reversing in Python is s[::-1]; in JavaScript the idiomatic version splits to an array, reverses, and joins. Both are O(n).

Common operations

A handful of operations cover most string work:

s = "the quick brown fox"

s.split()           # ['the', 'quick', 'brown', 'fox']
s.split(" ", 1)     # ['the', 'quick brown fox']
"-".join(["a","b"]) # 'a-b'
s.replace("o", "0") # 'the quick br0wn f0x'
s.find("quick")     # 4   (-1 if not found)
s.startswith("the") # True
s.lower()           # 'the quick brown fox'
"abc".count("a")    # 1

const s = "the quick brown fox";
s.split(" ");              // ['the','quick','brown','fox']
["a","b"].join("-");       // 'a-b'
s.replaceAll("o", "0");    // 'the quick br0wn f0x'
s.indexOf("quick");        // 4
s.startsWith("the");       // true
s.toLowerCase();

split and join together are how you bounce between strings and arrays — the most useful conversion in string DSA.

Try it yourself. Given s = " Hello, World! ":

Strip whitespace.
Lowercase it.
Replace the comma with a space and split on whitespace.

The result should be ['hello', 'world!'].

ASCII and Unicode

Every character has a numeric code point. ASCII covers the first 128: 'A' is 65, 'a' is 97, '0' is 48. Unicode extends this to over a million code points covering every script.

print(ord('A'))       # 65
print(ord('a'))       # 97
print(chr(97))        # a
print(ord('a') - ord('a'))   # 0  — useful for "letter index"

console.log("A".charCodeAt(0));   // 65
console.log(String.fromCharCode(97));  // a

The pattern ord(ch) - ord('a') maps 'a'..'z' to 0..25, perfect for fixed-size frequency arrays:

def letter_counts(s):
    counts = [0] * 26
    for ch in s.lower():
        if 'a' <= ch <= 'z':
            counts[ord(ch) - ord('a')] += 1
    return counts

For Unicode-heavy problems, prefer a hash map — code points are sparse, and a fixed-size array would waste memory.

Pattern 1: Two-pointer scans

Many string problems read once from both ends inward. This is the two-pointer pattern. It runs in O(n) time and O(1) extra space.

Palindrome check

def is_palindrome(s):
    i, j = 0, len(s) - 1
    while i < j:
        if s[i] != s[j]:
            return False
        i += 1
        j -= 1
    return True

print(is_palindrome("level"))   # True
print(is_palindrome("hello"))   # False

Reverse a string in place (mutable buffer)

If the language lets you mutate (e.g. a list of characters in Python), two pointers reverse in O(n) with O(1) extra space:

def reverse_chars(chars):
    i, j = 0, len(chars) - 1
    while i < j:
        chars[i], chars[j] = chars[j], chars[i]
        i += 1
        j -= 1
    return chars

print(reverse_chars(list("hello")))   # ['o','l','l','e','h']

Two pointers also power “valid palindrome ignoring punctuation,” “is one string a rotation of another,” and the classic “merge two sorted strings” problems.

Pattern 2: Frequency counters

A great many “anagram,” “permutation,” and “unique character” problems reduce to counting how often each character appears. Use a fixed array for ASCII and a hash map for general Unicode.

from collections import Counter

def is_anagram(a, b):
    return Counter(a) == Counter(b)

print(is_anagram("listen", "silent"))  # True
print(is_anagram("rat", "car"))        # False

Manual version, useful when you need finer control:

def is_anagram_manual(a, b):
    if len(a) != len(b):
        return False
    counts = [0] * 26
    for ch in a:
        counts[ord(ch) - ord('a')] += 1
    for ch in b:
        counts[ord(ch) - ord('a')] -= 1
        if counts[ord(ch) - ord('a')] < 0:
            return False
    return True

This runs in O(n) time and O(1) extra space (the array has fixed size 26).

Iterating with index

When you need both position and character, use enumerate in Python:

for i, ch in enumerate("hello"):
    print(i, ch)
# 0 h
# 1 e
# 2 l
# 3 l
# 4 o

In JavaScript:

for (const [i, ch] of [..."hello"].entries()) {
    console.log(i, ch);
}

The spread [..."hello"] handles surrogate pairs correctly for most emoji; iterating with a plain for loop on indices does not.

Building strings efficiently

A reminder that bears repeating, because it is the single most common performance bug:

# BAD: O(n^2)
out = ""
for word in words:
    out += word

# GOOD: O(n)
out = "".join(words)

In Java use StringBuilder. In JavaScript build an array and join. In Go use strings.Builder. The reason is the same in every language: immutable strings cannot grow in place.

Try it yourself. Write a function compress(s) that turns "aaabbc" into "a3b2c1". Use a list and "".join at the end. What is the time complexity? What is the space complexity?

Complexity cheat sheet

Operation	Python `str`	JavaScript `string`
Index `s[i]`	O(1)	O(1)
Length	O(1)	O(1)
Concatenate `a + b`	O(	a
Slice of length k	O(k)	O(k)
`in` substring check	O(n·m) worst	O(n·m) worst
`split` / `join`	O(n)	O(n)
Reverse	O(n)	O(n)

The in (substring) check uses an optimized algorithm in CPython but is still worst-case O(n·m). For repeated searches in the same text, the next post on KMP and Rabin-Karp gives you better tools.

Recap

You now know:

Strings are immutable in Python and JavaScript, so concatenation in loops is quadratic
Indexing is O(1); slicing and most methods are O(n)
ord and chr (or charCodeAt / fromCharCode) translate between characters and integers
The two-pointer pattern solves palindrome and reverse-in-place problems in O(n) / O(1)
The frequency counter pattern solves anagram and unique-character problems in O(n)
Build strings with a list + join, not with +=

Next steps

The next post moves from operations to algorithms: string pattern matching. We’ll cover the naive scan, the KMP algorithm (and its famous LPS array), and the Rabin-Karp rolling hash.

→ Next: String Pattern Matching: Naive, KMP, and Rabin-Karp

Questions or feedback? Email codeloomdevv@gmail.com.