SQL LIKE: Pattern Matching with Wildcards Explained
The LIKE operator in SQL filters rows based on pattern matching within text columns. It uses two wildcard characters: % (matches any sequence of characters) and _ (matches exactly one character). It is the go-to tool for searching, filtering, and validating text data.
Basic Syntax
SELECT * FROM customers
WHERE email LIKE '%@gmail.com';
This returns all customers with Gmail addresses.
The Two Wildcards
% (Percent) - Matches Any Sequence
| Pattern | Matches | Does Not Match |
|---|---|---|
| 'A%' | Apple, Amazon, A | banana, cat |
| '%ing' | running, coding, ring | rings, singer |
| '%data%' | database, big data, my-data-file | Data (case-sensitive DBs) |
| '%' | Everything (including empty string in some DBs) | NULL |
_ (Underscore) - Matches Exactly One Character
| Pattern | Matches | Does Not Match |
|---|---|---|
| '_at' | cat, bat, hat | chat, at |
| 'h_t' | hot, hit, hat | heat, ht |
| '___' | Any 3-character string | ab, abcd |
| 'a__e' | able, axle, anne | ae, apple |
Common Patterns
Starts With
-- Products starting with "Pro"
SELECT product_name FROM products
WHERE product_name LIKE 'Pro%';
Ends With
-- Files ending in .pdf
SELECT filename FROM documents
WHERE filename LIKE '%.pdf';
Contains
-- Customers with "smith" anywhere in their name
SELECT * FROM customers
WHERE last_name LIKE '%smith%';
Specific Position
-- 5-digit zip codes starting with 9
SELECT * FROM addresses
WHERE zip_code LIKE '9____';
-- Product codes: 2 letters followed by 4 digits pattern
SELECT * FROM products
WHERE sku LIKE '__----'; -- Not quite right, LIKE has no digit wildcard
Case Sensitivity
LIKE behavior varies by database:
| Database | LIKE Case Sensitivity | Case-Insensitive Alternative |
|---|---|---|
| PostgreSQL | Case-sensitive | ILIKE |
| MySQL | Case-insensitive (default collation) | LIKE (already insensitive) |
| SQL Server | Depends on collation | LIKE with collation override |
| BigQuery | Case-sensitive | LOWER() + LIKE |
| SQLite | Case-insensitive for ASCII | LIKE (already insensitive) |
-- PostgreSQL: case-insensitive search
SELECT * FROM customers WHERE name ILIKE '%john%';
-- Universal approach: normalize case
SELECT * FROM customers WHERE LOWER(name) LIKE '%john%';
NOT LIKE
Exclude rows matching a pattern:
-- Customers without Gmail addresses
SELECT * FROM customers
WHERE email NOT LIKE '%@gmail.com';
-- Products not in the "test" category pattern
SELECT * FROM products
WHERE product_name NOT LIKE 'TEST%';
ESCAPE Clause
When you need to match literal % or _ characters:
-- Find values containing a literal percent sign
SELECT * FROM discounts
WHERE description LIKE '%50\%%' ESCAPE '\';
-- Find column names with underscores
SELECT * FROM metadata
WHERE column_name LIKE '%\_id' ESCAPE '\';
Multiple LIKE Conditions
-- Match multiple patterns
SELECT * FROM customers
WHERE email LIKE '%@gmail.com'
OR email LIKE '%@yahoo.com'
OR email LIKE '%@hotmail.com';
-- More concise with SIMILAR TO (PostgreSQL) or REGEXP
-- PostgreSQL:
SELECT * FROM customers
WHERE email SIMILAR TO '%(gmail|yahoo|hotmail)%';
LIKE vs. Other Text Search Methods
| Method | Use Case | Performance |
|---|---|---|
| LIKE '%term%' | Simple substring match | Slow (full scan, no index) |
| LIKE 'prefix%' | Starts-with match | Fast (can use B-tree index) |
| Full-text search | Natural language search, relevance ranking | Fast (uses inverted index) |
| REGEXP / RLIKE | Complex patterns (digits, alternation) | Slow (no index) |
| = (exact match) | Exact value comparison | Fastest (uses index) |
Performance Considerations
Leading Wildcards Kill Performance
-- SLOW: Cannot use index (scans every row)
WHERE name LIKE '%smith%'
-- FAST: Can use B-tree index (prefix match)
WHERE name LIKE 'Smith%'
A leading % forces a full table scan because the database cannot use an index to narrow results. For substring searches on large tables:
- Use full-text search indexes (PostgreSQL tsvector, MySQL FULLTEXT, Elasticsearch)
- Use trigram indexes (PostgreSQL pg_trgm extension)
- Pre-compute searchable fields (extract domain from email into its own indexed column)
Optimization Strategies
-- Instead of: WHERE email LIKE '%@gmail.com'
-- Pre-compute and index:
ALTER TABLE customers ADD COLUMN email_domain VARCHAR(100);
UPDATE customers SET email_domain = SPLIT_PART(email, '@', 2);
CREATE INDEX idx_email_domain ON customers(email_domain);
-- Then: WHERE email_domain = 'gmail.com' (uses index)
Real-World Examples
Data cleaning: Find invalid emails
SELECT email FROM customers
WHERE email NOT LIKE '%_@_%.__%';
-- Matches basic email pattern: something@something.xx
Filter test data
DELETE FROM orders
WHERE customer_email LIKE '%@test.com'
OR customer_email LIKE '%+test%'
OR customer_name LIKE 'Test %';
Search functionality
-- Simple product search
SELECT * FROM products
WHERE LOWER(product_name) LIKE LOWER('%' || $user_input || '%')
LIMIT 20;
Categorization
SELECT
CASE
WHEN url LIKE '%/blog/%' THEN 'Blog'
WHEN url LIKE '%/products/%' THEN 'Product'
WHEN url LIKE '%/pricing%' THEN 'Pricing'
ELSE 'Other'
END AS page_type,
COUNT(*) AS pageviews
FROM web_events
GROUP BY 1;
Platforms like Skopx handle text filtering automatically. Ask "Show me customers with Gmail addresses" or "Find all test orders" and the AI generates the appropriate LIKE patterns without you writing SQL.
Summary
| Pattern | Meaning |
|---|---|
| LIKE 'abc%' | Starts with "abc" |
| LIKE '%abc' | Ends with "abc" |
| LIKE '%abc%' | Contains "abc" |
| LIKE '_bc' | Second and third chars are "bc" |
| LIKE 'a_c' | Three chars, starts with a, ends with c |
| NOT LIKE '%test%' | Does not contain "test" |
| ILIKE '%abc%' | Case-insensitive contains (PostgreSQL) |
LIKE is simple and universal but limited. For complex patterns use REGEXP. For full-text search use dedicated indexes. For prefix matches, LIKE with a trailing % can use indexes and performs well at scale.
Saad Selim
The Skopx engineering and product team