Regex Cheat Sheet: Complete Guide to Regular Expressions

What is Regex?

Regular expressions (regex or regexp) are sequences of characters that define search patterns. They are one of the most powerful tools for text processing, pattern matching, and data extraction.

Why Use Regular Expressions?

Regex is essential for:

Form validation (email, phone numbers, passwords)
Data extraction (parsing logs, scraping web pages)
Text processing (find and replace, formatting)
Code refactoring (renaming variables, updating syntax)
Input sanitization (security, preventing injection attacks)

How to Read This Cheat Sheet

This guide is organized into progressive sections:

Fundamentals - Basic syntax and patterns
Advanced Features - Lookarounds, named groups, Unicode
Language-Specific - Examples in Python, JavaScript, PHP, C#, Java, Go, Ruby
VS Code Regex - Find/replace transformations with 20+ examples
Common Patterns - Copy-paste patterns for emails, URLs, dates, etc.
Practical Use Cases - Real-world applications
Troubleshooting - Common mistakes and performance tips

Each section includes:

Clear explanations
Visual examples
Code snippets you can copy
Pro tips and gotchas

Literal Characters

The simplest regex is a literal character sequence:

abc

Matches: "abc" in "The abc sequence"

Case Sensitivity

By default, regex is case-sensitive:

Hello matches "Hello" but NOT "hello"
Use the i flag for case-insensitive matching: /hello/i

Special Characters (Metacharacters)

These 12 characters have special meaning in regex and must be escaped with \ to match literally:

. ^ $ * + ? { } [ ] \ | ( )

Examples:

\. matches a literal dot (period)
\$ matches a dollar sign
\( matches a literal parenthesis

Character	Escape	Example	Matches
`.` (dot)	`\.`	`3\.14`	"3.14"
`$` (dollar)	`\$`	`\$100`	"$100"
`*` (asterisk)	`\*`	`a\*b`	"a*b"

Character Classes

Predefined Character Classes

Pattern	Description	Equivalent	Example	Matches
`\d`	Any digit	`[0-9]`	`\d\d`	"42"
`\D`	Any non-digit	`[^0-9]`	`\D+`	"abc"
`\w`	Word character	`[a-zA-Z0-9_]`	`\w+`	"hello_123"
`\W`	Non-word character	`[^a-zA-Z0-9_]`	`\W`	"@", "#"
`\s`	Whitespace	`[ \t\n\r\f\v]`	`\s+`	" " (spaces)
`\S`	Non-whitespace	`[^ \t\n\r\f\v]`	`\S+`	"hello"
`.`	Any character except newline	-	`a.c`	"abc", "a1c"

Pro Tip: \w does NOT include Unicode letters by default. Use \p{L} for Unicode support (JavaScript/Python).

Custom Character Classes

Pattern	Description	Example	Matches
`[abc]`	Match any of a, b, or c	`[aeiou]`	Vowels: "a", "e", "i", "o", "u"
`[^abc]`	Match any except a, b, or c	`[^0-9]`	Non-digits
`[a-z]`	Range: lowercase letters	`[a-z]+`	"hello"
`[A-Z]`	Range: uppercase letters	`[A-Z]+`	"HELLO"
`[0-9]`	Range: digits	`[0-9]{4}`	"2025"
`[a-zA-Z]`	Combined: all letters	`[a-zA-Z0-9]`	Alphanumeric

Examples:

[aeiou]       → Matches any vowel
[^aeiou]      → Matches any consonant (not a vowel)
[a-z0-9]      → Matches lowercase letters and digits
[a-zA-Z0-9_]  → Same as \w (word characters)

Quantifiers

Quantifiers specify how many times a pattern should match.

Basic Quantifiers

Pattern	Description	Example	Matches
`*`	0 or more	`ab*c`	"ac", "abc", "abbc", "abbbc"
`+`	1 or more	`ab+c`	"abc", "abbc" (NOT "ac")
`?`	0 or 1 (optional)	`colou?r`	"color", "colour"
`{n}`	Exactly n times	`\d{4}`	"2025" (exactly 4 digits)
`{n,}`	n or more times	`\d{2,}`	"42", "123", "9999"
`{n,m}`	Between n and m times	`\d{2,4}`	"42", "123", "2025"

Greedy vs. Lazy Quantifiers

Greedy (default): Matches as much as possible

<.*>      → Matches: "<div>Hello</div>" (entire string)

Lazy (non-greedy): Matches as little as possible (add ? after quantifier)

<.*?>     → Matches: "<div>" and "</div>" separately

Greedy	Lazy	Description
`*`	`*?`	0 or more (lazy)
`+`	`+?`	1 or more (lazy)
`?`	`??`	0 or 1 (lazy)
`{n,m}`	`{n,m}?`	Between n and m (lazy)

Example:

Text: "Hello" and "World"

".*" matches: "Hello" and "World" (greedy)
".*?" matches: "Hello" and "World" separately (lazy)

Anchors & Boundaries

Anchors match positions, not characters.

Pattern	Description	Example	Matches
`^`	Start of string/line	`^Hello`	"Hello World" (at start)
`$`	End of string/line	`World$`	"Hello World" (at end)
`\b`	Word boundary	`\bcat\b`	"cat" in "The cat sat" (NOT "category")
`\B`	Non-word boundary	`\Bcat`	"category" (cat NOT at boundary)
`\A`	Start of string (not line)	`\AHello`	Only matches if "Hello" is at very start
`\z`	End of string (not line)	`World\z`	Only matches if "World" is at very end
`\Z`	End of string (before final newline)	`World\Z`	Matches "World" or "World\n"

Examples:

^cat$         → Matches: "cat" (entire line is "cat")
\bcat\b       → Matches: "cat" in "the cat sat" (whole word)
\Bcat         → Matches: "cat" in "category" (NOT at boundary)

Multiline Mode (m flag):

Without m: ^ and $ match start/end of entire string
With m: ^ and $ match start/end of each line

Groups & Alternation

Capturing Groups

Capturing groups (...) remember the matched text:

(\d+)-(\d+)   → Matches: "123-456"
                Group 1: "123"
                Group 2: "456"

Backreferences (reuse captured groups):

(\w)\1        → Matches: "aa", "bb", "cc" (repeated character)
(\w+) \1      → Matches: "hello hello" (repeated word)

Non-Capturing Groups

Use (?:...) when you need grouping but don't need to capture:

(?:https?://)  → Groups "http://" or "https://" without capturing

Why use non-capturing?

Faster performance (no memory overhead)
Cleaner backreferences (numbered groups only count capturing groups)

Alternation (OR)

Use | for "match this OR that":

cat|dog       → Matches: "cat" or "dog"
gr(a|e)y      → Matches: "gray" or "grey"

Examples:

(Mr|Ms|Mrs)\.?  → Matches: "Mr.", "Ms.", "Mrs."
https?://       → Matches: "http://" or "https://"

Lookaround Assertions

Lookarounds are zero-width assertions that match a position (like anchors) but with conditions.

Positive Lookahead `(?=...)`

Matches if the pattern ahead matches (but doesn't consume it):

\d(?=px)      → Matches: "10" in "10px" (NOT the "px" part)

Use case: Password validation

^(?=.*[A-Z])(?=.*[a-z])(?=.*\d)(?=.*[@$!%*?&]).{8,}$

Breakdown:

(?=.*[A-Z]) - Must contain uppercase
(?=.*[a-z]) - Must contain lowercase
(?=.*\d) - Must contain digit
(?=.*[@$!%*?&]) - Must contain special char
.{8,} - At least 8 characters

Negative Lookahead `(?!...)`

Matches if the pattern ahead does NOT match:

\d(?!px)      → Matches: "10" in "10em" (NOT "10px")

Use case: Exclude certain words

\b(?!test)\w+  → Matches words that DON'T start with "test"

Positive Lookbehind `(?<=...)`

Matches if the pattern behind matches:

(?<=\$)\d+    → Matches: "100" in "$100" (NOT the "$" part)

Use case: Extract prices

(?<=Price: \$)\d+\.\d{2}  → Matches: "29.99" in "Price: $29.99"

Negative Lookbehind `(?<!...)`

Matches if the pattern behind does NOT match:

(?<!\$)\d+    → Matches: "100" but NOT in "$100"

Summary Table:

Type	Syntax	Description	Example
Positive Lookahead	`(?=...)`	Matches if followed by...	`q(?=u)` matches "q" in "queen"
Negative Lookahead	`(?!...)`	Matches if NOT followed by...	`q(?!u)` matches "q" in "iraq"
Positive Lookbehind	`(?<=...)`	Matches if preceded by...	`(?<=\$)\d+` matches "10" in "$10"
Negative Lookbehind	`(?<!...)`	Matches if NOT preceded by...	`(?<!\$)\d+` matches "10" in "10"

Named Capturing Groups

Named groups (?<name>...) make regex more readable:

JavaScript:

const dateRegex = /(?<year>\d{4})-(?<month>\d{2})-(?<day>\d{2})/;
const match = '2025-01-17'.match(dateRegex);

console.log(match.groups.year);  // "2025"
console.log(match.groups.month); // "01"
console.log(match.groups.day);   // "17"

Python:

import re

pattern = r'(?P<year>\d{4})-(?P<month>\d{2})-(?P<day>\d{2})'
match = re.search(pattern, '2025-01-17')

print(match.group('year'))   # "2025"
print(match.group('month'))  # "01"
print(match.group('day'))    # "17"

C#:

var pattern = @"(?<year>\d{4})-(?<month>\d{2})-(?<day>\d{2})";
var match = Regex.Match("2025-01-17", pattern);

Console.WriteLine(match.Groups["year"].Value);  // "2025"

Atomic Groups & Possessive Quantifiers

Atomic Groups `(?>...)`

Once matched, the group does not backtrack. Prevents catastrophic backtracking:

(?>\d+)bar    → Matches: "123bar" (fast)

Without atomic group:

\d+bar        → Tries: "123bar", "12bar", "1bar" (slow on mismatch)

Possessive Quantifiers

Greedy	Possessive	Description
`*`	`*+`	0 or more (no backtracking)
`+`	`++`	1 or more (no backtracking)
`?`	`?+`	0 or 1 (no backtracking)

Use case: Prevent catastrophic backtracking on complex patterns.

Unicode Support

Modern regex engines support Unicode categories and scripts.

Unicode Categories `\p{...}`

JavaScript (ES2018+):

const letters = /\p{L}+/u;     // Any letter (any language)
const numbers = /\p{N}+/u;     // Any number
const currency = /\p{Sc}/u;    // Currency symbols

Python:

import regex  # Note: requires 'regex' module, not 're'

letters = regex.compile(r'\p{L}+')

Common Unicode Categories:

Category	Description	Example
`\p{L}`	Letter	"a", "字", "א"
`\p{N}`	Number	"1", "①", "一"
`\p{S}`	Symbol	"$", "©", "♥"
`\p{Sc}`	Currency symbol	"$", "€", "¥"
`\p{P}`	Punctuation	".", "!", "?"
`\p{Z}`	Separator	Space, tab

Unicode Scripts:

/\p{Script=Greek}/u     → Matches Greek letters: "α", "β", "γ"
/\p{Script=Cyrillic}/u  → Matches Cyrillic: "а", "б", "в"
/\p{Script=Han}/u       → Matches Chinese characters

Negation:

/\P{L}+/u   → Matches anything that is NOT a letter

Modifiers & Flags

Flags change how regex patterns are interpreted.

Flag	Name	Description	Example
`i`	Case-insensitive	Ignore case	`/hello/i` matches "Hello"
`g`	Global	Find all matches	`/cat/g` finds all "cat"
`m`	Multiline	`^` and `$` match line starts/ends	`/^hello/m`
`s`	Dotall	`.` matches newlines too	`/a.b/s` matches "a\nb"
`u`	Unicode	Enable Unicode features	`/\p{L}+/u`
`x`	Extended	Ignore whitespace (free-spacing)	Allows comments
`y`	Sticky	Match at exact position	JavaScript only

Examples:

Case-insensitive (i):

/hello/i.test('HELLO')   // true

Global (g):

'cat dog cat'.match(/cat/g)   // ["cat", "cat"]

Multiline (m):

const text = 'Line 1\nLine 2';
/^Line 2/m.test(text)   // true (without 'm': false)

Dotall (s):

/a.b/s.test('a\nb')   // true (without 's': false)

Inline Modifiers

Apply flags to part of the pattern:

(?i)hello      → Case-insensitive "hello"
(?-i)WORLD     → Case-sensitive "WORLD"
(?i:hello)     → Only "hello" is case-insensitive

Conditional Patterns

Syntax: (?(condition)true|false)

Example: Match quoted or unquoted strings

("|')?[^"'\r\n]*(?(1)\1)

Breakdown:

("|')? - Optionally capture opening quote
[^"'\r\n]* - Match content
(?(1)\1) - If group 1 matched (opening quote), match same closing quote

Matches:

"hello" ✅
'world' ✅
test ✅ (no quotes)
"mixed' ❌ (mismatched quotes)

Comments in Regex

Inline Comments `(?# comment)`

\d{3}(?# area code)-\d{3}(?# prefix)-\d{4}(?# line number)

Free-Spacing Mode (`x` flag)

Ignore whitespace and allow comments:

(?x)
  \d{3}     # area code
  -         # separator
  \d{3}     # prefix
  -         # separator
  \d{4}     # line number

Much more readable for complex patterns!

This section demonstrates how to use regex in 7 popular programming languages. Each language has its own regex API, but the core pattern syntax remains mostly consistent.

JavaScript / Node.js

Creating Regex Patterns

// Literal notation (most common)
const pattern1 = /\d{3}-\d{4}/;

// Constructor (when pattern is dynamic)
const pattern2 = new RegExp('\\d{3}-\\d{4}');
// Note: Backslashes must be escaped in strings

// With flags
const pattern3 = /hello/gi;  // Global, case-insensitive

String Methods

// .match() - Find matches
const text = 'Contact: 123-4567 or 987-6543';
const matches = text.match(/\d{3}-\d{4}/g);
console.log(matches);  // ["123-4567", "987-6543"]

// .matchAll() - Get all matches with groups (ES2020)
const emailPattern = /([\w.-]+)@([\w.-]+\.[a-z]{2,})/gi;
const emails = 'admin@example.com, user@test.org';
for (const match of emails.matchAll(emailPattern)) {
  console.log(`User: ${match[1]}, Domain: ${match[2]}`);
}
// User: admin, Domain: example.com
// User: user, Domain: test.org

// .search() - Find position of first match
const pos = 'Hello World'.search(/World/);
console.log(pos);  // 6

// .replace() - Replace matches
const phone = '(123) 456-7890';
const cleaned = phone.replace(/[^\d]/g, '');
console.log(cleaned);  // "1234567890"

// .replaceAll() - Replace all matches (ES2021)
const text2 = 'cat dog cat';
const result = text2.replaceAll(/cat/g, 'bird');
console.log(result);  // "bird dog bird"

// .split() - Split by pattern
const csv = 'apple,banana, orange , grape';
const fruits = csv.split(/\s*,\s*/);
console.log(fruits);  // ["apple", "banana", "orange", "grape"]

RegExp Methods

// .test() - Returns boolean
const isEmail = /^[\w.-]+@[\w.-]+\.[a-z]{2,}$/i;
console.log(isEmail.test('user@example.com'));  // true

// .exec() - Returns match details (or null)
const pattern = /(\d{4})-(\d{2})-(\d{2})/;
const match = pattern.exec('Date: 2025-01-17');
if (match) {
  console.log(match[0]);  // "2025-01-17" (full match)
  console.log(match[1]);  // "2025" (group 1)
  console.log(match[2]);  // "01" (group 2)
  console.log(match[3]);  // "17" (group 3)
}

Named Groups (ES2018+)

const pattern = /(?<year>\d{4})-(?<month>\d{2})-(?<day>\d{2})/;
const match = '2025-01-17'.match(pattern);

console.log(match.groups.year);   // "2025"
console.log(match.groups.month);  // "01"
console.log(match.groups.day);    // "17"

// Named backreferences
const dupeWord = /\b(?<word>\w+)\s+\k<word>\b/i;
console.log(dupeWord.test('hello hello'));  // true

Unicode Support (ES2018+)

// Match any letter (including accented, Chinese, Arabic, etc.)
const letters = /\p{L}+/u;
console.log(letters.test('café'));    // true
console.log(letters.test('你好'));    // true

// Match emoji
const emoji = /\p{Emoji}/u;
console.log(emoji.test('Hello 👋'));  // true

Python

The `re` Module

import re

# Compile pattern (recommended for reuse)
pattern = re.compile(r'\d{3}-\d{4}')

# Or use directly
re.search(r'\d{3}-\d{4}', 'Call 123-4567')

Core Functions

import re

# re.search() - Find first match
match = re.search(r'\d{3}-\d{4}', 'Contact: 123-4567 or 987-6543')
if match:
    print(match.group())  # "123-4567"
    print(match.start())  # 9 (position)
    print(match.end())    # 17

# re.match() - Match at START of string
match = re.match(r'\d+', '123 Main St')
print(match.group() if match else None)  # "123"

match = re.match(r'\d+', 'Main St 123')
print(match)  # None (doesn't start with digit)

# re.fullmatch() - Match ENTIRE string
result = re.fullmatch(r'\d{3}-\d{4}', '123-4567')
print(bool(result))  # True

result = re.fullmatch(r'\d{3}-\d{4}', 'Call 123-4567')
print(bool(result))  # False (extra text)

# re.findall() - Find all matches (returns list)
text = 'Prices: $10, $25, $100'
prices = re.findall(r'\$(\d+)', text)
print(prices)  # ['10', '25', '100']

# re.finditer() - Find all matches (returns iterator)
for match in re.finditer(r'\$(\d+)', text):
    print(f'Found ${match.group(1)} at position {match.start()}')
# Found $10 at position 8
# Found $25 at position 13
# Found $100 at position 18

# re.sub() - Replace matches
phone = '(123) 456-7890'
cleaned = re.sub(r'[^\d]', '', phone)
print(cleaned)  # "1234567890"

# re.split() - Split by pattern
csv = 'apple,banana, orange , grape'
fruits = re.split(r'\s*,\s*', csv)
print(fruits)  # ['apple', 'banana', 'orange', 'grape']

Groups and Named Groups

import re

# Numbered groups
pattern = r'(\d{4})-(\d{2})-(\d{2})'
match = re.search(pattern, 'Date: 2025-01-17')
if match:
    print(match.group(0))  # "2025-01-17" (full match)
    print(match.group(1))  # "2025"
    print(match.group(2))  # "01"
    print(match.group(3))  # "17"
    print(match.groups())  # ('2025', '01', '17')

# Named groups (?P<name>...)
pattern = r'(?P<year>\d{4})-(?P<month>\d{2})-(?P<day>\d{2})'
match = re.search(pattern, '2025-01-17')
if match:
    print(match.group('year'))   # "2025"
    print(match.group('month'))  # "01"
    print(match.group('day'))    # "17"
    print(match.groupdict())     # {'year': '2025', 'month': '01', 'day': '17'}

Flags

import re

# Case-insensitive
re.search(r'hello', 'HELLO', re.IGNORECASE)  # or re.I

# Multiline (^ and $ match line starts/ends)
re.search(r'^Line 2', 'Line 1\nLine 2', re.MULTILINE)  # or re.M

# Dotall (. matches newlines)
re.search(r'a.b', 'a\nb', re.DOTALL)  # or re.S

# Verbose (free-spacing mode with comments)
pattern = re.compile(r'''
    \d{3}     # area code
    -         # separator
    \d{3}     # prefix
    -         # separator
    \d{4}     # line number
''', re.VERBOSE)  # or re.X

# Combine flags with |
pattern = re.compile(r'hello', re.IGNORECASE | re.MULTILINE)

Replacement with Functions

import re

# Use function for dynamic replacements
def double_number(match):
    num = int(match.group())
    return str(num * 2)

text = 'I have 5 apples and 10 oranges'
result = re.sub(r'\d+', double_number, text)
print(result)  # "I have 10 apples and 20 oranges"

# With named groups
def format_name(match):
    return f"{match.group('last').upper()}, {match.group('first')}"

pattern = r'(?P<first>\w+)\s+(?P<last>\w+)'
text = 'John Doe'
result = re.sub(pattern, format_name, text)
print(result)  # "DOE, John"

PHP

PCRE Functions

<?php

// preg_match() - Find first match
$pattern = '/\d{3}-\d{4}/';
$text = 'Contact: 123-4567 or 987-6543';

if (preg_match($pattern, $text, $matches)) {
    echo $matches[0];  // "123-4567"
}

// preg_match_all() - Find all matches
preg_match_all('/\d{3}-\d{4}/', $text, $matches);
print_r($matches[0]);  // ["123-4567", "987-6543"]

// preg_replace() - Replace matches
$phone = '(123) 456-7890';
$cleaned = preg_replace('/[^\d]/', '', $phone);
echo $cleaned;  // "1234567890"

// preg_split() - Split by pattern
$csv = 'apple,banana, orange , grape';
$fruits = preg_split('/\s*,\s*/', $csv);
print_r($fruits);  // ["apple", "banana", "orange", "grape"]

// preg_grep() - Filter array by pattern
$words = ['apple', 'banana', 'apricot', 'orange'];
$aWords = preg_grep('/^a/', $words);
print_r($aWords);  // ["apple", "apricot"]
?>

Named Groups

<?php
$pattern = '/(?P<year>\d{4})-(?P<month>\d{2})-(?P<day>\d{2})/';
$text = '2025-01-17';

if (preg_match($pattern, $text, $matches)) {
    echo $matches['year'];   // "2025"
    echo $matches['month'];  // "01"
    echo $matches['day'];    // "17"
}
?>

Modifiers (Flags)

<?php
// i - Case-insensitive
preg_match('/hello/i', 'HELLO');  // Match

// m - Multiline
preg_match('/^Line 2/m', "Line 1\nLine 2");  // Match

// s - Dotall (. matches newlines)
preg_match('/a.b/s', "a\nb");  // Match

// x - Free-spacing (ignore whitespace)
$pattern = '/
    \d{3}     # area code
    -         # separator
    \d{3}     # prefix
    -         # separator
    \d{4}     # line number
/x';

// u - UTF-8 support
preg_match('/\w+/u', 'café');  // Match (includes é)

// Combine modifiers
preg_match('/hello/imu', $text);
?>

Replacement with Callbacks

<?php
$text = 'I have 5 apples and 10 oranges';

$result = preg_replace_callback('/\d+/', function($matches) {
    return (int)$matches[0] * 2;
}, $text);

echo $result;  // "I have 10 apples and 20 oranges"
?>

C# (.NET)

Regex Class

using System;
using System.Text.RegularExpressions;

// Static methods (simple use)
string text = "Contact: 123-4567 or 987-6543";
Match match = Regex.Match(text, @"\d{3}-\d{4}");
if (match.Success)
{
    Console.WriteLine(match.Value);  // "123-4567"
}

// Find all matches
MatchCollection matches = Regex.Matches(text, @"\d{3}-\d{4}");
foreach (Match m in matches)
{
    Console.WriteLine(m.Value);
}
// Output:
// 123-4567
// 987-6543

// Replace
string phone = "(123) 456-7890";
string cleaned = Regex.Replace(phone, @"[^\d]", "");
Console.WriteLine(cleaned);  // "1234567890"

// Split
string csv = "apple,banana, orange , grape";
string[] fruits = Regex.Split(csv, @"\s*,\s*");
// ["apple", "banana", "orange", "grape"]

Compiled Regex (Better Performance)

using System.Text.RegularExpressions;

// Compile for reuse (much faster for repeated use)
Regex pattern = new Regex(@"\d{3}-\d{4}", RegexOptions.Compiled);

string text = "Contact: 123-4567";
Match match = pattern.Match(text);
if (match.Success)
{
    Console.WriteLine(match.Value);
}

RegexOptions (Flags)

using System.Text.RegularExpressions;

// Case-insensitive
Regex.IsMatch("HELLO", "hello", RegexOptions.IgnoreCase);

// Multiline
Regex.Match("Line 1\nLine 2", "^Line 2", RegexOptions.Multiline);

// Singleline (. matches newlines)
Regex.Match("a\nb", "a.b", RegexOptions.Singleline);

// Compiled (better performance)
var pattern = new Regex(@"\d+", RegexOptions.Compiled);

// Combine options
var opts = RegexOptions.IgnoreCase | RegexOptions.Multiline;
Regex.Match(text, pattern, opts);

Named Groups

using System;
using System.Text.RegularExpressions;

string pattern = @"(?<year>\d{4})-(?<month>\d{2})-(?<day>\d{2})";
Match match = Regex.Match("2025-01-17", pattern);

if (match.Success)
{
    Console.WriteLine(match.Groups["year"].Value);   // "2025"
    Console.WriteLine(match.Groups["month"].Value);  // "01"
    Console.WriteLine(match.Groups["day"].Value);    // "17"
}

Replacement with MatchEvaluator

using System;
using System.Text.RegularExpressions;

string text = "I have 5 apples and 10 oranges";

string result = Regex.Replace(text, @"\d+", match =>
{
    int num = int.Parse(match.Value);
    return (num * 2).ToString();
});

Console.WriteLine(result);  // "I have 10 apples and 20 oranges"

Java

Pattern and Matcher Classes

import java.util.regex.Pattern;
import java.util.regex.Matcher;

// Compile pattern
Pattern pattern = Pattern.compile("\\d{3}-\\d{4}");
String text = "Contact: 123-4567 or 987-6543";

// Create matcher
Matcher matcher = pattern.matcher(text);

// Find first match
if (matcher.find()) {
    System.out.println(matcher.group());  // "123-4567"
}

// Find all matches
matcher.reset();  // Reset to start
while (matcher.find()) {
    System.out.println(matcher.group());
}
// Output:
// 123-4567
// 987-6543

String Methods

// matches() - Check if ENTIRE string matches
boolean isPhone = "123-4567".matches("\\d{3}-\\d{4}");
System.out.println(isPhone);  // true

// replaceAll() - Replace all matches
String phone = "(123) 456-7890";
String cleaned = phone.replaceAll("[^\\d]", "");
System.out.println(cleaned);  // "1234567890"

// replaceFirst() - Replace first match
String text = "cat dog cat";
String result = text.replaceFirst("cat", "bird");
System.out.println(result);  // "bird dog cat"

// split() - Split by pattern
String csv = "apple,banana, orange , grape";
String[] fruits = csv.split("\\s*,\\s*");
// ["apple", "banana", "orange", "grape"]

Pattern Flags

import java.util.regex.Pattern;

// Case-insensitive
Pattern pattern = Pattern.compile("hello", Pattern.CASE_INSENSITIVE);

// Multiline
Pattern.compile("^Line 2", Pattern.MULTILINE);

// Dotall (. matches newlines)
Pattern.compile("a.b", Pattern.DOTALL);

// Comments (free-spacing)
Pattern.compile("""
    \\d{3}     # area code
    -          # separator
    \\d{3}     # prefix
    -          # separator
    \\d{4}     # line number
    """, Pattern.COMMENTS);

// Combine flags
int flags = Pattern.CASE_INSENSITIVE | Pattern.MULTILINE;
Pattern.compile("pattern", flags);

Named Groups

import java.util.regex.Pattern;
import java.util.regex.Matcher;

Pattern pattern = Pattern.compile("(?<year>\\d{4})-(?<month>\\d{2})-(?<day>\\d{2})");
Matcher matcher = pattern.matcher("2025-01-17");

if (matcher.find()) {
    System.out.println(matcher.group("year"));   // "2025"
    System.out.println(matcher.group("month"));  // "01"
    System.out.println(matcher.group("day"));    // "17"
}

Advanced Replacement

import java.util.regex.Pattern;
import java.util.regex.Matcher;

String text = "I have 5 apples and 10 oranges";
Pattern pattern = Pattern.compile("\\d+");
Matcher matcher = pattern.matcher(text);

StringBuffer result = new StringBuffer();
while (matcher.find()) {
    int num = Integer.parseInt(matcher.group());
    matcher.appendReplacement(result, String.valueOf(num * 2));
}
matcher.appendTail(result);

System.out.println(result);  // "I have 10 apples and 20 oranges"

Go (Golang)

The `regexp` Package

package main

import (
    "fmt"
    "regexp"
)

func main() {
    // Compile pattern
    pattern := regexp.MustCompile(`\d{3}-\d{4}`)
    text := "Contact: 123-4567 or 987-6543"

    // Find first match
    match := pattern.FindString(text)
    fmt.Println(match)  // "123-4567"

    // Find all matches
    matches := pattern.FindAllString(text, -1)
    fmt.Println(matches)  // [123-4567 987-6543]

    // Check if matches
    isMatch := pattern.MatchString("123-4567")
    fmt.Println(isMatch)  // true

    // Replace all
    phone := "(123) 456-7890"
    cleaned := regexp.MustCompile(`[^\d]`).ReplaceAllString(phone, "")
    fmt.Println(cleaned)  // "1234567890"

    // Split
    csv := "apple,banana, orange , grape"
    fruits := regexp.MustCompile(`\s*,\s*`).Split(csv, -1)
    fmt.Println(fruits)  // [apple banana orange grape]
}

Submatches (Groups)

package main

import (
    "fmt"
    "regexp"
)

func main() {
    pattern := regexp.MustCompile(`(\d{4})-(\d{2})-(\d{2})`)
    text := "Date: 2025-01-17"

    // FindStringSubmatch returns [full, group1, group2, ...]
    matches := pattern.FindStringSubmatch(text)
    if matches != nil {
        fmt.Println(matches[0])  // "2025-01-17" (full match)
        fmt.Println(matches[1])  // "2025" (group 1)
        fmt.Println(matches[2])  // "01" (group 2)
        fmt.Println(matches[3])  // "17" (group 3)
    }

    // FindAllStringSubmatch for all matches
    text2 := "Dates: 2025-01-17 and 2024-12-31"
    allMatches := pattern.FindAllStringSubmatch(text2, -1)
    for _, match := range allMatches {
        fmt.Printf("Year: %s, Month: %s, Day: %s\n", match[1], match[2], match[3])
    }
    // Year: 2025, Month: 01, Day: 17
    // Year: 2024, Month: 12, Day: 31
}

Named Groups

package main

import (
    "fmt"
    "regexp"
)

func main() {
    pattern := regexp.MustCompile(`(?P<year>\d{4})-(?P<month>\d{2})-(?P<day>\d{2})`)
    text := "2025-01-17"

    match := pattern.FindStringSubmatch(text)
    if match != nil {
        // Get named group indices
        names := pattern.SubexpNames()
        result := make(map[string]string)
        for i, name := range names {
            if i != 0 && name != "" {
                result[name] = match[i]
            }
        }
        fmt.Println(result["year"])   // "2025"
        fmt.Println(result["month"])  // "01"
        fmt.Println(result["day"])    // "17"
    }
}

Replacement with Functions

package main

import (
    "fmt"
    "regexp"
    "strconv"
)

func main() {
    text := "I have 5 apples and 10 oranges"
    pattern := regexp.MustCompile(`\d+`)

    result := pattern.ReplaceAllStringFunc(text, func(s string) string {
        num, _ := strconv.Atoi(s)
        return strconv.Itoa(num * 2)
    })

    fmt.Println(result)  // "I have 10 apples and 20 oranges"
}

Ruby

Regex Literals

# Literal notation
pattern = /\d{3}-\d{4}/

# With flags
pattern_ci = /hello/i  # Case-insensitive
pattern_multi = /^line/m  # Multiline

# Constructor (for dynamic patterns)
pattern = Regex.new('\d{3}-\d{4}')

String Methods

text = 'Contact: 123-4567 or 987-6543'

# match() - Returns MatchData or nil
match = text.match(/\d{3}-\d{4}/)
if match
  puts match[0]  # "123-4567"
end

# scan() - Find all matches
matches = text.scan(/\d{3}-\d{4}/)
puts matches  # ["123-4567", "987-6543"]

# =~ operator - Returns index of first match
index = text =~ /\d{3}-\d{4}/
puts index  # 9

# sub() - Replace first match
result = 'cat dog cat'.sub(/cat/, 'bird')
puts result  # "bird dog cat"

# gsub() - Replace all matches
phone = '(123) 456-7890'
cleaned = phone.gsub(/[^\d]/, '')
puts cleaned  # "1234567890"

# split() - Split by pattern
csv = 'apple,banana, orange , grape'
fruits = csv.split(/\s*,\s*/)
puts fruits  # ["apple", "banana", "orange", "grape"]

Capture Groups

pattern = /(\d{4})-(\d{2})-(\d{2})/
match = '2025-01-17'.match(pattern)

if match
  puts match[0]  # "2025-01-17" (full match)
  puts match[1]  # "2025" (group 1)
  puts match[2]  # "01" (group 2)
  puts match[3]  # "17" (group 3)
end

Named Groups

pattern = /(?<year>\d{4})-(?<month>\d{2})-(?<day>\d{2})/
match = '2025-01-17'.match(pattern)

if match
  puts match[:year]   # "2025"
  puts match[:month]  # "01"
  puts match[:day]    # "17"
end

Replacement with Blocks

text = 'I have 5 apples and 10 oranges'

result = text.gsub(/\d+/) { |num| (num.to_i * 2).to_s }
puts result  # "I have 10 apples and 20 oranges"

# With named groups
pattern = /(?<first>\w+)\s+(?<last>\w+)/
text = 'John Doe'

result = text.gsub(pattern) do |match|
  m = Regexp.last_match
  "#{m[:last].upcase}, #{m[:first]}"
end
puts result  # "DOE, John"

Flags

# i - Case-insensitive
/hello/i.match('HELLO')  # Match

# m - Multiline (. matches newlines)
/a.b/m.match("a\nb")  # Match

# x - Free-spacing (ignore whitespace)
pattern = /
  \d{3}     # area code
  -         # separator
  \d{3}     # prefix
  -         # separator
  \d{4}     # line number
/x

# o - Compile once (optimization)
pattern = /\d+/o

Visual Studio Code's Find and Replace (Ctrl/Cmd+H) supports regex with powerful transformation capabilities. This section shows 28 practical examples developers use every day.

Accessing Find & Replace

Keyboard Shortcuts:

Find: Ctrl+F (Windows/Linux) / Cmd+F (Mac)
Replace: Ctrl+H (Windows/Linux) / Cmd+H (Mac)
Enable Regex: Click the .* button or press Alt+R

Tips:

Use Ctrl+Enter (Cmd+Enter) to replace all
Preview matches before replacing (they highlight in yellow)
Use F3 / Shift+F3 to navigate between matches

Case Transformations

VS Code supports special replacement sequences for case conversion:

Sequence	Effect	Example
`\l`	Lowercase next character	`\l`
`\u`	Uppercase next character	`\u`
`\L`	Lowercase all following characters	`\L`
`\U`	Uppercase all following characters	`\U`
`\E`	End case transformation	`\U\E`

Example 1: Capitalize First Letter

Find:

\b(\w)(\w*)

Replace:

\u\

Before:

hello world

After:

Hello World

This section provides 25+ ready-to-use regex patterns.

Email Validation

^[\w.-]+@[\w.-]+\.[a-z]{2,}$

Phone Numbers (US)

^(\+?1)?[-.\s]?\(?([0-9]{3})\)?[-.\s]?([0-9]{3})[-.\s]?([0-9]{4})$

Dates (ISO 8601)

^\d{4}-(0[1-9]|1[0-2])-(0[1-9]|[12]\d|3[01])$

URLs

^https?://(?:www\.)?[-a-zA-Z0-9@:%._+~#=]{1,256}\.[a-zA-Z0-9()]{1,6}\b

IPv4 Addresses

^(?:(?:25[0-5]|2[0-4]\d|[01]?\d\d?)\.){3}(?:25[0-5]|2[0-4]\d|[01]?\d\d?)$

Hex Colors

^#?([A-Fa-f0-9]{6}|[A-Fa-f0-9]{3})$

Strong Password

^(?=.*[a-z])(?=.*[A-Z])(?=.*\d)(?=.*[@$!%*?&])[A-Za-z\d@$!%*?&]{8,}$

Username (3-16 chars)

^[a-zA-Z0-9_-]{3,16}$

UUID v4

^[0-9a-f]{8}-[0-9a-f]{4}-4[0-9a-f]{3}-[89ab][0-9a-f]{3}-[0-9a-f]{12}$

And 15+ more patterns with full documentation!

Real-world regex applications in development.

Form Validation

const validators = {
  email: /^[\w.-]+@[\w.-]+\.[a-z]{2,}$/i,
  phone: /^\(?\d{3}\)?[-.\s]?\d{3}[-.\s]?\d{4}$/
};

Data Extraction

Extract emails from text:

import re
emails = re.findall(r'[\w.-]+@[\w.-]+\.[a-z]{2,}', text)

Log Parsing

pattern = r'(?P<ip>[\d.]+).+(?P<status>\d{3})'

Code Refactoring

Convert old API calls in VS Code:

Find: apiClient\.get$'([^']+)'$
Replace: fetch('$1').then(r => r.json())

Common Mistakes

1. Forgetting to Escape Special Characters

❌ Wrong: file.txt
✅ Correct: file\.txt

2. Greedy vs Lazy

❌ Greedy: <.*> matches entire <div>text</div>
✅ Lazy: <.*?> matches <div> and </div> separately

3. Not Using Anchors

❌ /\d{3}/ matches "123" in "abc123def"
✅ /^\d{3}$/ only matches exactly "123"

Performance Tips

Use specific character classes instead of .
Anchor patterns when possible
Avoid nested quantifiers (catastrophic backtracking)
Use atomic groups for performance
Compile patterns for reuse

Debugging

Test on regex101.com
Use verbose mode with comments
Break complex patterns into parts
Test edge cases

Online Tools

Regex Testers

regex101.com - Best tester with explanations
regexr.com - Visual regex builder
regexpal.com - Simple, fast testing

Visualizers

debuggex.com - Railroad diagrams
regexper.com - Visual tool

Learning Resources

regexone.com - Interactive lessons
regexlearn.com - Step-by-step guide
regular-expressions.info - Documentation

IDE Extensions

Regex Previewer (VS Code)
Regex Tester (VS Code)

Character Classes

Pattern	Matches
`\d`	Digit [0-9]
`\w`	Word [a-zA-Z0-9_]
`\s`	Whitespace
`.`	Any character

Quantifiers

Pattern	Meaning
`*`	0 or more
`+`	1 or more
`?`	0 or 1
`{n}`	Exactly n

Anchors

Pattern	Meaning
`^`	Start of line
`$`	End of line
`\b`	Word boundary

Flags

Flag	Meaning
`i`	Case-insensitive
`g`	Global
`m`	Multiline
`s`	Dotall

General Questions

Q: What is the difference between greedy and lazy quantifiers?

A: Greedy matches as much as possible. Lazy (*?, +?) matches as little as possible.

Q: How do I match a literal dot?

A: Escape it with backslash: \.

Q: What is catastrophic backtracking?

A: When regex tries many combinations, causing slowness. Avoid nested quantifiers like (a+)+.

Q: Can regex validate email perfectly?

A: No. Use regex for basic format, then verify via email.

Q: How do I match across multiple lines?

A: Use the s flag, or use [\s\S]* instead of .*.

VS Code Specific

Q: How do I replace with uppercase in VS Code?

A: Use \u (uppercase next), \U (uppercase all).

Q: Can I use regex in VS Code file search?

A: Yes! Press Ctrl+Shift+F and enable regex (Alt+R).

Regex Data Extractor Chrome Extension

Our Regex Data Extractor helps you extract data from web pages using patterns from this guide.

Key Features

Pattern Library: Pre-built patterns
Live Testing: Test regex on any webpage
Multi-Format Export: CSV, JSON, Excel, PDF
Batch Extraction: Extract from multiple pages

Example: Extract Emails

Install Regex Data Extractor
Navigate to any webpage
Click the extension icon
Enter pattern: [\w.-]+@[\w.-]+\.[a-z]{2,}
Click "Extract"
Export to CSV/JSON

Example: Extract Prices

Pattern: \$([0-9]{1,3}(?:,?[0-9]{3})*\.[0-9]{2})
Captures: $1,234.56, $99.99

Pro Tips

Save frequently used patterns
Use named groups for structured data
Test patterns first
Export for data analysis

Get Regex Data Extractor →

What is Regex?

Why Use Regular Expressions?

How to Read This Cheat Sheet

Literal Characters

Case Sensitivity

Special Characters (Metacharacters)

Character Classes

Predefined Character Classes

Custom Character Classes

Quantifiers

Basic Quantifiers

Greedy vs. Lazy Quantifiers

Anchors & Boundaries

Groups & Alternation

Capturing Groups

Non-Capturing Groups

Alternation (OR)

Lookaround Assertions

Positive Lookahead (?=...)

Negative Lookahead (?!...)

Positive Lookbehind (?<=...)

Negative Lookbehind (?<!...)

Named Capturing Groups

Atomic Groups & Possessive Quantifiers

Atomic Groups (?>...)

Possessive Quantifiers

Unicode Support

Unicode Categories \p{...}

Common Unicode Categories:

Unicode Scripts:

Negation:

Modifiers & Flags

Inline Modifiers

Conditional Patterns

Comments in Regex

Inline Comments (?# comment)

Free-Spacing Mode (x flag)

JavaScript / Node.js

Creating Regex Patterns

String Methods

RegExp Methods

Named Groups (ES2018+)

Unicode Support (ES2018+)

Python

The re Module

Core Functions

Groups and Named Groups

Flags

Replacement with Functions

PHP

PCRE Functions

Named Groups

Modifiers (Flags)

Replacement with Callbacks

C# (.NET)

Regex Class

Compiled Regex (Better Performance)

RegexOptions (Flags)

Named Groups

Replacement with MatchEvaluator

Java

Pattern and Matcher Classes

String Methods

Pattern Flags

Named Groups

Advanced Replacement

Go (Golang)

The regexp Package

Submatches (Groups)

Named Groups

Replacement with Functions

Ruby

Regex Literals

String Methods

Capture Groups

Named Groups

Replacement with Blocks

Flags

Accessing Find & Replace

Case Transformations

Positive Lookahead `(?=...)`

Negative Lookahead `(?!...)`

Positive Lookbehind `(?<=...)`

Negative Lookbehind `(?<!...)`

Atomic Groups `(?>...)`

Unicode Categories `\p{...}`

Inline Comments `(?# comment)`

Free-Spacing Mode (`x` flag)

The `re` Module

The `regexp` Package