SlideShare a Scribd company logo
Cheat Sheet
Updated: 09/16
* Matches at least 0 times
+ Matches at least 1 time
? Matches at most 1 time; optional string
{n} Matches exactly n times
{n,} Matches at least n times
{,n} Matches at most n times
{n,m} Matches between n and m times
> string <- c("Hiphopopotamus", "Rhymenoceros", "time for bottomless lyrics")
> pattern <- "t.m"
grep(pattern, string)
[1] 1 3
grep(pattern, string, value = TRUE)
[1] "Hiphopopotamus"
[2] "time for bottomless lyrics“
grepl(pattern, string)
[1] TRUE FALSE TRUE
stringr::str_detect(string, pattern)
[1] TRUE FALSE TRUE
regexpr(pattern, string)
find starting position and length of first match
gregexpr(pattern, string)
find starting position and length of all matches
stringr::str_locate(string, pattern)
find starting and end position of first match
stringr::str_locate_all(string, pattern)
find starting and end position of all matches
regmatches(string, regexpr(pattern, string))
extract first match [1] "tam" "tim"
regmatches(string, gregexpr(pattern, string))
extracts all matches, outputs a list
[[1]] "tam" [[2]] character(0) [[3]] "tim" "tom"
stringr::str_extract(string, pattern)
extract first match [1] "tam" NA "tim"
stringr::str_extract_all(string, pattern)
extract all matches, outputs a list
stringr::str_extract_all(string, pattern, simplify = TRUE)
extract all matches, outputs a matrix
stringr::str_match(string, pattern)
extract first match + individual character groups
stringr::str_match_all(string, pattern)
extract all matches + individual character groups
sub(pattern, replacement, string)
replace first match
gsub(pattern, replacement, string)
replace all matches
stringr::str_replace(string, pattern, replacement)
replace first match
stringr::str_replace_all(string, pattern, replacement)
replace all matchesstrsplit(string, pattern) or stringr::str_split(string, pattern)
pattern
string
^ Start of the string
$ End of the string
b Empty string at either edge of a word
B NOT the edge of a word
< Beginning of a word
> End of a word
[[:digit:]] or d Digits; [0-9]
D Non-digits; [^0-9]
[[:lower:]] Lower-case letters; [a-z]
[[:upper:]] Upper-case letters; [A-Z]
[[:alpha:]] Alphabetic characters; [A-z]
[[:alnum:]] Alphanumeric characters [A-z0-9]
w Word characters; [A-z0-9_]
W Non-word characters
[[:xdigit:]] or x Hexadec. digits; [0-9A-Fa-f]
[[:blank:]] Space and tab
[[:space:]] or s Space, tab, vertical tab, newline,
form feed, carriage return
S Not space; [^[:space:]]
[[:punct:]] Punctuation characters;
!"#$%&’()*+,-./:;<=>?@[]^_`{|}~
[[:graph:]]
Graphical char.;
[[:alnum:][:punct:]]
[[:print:]]
Printable characters;
[[:alnum:][:punct:]s]
[[:cntrl:]] or c Control characters; n, r etc.
. Any character except n
| Or, e.g. (a|b)
[…] List permitted characters, e.g. [abc]
[a-z] Specify character ranges
[^…] List excluded characters
(…) Grouping, enables back referencing using
N where N is an integer
n New line
r Carriage return
t Tab
v Vertical tab
f Form feed
(?=) Lookahead (requires PERL = TRUE),
e.g. (?=yx): position followed by 'xy'
(?!) Negative lookahead (PERL = TRUE);
position NOT followed by pattern
(?<=) Lookbehind (PERL = TRUE), e.g.
(?<=yx): position following 'xy'
(?<!)
Negative lookbehind (PERL = TRUE);
position NOT following pattern
?(if)then If-then-condition (PERL = TRUE); use
lookaheads, optional char. etc in if-clause
?(if)then|else If-then-else-condition (PERL = TRUE)
*see, e.g. http://www.regular-expressions.info/lookaround.html
http://www.regular-expressions.info/conditional.html
By default R uses POSIX extended regular
expressions. You can switch to PCRE regular
expressions using PERL = TRUE for base or by
wrapping patterns with perl() for stringr.
All functions can be used with literal searches
using fixed = TRUE for base or by wrapping
patterns with fixed() for stringr.
All base functions can be made case insensitive
by specifying ignore.cases = TRUE.
Metacharacters (. * + etc.) can be used as
literal characters by escaping them. Characters
can be escaped using  or by enclosing them
in Q...E.
By default the asterisk * is greedy, i.e. it always
matches the longest possible string. It can be
used in lazy mode by adding ?, i.e. *?.
Greedy mode can be turned off using (?U). This
switches the syntax, so that (?U)a* is lazy and
(?U)a*? is greedy.
Regular expressions can be made case insensitive
using (?i). In backreferences, the strings can be
converted to lower or upper case using L or U
(e.g. L1). This requires PERL = TRUE.
CC BY Ian Kopacka • ian.kopacka@ages.at
Regular expressions can conveniently be
created using rex::rex().

More Related Content

PPT
Rate of change and tangent lines
PDF
5 2. string processing
ODP
Parsec
PPT
Functions
PPTX
String in programming language in c or c++
PPT
Question 1 Solution
PPTX
Otter 2014-12-08-02
PDF
Lista de exercícios 6 - Cálculo 1
Rate of change and tangent lines
5 2. string processing
Parsec
Functions
String in programming language in c or c++
Question 1 Solution
Otter 2014-12-08-02
Lista de exercícios 6 - Cálculo 1

What's hot (20)

DOCX
L'hopital's rule
PPTX
String (Computer programming and utilization)
PPTX
Keypoints c strings
PDF
CAPS_Discipline_Training
PPT
Volume
PPT
Graphing day 2 worked
PDF
5 1. character processing
PPTX
Bioinformatica p2-p3-introduction
PPT
5.7 rolle's thrm & mv theorem
PPTX
Processing Regex Python
PPTX
The Moore-Spiegel Oscillator
PDF
Methods of calculate roots of equations
ODP
Clug 2009 05 Ten Tips For Bash
PPT
Strongly Connected Components
PPT
Roll's theorem
PDF
Characterizing the Distortion of Some Simple Euclidean Embeddings
PPT
Regex Intro
L'hopital's rule
String (Computer programming and utilization)
Keypoints c strings
CAPS_Discipline_Training
Volume
Graphing day 2 worked
5 1. character processing
Bioinformatica p2-p3-introduction
5.7 rolle's thrm & mv theorem
Processing Regex Python
The Moore-Spiegel Oscillator
Methods of calculate roots of equations
Clug 2009 05 Ten Tips For Bash
Strongly Connected Components
Roll's theorem
Characterizing the Distortion of Some Simple Euclidean Embeddings
Regex Intro
Ad

Similar to Reg ex cheatsheet (20)

PPT
PPT
Strings
PPTX
Presentation more c_programmingcharacter_and_string_handling_
PPT
Strings
PDF
c programming
PPT
String manipulation techniques like string compare copy
PPT
Regex Basics
PPTX
Day5 String python language for btech.pptx
PPSX
Regular expressions in oracle
PPTX
Mikhail Khristophorov "Introduction to Regular Expressions"
PPT
Php String And Regular Expressions
PDF
14 ruby strings
PPTX
SQL for pattern matching (Oracle 12c)
PPTX
Slide -231, Math-1151, lecture-15,Chapter, Chapter 2.2,2.3, 5.1,5.2,5.6.pptx
PDF
Strings in Python
DOCX
Type header file in c++ and its function
PDF
ANSI C REFERENCE CARD
PDF
Python : Regular expressions
PPT
Strings In C and its syntax and uses .ppt
PDF
Lecture 10.pdf
Strings
Presentation more c_programmingcharacter_and_string_handling_
Strings
c programming
String manipulation techniques like string compare copy
Regex Basics
Day5 String python language for btech.pptx
Regular expressions in oracle
Mikhail Khristophorov "Introduction to Regular Expressions"
Php String And Regular Expressions
14 ruby strings
SQL for pattern matching (Oracle 12c)
Slide -231, Math-1151, lecture-15,Chapter, Chapter 2.2,2.3, 5.1,5.2,5.6.pptx
Strings in Python
Type header file in c++ and its function
ANSI C REFERENCE CARD
Python : Regular expressions
Strings In C and its syntax and uses .ppt
Lecture 10.pdf
Ad

More from Dieudonne Nahigombeye (11)

PDF
Rstudio ide-cheatsheet
PDF
Rmarkdown cheatsheet-2.0
PDF
How big-is-your-graph
PDF
Ggplot2 cheatsheet-2.1
PDF
Eurostat cheatsheet
PDF
Devtools cheatsheet
PDF
Data transformation-cheatsheet
PDF
Data import-cheatsheet
PDF
Rstudio ide-cheatsheet
Rmarkdown cheatsheet-2.0
How big-is-your-graph
Ggplot2 cheatsheet-2.1
Eurostat cheatsheet
Devtools cheatsheet
Data transformation-cheatsheet
Data import-cheatsheet

Recently uploaded (20)

PPTX
DISORDERS OF THE LIVER, GALLBLADDER AND PANCREASE (1).pptx
PPTX
Introduction to machine learning and Linear Models
PDF
BF and FI - Blockchain, fintech and Financial Innovation Lesson 2.pdf
PDF
Galatica Smart Energy Infrastructure Startup Pitch Deck
PPTX
Data_Analytics_and_PowerBI_Presentation.pptx
PPTX
oil_refinery_comprehensive_20250804084928 (1).pptx
PDF
.pdf is not working space design for the following data for the following dat...
PPT
Quality review (1)_presentation of this 21
PPT
Miokarditis (Inflamasi pada Otot Jantung)
PDF
168300704-gasification-ppt.pdfhghhhsjsjhsuxush
PPTX
mbdjdhjjodule 5-1 rhfhhfjtjjhafbrhfnfbbfnb
PPTX
Business Ppt On Nestle.pptx huunnnhhgfvu
PPTX
IBA_Chapter_11_Slides_Final_Accessible.pptx
PPTX
advance b rammar.pptxfdgdfgdfsgdfgsdgfdfgdfgsdfgdfgdfg
PDF
annual-report-2024-2025 original latest.
PPTX
AI Strategy room jwfjksfksfjsjsjsjsjfsjfsj
PPTX
Microsoft-Fabric-Unifying-Analytics-for-the-Modern-Enterprise Solution.pptx
PDF
Mega Projects Data Mega Projects Data
PDF
Fluorescence-microscope_Botany_detailed content
PPTX
ALIMENTARY AND BILIARY CONDITIONS 3-1.pptx
DISORDERS OF THE LIVER, GALLBLADDER AND PANCREASE (1).pptx
Introduction to machine learning and Linear Models
BF and FI - Blockchain, fintech and Financial Innovation Lesson 2.pdf
Galatica Smart Energy Infrastructure Startup Pitch Deck
Data_Analytics_and_PowerBI_Presentation.pptx
oil_refinery_comprehensive_20250804084928 (1).pptx
.pdf is not working space design for the following data for the following dat...
Quality review (1)_presentation of this 21
Miokarditis (Inflamasi pada Otot Jantung)
168300704-gasification-ppt.pdfhghhhsjsjhsuxush
mbdjdhjjodule 5-1 rhfhhfjtjjhafbrhfnfbbfnb
Business Ppt On Nestle.pptx huunnnhhgfvu
IBA_Chapter_11_Slides_Final_Accessible.pptx
advance b rammar.pptxfdgdfgdfsgdfgsdgfdfgdfgsdfgdfgdfg
annual-report-2024-2025 original latest.
AI Strategy room jwfjksfksfjsjsjsjsjfsjfsj
Microsoft-Fabric-Unifying-Analytics-for-the-Modern-Enterprise Solution.pptx
Mega Projects Data Mega Projects Data
Fluorescence-microscope_Botany_detailed content
ALIMENTARY AND BILIARY CONDITIONS 3-1.pptx

Reg ex cheatsheet

  • 1. Cheat Sheet Updated: 09/16 * Matches at least 0 times + Matches at least 1 time ? Matches at most 1 time; optional string {n} Matches exactly n times {n,} Matches at least n times {,n} Matches at most n times {n,m} Matches between n and m times > string <- c("Hiphopopotamus", "Rhymenoceros", "time for bottomless lyrics") > pattern <- "t.m" grep(pattern, string) [1] 1 3 grep(pattern, string, value = TRUE) [1] "Hiphopopotamus" [2] "time for bottomless lyrics“ grepl(pattern, string) [1] TRUE FALSE TRUE stringr::str_detect(string, pattern) [1] TRUE FALSE TRUE regexpr(pattern, string) find starting position and length of first match gregexpr(pattern, string) find starting position and length of all matches stringr::str_locate(string, pattern) find starting and end position of first match stringr::str_locate_all(string, pattern) find starting and end position of all matches regmatches(string, regexpr(pattern, string)) extract first match [1] "tam" "tim" regmatches(string, gregexpr(pattern, string)) extracts all matches, outputs a list [[1]] "tam" [[2]] character(0) [[3]] "tim" "tom" stringr::str_extract(string, pattern) extract first match [1] "tam" NA "tim" stringr::str_extract_all(string, pattern) extract all matches, outputs a list stringr::str_extract_all(string, pattern, simplify = TRUE) extract all matches, outputs a matrix stringr::str_match(string, pattern) extract first match + individual character groups stringr::str_match_all(string, pattern) extract all matches + individual character groups sub(pattern, replacement, string) replace first match gsub(pattern, replacement, string) replace all matches stringr::str_replace(string, pattern, replacement) replace first match stringr::str_replace_all(string, pattern, replacement) replace all matchesstrsplit(string, pattern) or stringr::str_split(string, pattern) pattern string ^ Start of the string $ End of the string b Empty string at either edge of a word B NOT the edge of a word < Beginning of a word > End of a word [[:digit:]] or d Digits; [0-9] D Non-digits; [^0-9] [[:lower:]] Lower-case letters; [a-z] [[:upper:]] Upper-case letters; [A-Z] [[:alpha:]] Alphabetic characters; [A-z] [[:alnum:]] Alphanumeric characters [A-z0-9] w Word characters; [A-z0-9_] W Non-word characters [[:xdigit:]] or x Hexadec. digits; [0-9A-Fa-f] [[:blank:]] Space and tab [[:space:]] or s Space, tab, vertical tab, newline, form feed, carriage return S Not space; [^[:space:]] [[:punct:]] Punctuation characters; !"#$%&’()*+,-./:;<=>?@[]^_`{|}~ [[:graph:]] Graphical char.; [[:alnum:][:punct:]] [[:print:]] Printable characters; [[:alnum:][:punct:]s] [[:cntrl:]] or c Control characters; n, r etc. . Any character except n | Or, e.g. (a|b) […] List permitted characters, e.g. [abc] [a-z] Specify character ranges [^…] List excluded characters (…) Grouping, enables back referencing using N where N is an integer n New line r Carriage return t Tab v Vertical tab f Form feed (?=) Lookahead (requires PERL = TRUE), e.g. (?=yx): position followed by 'xy' (?!) Negative lookahead (PERL = TRUE); position NOT followed by pattern (?<=) Lookbehind (PERL = TRUE), e.g. (?<=yx): position following 'xy' (?<!) Negative lookbehind (PERL = TRUE); position NOT following pattern ?(if)then If-then-condition (PERL = TRUE); use lookaheads, optional char. etc in if-clause ?(if)then|else If-then-else-condition (PERL = TRUE) *see, e.g. http://www.regular-expressions.info/lookaround.html http://www.regular-expressions.info/conditional.html By default R uses POSIX extended regular expressions. You can switch to PCRE regular expressions using PERL = TRUE for base or by wrapping patterns with perl() for stringr. All functions can be used with literal searches using fixed = TRUE for base or by wrapping patterns with fixed() for stringr. All base functions can be made case insensitive by specifying ignore.cases = TRUE. Metacharacters (. * + etc.) can be used as literal characters by escaping them. Characters can be escaped using or by enclosing them in Q...E. By default the asterisk * is greedy, i.e. it always matches the longest possible string. It can be used in lazy mode by adding ?, i.e. *?. Greedy mode can be turned off using (?U). This switches the syntax, so that (?U)a* is lazy and (?U)a*? is greedy. Regular expressions can be made case insensitive using (?i). In backreferences, the strings can be converted to lower or upper case using L or U (e.g. L1). This requires PERL = TRUE. CC BY Ian Kopacka • ian.kopacka@ages.at Regular expressions can conveniently be created using rex::rex().