SlideShare a Scribd company logo
Regular Expressions
Ben Simpson - <3 HUB
Introductions
●
●
●
●

Working with web technologies for 10 years
Former HUB supervisor
Tour de jobs: http://tinyurl.com/kmsns38
Graduated from CSU with a BAS in
Technology Management 2013
● Husband and proud father
● Presenter on regular expressions!
What Is a Regular
Expression?
Pattern matching
What Could I Do With a RegExp?
●
●
●
●
●
●

Searching
Syntax highlighting
Data validation
Sanitation
Data queries / extraction
Many tasks that require matching a pattern
RegExps Won’t Let You Time Travel
Brain Teaser
Which of the following is a valid telephone
number?
1. 678 466 4000
2. (678) 466-4000
3. 1234
4. domainuser
5. 1 (800) 1234 567
How did you know?
Depends on who you ask...
We Pattern Match Every Day
● Telephone numbers follow a pattern that we
recognize
● This pattern has rules (3 digit zip, 7 digit
number, numeric only)
● There are often many variations to a pattern
(optional intl code)
Literal Characters
String: The cat in the hat
RegExp: /at/
The cat in the hat
Regular Expressions in Javascript
var haystack = "The cat in the hat";
var needle = new RegExp(/cat/);
haystack.match(needle); // truthy
needle = new RegExp(/dog/);
haystack.match(needle); // falsey
Well that wasn’t so bad
The best is yet to come!
Special Characters (Metacharacters)
●  - escape character
● ^ - beginning of line (not
inside brackets)

● $ - ending of line
● . - wildcard
● | - or junction

●
●
●
●
●
●

? - zero or one
* - zero or more
+ - one or more
() - grouping
[] - character set
{} - repetition
Regular expression presentation for the HUB
Demonstration of Special Characters
String: ...To login to your email use the
username: “ben.simpson@mail.com” with a
password “password123”...
RegExp: /username "(.*)" .* password "(.*)"/
Results: 1. ben.simpson@mail.com
2. password123
Shorthand Character Classes
● d - digit [0-9]
● w - word
● s - whitespace

● D - digit [^d]
● W - word [^w]
● S - whitespace [^s]
Wait a Second!
You said this was easy
Thinking about a Telephone Pattern
●
●
●
●
●
●
●
●
●

Optional international code
3 digit area code
7 digit number
Optional extension
What about alpha phrases? (e.g. 678 466-HELP)
What is the length of intl codes? (e.g. 358 for Finland)
Are parenthesis optional?
Is spacing optional?
Country specific formats (e.g. France 06 87 71 23 45)
Regular Expression - Telephone #
String: 678 466 4357
RegExp: d{3} d{3} d{4}
String: (678) 466-4357
RegExp: (d{3}) d{3}-d{4}
Telephone # - Two Variations
String: 678 466 4357
(678) 466-4357
RegExp: (?d{3})? d{3}[s-]d{4}
Telephone # - Three Variations
String: 678 466 4357
(678) 466-4357
1 (678) 466-4357
RegExp: d*s?(?d{3})? d{3}[s-]?d{4}
That Escalated Quickly
Surprisingly Difficult
● Seemingly simple patterns can become very
complex.
● Its best to work against data that is
consistent, or regular in its implementation of
patterns
● If the data is too dirty, a regular expression
won’t be much help
When RegExps Go Bad
● Websites that don’t accept special
characters in email addresses, URLs,
telephone numbers, etc
● May be RegExps that are too restrictive
● Doesn’t take into account all variations of a
pattern
● Longer expressions are difficult to grok
Regular expression presentation for the HUB
In a Nutshell
“Some people, when confronted with a
problem, think ‘I know, I'll use regular
expressions.’ Now they have two problems.”
-Jamie Zawinski
Brain Teaser
Which of the following a valid email address?
1. thehoagie@gmail.com
2. ben.simpson+work@analoganalytics.com
3. ben+email
4. http://www.clayton.edu
5. abc."defghi".xyz@example.com
Thinking about Email Address
● Has a local part (e.g. thehub@clayton.edu)
● Has a domain part (e.g. thehub@clayton.
edu)
● Has an @ symbol in the middle
● Do we need to support special characters?
● Can we verify based on minimum /
maximum length?
Best to Keep It Simple!
String: thehoagie@gmail.com
RegExp: .*@.*
Yeah, but isn’t here an official email Regex that
takes all the patterns into account? Yes...
RFC 5322 - The Email RegExp
(?:[a-z0-9!#$%&'*+/=?^_`{|}~-]+(?:.[a-z0-9!#$%&'*+/=?^_`{|}~-]+)*
| "(?:[x01-x08x0bx0cx0e-x1fx21x23-x5bx5d-x7f]
| [x01-x09x0bx0cx0e-x7f])*")
@ (?:(?:[a-z0-9](?:[a-z0-9-]*[a-z0-9])?.)+[a-z0-9](?:[a-z0-9-]*[a-z0-9])?
| [(?:(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?).){3}
(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?|[a-z0-9-]*[a-z0-9]:
(?:[x01-x08x0bx0cx0e-x1fx21-x5ax53-x7f]
| [x01-x09x0bx0cx0e-x7f])+)
])
Maybe this instead?

(╯°□°)╯︵ ┻━┻)
┬─┬ ノ( ゜-゜ノ)

(Let me put that back for you)
Brain Teaser
Which is a valid zipcode?
1. 30022
2. 30022-7155
3. 300131
4. -7155
5. AB123XY
Thinking About a Zipcode
●
●
●
●
●

Digits only
5 digits mandatory plus optional 4 digit code
4 digit code suffixed with hyphen
Do other countries use zip codes?
Pattern is easier because there is less
variation (Thank USPS!)
Brain Teaser
Which is a valid URL?
1. http://www.clayton.edu
2. www.clayton.edu
3. clayton.edu
4. thehub.clayton.edu
5. ben:pass@clayton.edu:80/foo?bar=baz#qux
Thinking about a URL
Ben Simpson
thehoagie@gmail.com
@mrfrosti
Extra Credit
●
●
●
●
●

IP address
HTML Tag contents
Validating a password against requirements
Dates
Times

More Related Content

PPTX
Chatbot ppt
PPTX
Huawei Next Generation Anti-DDoS Solution (2015.3)
PPTX
Weakly Supervised Machine Reading
PDF
Anjuli Kannan, Software Engineer, Google at MLconf SF 2016
PPTX
Clean code
PDF
Clean code
PPTX
Testing natural language processing
PPTX
Dragos Munteanu (SDL) at the Industry Leaders Forum 2015
Chatbot ppt
Huawei Next Generation Anti-DDoS Solution (2015.3)
Weakly Supervised Machine Reading
Anjuli Kannan, Software Engineer, Google at MLconf SF 2016
Clean code
Clean code
Testing natural language processing
Dragos Munteanu (SDL) at the Industry Leaders Forum 2015

Viewers also liked (6)

PDF
Regular Expressions
PDF
Docker presentation
PPT
Expression Presentation
PDF
Learn BEM: CSS Naming Convention
PDF
SEO: Getting Personal
PPTX
How to Build a Dynamic Social Media Plan
Regular Expressions
Docker presentation
Expression Presentation
Learn BEM: CSS Naming Convention
SEO: Getting Personal
How to Build a Dynamic Social Media Plan
Ad

Similar to Regular expression presentation for the HUB (20)

PDF
/Regex makes me want to (weep_give up_(╯°□°)╯︵ ┻━┻)/i (for 2024 CascadiaPHP)
PDF
/Regex makes me want to (weep|give up|(╯°□°)╯︵ ┻━┻)\.?/i
PPTX
Regular Expression
PPTX
Regular Expression Crash Course
PPTX
regex_presentation.pptx
PPT
Regular Expression in Action
PPTX
Regular expressions
PDF
How to check valid Email? Find using regex.
PDF
Regular Expressions: QA Challenge Accepted Conf (March 2015)
PPTX
Regular expressions using Python
ODP
Regular Expressions and You
PPTX
Regular Expression
PDF
How to check valid Email? Find using regex.
PPTX
Regular expressions
PDF
Regex - Regular Expression Basics
PPTX
Regular Expression (Regex) Fundamentals
PDF
How to check valid email? Find using regex(p?)
PPTX
Mikhail Khristophorov "Introduction to Regular Expressions"
PDF
Understanding advanced regular expressions
PDF
Regexp
/Regex makes me want to (weep_give up_(╯°□°)╯︵ ┻━┻)/i (for 2024 CascadiaPHP)
/Regex makes me want to (weep|give up|(╯°□°)╯︵ ┻━┻)\.?/i
Regular Expression
Regular Expression Crash Course
regex_presentation.pptx
Regular Expression in Action
Regular expressions
How to check valid Email? Find using regex.
Regular Expressions: QA Challenge Accepted Conf (March 2015)
Regular expressions using Python
Regular Expressions and You
Regular Expression
How to check valid Email? Find using regex.
Regular expressions
Regex - Regular Expression Basics
Regular Expression (Regex) Fundamentals
How to check valid email? Find using regex(p?)
Mikhail Khristophorov "Introduction to Regular Expressions"
Understanding advanced regular expressions
Regexp
Ad

More from thehoagie (10)

PPTX
Pair programming
PDF
Database 101
PPTX
Testing
PPTX
Hubot
PDF
Git Pro Tips
PDF
Null object pattern
PDF
Big tables and you - Keeping DDL operatations fast
PDF
Angular.js - An introduction for the unitiated
PDF
Converting your JS library to a jQuery plugin
PDF
Active records before_type_cast
Pair programming
Database 101
Testing
Hubot
Git Pro Tips
Null object pattern
Big tables and you - Keeping DDL operatations fast
Angular.js - An introduction for the unitiated
Converting your JS library to a jQuery plugin
Active records before_type_cast

Recently uploaded (20)

PDF
Mobile App Security Testing_ A Comprehensive Guide.pdf
PDF
cuic standard and advanced reporting.pdf
PDF
Reach Out and Touch Someone: Haptics and Empathic Computing
PPTX
ACSFv1EN-58255 AWS Academy Cloud Security Foundations.pptx
PDF
MIND Revenue Release Quarter 2 2025 Press Release
PDF
The Rise and Fall of 3GPP – Time for a Sabbatical?
PDF
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
PDF
KodekX | Application Modernization Development
PDF
Electronic commerce courselecture one. Pdf
PPTX
Detection-First SIEM: Rule Types, Dashboards, and Threat-Informed Strategy
PPTX
sap open course for s4hana steps from ECC to s4
PDF
Per capita expenditure prediction using model stacking based on satellite ima...
PDF
Machine learning based COVID-19 study performance prediction
PPTX
20250228 LYD VKU AI Blended-Learning.pptx
PDF
Approach and Philosophy of On baking technology
PDF
Review of recent advances in non-invasive hemoglobin estimation
DOCX
The AUB Centre for AI in Media Proposal.docx
PDF
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
PPTX
Cloud computing and distributed systems.
PDF
Building Integrated photovoltaic BIPV_UPV.pdf
Mobile App Security Testing_ A Comprehensive Guide.pdf
cuic standard and advanced reporting.pdf
Reach Out and Touch Someone: Haptics and Empathic Computing
ACSFv1EN-58255 AWS Academy Cloud Security Foundations.pptx
MIND Revenue Release Quarter 2 2025 Press Release
The Rise and Fall of 3GPP – Time for a Sabbatical?
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
KodekX | Application Modernization Development
Electronic commerce courselecture one. Pdf
Detection-First SIEM: Rule Types, Dashboards, and Threat-Informed Strategy
sap open course for s4hana steps from ECC to s4
Per capita expenditure prediction using model stacking based on satellite ima...
Machine learning based COVID-19 study performance prediction
20250228 LYD VKU AI Blended-Learning.pptx
Approach and Philosophy of On baking technology
Review of recent advances in non-invasive hemoglobin estimation
The AUB Centre for AI in Media Proposal.docx
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
Cloud computing and distributed systems.
Building Integrated photovoltaic BIPV_UPV.pdf

Regular expression presentation for the HUB

  • 2. Introductions ● ● ● ● Working with web technologies for 10 years Former HUB supervisor Tour de jobs: http://tinyurl.com/kmsns38 Graduated from CSU with a BAS in Technology Management 2013 ● Husband and proud father ● Presenter on regular expressions!
  • 3. What Is a Regular Expression? Pattern matching
  • 4. What Could I Do With a RegExp? ● ● ● ● ● ● Searching Syntax highlighting Data validation Sanitation Data queries / extraction Many tasks that require matching a pattern
  • 5. RegExps Won’t Let You Time Travel
  • 6. Brain Teaser Which of the following is a valid telephone number? 1. 678 466 4000 2. (678) 466-4000 3. 1234 4. domainuser 5. 1 (800) 1234 567
  • 7. How did you know? Depends on who you ask...
  • 8. We Pattern Match Every Day ● Telephone numbers follow a pattern that we recognize ● This pattern has rules (3 digit zip, 7 digit number, numeric only) ● There are often many variations to a pattern (optional intl code)
  • 9. Literal Characters String: The cat in the hat RegExp: /at/ The cat in the hat
  • 10. Regular Expressions in Javascript var haystack = "The cat in the hat"; var needle = new RegExp(/cat/); haystack.match(needle); // truthy needle = new RegExp(/dog/); haystack.match(needle); // falsey
  • 11. Well that wasn’t so bad The best is yet to come!
  • 12. Special Characters (Metacharacters) ● - escape character ● ^ - beginning of line (not inside brackets) ● $ - ending of line ● . - wildcard ● | - or junction ● ● ● ● ● ● ? - zero or one * - zero or more + - one or more () - grouping [] - character set {} - repetition
  • 14. Demonstration of Special Characters String: ...To login to your email use the username: “ben.simpson@mail.com” with a password “password123”... RegExp: /username "(.*)" .* password "(.*)"/ Results: 1. ben.simpson@mail.com 2. password123
  • 15. Shorthand Character Classes ● d - digit [0-9] ● w - word ● s - whitespace ● D - digit [^d] ● W - word [^w] ● S - whitespace [^s]
  • 16. Wait a Second! You said this was easy
  • 17. Thinking about a Telephone Pattern ● ● ● ● ● ● ● ● ● Optional international code 3 digit area code 7 digit number Optional extension What about alpha phrases? (e.g. 678 466-HELP) What is the length of intl codes? (e.g. 358 for Finland) Are parenthesis optional? Is spacing optional? Country specific formats (e.g. France 06 87 71 23 45)
  • 18. Regular Expression - Telephone # String: 678 466 4357 RegExp: d{3} d{3} d{4} String: (678) 466-4357 RegExp: (d{3}) d{3}-d{4}
  • 19. Telephone # - Two Variations String: 678 466 4357 (678) 466-4357 RegExp: (?d{3})? d{3}[s-]d{4}
  • 20. Telephone # - Three Variations String: 678 466 4357 (678) 466-4357 1 (678) 466-4357 RegExp: d*s?(?d{3})? d{3}[s-]?d{4}
  • 22. Surprisingly Difficult ● Seemingly simple patterns can become very complex. ● Its best to work against data that is consistent, or regular in its implementation of patterns ● If the data is too dirty, a regular expression won’t be much help
  • 23. When RegExps Go Bad ● Websites that don’t accept special characters in email addresses, URLs, telephone numbers, etc ● May be RegExps that are too restrictive ● Doesn’t take into account all variations of a pattern ● Longer expressions are difficult to grok
  • 25. In a Nutshell “Some people, when confronted with a problem, think ‘I know, I'll use regular expressions.’ Now they have two problems.” -Jamie Zawinski
  • 26. Brain Teaser Which of the following a valid email address? 1. thehoagie@gmail.com 2. ben.simpson+work@analoganalytics.com 3. ben+email 4. http://www.clayton.edu 5. abc."defghi".xyz@example.com
  • 27. Thinking about Email Address ● Has a local part (e.g. thehub@clayton.edu) ● Has a domain part (e.g. thehub@clayton. edu) ● Has an @ symbol in the middle ● Do we need to support special characters? ● Can we verify based on minimum / maximum length?
  • 28. Best to Keep It Simple! String: thehoagie@gmail.com RegExp: .*@.* Yeah, but isn’t here an official email Regex that takes all the patterns into account? Yes...
  • 29. RFC 5322 - The Email RegExp (?:[a-z0-9!#$%&'*+/=?^_`{|}~-]+(?:.[a-z0-9!#$%&'*+/=?^_`{|}~-]+)* | "(?:[x01-x08x0bx0cx0e-x1fx21x23-x5bx5d-x7f] | [x01-x09x0bx0cx0e-x7f])*") @ (?:(?:[a-z0-9](?:[a-z0-9-]*[a-z0-9])?.)+[a-z0-9](?:[a-z0-9-]*[a-z0-9])? | [(?:(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?).){3} (?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?|[a-z0-9-]*[a-z0-9]: (?:[x01-x08x0bx0cx0e-x1fx21-x5ax53-x7f] | [x01-x09x0bx0cx0e-x7f])+) ])
  • 31. ┬─┬ ノ( ゜-゜ノ) (Let me put that back for you)
  • 32. Brain Teaser Which is a valid zipcode? 1. 30022 2. 30022-7155 3. 300131 4. -7155 5. AB123XY
  • 33. Thinking About a Zipcode ● ● ● ● ● Digits only 5 digits mandatory plus optional 4 digit code 4 digit code suffixed with hyphen Do other countries use zip codes? Pattern is easier because there is less variation (Thank USPS!)
  • 34. Brain Teaser Which is a valid URL? 1. http://www.clayton.edu 2. www.clayton.edu 3. clayton.edu 4. thehub.clayton.edu 5. ben:pass@clayton.edu:80/foo?bar=baz#qux
  • 37. Extra Credit ● ● ● ● ● IP address HTML Tag contents Validating a password against requirements Dates Times