SlideShare a Scribd company logo
BUILDING PROFANITY
FILTERS
clbuttic sh!t
Framework Days. IT Saturday. 5.09.2015
INTERNET CENSORSHIP
children
religion
sexual
https://en.wikipedia.org/wiki/Censorship
Framework Days. IT Saturday. 5.09.2015
WHY FILTERING GOOD
ETHNICITY
PROTECTS
CHILDREN
RELIGION
SEXUAL ORIENTATION
Framework Days. IT Saturday. 5.09.2015
WHY FILTERING BAD
lack of trust
to your users
their willing to
break rules
→
Framework Days. IT Saturday. 5.09.2015
Framework Days. IT Saturday. 5.09.2015
COCK
LET’S BUILD FILTER!
Framework Days. IT Saturday. 5.09.2015
filter =
list of dirty words +
list of replacements +
filter rule
– George Carlin, 1972
Shit, piss, fuck, cunt, cocksucker,
motherfucker, and tits.
“Seven Words You Can Never Say
on Television”
Framework Days. IT Saturday. 5.09.2015
FILTER RULES
Framework Days. IT Saturday. 5.09.2015
1. search by entry
psss…
RANGE OF WORD
Framework Days. IT Saturday. 5.09.2015
NSRange range = [text rangeOfString:badWord options:NSCaseInsensitiveSearch];
BOOL hasDirtyWord = [text localizedCaseInsensitiveContainsString:badWord];
RANGE OF WORD
Framework Days. IT Saturday. 5.09.2015
- (NSArray * )rangesOfBadWordsWithSpaceInString:(NSString * )text {
__block NSMutableArray * result = [NSMutableArray array];
[self.listOfBadWordsWithSpace enumerateObjectsUsingBlock:^(NSString * badWord,
NSUInteger idx, BOOL * stop) {
NSRange range = [text rangeOfString:badWord options:NSCaseInsensitiveSearch];
while (range.location != NSNotFound) {
[result addObject:[NSValue valueWithRange:range]];
NSRange nextRange = NSMakeRange(range.location + 1, [text length] - range.location - 1);
range = [text rangeOfString:badWord
options:NSCaseInsensitiveSearch
range:nextRange];
}
}];
return result;
}
SEARCH BY ENTRY
Framework Days. IT Saturday. 5.09.2015
Get your ass down here!The grass
around the creek was new, giving it a
velvety look. Dusty, his heartless
assassin, had found his mate.
SEARCH BY ENTRY
Framework Days. IT Saturday. 5.09.2015
Get your ass down here!The grass
around the creek was new, giving it a
velvety look. Dusty, his heartless
assassin, had found his mate.
SEARCH BY ENTRY
Framework Days. IT Saturday. 5.09.2015
Get your ass down here!The grass
around the creek was new, giving it a
velvety look. Dusty, his heartless
assassin, had found his mate.
FALSE POSITIVES
Framework Days. IT Saturday. 5.09.2015
Framework Days. IT Saturday. 5.09.2015
Get your ass down here!The grass
around the creek was new, giving it a
velvety look. Dusty, his heartless
assassin, had found his mate.
FALSE POSITIVES
Framework Days. IT Saturday. 5.09.2015
assart
assault
association
assurance
‘ASS’ WORDS
harassment
hassel
hourglass
impassable
pass
passion
piassaba
preassign
1250 words found
http://www.morewords.com/contains/ass/
Framework Days. IT Saturday. 5.09.2015
ass → butt
REPLACEMENT RULES…
Framework Days. IT Saturday. 5.09.2015
classic → clbuttic
…FAILS
Framework Days. IT Saturday. 5.09.2015
Framework Days. IT Saturday. 5.09.2015
Constitution →
Consbreastution
AND FAILS AGAIN…
medieval →
medireview
Tyson Gay →
Tyson Homosexual
FILTER RULES
Framework Days. IT Saturday. 5.09.2015
1. search by entry
2. search whole word don’t u
know me?
SEARCH WHOLE WORD
Framework Days. IT Saturday. 5.09.2015
NSString * scanned;
if ([scanner scanCharactersFromSet:wordCharacters intoString:&scanned]) {
if ([wordSet containsObject:[scanned lowercaseString]]) {
NSRange range = NSMakeRange(scanner.scanLocation - scanned.length, scanned.length);
[result addObject:[NSValue valueWithRange:range]];
}
}
NSSet * badWordsSet = [NSMutableSet setWithArray:self.listOfBadWords];
NSScanner * scanner = [NSScanner scannerWithString:text];
NSCharacterSet * wordCharacters = [NSCharacterSet alphanumericCharacterSet];
SEARCH WHOLE WORD
Framework Days. IT Saturday. 5.09.2015
Get your ass down here!The grass
around the creek was new, giving it a
velvety look. Dusty, his heartless
assassin, had found his mate.
SPACE!
Framework Days. IT Saturday. 5.09.2015
AND OTHER PUNCTUATION
Framework Days. IT Saturday. 5.09.2015
Get your a s s down here!You'd
probably fire my a.s.s the first day on
the job. You've covered my a_s_s
every time I screwed up.
PUNCTUATION
Framework Days. IT Saturday. 5.09.2015
Get your a s s down here!You'd
probably fire my a.s.s the first day on
the job. You've covered my a_s_s
every time I screwed up.
PUNCTUATION
FILTER RULES
Framework Days. IT Saturday. 5.09.2015
1. search by entry
2. search whole word
3. handle punctuation
don’t tell
anyone…
1337 59341<
Framework Days. IT Saturday. 5.09.2015
L33T SPEAK
Framework Days. IT Saturday. 5.09.2015
HOW MANY DIFFERENT
SPELLINGS HAS ONE WORD?
BITCH
Framework Days. IT Saturday. 5.09.2015
B1TCH
Framework Days. IT Saturday. 5.09.2015
I → 1
BITCH
B!TCH
Framework Days. IT Saturday. 5.09.2015
I → !
BITCH B1TCH
BI+CH
Framework Days. IT Saturday. 5.09.2015
T → +
BITCH B1TCH B!TCH
I3ITCH
Framework Days. IT Saturday. 5.09.2015
B → I3
BITCH B1TCH B!TCH BI+CH
BITCH
B!TCH
B1TCH
8ITCH
ßITCH
13ITCH
L3ITCH
BI7CH
BI+CH
BI†CH
BIT[H
BIT¢H
BIT<H
BITC#
BITC:
B1T¢H
8!†C#
8ITC/-/
817[#
(3][+(:
Framework Days. IT Saturday. 5.09.2015
FILTER RULES
Framework Days. IT Saturday. 5.09.2015
1. search by entry
2. search whole word
3. handle punctuation
4. handle l33t speak
my name
is…
SCUNTHORPE PROBLEM
Framework Days. IT Saturday. 5.09.2015
https://en.wikipedia.org/wiki/Scunthorpe_problem
NICE TITS
In 2007, the Royal Society
for the Protection of Birds
blocked ornithological
terms such as cock (male
bird) and tit, shag and
booby from its discussion
forums
Framework Days. IT Saturday. 5.09.2015
FILTER RULES
Framework Days. IT Saturday. 5.09.2015
1. search by entry
2. search whole word
3. handle punctuation
4. handle l33t speak
5. remember about exceptions
blue-
footed booby!
TEXT FILTERING ON IOS
Framework Days. IT Saturday. 5.09.2015
words dictionary
(boobs, b00bs,
b00b5)
whole word scan
NSScanner, NSSet
TEXT FILTERING ON IOS
Framework Days. IT Saturday. 5.09.2015
phrases dictionary
(b o o b s, b.o.o.b.s,
b!o!o!bs)
substring scan
rangeOfString
TEXT FILTERING ON IOS
Framework Days. IT Saturday. 5.09.2015
words dictionary
(boobs, b00bs,
b00b5)
whole word scan
NSScanner, NSSet
phrases dictionary
(b o o b s, b.o.o.b.s,
bo!obs)
substring scan
rangeOfString+
HOW FAST IS IT?
Framework Days. IT Saturday. 5.09.2015
time,seconds
0
0,1
0,2
0,3
0,4
user text, characters count
1000 5000 10000 20000
range scanner both
dirty words
dictionary
contains 455
words
LIVE FILTERING
Framework Days. IT Saturday. 5.09.2015
use RAC and run filter
every time user inputs
character
IMPROVE FILTERING
• Keep dictionary up to date
• Whitelist
• Levenshtein distance
• Soundex functions (where a word sounds like another)
• Naive bayesian inference filtering of phrases/terms
Framework Days. IT Saturday. 5.09.2015
POST-MODERATION
Framework Days. IT Saturday. 5.09.2015
alive moderators
solid community
flag abuse
DIRTY WORDS
• list of dirty words in different languages
https://github.com/shutterstock/List-of-Dirty-
Naughty-Obscene-and-Otherwise-Bad-Words
• list of dirty words i’ve used
https://gist.github.com/vixentael/
5ce4168e3e94d9686405
Framework Days. IT Saturday. 5.09.2015
LAST SLIDE
@vixentael
Framework Days. IT Saturday. 5.09.2015
iOS developer at Stanfy
THANK YOU
FOR WATCHING!
Framework Days. IT Saturday. 5.09.2015

More Related Content

PDF
IGNITE: MAKING ROCKSTARS [INBOUND 2014]
PDF
Flip foundationsnyscate
PDF
CakePHP the yum & yuck
PDF
Agency Presentation Template
PDF
Lightweight APIs in mRuby (Михаил Бортник)
PDF
Алексей Волков "Интерактивные декларативные графики на React+D3"
PPTX
Швейцарія, масштабування Scrum і розподілені команди от Романа Сахарова
PPTX
"Query Execution: Expectation - Reality (Level 300)" Денис Резник
IGNITE: MAKING ROCKSTARS [INBOUND 2014]
Flip foundationsnyscate
CakePHP the yum & yuck
Agency Presentation Template
Lightweight APIs in mRuby (Михаил Бортник)
Алексей Волков "Интерактивные декларативные графики на React+D3"
Швейцарія, масштабування Scrum і розподілені команди от Романа Сахарова
"Query Execution: Expectation - Reality (Level 300)" Денис Резник

Viewers also liked (20)

PPTX
Сергей Жук "Android Performance Tips & Tricks"
PDF
Michael North "The Road to Native Web Components"
PDF
Fighting Fat Models (Богдан Гусев)
PDF
Павел Тайкало: "Optimistic Approach : How to show results instead spinners wi...
PDF
Андрей Шумада | Tank.ly
PDF
Сергей Яковлев "Phalcon 2 - стабилизация и производительность"
PDF
Designing for Privacy
PDF
Илья Прукко: "Как дизайнеру не становиться художником"
PDF
Анатолий Попель: "Формы оплаты и платёжные шлюзы"
PPTX
"Красная книга веб-разработчика" Виктор Полищук
PDF
"Frameworks in 2015" Андрей Листочкин
PDF
4 puchnina.pptx
PPTX
Трансформация команды: от инди разработки к играм с коммерческой успешностью
PDF
Александр Махомет "Feature Flags. Уменьшаем риски при выпуске изменений"
PDF
Евгений Жарков AngularJS: Good parts
PDF
Евгений Обрезков "Behind the terminal"
PPT
"Spring Boot. Boot up your development" Сергей Моренец
PDF
Алексей Рыбаков: "Wearable OS год спустя: Apple Watch 2.0, Android Wear 5.1.1...
PDF
"После OOD: как моделировать предметную область в пост-объектном мире" Руслан...
PDF
Скрам и Канбан: применимость самых распространенных методов организации умств...
Сергей Жук "Android Performance Tips & Tricks"
Michael North "The Road to Native Web Components"
Fighting Fat Models (Богдан Гусев)
Павел Тайкало: "Optimistic Approach : How to show results instead spinners wi...
Андрей Шумада | Tank.ly
Сергей Яковлев "Phalcon 2 - стабилизация и производительность"
Designing for Privacy
Илья Прукко: "Как дизайнеру не становиться художником"
Анатолий Попель: "Формы оплаты и платёжные шлюзы"
"Красная книга веб-разработчика" Виктор Полищук
"Frameworks in 2015" Андрей Листочкин
4 puchnina.pptx
Трансформация команды: от инди разработки к играм с коммерческой успешностью
Александр Махомет "Feature Flags. Уменьшаем риски при выпуске изменений"
Евгений Жарков AngularJS: Good parts
Евгений Обрезков "Behind the terminal"
"Spring Boot. Boot up your development" Сергей Моренец
Алексей Рыбаков: "Wearable OS год спустя: Apple Watch 2.0, Android Wear 5.1.1...
"После OOD: как моделировать предметную область в пост-объектном мире" Руслан...
Скрам и Канбан: применимость самых распространенных методов организации умств...
Ad

More from Fwdays (20)

PDF
"Mastering UI Complexity: State Machines and Reactive Patterns at Grammarly",...
PDF
"Effect, Fiber & Schema: tactical and technical characteristics of Effect.ts"...
PPTX
"Computer Use Agents: From SFT to Classic RL", Maksym Shamrai
PPTX
"Як ми переписали Сільпо на Angular", Євген Русаков
PDF
"AI Transformation: Directions and Challenges", Pavlo Shaternik
PDF
"Validation and Observability of AI Agents", Oleksandr Denisyuk
PPTX
"Autonomy of LLM Agents: Current State and Future Prospects", Oles` Petriv
PDF
"Beyond English: Navigating the Challenges of Building a Ukrainian-language R...
PPTX
"Co-Authoring with a Machine: What I Learned from Writing a Book on Generativ...
PPTX
"Human-AI Collaboration Models for Better Decisions, Faster Workflows, and Cr...
PDF
"AI is already here. What will happen to your team (and your role) tomorrow?"...
PPTX
"Is it worth investing in AI in 2025?", Alexander Sharko
PDF
''Taming Explosive Growth: Building Resilience in a Hyper-Scaled Financial Pl...
PDF
"Scaling in space and time with Temporal", Andriy Lupa.pdf
PDF
"Database isolation: how we deal with hundreds of direct connections to the d...
PDF
"Scaling in space and time with Temporal", Andriy Lupa .pdf
PPTX
"Provisioning via DOT-Chain: from catering to drone marketplaces", Volodymyr ...
PPTX
" Observability with Elasticsearch: Best Practices for High-Load Platform", A...
PPTX
"How to survive Black Friday: preparing e-commerce for a peak season", Yurii ...
PPTX
"Istio Ambient Mesh in production: our way from Sidecar to Sidecar-less",Hlib...
"Mastering UI Complexity: State Machines and Reactive Patterns at Grammarly",...
"Effect, Fiber & Schema: tactical and technical characteristics of Effect.ts"...
"Computer Use Agents: From SFT to Classic RL", Maksym Shamrai
"Як ми переписали Сільпо на Angular", Євген Русаков
"AI Transformation: Directions and Challenges", Pavlo Shaternik
"Validation and Observability of AI Agents", Oleksandr Denisyuk
"Autonomy of LLM Agents: Current State and Future Prospects", Oles` Petriv
"Beyond English: Navigating the Challenges of Building a Ukrainian-language R...
"Co-Authoring with a Machine: What I Learned from Writing a Book on Generativ...
"Human-AI Collaboration Models for Better Decisions, Faster Workflows, and Cr...
"AI is already here. What will happen to your team (and your role) tomorrow?"...
"Is it worth investing in AI in 2025?", Alexander Sharko
''Taming Explosive Growth: Building Resilience in a Hyper-Scaled Financial Pl...
"Scaling in space and time with Temporal", Andriy Lupa.pdf
"Database isolation: how we deal with hundreds of direct connections to the d...
"Scaling in space and time with Temporal", Andriy Lupa .pdf
"Provisioning via DOT-Chain: from catering to drone marketplaces", Volodymyr ...
" Observability with Elasticsearch: Best Practices for High-Load Platform", A...
"How to survive Black Friday: preparing e-commerce for a peak season", Yurii ...
"Istio Ambient Mesh in production: our way from Sidecar to Sidecar-less",Hlib...
Ad

Recently uploaded (10)

PDF
Date Right Stuff - Invite only, conservative dating app
PDF
2025 Guide to Buy Verified Cash App Accounts You Can Trust.pdf
DOC
NIU毕业证学历认证,阿比林基督大学毕业证留学生学历
PPTX
Social Media People PowerPoint Templates.pptx
PPTX
ASMS Telecommunication company Profile
DOC
SIUE毕业证学历认证,阿祖萨太平洋大学毕业证学位证书复制
PDF
Lesson 13- HEREDITY _ pedSAWEREGFVCXZDSASEWFigree.pdf
PDF
Kids, Screens & Emotional Development by Meenakshi Khakat
PPTX
Introduction to Packet Tracer Course Overview - Aug 21 (1).pptx
PDF
Best 4 Sites for Buy Verified Cash App Accounts – BTC Only.pdf
Date Right Stuff - Invite only, conservative dating app
2025 Guide to Buy Verified Cash App Accounts You Can Trust.pdf
NIU毕业证学历认证,阿比林基督大学毕业证留学生学历
Social Media People PowerPoint Templates.pptx
ASMS Telecommunication company Profile
SIUE毕业证学历认证,阿祖萨太平洋大学毕业证学位证书复制
Lesson 13- HEREDITY _ pedSAWEREGFVCXZDSASEWFigree.pdf
Kids, Screens & Emotional Development by Meenakshi Khakat
Introduction to Packet Tracer Course Overview - Aug 21 (1).pptx
Best 4 Sites for Buy Verified Cash App Accounts – BTC Only.pdf

Анастасия Войтова: "Building profanity filters on mobile: clbuttic sh!t"

  • 2. Framework Days. IT Saturday. 5.09.2015 INTERNET CENSORSHIP children religion sexual https://en.wikipedia.org/wiki/Censorship
  • 3. Framework Days. IT Saturday. 5.09.2015 WHY FILTERING GOOD ETHNICITY PROTECTS CHILDREN RELIGION SEXUAL ORIENTATION
  • 4. Framework Days. IT Saturday. 5.09.2015 WHY FILTERING BAD lack of trust to your users their willing to break rules →
  • 5. Framework Days. IT Saturday. 5.09.2015
  • 6. Framework Days. IT Saturday. 5.09.2015 COCK
  • 7. LET’S BUILD FILTER! Framework Days. IT Saturday. 5.09.2015 filter = list of dirty words + list of replacements + filter rule
  • 8. – George Carlin, 1972 Shit, piss, fuck, cunt, cocksucker, motherfucker, and tits. “Seven Words You Can Never Say on Television” Framework Days. IT Saturday. 5.09.2015
  • 9. FILTER RULES Framework Days. IT Saturday. 5.09.2015 1. search by entry psss…
  • 10. RANGE OF WORD Framework Days. IT Saturday. 5.09.2015 NSRange range = [text rangeOfString:badWord options:NSCaseInsensitiveSearch]; BOOL hasDirtyWord = [text localizedCaseInsensitiveContainsString:badWord];
  • 11. RANGE OF WORD Framework Days. IT Saturday. 5.09.2015 - (NSArray * )rangesOfBadWordsWithSpaceInString:(NSString * )text { __block NSMutableArray * result = [NSMutableArray array]; [self.listOfBadWordsWithSpace enumerateObjectsUsingBlock:^(NSString * badWord, NSUInteger idx, BOOL * stop) { NSRange range = [text rangeOfString:badWord options:NSCaseInsensitiveSearch]; while (range.location != NSNotFound) { [result addObject:[NSValue valueWithRange:range]]; NSRange nextRange = NSMakeRange(range.location + 1, [text length] - range.location - 1); range = [text rangeOfString:badWord options:NSCaseInsensitiveSearch range:nextRange]; } }]; return result; }
  • 12. SEARCH BY ENTRY Framework Days. IT Saturday. 5.09.2015 Get your ass down here!The grass around the creek was new, giving it a velvety look. Dusty, his heartless assassin, had found his mate.
  • 13. SEARCH BY ENTRY Framework Days. IT Saturday. 5.09.2015 Get your ass down here!The grass around the creek was new, giving it a velvety look. Dusty, his heartless assassin, had found his mate.
  • 14. SEARCH BY ENTRY Framework Days. IT Saturday. 5.09.2015 Get your ass down here!The grass around the creek was new, giving it a velvety look. Dusty, his heartless assassin, had found his mate.
  • 15. FALSE POSITIVES Framework Days. IT Saturday. 5.09.2015
  • 16. Framework Days. IT Saturday. 5.09.2015 Get your ass down here!The grass around the creek was new, giving it a velvety look. Dusty, his heartless assassin, had found his mate. FALSE POSITIVES
  • 17. Framework Days. IT Saturday. 5.09.2015 assart assault association assurance ‘ASS’ WORDS harassment hassel hourglass impassable pass passion piassaba preassign 1250 words found http://www.morewords.com/contains/ass/
  • 18. Framework Days. IT Saturday. 5.09.2015 ass → butt REPLACEMENT RULES…
  • 19. Framework Days. IT Saturday. 5.09.2015 classic → clbuttic …FAILS
  • 20. Framework Days. IT Saturday. 5.09.2015
  • 21. Framework Days. IT Saturday. 5.09.2015 Constitution → Consbreastution AND FAILS AGAIN… medieval → medireview Tyson Gay → Tyson Homosexual
  • 22. FILTER RULES Framework Days. IT Saturday. 5.09.2015 1. search by entry 2. search whole word don’t u know me?
  • 23. SEARCH WHOLE WORD Framework Days. IT Saturday. 5.09.2015 NSString * scanned; if ([scanner scanCharactersFromSet:wordCharacters intoString:&scanned]) { if ([wordSet containsObject:[scanned lowercaseString]]) { NSRange range = NSMakeRange(scanner.scanLocation - scanned.length, scanned.length); [result addObject:[NSValue valueWithRange:range]]; } } NSSet * badWordsSet = [NSMutableSet setWithArray:self.listOfBadWords]; NSScanner * scanner = [NSScanner scannerWithString:text]; NSCharacterSet * wordCharacters = [NSCharacterSet alphanumericCharacterSet];
  • 24. SEARCH WHOLE WORD Framework Days. IT Saturday. 5.09.2015 Get your ass down here!The grass around the creek was new, giving it a velvety look. Dusty, his heartless assassin, had found his mate.
  • 25. SPACE! Framework Days. IT Saturday. 5.09.2015 AND OTHER PUNCTUATION
  • 26. Framework Days. IT Saturday. 5.09.2015 Get your a s s down here!You'd probably fire my a.s.s the first day on the job. You've covered my a_s_s every time I screwed up. PUNCTUATION
  • 27. Framework Days. IT Saturday. 5.09.2015 Get your a s s down here!You'd probably fire my a.s.s the first day on the job. You've covered my a_s_s every time I screwed up. PUNCTUATION
  • 28. FILTER RULES Framework Days. IT Saturday. 5.09.2015 1. search by entry 2. search whole word 3. handle punctuation don’t tell anyone…
  • 29. 1337 59341< Framework Days. IT Saturday. 5.09.2015
  • 30. L33T SPEAK Framework Days. IT Saturday. 5.09.2015 HOW MANY DIFFERENT SPELLINGS HAS ONE WORD?
  • 31. BITCH Framework Days. IT Saturday. 5.09.2015
  • 32. B1TCH Framework Days. IT Saturday. 5.09.2015 I → 1 BITCH
  • 33. B!TCH Framework Days. IT Saturday. 5.09.2015 I → ! BITCH B1TCH
  • 34. BI+CH Framework Days. IT Saturday. 5.09.2015 T → + BITCH B1TCH B!TCH
  • 35. I3ITCH Framework Days. IT Saturday. 5.09.2015 B → I3 BITCH B1TCH B!TCH BI+CH
  • 37. FILTER RULES Framework Days. IT Saturday. 5.09.2015 1. search by entry 2. search whole word 3. handle punctuation 4. handle l33t speak my name is…
  • 38. SCUNTHORPE PROBLEM Framework Days. IT Saturday. 5.09.2015 https://en.wikipedia.org/wiki/Scunthorpe_problem
  • 39. NICE TITS In 2007, the Royal Society for the Protection of Birds blocked ornithological terms such as cock (male bird) and tit, shag and booby from its discussion forums Framework Days. IT Saturday. 5.09.2015
  • 40. FILTER RULES Framework Days. IT Saturday. 5.09.2015 1. search by entry 2. search whole word 3. handle punctuation 4. handle l33t speak 5. remember about exceptions blue- footed booby!
  • 41. TEXT FILTERING ON IOS Framework Days. IT Saturday. 5.09.2015 words dictionary (boobs, b00bs, b00b5) whole word scan NSScanner, NSSet
  • 42. TEXT FILTERING ON IOS Framework Days. IT Saturday. 5.09.2015 phrases dictionary (b o o b s, b.o.o.b.s, b!o!o!bs) substring scan rangeOfString
  • 43. TEXT FILTERING ON IOS Framework Days. IT Saturday. 5.09.2015 words dictionary (boobs, b00bs, b00b5) whole word scan NSScanner, NSSet phrases dictionary (b o o b s, b.o.o.b.s, bo!obs) substring scan rangeOfString+
  • 44. HOW FAST IS IT? Framework Days. IT Saturday. 5.09.2015 time,seconds 0 0,1 0,2 0,3 0,4 user text, characters count 1000 5000 10000 20000 range scanner both dirty words dictionary contains 455 words
  • 45. LIVE FILTERING Framework Days. IT Saturday. 5.09.2015 use RAC and run filter every time user inputs character
  • 46. IMPROVE FILTERING • Keep dictionary up to date • Whitelist • Levenshtein distance • Soundex functions (where a word sounds like another) • Naive bayesian inference filtering of phrases/terms Framework Days. IT Saturday. 5.09.2015
  • 47. POST-MODERATION Framework Days. IT Saturday. 5.09.2015 alive moderators solid community flag abuse
  • 48. DIRTY WORDS • list of dirty words in different languages https://github.com/shutterstock/List-of-Dirty- Naughty-Obscene-and-Otherwise-Bad-Words • list of dirty words i’ve used https://gist.github.com/vixentael/ 5ce4168e3e94d9686405 Framework Days. IT Saturday. 5.09.2015
  • 49. LAST SLIDE @vixentael Framework Days. IT Saturday. 5.09.2015 iOS developer at Stanfy
  • 50. THANK YOU FOR WATCHING! Framework Days. IT Saturday. 5.09.2015