SlideShare a Scribd company logo
Hacking Art
History for Fun
and Profit
John Resig
Hacking Art History
Hacking Art History
Hacking Art History
Hacking Art History
Hacking Art History
Hacking Art History
Hacking Art History
Hacking Art History
Hacking Art History
Hacking Art History
Hacking Art History
Hacking Art History
Hacking Art History
Hacking Art History
Hacking Art History
Lot 55: 20 Japanese Woodblock Prints
Each depicting a female/Geisha figure with
calligraphy throughout each print. Prints
measure 13.75" H x 9.375" W. Toning to
each print, some losses around edges.
Estimated Price: $400 - $600
Step 1: Acquire and read tons of expensive books.
Step 2: Learn to read Japanese. *
Japanese from the 17th to 19th century. *
You’re not going to learn this from Rosetta Stone.
Step 3: Learn to read Japanese calligraphy.
Solution: A fast-loading, responsive, i18ned, web
site: Ukiyo-e.org
Hacking Art History
Hacking Art History
Hacking Art History
Hacking Art History
Digital Ocean
Amazon S3
Amazon Cloudfront
Digital Ocean
Images
Data

(HTML,
XML, JSON)
Images JS, CSS
Images JS, CSS
nginx
(w/ cache)
node.js
express
node.js
express
naught
mongodb
Elastic

Search
Scraper
Hacking Art History
https://github.com/jeresig/jquery-imgscrubber
Collecting Tons of Woodblock Print Data
Search
Page Page Page
HTML
Image
HTML
Image
HTML
Image
Search
Page Page Page
HTML
Image
HTML
Image
HTML
Image
Queue-based Crawling using PhantomJS
Processing Queue
Some Website
WebKit
PhantomJS
CasperJS
SpookyJS
Save Data
XML Files
Mongo Log
libxml (+ xpath)
MongoDB
Extract Data
Process Data
Artists
Images
Correct Artist
and Date
Add to Site!
module.exports = function() {
return {
scrape: [
{
start: "http://ukiyo-e.org/search",
visit: "//a[@class='img']",
next: "//a[contains(@rel,'next')]"
},
{
extract: {
"title": "//p[contains(@class, 'title')]//span",
"dateCreated": "//p[contains(@class, 'date')]//span",
"artists[]": "//p[contains(@class, 'artist')]//a",
"images[]": "//div[contains(@class,'imageholder')]//a/@href"
}
}
]
};
};
"locale" : "ja",
"given" : "Okiie",
"given_kana" : "おきいえ",
"surname" : "Hashimoto",
"surname_kana" : "はしもと",
"name" : "Hashimoto Okiie",
"ascii" : "Hashimoto Okiie",
"plain" : "Hashimoto Okiie",
"kana" : "はしもとおきいえ",
"_id" : ObjectId("530c0825d9a80976b2000437")
}
],
"names" : [
{
"original" : "Hashimoto Okiie (橋本興家)",
"locale" : "ja",
"kanji" : "橋本興家",
"given" : "Okiie",
"given_kana" : "おきいえ",
"surname" : "Hashimoto",
"surname_kana" : "はしもと",
"given_kanji" : "興家",
"surname_kanji" : "橋本",
"name" : "Hashimoto Okiie",
"ascii" : "Hashimoto Okiie",
"plain" : "Hashimoto Okiie",
"kana" : "はしもとおきいえ",
"_id" : ObjectId("530c0825d9a80976b2000439")
}
],
"extract" : [
"53dfc997cbf9fa7501d78e4820b24a9c"
],
"created" : ISODate("2014-02-25T03:04:05Z"),
"__v" : 0
}
“Stack Scraper”
https://github.com/jeresig/stack-scraper
https://github.com/jeresig/ukiyoe-scrapers
Image Similarity
https://github.com/jeresig/node-matchengine
Hacking Art History
Hacking Art History
Hacking Art History
Hacking Art History
Hacking Art History
Hacking Art History
Hacking Art History
Hacking Art History
Image Similarity Search
Hacking Art History
Hacking Art History
Hacking Art History
Hacking Art History
Hacking Art History
Hacking Art History
Idyll: Offline Image Cropping
• https://github.com/jeresig/idyll

• Crop images offline and on a mobile
device.

• Saves the selections back to a server.

• Data is synced and saved using HTML 5
appcache.

• https://github.com/jeresig/node-
appcache-glob
by David Chester

at Shutterstock
https://github.com/dchester/perl-image-crop-calibration-target
http://www.ersatzlabs.com/
Aiding Woodblock Print
Studies with Image Analysis
Hacking Art History
Hacking Art History
Hacking Art History
Hacking Art History
Hacking Art History
Hacking Art History
Hacking Art History
Hacking Art History
Hacking Art History
Hacking Art History
Hacking Art History
Hacking Art History
Miyagawa Shuntei
Printed in 1897
Sold for: $550
Prints sell for $100-$400 individually
True Estimate: $2100 - $8400 *
* You just have to find
someone willing to buy them!
Does this work for other art forms?
• Collaborating with the Frick Art Reference Library

• Analyzing their Anonymous Italian Art Photo Archive

• Also working with the Zeri Foundation at the University
of Bologna in Italy

• Analyzing their Italian Art Archives
Similar Images
Different photo, same work of art.
Similar Images
Different photo, slightly different cropping.
Similar Images
Different photo, dramatically different lighting.
Alternate Images
Partial Image vs. Full Image
Alternate Images
Color vs. Black-and-White
Alternate Images
Partial Image vs. Much Larger Image
Conservation
Conservation
Repairs and possibly removal of later additions.
Conservation
Analysis even spots dramatic conservation work.
Copies
Copies
Copies
Copies
Copies
Graph Analysis with neo4j
Frick 420
420
Zeri 1583642090
Frick 417
417
?
Hacking Art History
Frick 347
347
Zeri 12227
33526
Frick 348
348
33525
?
Hacking Art History
8132a 8132
57129
57134
57130
57138
8131a 8131
?
Hacking Art History
Hacking Art History
Hacking Art History
• http://ejohn.org/research/

• http://ukiyo-e.org/
• https://github.com/jeresig
Correcting Print Data
Japanese Names
• Utagawa Hiroshige
• Ando Hiroshige
• Andō Hiroshige
• Hiroshige
• 歌川広重
• 広重
安⼟土
安堂
安島
安東
安籐
安藤
安道
安達
阿藤
Andō
安藤
andō
antō
anzō
yasuzuka
A many-to-many mapping!
Sharaku Toshusai
東洲斎写楽
Sharaku Toshusai
東洲斎写楽
Is this the family name?
Where are the stress marks?
How do you “split” this name?
Which name parts

correlate?
Tools (all are Node modules!)
• https://github.com/lovell/
hepburn

• https://github.com/jeresig/
node-enamdict

• https://github.com/jeresig/
node-ndlna

• https://github.com/jeresig/
node-romaji-name
ndlnahepburn enamdict
romaji-name
Hepburn
• https://github.com/lovell/
hepburn

• Takes in the English form of a
Japanese word.

• Returns it written in Hiragana or
Katakana (phonetic Japanese
alphabets).
ndlnahepburn enamdict
romaji-name
うたがわひろしげUtagawa Hiroshige
Enamdict
• https://github.com/jeresig/
node-enamdict

• Downloads and queries the
ENAMDICT database

• (A mapping of Japanese proper
names to Hiragana and
English.)

• Used to correct typos and figure
out surname/given name.
ndlnahepburn enamdict
romaji-name
NDLNA
• https://github.com/jeresig/
node-ndlna

• Queries the NDLNA database

• Finds the correct Kanji for an
English name.

• Or the correct English for a
Kanji name.
ndlnahepburn enamdict
romaji-name
ndlnahepburn enamdict
romaji-name
{
"original" : "Sharaku Toshusai (東洲斎写楽 )",
"locale" : "ja",
"kanji" : "東洲斎写楽",
"given" : "Sharaku",
"given_kana" : "しゃらく",
"surname" : "Tōshūsai",
"surname_kana" : "とおしゅうさい",
"surname_kanji" : "東洲斎",
"given_kanji" : "写楽",
"name" : "Tōshūsai Sharaku",
"ascii" : "Tooshuusai Sharaku",
"plain" : "Toshusai Sharaku",
"kana" : "とおしゅうさいしゃらく"
}
Dates
• https://github.com/jeresig/node-yearrange
var yr = require("yearrange");
yr.parse("1877")
// {"start": 1877, "end": 1877}
yr.parse("1847-48")
// {"start": 1847, "end": 1848}
yr.parse("ca. 1810-20s")
// {"start": 1810, "end": 1829, "circa": true}
yr.parse("18th–19th century")
// {"start": 1700, "end": 1899}
yr.parse("Meiji era")
// {"start": 1868, "end": 1912}
Artist Rectification

More Related Content

PDF
Computer Vision as Art Historical Investigation
PDF
EmpireJS: Hacking Art with Node js and Image Analysis
PDF
NYARC 2014: Frick/Zeri Results
PPTX
Keynote: UK Museums and The Web 2014
PDF
Please feel the Museum: 3D technologies in the Museum
PDF
Does Coding Every Day Matter?
PDF
Accidentally Becoming a Digital Librarian
PDF
2014: John's Favorite Thing (Neo4j)
Computer Vision as Art Historical Investigation
EmpireJS: Hacking Art with Node js and Image Analysis
NYARC 2014: Frick/Zeri Results
Keynote: UK Museums and The Web 2014
Please feel the Museum: 3D technologies in the Museum
Does Coding Every Day Matter?
Accidentally Becoming a Digital Librarian
2014: John's Favorite Thing (Neo4j)

More from jeresig (20)

PDF
Using JS to teach JS at Khan Academy
PDF
Applying Computer Vision to Art History
PDF
Applying Computer Vision to Art History
PDF
JavaScript Libraries (Ajax Exp 2006)
PDF
Introduction to jQuery (Ajax Exp 2006)
PDF
jQuery Recommendations to the W3C (2011)
PDF
jQuery Open Source Process (RIT 2011)
PDF
jQuery Open Source Process (Knight Foundation 2011)
PDF
jQuery Mobile
PDF
jQuery Open Source (Fronteer 2011)
PDF
Holistic JavaScript Performance
PDF
New Features Coming in Browsers (RIT '09)
PDF
Introduction to jQuery (Ajax Exp 2007)
PDF
Advanced jQuery (Ajax Exp 2007)
PDF
JavaScript Library Overview (Ajax Exp West 2007)
PDF
Meta Programming with JavaScript
PDF
Advancing JavaScript with Libraries (Yahoo Tech Talk)
PDF
The Future of JavaScript (Ajax Exp '07)
PDF
State of jQuery and Drupal
PDF
Khan Academy Computer Science
Using JS to teach JS at Khan Academy
Applying Computer Vision to Art History
Applying Computer Vision to Art History
JavaScript Libraries (Ajax Exp 2006)
Introduction to jQuery (Ajax Exp 2006)
jQuery Recommendations to the W3C (2011)
jQuery Open Source Process (RIT 2011)
jQuery Open Source Process (Knight Foundation 2011)
jQuery Mobile
jQuery Open Source (Fronteer 2011)
Holistic JavaScript Performance
New Features Coming in Browsers (RIT '09)
Introduction to jQuery (Ajax Exp 2007)
Advanced jQuery (Ajax Exp 2007)
JavaScript Library Overview (Ajax Exp West 2007)
Meta Programming with JavaScript
Advancing JavaScript with Libraries (Yahoo Tech Talk)
The Future of JavaScript (Ajax Exp '07)
State of jQuery and Drupal
Khan Academy Computer Science

Recently uploaded (20)

PPTX
400kV_Switchyard_Training_with_Diagrams.pptx
PPTX
Green and Blue Illustrative Earth Day Presentation.pptx
PPTX
A slideshow about aesthetic value in arts
PPT
Jaipur Sculpture Tradition: Crafting Marble Statues
PPTX
E8 Q1 020ssssssssssssssssssssssssssssss2 PS.pptx
PPTX
573393963-choose-your-own-adventure(2).pptx
PPTX
G10 HOMEROOM PARENT-TEACHER ASSOCIATION MEETING SATURDAY.pptx
PPTX
unit5-servicesrelatedtogeneticsinnursing-241221084421-d77c4adb.pptx
PPTX
Callie Slide Show Slide Show Slide Show S
PPSX
Multiple scenes in a single painting.ppsx
PPTX
Technical-Codes-presentation-G-12Student
PPTX
4277547e-f8e2-414e-8962-bf501ea91259.pptx
PPTX
CPAR_QR1_WEEK1_INTRODUCTION TO CPAR.pptx
PPTX
current by laws xxxxxxxxxxxxxxxxxxxxxxxxxxx
PPTX
slide head and neck muscel for medical students
PPTX
22 Bindushree Sahu.pptxmadam curie life and achievements
PPTX
Presentation on tradtional textiles of kutch
PPTX
65bc3704-6ed1-4724-977d-a70f145d40da.pptx
PDF
DPSR MUN'25 (U).pdf hhhhhhhhhhhhhbbnhhhh
PDF
Close Enough S3 E7 "Bridgette the Brain"
400kV_Switchyard_Training_with_Diagrams.pptx
Green and Blue Illustrative Earth Day Presentation.pptx
A slideshow about aesthetic value in arts
Jaipur Sculpture Tradition: Crafting Marble Statues
E8 Q1 020ssssssssssssssssssssssssssssss2 PS.pptx
573393963-choose-your-own-adventure(2).pptx
G10 HOMEROOM PARENT-TEACHER ASSOCIATION MEETING SATURDAY.pptx
unit5-servicesrelatedtogeneticsinnursing-241221084421-d77c4adb.pptx
Callie Slide Show Slide Show Slide Show S
Multiple scenes in a single painting.ppsx
Technical-Codes-presentation-G-12Student
4277547e-f8e2-414e-8962-bf501ea91259.pptx
CPAR_QR1_WEEK1_INTRODUCTION TO CPAR.pptx
current by laws xxxxxxxxxxxxxxxxxxxxxxxxxxx
slide head and neck muscel for medical students
22 Bindushree Sahu.pptxmadam curie life and achievements
Presentation on tradtional textiles of kutch
65bc3704-6ed1-4724-977d-a70f145d40da.pptx
DPSR MUN'25 (U).pdf hhhhhhhhhhhhhbbnhhhh
Close Enough S3 E7 "Bridgette the Brain"

Hacking Art History