Personal Information
Organización/Lugar de trabajo
Stockholm, Sweden Sweden
Ocupación
Computational Linguist, PhD
Sector
Technology / Software / Internet
Sitio web
www.forum.santini.se/
Acerca de
I am a computational linguist with a strong interest in textual and linguistic features, machine learning and intensive textual data processing. My personal challenge is to extract "contextualized" information from big unstructured textual data leveraging on the concept of "genre". The word "genre" means "type of text". Nowadays all kinds of businesses, enterprises and customer care services produce huge amount of data in the form of many different "genres", i.e. emails, memos, notes from call-centers, news, user groups, chats, reports, tweets, Facebook pages, blogs, forums, marketing material and so on. All these textual genres contain valuable but unstructured data. The exploitation of ...
Etiquetas
machine learning
language technology
supervised classification
weka
computational semantics
nlp
decision trees
sentiment analysis
svm
noise
uppsala university
supervised machine learning
logistic regression
semantic analysis in language technology
text analytics
gain ratio
information gain
divide and conquer
entropy
marina santini
genre
perceptron
mesh
wordle
mira
tag clouds
word clouds
inductive bias
crossvalidation
description logics
best split
similarity
semantic analysis
semi-supervised learning
lexical semantics
formal semantics
evaluation
sampling
smoothing
text mining
unification
independence
semantics
unstructured data
opinion mining
structured data
predicate-argument structure
dependency parsing
semantic roles
thematic roles
semantic web
owl
rdf
rules
axioms of probability
pointwise mutual information
query log analysis
conditional probability
induction
naive bayes baseline algorithm
emotion
wordnet
web corpora
corpus evaluation
margin
automatic genre identification
selectional restrictions
formal languages
nearest neighbors
flipped classroom
semantics in language technology
pruning
supervised learning
events
clustering
domain-specific
statistical inference
automata
training set
hypothesis testing
maximum likelihood estimation (mle)
spam filtering
expectations
z-test
distance metric
probabilities
variance
statistics
algorithms for hmms
smoothing for pos tagging
markov assumptions
pos tagging with hmms
hidden markov models (hmms)
em for naive bayes
hidden and latent variables
maximum likelihood estimation
expectation-maximization
problems for hmms
stochastic variables
naive bayes classifiers
bayesian classification
naive bayes in nlp
frequency functions
joint probabilities
instance attributes
estimation
conditional probabilities
evaluation criteria
layout
semantically-related words
meaningful adjacencies
finite state automata
fsa
non-deterministic
deterministic
regular languages
pumping lemma
regular expressions
finite state machines
non-terminals
context-free grammars
phrase structure grammars
cfgs
backus-naur form
terminals
addition rule
probability therory
probability theorems
bayes law
marginal probability
multiplication rule
examination
cooperation
k-nearest neighbors
main theorem
feature representation
margin and separability
the norm
maximizing margin
margin infused relaxed algorithm
support vectors machines
max margin
max log-likelihood
minimum error
compositionality
corpus-based approaches
event representations
distributional semantics
description logics & the web ontology language
the semantics of first-order logic
formal and computational representations
latent semantic analysis
topic models
lamba calculus
roles semantic role labelling
ontologies
semantic word clouds
quantitative evaluation
dissimilarity
big data
unsupervised classification
overlap measure
distance
modified value difference metric
lazy learning
eager learning
logistic regression/maximum entropy
svms
statistical software
machine learning workbench
k-nn
classifiers
support vector machine
structured svms
conditional random fields
structured perceptron
sequence tagging
structured mira
voting
boosting
bagging
adaboost
base learner
stacking
ensemble learner
geographical information
venues
products
news
agi
contextualized information
actionable information
query log
search
information architecture
findwise
italian
swedish
sentistrength
cyberemotions
query logs
big textual data
stefan th. gries
crisis analysis
customer analytics
actionable intelligence
r
information discovery
hadhoop
business intelligence
strata
job title
professional profile
semantic-oriented applications
affective states
natural language processing
affect
regression
hypothesis class
type of machine learning
reinforcement learning supervised learning
definition of machine learning
classification
empirical error
classification in nlp
cross-validation
types of classification
unsupervised learning
generalization model assessment
statistical methods and natural language processin
theorems of probability
sample spaces
independence and incompatibility
notion of probability
video lectures
flip teaching
lab sessions
boostrap resampling
cascading
ensemble
recorded future
gavagai
cross-lingual learning
part-of-speech tagging
multilingual learning
linguistic structure prediction
incomplete supervision
latent-variable model
indirect supervision
ambiguous supervision
meetups
named-entity recognition
partial supervision
multilinguality
structured prediction
computational lexical semantics
representation of meaning
topic sentence
academic writing
critical thinking
argumentation
peer reviewing
job
learning outcomes
zellig harris
ppmi
cosine metric
ner
named entity recognition
standard evaluation per token
sequence classifier
information extraction
sequence labeling
e-discovery
calendaring
standard evaluation per entity
word shapes
ir-based approaches
knowledge-based approaches
ibm's watson
complex questions
answer type taxonomy
apple's siri
mrr
factoid questions
ir-based question answering
mean reciprocal rank
wolframalpha
hybrid approaches
passage retrieval
narrative questions
distant supervision
knowledge graph
relation extractors
dbpedia
hyponymy
corpus lesk
word similarity
word relatedness
graph-based methods
wsd
thesaurus-based methods
resnik method
lin method
semcor
dictionary-based methods
surprisal
supervised methods
lesk algorithm
path-based similarity
michael lesk
elesk
word sense disambiguation
extended lesk
simplified lesk
information content
term-context matrix
dot product
marginals
john rupert firth
pmi
cosine similarity measure
joint probability
vectors
positive pointwise mutual information
distributional models
quantitative metrics
compactness
running time
context-preserving word cloud visualisation
cpewcv
inflate and push
realized adjacencies
area utilization
aspect ratio
folksonomy
social tagging
automatic folksonomy construction
cycle cover
star forest
distortion
readability
swedish-umeå corpus (suc)
unsupervised machine learning
agglomerative hierarchical clustering
ward’s linkage
domain
ecare
web corpus
lay-specialized sublanguage
corpus quality
terminology extraction
domainhood
burstiness
log-likelihood
kullback– leibler divergence
mann-withney-wilcoxon test
unsupervised learning from the web
freebase
databases of relations
hand-written patterns
ace
bootstrapping
abstracting
topic signature-based content selection
rouge
recall oriented understudy for gisting evaluation
extractive summarization
snippets
summarization in question answering
abstractive summarization
query-focused summarization
unsupervised content selection
single vs. multiple documents
shared semantic annotation
dls
tags
web 3.0
shared understanding
ontology
tree of porphyry
webprotege
relations
classes
ontology learning
sparql
iri
seam carving
induction pipeline
f-measure
leave one out
parameters
recall
stratification
confusion matrix
hyperparameters
accuracy
precision
test set
development set
expected loss
empirical error induction
greediness
inductive bias of the decision tree
loss function
suprisal
constructing decision trees
machine leaning
attribute selection
confidence interval
standard error
inferential statistics
multiplier
interval estimation
confidence level
z critical value
confidence interval for proportion
confidence interval for the mean
roc curves
scalable platform
cheating
hybrid teaching/learning model
plagiarism
deduction
machine learning models
generalization
underfitting
training data
learning algorithms
overfitting
inference algorithms
elements of machine learning
test data
concepts
missing data
attributes
sample
normal distribution
features
population
outliers
mean
median
measures of dispersion
data
instances
arff format
mode
sparse data
measures of central tendency
semantic role labeling
sentiment lexica
scherer typology
mutual information
turney algorithm
affetctive meaning
emotion classification
connotational aspects
sentiwordnet
likelihood
sentiment mining
sentiment lexicons
semi-supervised methods
learning sentiment lexicons
general inquirer
manually-built sentiment lexicons
word senses
homonymy
hypernymy
senseval
membership meronymy
lemma
polysemy
antonomy
babelnet
part-whole meronymy
synonmy
wordform
metonymy
meronymy
zeugma test
occam's razor
k-statistic
lift charts
cost-sensitive measures
loss functon
recall-precision curves
t-test
counting the cost
multiclass classification
real-world implementations
holdout estimation
representation
unbalanced data
theoretical modelling
bootstrap
leave-one-out
logic and language
denotation
formal theories
logic
meaning representation
first-order logic
predicate logic
computational semantcs.
connotation
propositional logic
semantic role labelling
framenet
propbank
shallow semantics
shallow semantic representation
kendall correlation coefficient
Ver más
Presentaciones
(62)Recomendaciones
(3)Il Booktrailer
ludam
•
Hace 16 años
Analytics Education in the era of Big Data
Gregory Piatetsky-Shapiro
•
Hace 12 años
Evaluating Search Engines
Ramzi Alqrainy
•
Hace 13 años
Personal Information
Organización/Lugar de trabajo
Stockholm, Sweden Sweden
Ocupación
Computational Linguist, PhD
Sector
Technology / Software / Internet
Sitio web
www.forum.santini.se/
Acerca de
I am a computational linguist with a strong interest in textual and linguistic features, machine learning and intensive textual data processing. My personal challenge is to extract "contextualized" information from big unstructured textual data leveraging on the concept of "genre". The word "genre" means "type of text". Nowadays all kinds of businesses, enterprises and customer care services produce huge amount of data in the form of many different "genres", i.e. emails, memos, notes from call-centers, news, user groups, chats, reports, tweets, Facebook pages, blogs, forums, marketing material and so on. All these textual genres contain valuable but unstructured data. The exploitation of ...
Etiquetas
machine learning
language technology
supervised classification
weka
computational semantics
nlp
decision trees
sentiment analysis
svm
noise
uppsala university
supervised machine learning
logistic regression
semantic analysis in language technology
text analytics
gain ratio
information gain
divide and conquer
entropy
marina santini
genre
perceptron
mesh
wordle
mira
tag clouds
word clouds
inductive bias
crossvalidation
description logics
best split
similarity
semantic analysis
semi-supervised learning
lexical semantics
formal semantics
evaluation
sampling
smoothing
text mining
unification
independence
semantics
unstructured data
opinion mining
structured data
predicate-argument structure
dependency parsing
semantic roles
thematic roles
semantic web
owl
rdf
rules
axioms of probability
pointwise mutual information
query log analysis
conditional probability
induction
naive bayes baseline algorithm
emotion
wordnet
web corpora
corpus evaluation
margin
automatic genre identification
selectional restrictions
formal languages
nearest neighbors
flipped classroom
semantics in language technology
pruning
supervised learning
events
clustering
domain-specific
statistical inference
automata
training set
hypothesis testing
maximum likelihood estimation (mle)
spam filtering
expectations
z-test
distance metric
probabilities
variance
statistics
algorithms for hmms
smoothing for pos tagging
markov assumptions
pos tagging with hmms
hidden markov models (hmms)
em for naive bayes
hidden and latent variables
maximum likelihood estimation
expectation-maximization
problems for hmms
stochastic variables
naive bayes classifiers
bayesian classification
naive bayes in nlp
frequency functions
joint probabilities
instance attributes
estimation
conditional probabilities
evaluation criteria
layout
semantically-related words
meaningful adjacencies
finite state automata
fsa
non-deterministic
deterministic
regular languages
pumping lemma
regular expressions
finite state machines
non-terminals
context-free grammars
phrase structure grammars
cfgs
backus-naur form
terminals
addition rule
probability therory
probability theorems
bayes law
marginal probability
multiplication rule
examination
cooperation
k-nearest neighbors
main theorem
feature representation
margin and separability
the norm
maximizing margin
margin infused relaxed algorithm
support vectors machines
max margin
max log-likelihood
minimum error
compositionality
corpus-based approaches
event representations
distributional semantics
description logics & the web ontology language
the semantics of first-order logic
formal and computational representations
latent semantic analysis
topic models
lamba calculus
roles semantic role labelling
ontologies
semantic word clouds
quantitative evaluation
dissimilarity
big data
unsupervised classification
overlap measure
distance
modified value difference metric
lazy learning
eager learning
logistic regression/maximum entropy
svms
statistical software
machine learning workbench
k-nn
classifiers
support vector machine
structured svms
conditional random fields
structured perceptron
sequence tagging
structured mira
voting
boosting
bagging
adaboost
base learner
stacking
ensemble learner
geographical information
venues
products
news
agi
contextualized information
actionable information
query log
search
information architecture
findwise
italian
swedish
sentistrength
cyberemotions
query logs
big textual data
stefan th. gries
crisis analysis
customer analytics
actionable intelligence
r
information discovery
hadhoop
business intelligence
strata
job title
professional profile
semantic-oriented applications
affective states
natural language processing
affect
regression
hypothesis class
type of machine learning
reinforcement learning supervised learning
definition of machine learning
classification
empirical error
classification in nlp
cross-validation
types of classification
unsupervised learning
generalization model assessment
statistical methods and natural language processin
theorems of probability
sample spaces
independence and incompatibility
notion of probability
video lectures
flip teaching
lab sessions
boostrap resampling
cascading
ensemble
recorded future
gavagai
cross-lingual learning
part-of-speech tagging
multilingual learning
linguistic structure prediction
incomplete supervision
latent-variable model
indirect supervision
ambiguous supervision
meetups
named-entity recognition
partial supervision
multilinguality
structured prediction
computational lexical semantics
representation of meaning
topic sentence
academic writing
critical thinking
argumentation
peer reviewing
job
learning outcomes
zellig harris
ppmi
cosine metric
ner
named entity recognition
standard evaluation per token
sequence classifier
information extraction
sequence labeling
e-discovery
calendaring
standard evaluation per entity
word shapes
ir-based approaches
knowledge-based approaches
ibm's watson
complex questions
answer type taxonomy
apple's siri
mrr
factoid questions
ir-based question answering
mean reciprocal rank
wolframalpha
hybrid approaches
passage retrieval
narrative questions
distant supervision
knowledge graph
relation extractors
dbpedia
hyponymy
corpus lesk
word similarity
word relatedness
graph-based methods
wsd
thesaurus-based methods
resnik method
lin method
semcor
dictionary-based methods
surprisal
supervised methods
lesk algorithm
path-based similarity
michael lesk
elesk
word sense disambiguation
extended lesk
simplified lesk
information content
term-context matrix
dot product
marginals
john rupert firth
pmi
cosine similarity measure
joint probability
vectors
positive pointwise mutual information
distributional models
quantitative metrics
compactness
running time
context-preserving word cloud visualisation
cpewcv
inflate and push
realized adjacencies
area utilization
aspect ratio
folksonomy
social tagging
automatic folksonomy construction
cycle cover
star forest
distortion
readability
swedish-umeå corpus (suc)
unsupervised machine learning
agglomerative hierarchical clustering
ward’s linkage
domain
ecare
web corpus
lay-specialized sublanguage
corpus quality
terminology extraction
domainhood
burstiness
log-likelihood
kullback– leibler divergence
mann-withney-wilcoxon test
unsupervised learning from the web
freebase
databases of relations
hand-written patterns
ace
bootstrapping
abstracting
topic signature-based content selection
rouge
recall oriented understudy for gisting evaluation
extractive summarization
snippets
summarization in question answering
abstractive summarization
query-focused summarization
unsupervised content selection
single vs. multiple documents
shared semantic annotation
dls
tags
web 3.0
shared understanding
ontology
tree of porphyry
webprotege
relations
classes
ontology learning
sparql
iri
seam carving
induction pipeline
f-measure
leave one out
parameters
recall
stratification
confusion matrix
hyperparameters
accuracy
precision
test set
development set
expected loss
empirical error induction
greediness
inductive bias of the decision tree
loss function
suprisal
constructing decision trees
machine leaning
attribute selection
confidence interval
standard error
inferential statistics
multiplier
interval estimation
confidence level
z critical value
confidence interval for proportion
confidence interval for the mean
roc curves
scalable platform
cheating
hybrid teaching/learning model
plagiarism
deduction
machine learning models
generalization
underfitting
training data
learning algorithms
overfitting
inference algorithms
elements of machine learning
test data
concepts
missing data
attributes
sample
normal distribution
features
population
outliers
mean
median
measures of dispersion
data
instances
arff format
mode
sparse data
measures of central tendency
semantic role labeling
sentiment lexica
scherer typology
mutual information
turney algorithm
affetctive meaning
emotion classification
connotational aspects
sentiwordnet
likelihood
sentiment mining
sentiment lexicons
semi-supervised methods
learning sentiment lexicons
general inquirer
manually-built sentiment lexicons
word senses
homonymy
hypernymy
senseval
membership meronymy
lemma
polysemy
antonomy
babelnet
part-whole meronymy
synonmy
wordform
metonymy
meronymy
zeugma test
occam's razor
k-statistic
lift charts
cost-sensitive measures
loss functon
recall-precision curves
t-test
counting the cost
multiclass classification
real-world implementations
holdout estimation
representation
unbalanced data
theoretical modelling
bootstrap
leave-one-out
logic and language
denotation
formal theories
logic
meaning representation
first-order logic
predicate logic
computational semantcs.
connotation
propositional logic
semantic role labelling
framenet
propbank
shallow semantics
shallow semantic representation
kendall correlation coefficient
Ver más