SlideShare a Scribd company logo
 
MULTI-LEXEMIC UNITS: AN OVERVIEW
LET ME INTRODUCE MYSELF! Xavier Blanco -  [email_address] Autonomous University of Barcelona Laboratory of Phonetics, Lexicology and Semantics   f LexSem   My research areas: Lexicography Machine Translation My theoretical background: Lexicon-Grammar Meaning-Text Theory
MY SPEECH AT A GLANCE WHY MULTI-LEXEMIC UNITS? 4 TYPES OF MULTI-LEXEMIC UNITS CONSEQUENCES FOR TRANSLATION
Why are multi-lexemic units so important for MT? All  Machine Translation  Software needs  dictionaries  (i.e. complete linguistic descriptions of its working languages, formalized with identical procedures and criteria and linked by means of translation equivalence relations). A natural question concerns the nature of the macrostructure elements (lemmas) in these dictionaries, i.e. the  lexical units . Lexicon items are not always just  words  but very often  sequences of words . Many technical terms have been coined to refer to these complex lexical items:  compounds, collocations, idioms, frozen expressions...
Why are multi-lexemic units so important for MT? In fact, multi-lexemic units are important because they are  linguistic signs , and linguistics signs are the natural unit of treatement for dictionaries.
Which is the main problem? The main problem is that multi-lexemic units are/seem not so easy to identify and classify than mono-lexemic units. N.B.: that is true only regarding the segmentation (tokenization) question, not the polysemy question.
WHAT IS A MULTI-LEXEMIC UNIT? A  multi-lexemic unit  is a sequence of  word-forms   P  whose meaning cannot be build (by the general rules of the language  L ) from the meanings of the constituent lexemes of  P , their semantically loaded morphological means (if any) and their combinatorial properties.
FIRST TYPE OF MULTI-LEXEMIC UNIT: “COMPOUND” UNITS In most electronic dictionaries lexical units are typically classified as  nouns, verbs, adjectives  and  adverbs . In adition to simple forms, the languages we work with (romance languages) have, for each of this categories, multi-lexemic items (i.e. sequences of word-forms separated by a blank or a hyphen). Let’s beginn with compound nouns...
“ COMPOUND NOUNS” Examples of compound nouns are: hard drug high school black hole
“ COMPOUND NOUNS” A large amount of compound nouns: have simple variants or synonyms ( demócrata cristiano, demócrata-cristiano, demócratacristiano). can (sometimes must) be translated into simple nouns:  high school  =  lycée, instituto  //  escalera mecánica, escalier roulant  =  escalator ). acronyms are a way of constructing variants of compound nous:  acquired immune deficiency syndrome  =  AIDS .
“ COMPOUND NOUNS” Systematic descriptions of compound nouns have been proposed ranging from just a few types to over 700. Probably, for most of the applications, it’s enough to take in account only a dozen cases, such as: N-Adj, Adj-N ( christian name // nombre de pila )  N-Prep-N ( quality of live // calidad de vida ) N-N ( family doctor // médico de cabecera ) Prep-N ( under age // menor de edad ) V-N ( washing machine // lavadora )
“ COMPOUND NOUNS” It is often necessay to treat compound nouns as having a recursive structure: auditory canal  = Adj-N => N external auditory canal  = Adj-(Adj-N) => Adj-N => N
“ COMPOUND NOUNS” VERY IMPORTANT: The number of compound nouns has been often  underestimated . Typically even in large tradictional dictionaries only a very small percentage seems to have caught the lexicographers attention. We think that, if we consider technical languages,  a few millions  compound occur in texts. The size of any serious dictionary project in this area must be very important.
“ COMPOUND NOUNS” When we say that a compound noun is a linguistic unit, it means that we are obliged to describe its  form , its  meaning  and its  combinatory  INDEPENDENTLY of the properties the forms it contains may have. A regular construction like  a black sweater  is reducible to a predication like  black(sweater). But compounds like  a black hole  or  a black box  are not.
“ COMPOUND NOUNS” Compound nouns can be inmune to a number of syntactic modifications that similar but regular constructions can undergo: MODIFICATION: * a very black hole PREDICATIVITY: * this hole is black SYNONYMY:  a black hole  vs  a black orifice, a black opening COORDINATION: * a black and deep hole DELETION:  ...a black hole. This hole... NOMINALIZATION: * the blackness of the hole
“ COMPOUND NOUNS” Clearly these tests have varying degrees of precision depending on the semantic opaqueness of the compound It is nevertheless obvious that we need to list them in a dictionary. Not only  high schol , but also  driving shool  or  private shool  need to be associated which a specific meaning description. How should we come to know that  driving school  is not a school in which one take courses while being in a car? Or that  private shool  is not a school for soldiers of this rank?
“ COMPOUND VERBS” He flogs a dead horse. I gave him a taste of his own medicine I put my shoulder to the wheel
“ COMPOUND VERBS” Compound verbs are typicallly verbs with frozen arguments (1, 2 or 3). More often than not their degree of  semantic opacity  is much higher than in the case of compound nouns. The bad news: They are more difficult to extract from corpora (e.g. insertions, inflectional patterns...). The good news: the number of elements of this class is considerable smaller than in the nominal counterpart. It is likely that they can easily be kept far  below 100,000 .
“ COMPOUND ADVERBS” on foot at your own risk in cold blood in spite of N Probably  below 10,000 , excepting a few compound adverbs schemas that are productive:  from  (9 a.m.)  till  (5 p.m.) Compound adjectives:  out of order, de moda (fashionable)... Compound determiners:  a lot of, a flurry of criticism...
SECOND TYPE OF MULTI-LEXEMIC UNIT: “COLLOCATIONS” Compounds  need to be regarded as units in connection with almost every linguistic operation. They are  macrostructural elements  of the dictionaries and are typically  translated as a whole  without any attempt to maintain neither the internal structure nor the meaning of their particular parts. On the opposite,  collocations  involve  2  linguistic signs: the  base  of the collocation and the  value  of the collocation. We are going to discuss only two classes of collocations.
COLLOCATIONS: FROZEN MODIFIERS to condemn strongly, to endorse heartily, to laugh heartily, to laugh one’s head off... easy as pie, as 1-2-3... (smb) thin as a rake.. it rains cats and dogs... heavy smoker, ______ liar admirer profondement ;  aimer passionnément ;  remercier  chaleureusement;  surveiller étroitement ...
COLLOCATIONS: FROZEN MODIFIERS aid  =  valuable behaviour  =  excellent cut  =  neatly, cleanly advice  =  sound proposal  =  tempting struggle  =  heroic analysis  =  fruitful
COLLOCATIONS: FROZEN MODIFIERS Translation is a good indicator of the frozen status of these constructions. The translation of these expressions must proceed by first  identifying  the type of modification and then  reconstructing  that modification in the target language on the basis of the translation of the base term: miedo cerval  ->  INT (miedo) =  INT (fear) ->  mortal fear  * deer fear peur bleue   ->  INT (peur) =  INT (fear) ->  mortal fear * blue fear
COLLOCATIONS: FROZEN MODIFIERS There seems to be a restricted number of  meanings  that are likely to function as values of collocations. Exemples of such semantic values are  intensity , anti-intensity,  praise  and anti-praise. Collocations are not so difficult to understand (by a human being), but are difficult to produce (for a non-native speaker).
COLLOCATIONS: FROZEN MODIFIERS These modifiers need to be coded for each lexical unit separately. Not every lexical unit will have instantiations for every semantic value of a possible frozen modifier and some lexical units will have  more than one  modifier for a given semantic value. These frozen modifiers range  from highly idiosyncratic ones to almost regular ones : but they always need to be explicitly coded.
COLLOCATIONS: SUPPORT VERBS The main predicate of a sentence can be realized not just by verbs, but also by nouns, adjectives and prepositions. In the latter cases, an additional lexical element, called  support verb , is usually associated with the real semantic predicate to form the predicational basis of the simple sentence. Particularly for nouns, these support verbs cannot always be predicted just from the nature of the main predicate.
COLLOCATIONS: SUPPORT VERBS to  play  a role to  give  an advice to  take  a look at to  do  someone a favor to  put  a question “ The man who  makes  no mistakes does not usually  make  anything”
COLLOCATIONS: SUPPORT VERBS the war  broke out I  keep  my calm to  reduce  to despair to  raise  hope in to  draw  smb attention to
COLLOCATIONS: SUPPORT VERBS to fulfil  a promise to  answer  a question to  follow  an advice his dream  came true
COLLOCATIONS: SUPPORT VERBS The artillery _________ a heavy bombardment over the town. The artillery __________ the town to a heavy bombardment. The town _________ a heavy bombardment (of the artillery). A heavy bombardment (of the artillery) ________ over the town.
THIRD TYPE OF MULTI-LEXEMIC UNIT: “FROZEN SENTENCES” Proverbs:  A birth in the hand is worth two in the bush; A rolling stone gathers no moss. Pragmatemes:  Staff only, Can I help you? N.B.: Often frozen sentences undergo variations which can involve creative mechanisms fo the defreezing of the ordinary accepted patterns.
FOURTH TYPE OF MULTI-LEXEMIC UNIT: “GRAMMATICAL UNITS” Empirical Grammatical Expressions :  has been, could have been, may have been / either... or / if... then... Theoretical Grammatical Expressions: <Adj_colour>  <clothes>... but  blue jeans! <Noun_animate>  drink  <beverages>
Here I should present the calculus of Grammatical Meanings, but... zzz PISS: Powerpoint Induced Sleep Syndrom
FINAL OVERVIEW Let a linguistic sign be an ordered triple A  = <‘A’,  A ,  ∑A > where: ‘ A’ is the signifier of  A A  is the signifiant of ‘A’ ∑ A  is the set of combinatory properties of  A Basic types of linguistic signs are: morphs, modifications, conversions, supramorphs,  word-forms, phrasemes  and syntagms.
FINAL OVERVIEW Free sequences : AB = <‘AB’; / A    B /  ∑A  U  ∑B >  Full-idioms : AB = <‘ C ’; / A    B /> | ‘A’     ‘C’ & ‘B’     ‘C’ Cuasi-idioms : AB = <‘A   B   C ’; / A    B /> | ‘C’  ≠ ‘A’ & ‘C’  ≠  ‘B’ Semi-idioms :  AB = <‘A   C’; / A    B /> The signifier of the semi-idiom includes, intact, the signifier of one of its two constituents.  A  is chosen by the speaker strictly because of its signified. But  B  s used to express ‘C’ contingent on A. Otherwise  B  will not be the signifiant of ‘C’.
CONCLUSIONS Tests with large collection of texts at the LADL and the CIS have shown that  at least one-third  of any natural language corpus must be analyzed in terms of multi-lexemic units. The characteristics of multi-lexemic units are such than there is no alternative to the lexicographic solution. The availability of large scale multi-lexemic dictionaries will significantly improve the quality of machine translation systems.

More Related Content

PPSX
Coherence in Writing
PPTX
Morphology # Productivity in Word-Formation
PPTX
Semantics ( Introduction to Linguistics)
PPTX
Coherence
PPT
Introduction
PPTX
Morphology ( grammar lesson )
PPT
Coherence And Cohesion
PPT
Coherence and unity writing
Coherence in Writing
Morphology # Productivity in Word-Formation
Semantics ( Introduction to Linguistics)
Coherence
Introduction
Morphology ( grammar lesson )
Coherence And Cohesion
Coherence and unity writing

What's hot (19)

PPTX
Kübra suran
PPTX
Word formation
PPTX
Coherence, cohesion, & unity
PPTX
PPTX
Chapter 5 "Verbal Repetitions", of "A Linguistic Guide to Poetry," by Leech.
PPTX
Repetition (tool in stylistic)
PPT
Additive morpheme
PPTX
Morphology
PDF
Comma splices
PDF
Simonovic arsenijevic - in and out of paradigms - bcn2013
PPTX
Morphology Dr Sabri alkatib
PPSX
Linguistic morp
PPTX
Group 4 words
PPT
Group presentation lexical semantics
PPT
morphology
PPTX
Engl 396 oct. 23 presentation (version 2)
DOCX
Ancient germans
PPSX
Punctuation tips
PPTX
O. Henry. Stylistic devices
Kübra suran
Word formation
Coherence, cohesion, & unity
Chapter 5 "Verbal Repetitions", of "A Linguistic Guide to Poetry," by Leech.
Repetition (tool in stylistic)
Additive morpheme
Morphology
Comma splices
Simonovic arsenijevic - in and out of paradigms - bcn2013
Morphology Dr Sabri alkatib
Linguistic morp
Group 4 words
Group presentation lexical semantics
morphology
Engl 396 oct. 23 presentation (version 2)
Ancient germans
Punctuation tips
O. Henry. Stylistic devices
Ad

Viewers also liked (20)

PDF
1. open innov framing
PPT
Lidia Pivovarova
PPT
Chinex Y Muah
PDF
PROACtive Process Overview
PPT
On9 Systems Web Solutions (Chinese)
PPT
Eday Web3
PDF
Git is my hero
PDF
Martin karlssons vykortssamling munken och prästgatan
PPT
Foldervisie
PPS
Funny cars accident
PDF
Leading Without Being In Charge
PPT
Mobilmob
PPT
Personal Trainer In Company
PPT
Harry Pictures
PPT
Cars
PPT
Puls Russian
PDF
Arduino yun × apiで遊んでみる
PPT
WiPromo Overview
PPT
大家行04
PPS
Raffles Visitor Day! Friday 21st Sept 2007
1. open innov framing
Lidia Pivovarova
Chinex Y Muah
PROACtive Process Overview
On9 Systems Web Solutions (Chinese)
Eday Web3
Git is my hero
Martin karlssons vykortssamling munken och prästgatan
Foldervisie
Funny cars accident
Leading Without Being In Charge
Mobilmob
Personal Trainer In Company
Harry Pictures
Cars
Puls Russian
Arduino yun × apiで遊んでみる
WiPromo Overview
大家行04
Raffles Visitor Day! Friday 21st Sept 2007
Ad

Similar to Xavier Blanco (20)

PPT
Word Formation in English
PPT
Word formation-in-english3443
PPT
Word formation-in-english3443-110514141448-phpapp02
PPT
Word formation-in-english3443
PPTX
Morphology
PPTX
One of the great lecture about lexiscology
PDF
Morphology 100826021909-phpapp01
PDF
PPT
Lexis1
PPTX
Morphology 1
PPTX
Compounding
PPT
Morphological structure
PPT
Linguistics Notes - 10/7/09
PPTX
Morphology
PDF
207 morphbooklet
PPT
Parts of speech English calls in India for CBSE
PPT
THE EIGHT Parts of speech presentation.ppt
PPT
Parts of speech presentation modified by Bruce Benett’s STUS 011 basic English
PPT
Makalah Parts of speech presentation.ppt
PPT
Makalah Parts of speech presentation.ppt
Word Formation in English
Word formation-in-english3443
Word formation-in-english3443-110514141448-phpapp02
Word formation-in-english3443
Morphology
One of the great lecture about lexiscology
Morphology 100826021909-phpapp01
Lexis1
Morphology 1
Compounding
Morphological structure
Linguistics Notes - 10/7/09
Morphology
207 morphbooklet
Parts of speech English calls in India for CBSE
THE EIGHT Parts of speech presentation.ppt
Parts of speech presentation modified by Bruce Benett’s STUS 011 basic English
Makalah Parts of speech presentation.ppt
Makalah Parts of speech presentation.ppt

More from Lidia Pivovarova (20)

PDF
Classification and clustering in media monitoring: from knowledge engineering...
PDF
Convolutional neural networks for text classification
PDF
Grouping business news stories based on salience of named entities
PDF
Интеллектуальный анализ текста
PPTX
AINL 2016: Yagunova
PDF
AINL 2016: Kuznetsova
PPT
AINL 2016: Bodrunova, Blekanov, Maksimov
PDF
AINL 2016: Boldyreva
PPTX
AINL 2016: Rykov, Nagornyy, Koltsova, Natta, Kremenets, Manovich, Cerrone, Cr...
PDF
AINL 2016: Kozerenko
PDF
AINL 2016: Shavrina, Selegey
PDF
AINL 2016: Khudobakhshov
PDF
AINL 2016: Proncheva
PPTX
AINL 2016:
PPTX
AINL 2016: Bugaychenko
PDF
AINL 2016: Grigorieva
PDF
AINL 2016: Muravyov
PDF
AINL 2016: Just AI
PPTX
AINL 2016: Moskvichev
PDF
AINL 2016: Goncharov
Classification and clustering in media monitoring: from knowledge engineering...
Convolutional neural networks for text classification
Grouping business news stories based on salience of named entities
Интеллектуальный анализ текста
AINL 2016: Yagunova
AINL 2016: Kuznetsova
AINL 2016: Bodrunova, Blekanov, Maksimov
AINL 2016: Boldyreva
AINL 2016: Rykov, Nagornyy, Koltsova, Natta, Kremenets, Manovich, Cerrone, Cr...
AINL 2016: Kozerenko
AINL 2016: Shavrina, Selegey
AINL 2016: Khudobakhshov
AINL 2016: Proncheva
AINL 2016:
AINL 2016: Bugaychenko
AINL 2016: Grigorieva
AINL 2016: Muravyov
AINL 2016: Just AI
AINL 2016: Moskvichev
AINL 2016: Goncharov

Xavier Blanco

  • 1.  
  • 3. LET ME INTRODUCE MYSELF! Xavier Blanco - [email_address] Autonomous University of Barcelona Laboratory of Phonetics, Lexicology and Semantics f LexSem My research areas: Lexicography Machine Translation My theoretical background: Lexicon-Grammar Meaning-Text Theory
  • 4. MY SPEECH AT A GLANCE WHY MULTI-LEXEMIC UNITS? 4 TYPES OF MULTI-LEXEMIC UNITS CONSEQUENCES FOR TRANSLATION
  • 5. Why are multi-lexemic units so important for MT? All Machine Translation Software needs dictionaries (i.e. complete linguistic descriptions of its working languages, formalized with identical procedures and criteria and linked by means of translation equivalence relations). A natural question concerns the nature of the macrostructure elements (lemmas) in these dictionaries, i.e. the lexical units . Lexicon items are not always just words but very often sequences of words . Many technical terms have been coined to refer to these complex lexical items: compounds, collocations, idioms, frozen expressions...
  • 6. Why are multi-lexemic units so important for MT? In fact, multi-lexemic units are important because they are linguistic signs , and linguistics signs are the natural unit of treatement for dictionaries.
  • 7. Which is the main problem? The main problem is that multi-lexemic units are/seem not so easy to identify and classify than mono-lexemic units. N.B.: that is true only regarding the segmentation (tokenization) question, not the polysemy question.
  • 8. WHAT IS A MULTI-LEXEMIC UNIT? A multi-lexemic unit is a sequence of word-forms P whose meaning cannot be build (by the general rules of the language L ) from the meanings of the constituent lexemes of P , their semantically loaded morphological means (if any) and their combinatorial properties.
  • 9. FIRST TYPE OF MULTI-LEXEMIC UNIT: “COMPOUND” UNITS In most electronic dictionaries lexical units are typically classified as nouns, verbs, adjectives and adverbs . In adition to simple forms, the languages we work with (romance languages) have, for each of this categories, multi-lexemic items (i.e. sequences of word-forms separated by a blank or a hyphen). Let’s beginn with compound nouns...
  • 10. “ COMPOUND NOUNS” Examples of compound nouns are: hard drug high school black hole
  • 11. “ COMPOUND NOUNS” A large amount of compound nouns: have simple variants or synonyms ( demócrata cristiano, demócrata-cristiano, demócratacristiano). can (sometimes must) be translated into simple nouns: high school = lycée, instituto // escalera mecánica, escalier roulant = escalator ). acronyms are a way of constructing variants of compound nous: acquired immune deficiency syndrome = AIDS .
  • 12. “ COMPOUND NOUNS” Systematic descriptions of compound nouns have been proposed ranging from just a few types to over 700. Probably, for most of the applications, it’s enough to take in account only a dozen cases, such as: N-Adj, Adj-N ( christian name // nombre de pila ) N-Prep-N ( quality of live // calidad de vida ) N-N ( family doctor // médico de cabecera ) Prep-N ( under age // menor de edad ) V-N ( washing machine // lavadora )
  • 13. “ COMPOUND NOUNS” It is often necessay to treat compound nouns as having a recursive structure: auditory canal = Adj-N => N external auditory canal = Adj-(Adj-N) => Adj-N => N
  • 14. “ COMPOUND NOUNS” VERY IMPORTANT: The number of compound nouns has been often underestimated . Typically even in large tradictional dictionaries only a very small percentage seems to have caught the lexicographers attention. We think that, if we consider technical languages, a few millions compound occur in texts. The size of any serious dictionary project in this area must be very important.
  • 15. “ COMPOUND NOUNS” When we say that a compound noun is a linguistic unit, it means that we are obliged to describe its form , its meaning and its combinatory INDEPENDENTLY of the properties the forms it contains may have. A regular construction like a black sweater is reducible to a predication like black(sweater). But compounds like a black hole or a black box are not.
  • 16. “ COMPOUND NOUNS” Compound nouns can be inmune to a number of syntactic modifications that similar but regular constructions can undergo: MODIFICATION: * a very black hole PREDICATIVITY: * this hole is black SYNONYMY: a black hole vs a black orifice, a black opening COORDINATION: * a black and deep hole DELETION: ...a black hole. This hole... NOMINALIZATION: * the blackness of the hole
  • 17. “ COMPOUND NOUNS” Clearly these tests have varying degrees of precision depending on the semantic opaqueness of the compound It is nevertheless obvious that we need to list them in a dictionary. Not only high schol , but also driving shool or private shool need to be associated which a specific meaning description. How should we come to know that driving school is not a school in which one take courses while being in a car? Or that private shool is not a school for soldiers of this rank?
  • 18. “ COMPOUND VERBS” He flogs a dead horse. I gave him a taste of his own medicine I put my shoulder to the wheel
  • 19. “ COMPOUND VERBS” Compound verbs are typicallly verbs with frozen arguments (1, 2 or 3). More often than not their degree of semantic opacity is much higher than in the case of compound nouns. The bad news: They are more difficult to extract from corpora (e.g. insertions, inflectional patterns...). The good news: the number of elements of this class is considerable smaller than in the nominal counterpart. It is likely that they can easily be kept far below 100,000 .
  • 20. “ COMPOUND ADVERBS” on foot at your own risk in cold blood in spite of N Probably below 10,000 , excepting a few compound adverbs schemas that are productive: from (9 a.m.) till (5 p.m.) Compound adjectives: out of order, de moda (fashionable)... Compound determiners: a lot of, a flurry of criticism...
  • 21. SECOND TYPE OF MULTI-LEXEMIC UNIT: “COLLOCATIONS” Compounds need to be regarded as units in connection with almost every linguistic operation. They are macrostructural elements of the dictionaries and are typically translated as a whole without any attempt to maintain neither the internal structure nor the meaning of their particular parts. On the opposite, collocations involve 2 linguistic signs: the base of the collocation and the value of the collocation. We are going to discuss only two classes of collocations.
  • 22. COLLOCATIONS: FROZEN MODIFIERS to condemn strongly, to endorse heartily, to laugh heartily, to laugh one’s head off... easy as pie, as 1-2-3... (smb) thin as a rake.. it rains cats and dogs... heavy smoker, ______ liar admirer profondement ; aimer passionnément ; remercier chaleureusement; surveiller étroitement ...
  • 23. COLLOCATIONS: FROZEN MODIFIERS aid = valuable behaviour = excellent cut = neatly, cleanly advice = sound proposal = tempting struggle = heroic analysis = fruitful
  • 24. COLLOCATIONS: FROZEN MODIFIERS Translation is a good indicator of the frozen status of these constructions. The translation of these expressions must proceed by first identifying the type of modification and then reconstructing that modification in the target language on the basis of the translation of the base term: miedo cerval -> INT (miedo) = INT (fear) -> mortal fear * deer fear peur bleue -> INT (peur) = INT (fear) -> mortal fear * blue fear
  • 25. COLLOCATIONS: FROZEN MODIFIERS There seems to be a restricted number of meanings that are likely to function as values of collocations. Exemples of such semantic values are intensity , anti-intensity, praise and anti-praise. Collocations are not so difficult to understand (by a human being), but are difficult to produce (for a non-native speaker).
  • 26. COLLOCATIONS: FROZEN MODIFIERS These modifiers need to be coded for each lexical unit separately. Not every lexical unit will have instantiations for every semantic value of a possible frozen modifier and some lexical units will have more than one modifier for a given semantic value. These frozen modifiers range from highly idiosyncratic ones to almost regular ones : but they always need to be explicitly coded.
  • 27. COLLOCATIONS: SUPPORT VERBS The main predicate of a sentence can be realized not just by verbs, but also by nouns, adjectives and prepositions. In the latter cases, an additional lexical element, called support verb , is usually associated with the real semantic predicate to form the predicational basis of the simple sentence. Particularly for nouns, these support verbs cannot always be predicted just from the nature of the main predicate.
  • 28. COLLOCATIONS: SUPPORT VERBS to play a role to give an advice to take a look at to do someone a favor to put a question “ The man who makes no mistakes does not usually make anything”
  • 29. COLLOCATIONS: SUPPORT VERBS the war broke out I keep my calm to reduce to despair to raise hope in to draw smb attention to
  • 30. COLLOCATIONS: SUPPORT VERBS to fulfil a promise to answer a question to follow an advice his dream came true
  • 31. COLLOCATIONS: SUPPORT VERBS The artillery _________ a heavy bombardment over the town. The artillery __________ the town to a heavy bombardment. The town _________ a heavy bombardment (of the artillery). A heavy bombardment (of the artillery) ________ over the town.
  • 32. THIRD TYPE OF MULTI-LEXEMIC UNIT: “FROZEN SENTENCES” Proverbs: A birth in the hand is worth two in the bush; A rolling stone gathers no moss. Pragmatemes: Staff only, Can I help you? N.B.: Often frozen sentences undergo variations which can involve creative mechanisms fo the defreezing of the ordinary accepted patterns.
  • 33. FOURTH TYPE OF MULTI-LEXEMIC UNIT: “GRAMMATICAL UNITS” Empirical Grammatical Expressions : has been, could have been, may have been / either... or / if... then... Theoretical Grammatical Expressions: <Adj_colour> <clothes>... but blue jeans! <Noun_animate> drink <beverages>
  • 34. Here I should present the calculus of Grammatical Meanings, but... zzz PISS: Powerpoint Induced Sleep Syndrom
  • 35. FINAL OVERVIEW Let a linguistic sign be an ordered triple A = <‘A’, A , ∑A > where: ‘ A’ is the signifier of A A is the signifiant of ‘A’ ∑ A is the set of combinatory properties of A Basic types of linguistic signs are: morphs, modifications, conversions, supramorphs, word-forms, phrasemes and syntagms.
  • 36. FINAL OVERVIEW Free sequences : AB = <‘AB’; / A  B / ∑A U ∑B > Full-idioms : AB = <‘ C ’; / A  B /> | ‘A’  ‘C’ & ‘B’  ‘C’ Cuasi-idioms : AB = <‘A  B  C ’; / A  B /> | ‘C’ ≠ ‘A’ & ‘C’ ≠ ‘B’ Semi-idioms : AB = <‘A  C’; / A  B /> The signifier of the semi-idiom includes, intact, the signifier of one of its two constituents. A is chosen by the speaker strictly because of its signified. But B s used to express ‘C’ contingent on A. Otherwise B will not be the signifiant of ‘C’.
  • 37. CONCLUSIONS Tests with large collection of texts at the LADL and the CIS have shown that at least one-third of any natural language corpus must be analyzed in terms of multi-lexemic units. The characteristics of multi-lexemic units are such than there is no alternative to the lexicographic solution. The availability of large scale multi-lexemic dictionaries will significantly improve the quality of machine translation systems.