Trojina, Institute for Applied Slovene Studies

Ljubljana, Slovenia


Innovative Approaches in Lexicography: Benefits for Under-Resourced Languages and Language Learning

Recent technologic progress, especially a shift to the digital medium, has had a profound impact on lexicography. The digital medium has brought various new advantages for dictionary makers and lexicographers, in terms of both availability of very large corpora and new possibilities in presentation of dictionary content. We have also witnessed a rise of the so-called born digital dictionaries, i.e. dictionaries conceptualised for the digital medium. On the other hand, this new medium has brought new challenges for lexicographers who suddenly had large quantities of data to analyse and, also due to raised expectations of users demanding quick access to up-to-date dictionary information, not enough time to analyse it. As a result, new tools have emerged that help lexicographers with analysis by summarizing grammatical and collocational behaviour of words and, more recently, even attempt to automate knowledge acquisition from corpora (Kilgarriff 2013) and other language resources.

My presentation will focus on the use of innovative lexicographic methods in practice, drawing on the experience in compiling the Slovene Lexical Database and the Dictionary of Contemporary Slovene Language (DCSL). I will describe the lexicographical process of DCSL, which consists of five phases, relies heavily on automatic extraction of lexical data from the corpus and the use of crowdsourcing for data clean-up (Kosem et al. 2013, Gantar et al., in print), and envisages releasing entries after each phase. I intend to argue that such lexicographical process and methodology are very suitable for under-resourced languages where dictionaries often need to be compiled from scratch and where the users cannot wait for years, often decades, for the dictionary to be “completed”.

In the final part of my presentation, I will show some uses of state-of-the-art lexicographic tools and methods for the purposes of language learning, more specifically in the development of language portals, applications, tools games and other resources where language learning can take place directly or indirectly.


Iztok Kosem has a PhD in Corpus Linguistics, obtained at Aston University in Birmingham, UK. His main interest is the production of user-friendly and modern language resources (both monolingual and bilingual) for speakers of Slovene. In his opinion, the future of dictionaries is in online format, whose lexicographic potential is yet to be fully exploited. He is also interested in building and use of (pedagogical) corpora in language teaching and learning.

In October 2013 he was elected Vice-Chair of ENeL (European Network of Electronic Lexicography, funded by COST. He is a member of the EURALEX 2014 scientific committee. He was the head of organization committees and a member of scientific committees of eLex 2011 and eLex 2013 conferences. In August 2013 he was among 50 invited experts at the OED symposium. He is a co-author of the Proposal for a new dictionary of contemporary Slovene.



Department of Linguistics

University of the Philippines, Diliman


Lexicography in the Philippines: Development, Milestones & Challenges [back to the top]

My lecture seeks to accomplish the following goals: (1) to recount the development of lexicography in the Philippines from the onset of the Spanish colonization period until the latter part of the American colonization period; (2) to enquire into the ways in which the national language discourse has caused lexicography in the Philippines to forge ahead from when such notion was first introduced through a policy in the 1935 Philippine Constitution up to the time when Pilipino was changed to Filipino in the 1987 Constitution; and (3) to assess the current state of the art of Philippine lexicography and to set forth its potential course of development given the current linguistic landscape in the country.

I will build upon the previous studies of Hidalgo (Philippine Lexicography from 1521 to the Present 1977) & Newell (Philippine Lexicography: the State of the Art 1995) in recounting the development of Philippine lexicography and in identifying milestones that had significantly contributed to its current shape. I will also refer to a previous work of mine on the development of Filipino monolingual lexicography (The Filipino Monolingual Dictionaries and the Development of Filipino Lexicography 2011) as I touch upon the metamorphosis of the Tagalog-based Filipino and the dictionaries that were published in an attempt to bolster its status as the statutory national language. To be able to accomplish the third objective above, I will examine the current state of lexicography in ten other Philippine languages that are used in widespread communication (with a score ranging from 1 to 3 in the EGIDs scale) that include Bikol Central, Cebuano, Hiligaynon, Ilocano, Maguindanaon, Masbatenyo, Kapampangan, Pangasinan, Tausug and Waray. A core component of this assessment would entail looking into the actual lexicographic products with particular attention to monolingual dictionaries, existing metalexicographic accounts of these products, and the presence of institutions in charge of developing strategies and formulating general framework that could guarantee a sustainable path for the practice of lexicography in these languages. Finally, my lecture would also cite some of the most crucial challenges that lexicography in the Philippines is currently facing. Challenges brought about by both the social dimensions of language at one hand and the linguistic dimensions of society on the other will be tackled. In addition, applied issues that concern education and language planning in particular will also be discussed.


The speaker obtained his PhD in Korean Linguistics from the Academy of Korean Studies, South Korea, where he also completed an interim MA in International Studies. He also obtained MA in Linguistics from the University of the Philippines (UP) Diliman in 2007. He received his BA in Linguistics also from UP Diliman with Magna Cum Laude honors in 2003. He is currently an Associate Professor and the Chairperson of the Department of Linguistics, UP Diliman. His research interests include Formal Syntax, Korean Linguistics and Cultural Studies, Lexicography, Ethnolinguistics and the structure of West Visayan languages. In 2011, he published a paper on “the Filipino monolingual dictionaries and the development of Filipino lexicography”. His most current publication is about “how elicited gestures reflect word-order bias in world languages” – a collaborative research he did with linguists from UCLA San Diego, Brown University, University College Dublin and MIT. His first language (mother tongue) is Cuyonon (spoken mainly in Palawan). His second and third languages are Hiligaynon and Kinaray-a, respectively. He started learning English and Filipino when he entered elementary school. While taking up Linguistics in UP, he studied varying levels of Spanish, Bahasa Indonesia-Malaysia, Japanese and Mandarin. He eventually pursued Korean Language and Linguistics when he entered the graduate school. He now holds advanced level of proficiency in Korean language.



National University of Singapore



Lexical priming, dictionaries and Asian users of English [back to the top]

Both lexicography and corpus linguistics can assist a typical Asian user of English to accelerate (or at least, reconcile) their priming of native-speaker patterns by getting them to pay close attention to their own lexical priming (Hoey 2005) vis-à-vis those found in online native dictionaries and attendant electronic corpora.  This ‘close attention’ includes an awareness of such parameters as the following: local vs international usage, formal vs informal contexts, spoken vs written registers, Internet slang vs standard usage etc. 

In this session, I draw on both the Concentric Circles Model for nativised Englishes that I have constructed and Michael Hoey’s theory of lexical priming in order to examine the treatment of certain lexical and grammatical constructions in free online dictionaries of English.  We live in an exciting era of lexicography, as there are multiple competing free lexical resources for the user’s attention; at the same time, the plurality of lexical resources can often lead to a confusion as to which treatment is the ‘correct’ one.  A case in point is brownie points, whose definition by the Merriam Webster Learner’s Dictionary is “praise, credit, or approval that a person gets from someone (such as a boss or teacher) for doing something good or helpful”; example sentences list associated verbs as “earn”, “win”, or “get”. After reading this lexical entry, the learner not only does not know the preferred verb associated with the term but also its more appropriate context(s) of usage, i.e. spoken, written, academic etc. More puzzlingly, the Cambridge Dictionary lists the term as “humorous” while the COBUILD Dictionary flags it for “disapproval”, i.e. a negative semantic prosody because the approvers for brownie points can be politicians of dubious public standing. Which of these entries should the learner of English believe in? Given access to free resources on the Web, the user is likely not only to use Google’s dictionary (which has become a serious contender to other dictionaries) but also gets to search Wikipedia, the British National Corpus, the Corpus of Contemporary American English, and the Corpus of Web-based Global English in order to achieve greater understanding.

While other examples in this session will show uneven treatment, incomplete varietal usage and even cultural bias in existing online dictionaries, the overall ‘takeaway’ is that the Asian user of English has to rely on a plurality of lexical resources and be more self-directed in reconciling lexical primings – often conflicting, in both local usage (endonorms) and international ones (exonorms). Lexicographers have to be even more nimble in catering to much more Web-savvy audiences nowadays.


Dr. Vincent B Y Ooi has teaching and research interests in lexicology and lexicography, corpus-based language studies, computer-mediated communication (language and the Internet), and Asian English discourses. He is the author of Computer Corpus Lexicography, the first lead editor of Perspectives in Lexicography: Asia and beyond, and the co-chief editor of The Times-Chambers Essential English Dictionary (2nd edition). Since 2009, he has served as a scientific committee member for the eLex (electronic lexicography in the 21st century) conferences; other scientific committee memberships include Corpus Linguistics 2015 and COLTA2015. Dr Ooi is the General Secretary and Board member of The Asian Association for Lexicography (Asialex) and an editorial board member of the journal Corpora: Corpus-based language.


University of Oxford

United Kingdom


Lexicography as History: Asian Vocabulary in the Oxford English Dictionary [back to the top]

The Oxford English Dictionary illustrates the evolution of the English language over the last thousand years, providing an unsurpassed guide to the meaning, history, and pronunciation of over half a million words through around three million quotations taken from a wide range of international sources. Unlike dictionaries of current English, the OED includes all core words and meanings in the language, even those that are rare, archaic, historical, obsolete, and technical. Definitions are listed in chronological order, from the earliest evidence of usage to the most recent. This makes the OED the ideal resource for studying the origin and development of thousands of English words.

The OED is therefore more than just a list of words and definitions—it is an historical record that can offer as much to the social historian as to the linguist. In this presentation, I will highlight the OED’s significance as a chronicle of Anglophone society by focusing on its coverage of words from emerging varieties of English in Asia.

The English-speaking world has changed enormously since the OED was first conceived by British philologists in 1857. A large proportion of today’s English speakers can be found not in Britain and North America, but in Asia, where millions of people use English as a second language, especially in postcolonial nations such as India, Hong Kong, Singapore and the Philippines. In this lecture, we will consider what we can learn about the constantly evolving role of English in these relatively new Anglophone communities by examining how they have been represented in the OED at various points in the dictionary’s long history.

I will also discuss the changes that have been implemented in the OED’s editorial policy to remove its Britocentric bias in favour of a more inclusive, pluricentric stance, as well as the OED’s continuing efforts to cover a wider range of lexical innovations from Asian varieties of English that more accurately reflect the way that the language is being adapted to suit the communicative needs of its speakers in the region. These will be demonstrated through revised and new entries for Asian lexical items included in the OED’s recent and upcoming quarterly updates.


Dr. Danica Salazar is Consultant Editor on World Englishes for the Oxford English Dictionary. Prior to working at Oxford University Press, she was the Mellon Postdoctoral Fellow in English Language Lexicography at the English Faculty of the University of Oxford. She holds a PhD in Applied Linguistics from the University of Barcelona, an MA in Teaching Spanish as a Foreign Language from the University of Salamanca and a BA in European Languages,  magna cum laude, from the University of the Philippines-Diliman. She publishes and lectures regularly on lexicography, phraseology, World Englishes and Spanish- and English-language teaching. Dr Salazar is the author of Lexical Bundles in Native and Non-native Scientific Writing (2014) and co-editor of Biomedical English: A Corpus-based Approach (2013).