Researcher and Head of the Corpus Laboratory (External)

Research interests

The primary focus of Assistant Prof. Primož Jakopin’s research and teaching is the field of corpus linguistics. His research is related, on the one hand, to the preparation and use of the text corpus Nova beseda and, on the other hand, to the preparation and online use of dictionaries. He has been in charge of the digitalization of the Dictionary of Slovenian Language and the collected works of Ivan Cankar, Cyril Kosmač and Drago Jančar.

He has developed a number of software tools for the preparation and use of textual and lexical databases such as the well-known text editor EVA and search engine NEVA. He has worked also on the information content of language, entropy; he has estimated the limit for Slovene literary texts to be 2.2 bits per letter, and has measured, with the numerical model of language, the distance among languages.

Selected publications

GLOŽANČEV, Alenka, JAKOPIN, Primož, MICHELIZZA, Mija, URŠIČ, Lučka, ŽELE, Andreja, ŽELE, Andreja (ed.). Novejša slovenska leksika (v povezavi s spletnimi jezikovnimi viri). Ljubljana: Založba ZRC, ZRC SAZU, 2009. 408 pp.

JAKOPIN, Primož. Delež minimalnih parov besed med besednimi oblikami in lemami. Jezikosl. zap., 2009, 15/1–2, pp. 87–94.

JAKOPIN, Primož (ed.). Planinska dolina. Ljudje in kraji ob Unici. 1st edition. Planina pri Rakeku: Župnija, 2009. 296 pp.

JAKOPIN, Primož, ŽELE, Andreja. English anchors in a Slovenian word resource. In DAVIES, Matthew (ed.). Proceedings of the Corpus Linguistics Conference, CL2007, University of Birmingham, UK, 27-30 July 2007. URL (http://ucrel.lancs. ac.uk/publications/CL2007/paper/271_Paper.pdf). Birmingham, 2007, [7 pp.].

JAKOPIN, Primož, MICHELIZZA, Mija. Besedilni korpus Nova beseda. Mostovi (Ljublj.), 2007/08, Issue 41, No. 1/2, pp. 165–176.

JAKOPIN, Primož. Entropija v slovenskih leposlovnih besedilih. Ljubljana: ZRC SAZU, Založba ZRC, 2002. 208 pp.

All publications (COBISS) >>
Curriculum Vitae

Born: 30 June 1949 in Ljubljana.

1972: graduated (BSc, Hons.) in technical mathematics from the Department of Mathematics, Physics and Mechanics, Faculty of Science and Technology, University of Ljubljana.

1981: completed Masters Studies in the Centre for Postgraduate Studies, University of Zagreb, with the thesis Entropija priimkov in imen v Sloveniji.

1999: completed PhD on information theory in the Faculty of Electrical Engineering, University of Ljubljana, with the thesis Zgornja meja entropije pri leposlovnih besedilih v slovenskem jeziku.

Since 1971: worked in Ljubljana in the following institutions: Computer Centre of the Institute of Mathematics, Physics and Mechanics; Centre for Data Processing, University Medical Centre; Institute of Biomedical Informatics, Faculty of Medicine; Computer Centre, University of Ljubljana; Section for Technological Development, Mladinska Knjiga; for several years he has been also independent innovator.

January and February 1980: specialized study of the Operational system TOPS-10 in the Digital Equipment Corporation in Bedford, USA.

1993 – 2001: employed as Assistant (Tutor) in the Faculty of Arts, University of Ljubljana.

Since 2001: employed (part-time) as Assistant Professor (Docent) for the area of linguistic technology in the Faculty of Arts, University of Ljubljana. He has been teaching two courses in the Department of Comparative and General Linguistics: “Linguistics and Web technology” and “Computer assisted text analysis”. He also participates in the postgraduate programs in the department.

1989 – 1996: collaborated (on contract) with the Fran Ramovš Institute of the Slovenian Language, ZRC SAZU; 1996 – 2001: employed part-time in the Institute; since 2001: employed full-time in the Institute as the Head of the Corpus Laboratory.

Research areas

Applied linguistics, foreign languages teaching, sociolinguistics H360 • Statistics, operations research, programming, actuarial mathematics P160 • Computer science, numerical analysis, systems, control P170 • Informatics, systems theory P175 

Keywords

linguistic technology • corpus linguistics • information theory