Ongoing projects

Multi-CAST: Multilingual Corpus of Annotated Spoken Texts

document.

Personal pronouns and person clitics in Tabasaran: Toward a theory of Person

Funded by Deutsche Forschungsgesellschaft

Researcher: Dr. Natalia Bogomolova

Project Manager: Dr. Natalia Bogomolova, Prof. Dr. Geoffrey Haig

Funding period: September 1st, 2022 to August 31st, 2025 (36 month)

The project has two major goals: an in-depth investigation of person in the grammar of Tabasaran (Nakh-Daghestanian) and using new empirical data from this understudied language in theoretical syntactic research in order to advance the theory of Person. The category of person in Tabasaran is manifested via two systems: free personal pronouns and an elaborate system of person clitics, which display a number of interesting properties. First, both subject and non-subject person arguments can be marked on the finite verb by a clitic. Clusters of a subject and a non-subject clitics are also possible. Second, person subjects of transitive and intransitive clauses and non-canonical subjects behave differently with respect to clitics. In root declarative clauses, canonical subjects are always clitic-doubled, while in clauses with non-canonical subjects both subject and non-subject can trigger aclitic on the verb. Third, allowing cliticclusters, Tabasaran demonstrates a phenomenon known as Person–Case Constraint, reminiscent of what is attested in Romance languages, with some important differences. Fourth, both pronouns and clitics exhibit indexical shift in speech reports, losing their indexical semantics and referring to the arguments of the matrix clause. The proposed project collects and analyzes a substantial body of new empirical data, challenging for syntactic theory, puts current approaches under scrutiny with regard to their ability to deal with those facts, and modifies them to the point of a better understanding how information about person is conveyed in human language.

Post-predicate Elements in Iranian: Inheritance, Contact, and Information Structure

Funded by the Alexander-von-Humboldt-Stiftung

Funding period: 01.07.2019-30.06.2022
PI's: Geoffrey Haig (Bamberg); Mohammad Rasekh-Mahand (Hamedan)
 

Iranian languages are routinely classified as "verb final". While this is true with regard to the position of (non-pronominal) direct objects, which are generally pre-verbal, in several West Iranian languages, certain other constituents occur more or less systematically after the verb. The result is a typologically unusual and hitherto largely ignored OVX word order type within West Iranian. Furthermore, OVX word order has been identified in unrelated languages in contact with Iranian, including Turkic, and Neo-Aramaic.

This project brings together leading international experts on Iranian and neighbouring languages in order to explore

  • the extent of OVX word order within Iranian, and its genesis within the family
  • the areal spread of OVX word order in neighbouring languages, and the pathways of transmission
  • information-structural correlates of  OVX word order
  • typological implications of OVX word order.

For more information click here

Previous projects

Does morphosyntactic alignment shape discourse? Implementing a corpus-based approach to linguistic typology

online here), and implements an expanded version of the syntactic annotation system GRAID (Grammatical Relations and Animacy in Discourse, Haig & Schnell 2014, manual here). The existing language sample in Multi-CAST is being extended by the inclusion of ergative languages from the Nakh-Daghestanian language family and from Australia, and of data from Phillippine-type languages. All corpora are subjected to a standardized annotation procedure, and the resulting data feed into quantitative cross-corpus analysis in order to identify significant statistical patterns in connected discourse, for example:

  • the distribution of referential expressions across syntactic functions,
  • the density of zero-anaphora,
  • patterns of new-referent introduction,
  • division of labour among pronouns and lexical expressions,
  • the impact of animacy on syntactic configurations

The resulting dataset, the first of its kind worldwide, aids the detection of possible correlations between the alignment of morphosyntax, and probabilistic patterning in the way connected spoken language is organized.

The project is being coordinated by Geoffrey Haig, Stefan Schnell, and Nils Schiborr at the University of Bamberg, and runs in collaboration with researchers from the Centre of Excellence for Dynamics of Language, Canberra and Melbourne (Nick Thieberger), and the University of Jena (Diana Forker).

The project is supported by a DFG grant (project number 323627599), for an initial period of 2017–2020.

Bamberg Lexical Database for Contemporary Iranian Languages (BLDCIL)

Background and aims

The sub-classification of Iranian languages has proven to be a particularly recalcitrant problem in historical linguistics (see Korn 2016 for recent proposals, DOI: 10.1515/if-2016-0021). This project aims to complement and extend existing scholarship by applying a phylogenetic  approach, based on lexical comparison, to the problem; see e.g. Heggarty et al. (2010) for the background to this kind of approach (https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2981917/pdf/rstb20100099.pdf).

titus.fkidg1.uni-frankfurt.de/personal/jg/pdf/jg2008e.pdf)

The aims of the project are thus two-fold: (i) to apply a novel methodology to an old problem in the sub-classification of  Iranian languages; (ii) to serve as a proof-of-concept for the efficacy (or otherwise) of phylogenetic models in resolving classic problems of philology. The first phase, beginning in November 2016, involves the comilation of standardized lexical data sets, together with sound files, from a representative set of Iranian languages, focussing initially on the West Iranian languages.


Cooperation

The project is closely linked with two existing initiatives: 

www.shh.mpg.de/207610/cobldatabase).

http://iranatlas.net)


The Jena/Bamberg Iranian List (JBIL) of meanings

The JBIL-list is a list of meanings, which includes the 200 items used in the CoBL project, and 80 items used in the Atlas of the Languages of Iran, plus a number of other items deemed of interest for Iranian languages. The items themselves, plus explanations and instructions for investigators, are available as downloads  below:

  • The JBIL-list, with explanations and example sentences, and instructions for investigators (pdf(247.0 KB, 20 pages))
  • The JBIL-list, with Persian translations and Persian example sentences (pdf(470.0 KB, 25 pages))

  • The Data Entry Form, into which the actual forms for each language may be entered. (doc(98.5 KB, 21 pages))

 

Languages

Data sets have been compiled, or are in the process of compilation, for the following languages:

Kumzari

Behdin? Kurdish

Mazanderani

Persian

Jafi Kurdish

Tati

Bakhtiari

Delvari

 

Sample Data

Sample data sets will be made available shortly here

 

Team

The LDBCIL project is coordinated by Geoffrey Haig (Bamberg) and Erik Anonby (Carleton/Bamberg). Data collection and handling is undertaken with the assistance of (in alphabetical order): Shirin Adibifar (Bamberg), Raheleh Izadifar (Hamedan), Mina Salehi (Bamberg), Mortaza Taheri-Ardali (Shahr-e Kord University).

 

Support

The project gratefully acknowledges the financial and technical support of the Max-Planck Institute for the Science of Human History (CoBL-Database), the University of Bamberg for departmental funding, and the Dept. of Linguistics at the University of Hamedan as a cooperation partner in the Islamic Republic of Iran.

Atlas of the Languages of Iran (Chief Editor: Erik Anonby)

For more information click here

Documenting Dargi languages in Daghestan - Shiri and Sanzhi

http://www.mpi.nl/DOBES/).

In the linguistic documentation and analysis of Shiri and Sanzhi we will pay special attention to those features that are unusual for the Nakh-Daghestanian language family and of broader typological interest. Two of these features are person agreement, which is based on the person hierarchy and not determined by grammatical roles, and extraordinarily rich TAM and evidentiality paradigms.

In our project we will collaborate with Russian colleagues (e.g. Nina Sumbatova) and colleagues from the University of Jena (Kevin Tuite, Florian Mühlfried). But our main cooperation partners will be Daghestanian researchers, students and the Shiri and Sanzhi communities.

The project ist funded by the DoBeS program of the VW foundation (http://www.volkswagenstiftung.de/service/aktuelles/article/129/chancen-fuer.html?no_cache=1&cHash=fefa1ac99f). It started in summer 2012 and runs for three years.

For further information please visit our project page: http://www.kaukaz.net/cgi-bin/blosxom.cgi/english/dargwa.

Chirag Documentation Project

here.

Compilation and critical edition of pre-19th century Kurmanji Kurdish

Researcher:

Dr. Ergin ?pengin

Project Details:

Deutsche Forschungsgemeinschaft (DFG). Duration: 10/2014 - 03/2016. 134.767 EUR

Project Summary:

Kurmanji Kurdish is one of the most widely-spoken languages of the Middle East, but research on its history and development is severely hampered due to the lack of written attestation prior to the 15th century. Furthermore, the few samples of Kurdish prose that can reliably be ascribed to the period 15th-19th centuries are largely inaccessible to a wider scholarly audience, and lack reliable critical apparatus. This project will compile a selection of 10 Kurdish texts from prior to 1800, transliterated in a standardized format and supplied with English translations and an authoritative critical apparatus. The texts will also be made fully accessible as a digital corpus, accompanied by a concordance, and the resulting two volumes will be published on the open-access portal of the University of Bamberg. Issues of authorship and localization of the texts will also be assessed in the light of the applicant’s ongoing research on regional variation in Kurdish, which allows a much finer-grained evaluation than has previously been possible. The project will thus lay the foundation for serious academic research on the history of Kurdish by creating an open-source research resource for questions relating to the history of the Kurdish language(s) itself, to the issue of the position of Kurdish within west Iranian languages, reconstructing the linguistic ecology of Kurdistan in the Ottoman period, assessing the timing of contact phenomena and of language change, and of issues of literary and religious scribal practices in the period.

Agreement in Discourse

http://www.daimler-benz-stiftung.de/cms/index.php?page=postdoc-stipendiaten-2012).

As part of the project Diana Forker and Geoffrey Haig organize a workshop at the University of Bamberg (1-2 February, 2013).

KiKoDaz

Kieler Korpus Deutsch als Zweitsprache

Documentation of Gorani, an endangered language of West Iran

Dokumentation Bedrohter Sprachen (DoBeS). The project was originally granted for three years (2007-2010), but has been extended till 2012. The project is a collaborative project, conducted together with Professor Ludwig Paul (Hamburg) and Professor Philip Kreyenbroek (G?ttingen). Information on the project can be found here.