Author Archives: Ray Carey

On dialects, similects, and the -lishes

"Billard" by No-w-ay in collaboration with H. Caps - Own work. Licensed under GFDL via Wikimedia Commons.

Billard” by No-w-ay in collaboration with H. Caps – Own work. Licensed under GFDL via Wikimedia Commons.

One of the major lines of English as a lingua franca (ELF) research is how to describe the features of English in interaction between second-language users. With the multitude of accents and variable usage of English you find in the world today, the most obvious quality of ELF talk is its diversity (some say it’s even superdiversity, though I’m campaigning for the shamelessly hyperbolic ultra-mega-diversity-squared). At the same time, people notice that speakers of the same first language (or L1, such as Finnish) often share similar features when speaking English. Thus, you encounter folk linguistic descriptions of L1-specific -lishes – “Finglish”, “Swenglish”, “Spanglish”, etc.

I personally avoid these typically muddled labels, as the only thing that unifies them all is their negative overtone. The term “Finglish” is most likely trotted out when mocking a public figure’s “Bad English” when speaking in international media. But there’s still something to it. It’s a well-studied fact that a person’s first language will influence the learning and use of other languages. Some features of Finnish are commonly heard when Finns speak English; for example, words starting with p, t and k sounds aren’t aspirated in Finnish, and they’re likely not aspirated when Finns speak English. On the grammatical side, you might hear a Finn say “I’m waiting the bus” or “Let’s discuss about this”, which reflect the equivalent case endings in Finnish (Odotan bussia; Keskustellaan tästä).

This doesn’t mean all Finns will display these features, nor does it mean that speakers of English from other first-language backgrounds won’t use them as well (in the ELFA corpus, discuss about occurs 20 times, but only half of those come from speakers with Finnish L1). So how do we describe these -lishes? Some would see them as dialects of English, but there’s a crucial difference between the features of English dialects and the features of L1-influenced English. Keep reading…

WrELFA 2015: the written corpus of academic ELF


During the past two years that I’ve kept up this blog, I’ve been working on compilation of the first corpus of written ELF (English as a lingua franca) for Anna Mauranen’s ELFA project. I started loitering around her group shortly after the ELFA corpus of spoken academic ELF was completed, and a written corpus was already being discussed. A couple years later, with the proper mix of time, money, and research assistants, we launched the WrELFA corpus project – the Written Corpus of English as a Lingua Franca in Academic Settings. And now we can announce that the WrELFA corpus compilation is complete.

I’ve been blogging about this work-in-progress over the past couple years, so I don’t need to repeat it all here. There are three text types included in WrELFA, each of which invites its own investigation. These three components are:

  1. Academic research blogs – this subcorpus is drawn from 40 different blogs maintained by second-language users of English and totals 372,000 words (see this and this post).
  2. PhD examiner reports – 330 evaluations by senior academics with 33 different first languages (402,000 words). I’ve discussed this data in depth in this post.
  3. SciELF corpus – a collaborative, stand-alone corpus of 150 unedited research papers by academics from 10 first-language backgrounds. Partners from 12 universities contributed texts to the 759,000 total words (see this post).

Taken together, these three components total just over 1.5 million words of text with a rough binary division between the natural sciences (55% of words) and disciplines in social sciences and humanities (45% of words). For more detailed information on the make-up of the corpus, see the ELFA project homepage, where I’ve recently done a major update of the WrELFA corpus pages, with documentation of the corpus components, compilation principles, and authors’ L1 distributions. Keep reading…

Summer course on English as a lingua franca


Registration is open for an intensive, three-week course entitled “English as a lingua franca – a new language?” (5 ECTS) offered from August 4-20, 2015, through Helsinki Summer School. The University of Helsinki’s annual summer school brings in students from around the world, and we look forward to welcoming a diverse group to join in our research into English as a lingua franca (ELF). Professor and university Vice-Rector Anna Mauranen is a trailblazing expert in ELF research, and her ELFA project leads the way in the study of academic ELF.

What makes this course different? First of all, it will be hands-on and data-centered. One of the advantages of studying in Helsinki is our wealth of linguistic databases. In addition to the spoken ELFA corpus (1 million words), we recently completed the first written ELF corpus, WrELFA (1.5 million words). Each teaching day will begin with a lecture, followed by a language lab in which students will get to work with ELF data, applying and exploring the concepts in practice.

Secondly, our course is linguistically oriented. If you’ve read about ELF research before, you might have the idea that this is an ideologically driven field, especially toward reforming the status quo in English language teaching. In Prof. Mauranen’s group, however, we tend to focus on the descriptive challenges of ELF and how it should be understood in relation to English(es) as a whole. Is ELF a new language? Or is it just English?

Three levels for understanding ELF


The summer course will mainly be taught by two ELFA project researchers, Svetlana Vetchinnikova and Ray Carey (that’s me). We base the course on Mauranen’s three-level framework for understanding ELF: the macrosocial, microsocial, and cognitive levels. This framework is also the basis of her book, Exploring ELF: Academic English shaped by non-native speakers (2012, CUP), and Prof. Mauranen will provide a lecture introducing these perspectives at the beginning of the course.

The macrosocial, microsocial, and cognitive approaches not only provide a conceptual framework – they also provide different approaches to analysing and describing data. The course and its hands-on exercises will apply these levels in different ways: Keep reading…

Let’s be cool about English


With English serving as a global lingua franca, it’s easy to see the ill fit when a minority of English speakers (those who speak it by accident of birth) exercise disproportionate control over what should be regarded as acceptable English. In scientific publishing, for example, authors using English as a lingua franca (ELF) encounter linguistic gatekeepers who not infrequently insist on “native-like” English as a criteria for publishing. Yet, attitudes are changing just as quickly as anything else in our single-click world. Even native speakers of English can see that we need to be cool about English.

In an article this month in Slate, Boer Deng, herself a second-language user of English, has an interesting take on English as the scientific lingua franca. She argues that the English supremacy in academia is linked to American spending on and production of PhDs, which has exploded since the 1960s. She further points out the added challenges of representing one’s self as a professional without the advantage of using your first language. As a result, native speakers of English should show more understanding and consideration toward their peers – in short, we need to be cool.

But how to implement this linguistic coolness institutionally? Deng cites the example of an American journal, Molecular Biology of the Cell, which published an editorial in 2012 (read it here) that calls for flexibility by reviewers when evaluating scientific texts by authors using English as a second or foreign language. And what do they say that it takes to be cool in today’s scientific world? Keep reading…

Adventures in correcting the (semi-)scientific record

Photo shared by Michiel1972 via Wikimedia Commons

Photo shared by Michiel1972 via Wikimedia Commons

One of the blogs I follow is Retraction Watch, which documents the world of quality control in scientific research – pre-publication peer review (and its abuses); post-publication peer review in fora such as research blogs; retractions and corrections by journals; and plagiarism and fraud. The large majority of cases they report on are drawn from the “hard sciences”. From time to time, a case pops up from the humanities as well, and it’s not outrageous to ask – who cares anyway? Well, I do.

I’m one of those humanistic researchers who likes to imagine that I do something resembling science. One of the most frustrating things about humanistic research that can’t stand up to scrutiny is the feeling that it doesn’t matter, anyway; nobody cares about this stuff but us. Does that make me some starry-eyed idealist? No, I just don’t like sloppy work. And when I see it, it makes me look bad too, a humanistic guilt by association. Several of the posts on this blog can be seen as post-publication peer review, and during the past year I had my own experience with attempting to correct the (semi-)scientific record.

Last year I read an article by Prof. Hilary Nesi in the Journal of English for Academic Purposes (JEAP) entitled Laughter in university lectures. It contained an obvious error in the word count of the Corpus of British Academic Spoken English (BASE), which resulted in erroneous claims about the frequency of laughter in this linguistic database. The natural response, again, might be who cares?. Several people should care, because the author, two peer reviewers, and the journal editors apparently didn’t look very carefully at the figures reported in two of the tables in the paper. I decided to start with the author. Keep reading…

Tagged ,

Souvenirs from Athens: recollections from ELF7

ELF7 Doctoral workshop. From left: Roxani Faltzi, Haibo Liu, Yumi Matsumoto, convenors Barbara Seidlhofer and Henry Widdowson, Talip Gulle, Miya Komori-Glatz and Kaisa Pietikäinen.

ELF7 Doctoral workshop. From left: Roxani Faltzi, Haibo Liu, Yumi Matsumoto, convenors Barbara Seidlhofer and Henry Widdowson, Talip Gulle, Miya Komori-Glatz and Kaisa Pietikäinen.

by Kaisa Pietikäinen

For an inexperienced conference-goer such as myself, the prospect of giving two separate presentations under the watchful (and no doubt evaluative) eye of several distinguished ELF veterans made my stomach turn. In spite of my anxiety, I packed my most formal business jacket and flew to Athens for the 7th International Conference of English as a Lingua Franca, better known as ELF7, hosted by DEREE, The American College of Greece, during 4–6 September.

By the end of the first fully-packed ELF day, my anxiety had levelled, and I was able to present my first talk rather successfully, or at least the small, heat-exhausted audience seemed quite interested in the topic: misunderstandings in private conversations among ELF couples, and how miscommunication was skilfully pre-empted by using various comprehension-enhancing tactics. I will post about this study later in more detail.

But, the real reason why I wasn’t so nervous anymore was perhaps not so flattering. During the first day, I was surprised that so many presenters actually didn’t seem to know much of ELF. Often there was no distinction made between ELF (English as a lingua franca) and EFL (English as a foreign language), and every time a presenter began his/her 20-minute presentation with a 10-minute intro on what ELF is, I thought: “Oh, here we go again!”

First day: student questionnaires

Although I obviously didn’t get to see all the presentations I wanted – at times there were as many as seven parallel sessions, and unfortunately many of the ones that sounded interesting coincided – I also noticed the copy-paste problem Ray took up in his posting on ELF6. But, here the duplication disease infected entire studies, not just data. The trendiest theme in the conference seemed to be students’ attitudes towards ELF. These studies were identical to many already conducted here, there and everywhere, but when this was pointed out to the presenters, the common response was: “Yes, but it hasn’t been studied in my country!” Keep reading…

Language users or learners? Lexical evidence from spoken ELF

Click the image to jump to the article (behind paywall):Kao, S. & Wang, W. (2014) Lexical and organizational features in novice and experienced ELF presentations. Journal of English as a Lingua Franca, 3(1), 49-79. DOI: 10.1515/jelf-2014-0003.

Click the image to jump to the article (behind paywall):
Kao, S. & Wang, W. (2014) Lexical and organizational features in novice and experienced ELF presentations. Journal of English as a Lingua Franca, 3(1), 49-79. DOI: 10.1515/jelf-2014-0003.

One of the key distinctions made in research on English as a lingua franca (ELF) is the difference between language users and learners. ELF data is typically approached from the viewpoint of second language use instead of second language acquisition. Rather than seeing non-native English speakers as perennially deficient pursuers of “native-like” proficiency, ELF researchers start from the position that non-native English is principally English in use – English serves as a vehicular language for doing stuff, and especially for professional life in international domains like academia.

These issues are explored in a study in a recent issue of the Journal of English as a Lingua Franca. Shin-Mei Kao and Wen-Chun Wang take up the user/learner distinction by investigating “lexical and organizational patterns in the presentations made by speakers of different ELF proficiency and experience levels” (Kao & Wang 2014: 54). To do this, they perform a lexical analysis of academic presentations from three different groups – novice students who can be considered as language learners, academic experts using English as a lingua franca, and academic experts who are also English language specialists.

The three datasets are from the following sources:

  • novices/language learners – 43 student presentations in an English for Academic Purposes (EAP) course held at National Cheng Kung University, Taiwan. Students are a mixture of Taiwanese and international students, with most students from the field of engineering. Presentations ranged between 2-5 minutes each, with an average of 360 words.
  • ELFA corpus – 30 conference presentations from the Corpus of English as a Lingua Franca in Academic Settings (ELFA). These academic experts consist of 49 presenters (mostly between the ages of 31-50) with 20 different first-language backgrounds (and no native English speakers). Each presentation on average lasts 21 minutes with 2568 words.
  • John Swales Conference Corpus – 23 conference presentations from the JSCC, recorded at a conference in Michigan celebrating Swales’ retirement. The 28 presenters are all academic experts and English-language specialists from 13 different first-language backgrounds, including an unknown number of English native speakers. The presentations average 3007 words, and only monologues are included to match the ELFA data.

Keep reading…


Needles in a haystack: questioning the “fluidity” of ELF

As I’ve earlier argued on this blog, sometimes the claims of “fluidity”, “diversity”, and “innovation” found in English as a lingua franca (ELF) research are overstated. It’s so diverse that even ordinary diversity won’t do – it’s “super-diversity” now. It could very well be ultra-mega-diversity-squared, but the question of the prominence of these presumably innovative features is a quantitative one. More specifically, it’s a question of how frequently any variant forms might occur in naturally occurring ELF interaction, relative to the conventional forms. One of my shameless nerd hobbies is writing little Python programs to query corpora, and several of these mini-studies have appeared on this blog. I especially enjoy working with the VOICE corpus, which is great because 1) it contains a million words of unelicited ELF interaction; 2) it’s ready for processing as well-formed XML; and 3) it has been meticulously part-of-speech (POS) tagged for both the form and function of each word in the corpus.

The value of this double form-function tag is that it reveals every token in the corpus where a word like fluently, which is formally recognisable as an adverb, functions in a different way, like as an adjective: i think you are very fluently in english. This example of fluently from VOICE has a form tag of RB (adverb), but a function tag of JJ (adjective) to reflect that fluently seems to be serving in an adjectival function. This kind of form-function variation in ELF is presumably prominent enough that it necessitates this double tagging to adequately describe the fluidity. The VOICE team was kind enough to carry out this formidable task involving manual inspection of all million words. Now that this resource is in place (and freely available), the instances of these form-function mismatches can be easily found, counted, and viewed in context.

I’ve wondered for some time how often these variant form-function tokens occur overall, in relation to their conventional forms. My interest was renewed by the recent paper by VOICE project researcher Ruth Osimk-Teasdale in the Journal of English as a Lingua Franca. One of the main workers on the VOICE POS-tagging project, she investigates word class shifts in VOICE. She narrows her data to double form-function tags that reflect a shift of category across word classes (like from adverb to adjective). These inter-categorical word class shifts therefore exclude variations within a word class, like singular nouns which are treated as plural. She focuses on items like fluently above, where word class conversion occurs without any change to the form of the word itself.

Assigning these form-function tags – and the analysis of them – are directly linked to the fluidity of ELF: Keep reading…


Publishing in English as an academic lingua franca

Happy Summer from the ELFA project.© Nina Valtavirta

Happy Summer from the ELFA project.
© Nina Valtavirta

Few researchers would disagree that publishing in English is a necessity. The pressure to publish in high-ranking journals means publishing in English-language journals, and academics using English as a second or foreign language often find an uneven linguistic playing field. This has received a good deal of attention in the field of English for Academic Purposes (EAP), even branching out into a designated field of English for Research Publication Purposes, or ERPP. The importance of English can’t be ignored, but an English-centered approach can fail to take note of how English functions alongside other languages used by multilingual academics.

Questions surrounding English in multilingual research settings are explored in a special issue of the Journal of English for Academic Purposes (vol. 13) entitled “Writing for publication in multilingual contexts“. Edited by Maria Kuteeva of Stockholm University and Anna Mauranen of the University of Helsinki, the special issue features six articles investigating the multilingual practices of local communities of academics in locations such as Romania, Germany, Sweden, China and Canada. The studies are primarily qualitative, exploring the researchers’ attitudes toward and experiences with the use of English for disseminating research alongside their first and additional languages (click here to view the issue’s table of contents).

These studies dealing with attitudes and experiences give insights that supplement (and are supplemented by) descriptive linguistic research. While the researchers in the special issue study experiences of using the language, other work investigates the language itself in use. For this, databases of naturally occurring English are needed that represent the English produced by academics from a variety of first-language backgrounds. Here in Helsinki, Anna Mauranen’s group has made progress on compiling the WrELFA corpus of written academic ELF (English as a lingua franca), and a companion corpus of research articles by multilingual academics – SciELF – is also underway. As these resources are naturally of interest to researchers of English as an academic lingua franca, it’s no surprise that some contributors to the JEAP special issue are also contributors to the SciELF corpus. Keep reading…

ELF corpora in the mainstream: notes from the ICAME 35 conference

Nottingham Castle, the site of the conference excursion dinner.© Sebastian Hoffman

Nottingham Castle, the site of the conference excursion dinner.
© Sebastian Hoffman

Last month I presented a paper at the ICAME 35 corpus linguistics conference at the University of Nottingham, and I was happy to find that I wasn’t the only one there combining an ELF research perspective with a corpus methodology. Two other major ELF corpus projects were also represented at the conference, and it was nice to hear about the different research questions and ELF data that are being investigated. Though ELF research is still seen as controversial in some quarters, our papers and poster seemed to be well-received. Other researchers can see that lingua franca data is linguistically relevant and can be fruitfully investigated without foregrounding ideological concerns.

My own paper was drawn from my PhD research, a corpus-based study of fluency in spoken academic ELF. This data is drawn from ELFA corpus and SELF project data, all of which is naturally occurring spoken ELF from university settings. In addition to this data from relatively formal settings, a new corpus of informal academic ELF talk is being compiled at the University of Saarland. In their paper at ICAME 35, Stefan Diemer, Marie-Louise Brunner & Selina Schmidt shared early findings from CASE (Corpus of Academic Spoken English), made up of recorded interactions on Skype. But first, I want to discuss a project from Linnaeus University in Sweden, where an ELF research perspective is integrated into a corpus-based study of ongoing change in English.

ELF & the big picture: ongoing grammatical changes in English

Click to view the poster presented by Mikko Laitinen, Magnus Levin & Alexander Lakaw at ICAME 35, 30 April–4 May, 2014.

Mikko Laitinen, Magnus Levin and Alexander Lakaw presented a poster entitled “Ongoing grammatical change and the new Englishes: Towards a set of corpora of English use in the expanding circle” (link to pdf, or click on the image at left). Their project, led by Prof. Laitinen at Linnaeus University, is compiling two corpora of contemporary English in Sweden and Finland. Laitinen’s linguistic roots are in the VARIENG research unit here in Helsinki, and this background in the diachronic study of change in English (change over time) is here directed toward the “Expanding Circle” – the growing number of second-language users of English in countries like Sweden and Finland, and who are increasingly likely to both reflect and influence ongoing changes in English. Their research tests the applicability of some of the methods and theories used in empirical historical linguistics to the study of present-day ELF use and language contact. They ask questions such as to what extent global ELF uses contribute to language variability and whether ongoing grammatical changes are accelerated or slowed down by ELF speakers/writers. Keep reading…