Category Archives: Research blogging

On dialects, similects, and the -lishes

"Billard" by No-w-ay in collaboration with H. Caps - Own work. Licensed under GFDL via Wikimedia Commons.

Billard” by No-w-ay in collaboration with H. Caps – Own work. Licensed under GFDL via Wikimedia Commons.

One of the major lines of English as a lingua franca (ELF) research is how to describe the features of English in interaction between second-language users. With the multitude of accents and variable usage of English you find in the world today, the most obvious quality of ELF talk is its diversity (some say it’s even superdiversity, though I’m campaigning for the shamelessly hyperbolic ultra-mega-diversity-squared). At the same time, people notice that speakers of the same first language (or L1, such as Finnish) often share similar features when speaking English. Thus, you encounter folk linguistic descriptions of L1-specific -lishes – “Finglish”, “Swenglish”, “Spanglish”, etc.

I personally avoid these typically muddled labels, as the only thing that unifies them all is their negative overtone. The term “Finglish” is most likely trotted out when mocking a public figure’s “Bad English” when speaking in international media. But there’s still something to it. It’s a well-studied fact that a person’s first language will influence the learning and use of other languages. Some features of Finnish are commonly heard when Finns speak English; for example, words starting with p, t and k sounds aren’t aspirated in Finnish, and they’re likely not aspirated when Finns speak English. On the grammatical side, you might hear a Finn say “I’m waiting the bus” or “Let’s discuss about this”, which reflect the equivalent case endings in Finnish (Odotan bussia; Keskustellaan tästä).

This doesn’t mean all Finns will display these features, nor does it mean that speakers of English from other first-language backgrounds won’t use them as well (in the ELFA corpus, discuss about occurs 20 times, but only half of those come from speakers with Finnish L1). So how do we describe these -lishes? Some would see them as dialects of English, but there’s a crucial difference between the features of English dialects and the features of L1-influenced English. Keep reading…


Adventures in correcting the (semi-)scientific record

Photo shared by Michiel1972 via Wikimedia Commons

Photo shared by Michiel1972 via Wikimedia Commons

One of the blogs I follow is Retraction Watch, which documents the world of quality control in scientific research – pre-publication peer review (and its abuses); post-publication peer review in fora such as research blogs; retractions and corrections by journals; and plagiarism and fraud. The large majority of cases they report on are drawn from the “hard sciences”. From time to time, a case pops up from the humanities as well, and it’s not outrageous to ask – who cares anyway? Well, I do.

I’m one of those humanistic researchers who likes to imagine that I do something resembling science. One of the most frustrating things about humanistic research that can’t stand up to scrutiny is the feeling that it doesn’t matter, anyway; nobody cares about this stuff but us. Does that make me some starry-eyed idealist? No, I just don’t like sloppy work. And when I see it, it makes me look bad too, a humanistic guilt by association. Several of the posts on this blog can be seen as post-publication peer review, and during the past year I had my own experience with attempting to correct the (semi-)scientific record.

Last year I read an article by Prof. Hilary Nesi in the Journal of English for Academic Purposes (JEAP) entitled Laughter in university lectures. It contained an obvious error in the word count of the Corpus of British Academic Spoken English (BASE), which resulted in erroneous claims about the frequency of laughter in this linguistic database. The natural response, again, might be who cares?. Several people should care, because the author, two peer reviewers, and the journal editors apparently didn’t look very carefully at the figures reported in two of the tables in the paper. I decided to start with the author. Keep reading…

Tagged ,

Language users or learners? Lexical evidence from spoken ELF

Click the image to jump to the article (behind paywall):Kao, S. & Wang, W. (2014) Lexical and organizational features in novice and experienced ELF presentations. Journal of English as a Lingua Franca, 3(1), 49-79. DOI: 10.1515/jelf-2014-0003.

Click the image to jump to the article (behind paywall):
Kao, S. & Wang, W. (2014) Lexical and organizational features in novice and experienced ELF presentations. Journal of English as a Lingua Franca, 3(1), 49-79. DOI: 10.1515/jelf-2014-0003.

One of the key distinctions made in research on English as a lingua franca (ELF) is the difference between language users and learners. ELF data is typically approached from the viewpoint of second language use instead of second language acquisition. Rather than seeing non-native English speakers as perennially deficient pursuers of “native-like” proficiency, ELF researchers start from the position that non-native English is principally English in use – English serves as a vehicular language for doing stuff, and especially for professional life in international domains like academia.

These issues are explored in a study in a recent issue of the Journal of English as a Lingua Franca. Shin-Mei Kao and Wen-Chun Wang take up the user/learner distinction by investigating “lexical and organizational patterns in the presentations made by speakers of different ELF proficiency and experience levels” (Kao & Wang 2014: 54). To do this, they perform a lexical analysis of academic presentations from three different groups – novice students who can be considered as language learners, academic experts using English as a lingua franca, and academic experts who are also English language specialists.

The three datasets are from the following sources:

  • novices/language learners – 43 student presentations in an English for Academic Purposes (EAP) course held at National Cheng Kung University, Taiwan. Students are a mixture of Taiwanese and international students, with most students from the field of engineering. Presentations ranged between 2-5 minutes each, with an average of 360 words.
  • ELFA corpus – 30 conference presentations from the Corpus of English as a Lingua Franca in Academic Settings (ELFA). These academic experts consist of 49 presenters (mostly between the ages of 31-50) with 20 different first-language backgrounds (and no native English speakers). Each presentation on average lasts 21 minutes with 2568 words.
  • John Swales Conference Corpus – 23 conference presentations from the JSCC, recorded at a conference in Michigan celebrating Swales’ retirement. The 28 presenters are all academic experts and English-language specialists from 13 different first-language backgrounds, including an unknown number of English native speakers. The presentations average 3007 words, and only monologues are included to match the ELFA data.

Keep reading…


Needles in a haystack: questioning the “fluidity” of ELF

As I’ve earlier argued on this blog, sometimes the claims of “fluidity”, “diversity”, and “innovation” found in English as a lingua franca (ELF) research are overstated. It’s so diverse that even ordinary diversity won’t do – it’s “super-diversity” now. It could very well be ultra-mega-diversity-squared, but the question of the prominence of these presumably innovative features is a quantitative one. More specifically, it’s a question of how frequently any variant forms might occur in naturally occurring ELF interaction, relative to the conventional forms. One of my shameless nerd hobbies is writing little Python programs to query corpora, and several of these mini-studies have appeared on this blog. I especially enjoy working with the VOICE corpus, which is great because 1) it contains a million words of unelicited ELF interaction; 2) it’s ready for processing as well-formed XML; and 3) it has been meticulously part-of-speech (POS) tagged for both the form and function of each word in the corpus.

The value of this double form-function tag is that it reveals every token in the corpus where a word like fluently, which is formally recognisable as an adverb, functions in a different way, like as an adjective: i think you are very fluently in english. This example of fluently from VOICE has a form tag of RB (adverb), but a function tag of JJ (adjective) to reflect that fluently seems to be serving in an adjectival function. This kind of form-function variation in ELF is presumably prominent enough that it necessitates this double tagging to adequately describe the fluidity. The VOICE team was kind enough to carry out this formidable task involving manual inspection of all million words. Now that this resource is in place (and freely available), the instances of these form-function mismatches can be easily found, counted, and viewed in context.

I’ve wondered for some time how often these variant form-function tokens occur overall, in relation to their conventional forms. My interest was renewed by the recent paper by VOICE project researcher Ruth Osimk-Teasdale in the Journal of English as a Lingua Franca. One of the main workers on the VOICE POS-tagging project, she investigates word class shifts in VOICE. She narrows her data to double form-function tags that reflect a shift of category across word classes (like from adverb to adjective). These inter-categorical word class shifts therefore exclude variations within a word class, like singular nouns which are treated as plural. She focuses on items like fluently above, where word class conversion occurs without any change to the form of the word itself.

Assigning these form-function tags – and the analysis of them – are directly linked to the fluidity of ELF: Keep reading…


Publishing in English as an academic lingua franca

Happy Summer from the ELFA project.© Nina Valtavirta

Happy Summer from the ELFA project.
© Nina Valtavirta

Few researchers would disagree that publishing in English is a necessity. The pressure to publish in high-ranking journals means publishing in English-language journals, and academics using English as a second or foreign language often find an uneven linguistic playing field. This has received a good deal of attention in the field of English for Academic Purposes (EAP), even branching out into a designated field of English for Research Publication Purposes, or ERPP. The importance of English can’t be ignored, but an English-centered approach can fail to take note of how English functions alongside other languages used by multilingual academics.

Questions surrounding English in multilingual research settings are explored in a special issue of the Journal of English for Academic Purposes (vol. 13) entitled “Writing for publication in multilingual contexts“. Edited by Maria Kuteeva of Stockholm University and Anna Mauranen of the University of Helsinki, the special issue features six articles investigating the multilingual practices of local communities of academics in locations such as Romania, Germany, Sweden, China and Canada. The studies are primarily qualitative, exploring the researchers’ attitudes toward and experiences with the use of English for disseminating research alongside their first and additional languages (click here to view the issue’s table of contents).

These studies dealing with attitudes and experiences give insights that supplement (and are supplemented by) descriptive linguistic research. While the researchers in the special issue study experiences of using the language, other work investigates the language itself in use. For this, databases of naturally occurring English are needed that represent the English produced by academics from a variety of first-language backgrounds. Here in Helsinki, Anna Mauranen’s group has made progress on compiling the WrELFA corpus of written academic ELF (English as a lingua franca), and a companion corpus of research articles by multilingual academics – SciELF – is also underway. As these resources are naturally of interest to researchers of English as an academic lingua franca, it’s no surprise that some contributors to the JEAP special issue are also contributors to the SciELF corpus. Keep reading…

What do we mean by “I mean”?

Click image to jump to Fernández-Polo, F. J. (2014) The role of I mean in conference presentations by ELF speakers. English for Specific Purposes 34, 58-67. (behind paywall)

Click image to jump to Fernández-Polo, F. J. (2014) The role of I mean in conference presentations by ELF speakers. English for Specific Purposes 34, 58-67. (behind paywall)

When analysing spoken English, it doesn’t take long to encounter discourse markers, the single words or phrases that speakers commonly use to mark their stance or organise their message. Common discourse markers include well, now, you know and i mean. In the April 2014 issue of English for Specific Purposes, Francisco Javier Fernández-Polo examines the discourse marker i mean in conference presentations included in the ELFA corpus. This subcorpus includes 34 conference presentations in English by speakers of 21 different first languages. Recorded at universities in Finland, the data consist of naturally occurring English used as a lingua franca (ELF) in academic settings.

Fernández-Polo’s study is qualitative, involving a close analysis of a small number of cases toward determining the functions of i mean in context. There are only 56 occurrences of i mean in this conference presentation subcorpus (94,314 words1), and Fernández-Polo takes 48 of them into his analysis. He classifies these into four different categories – correcting mistakes and dysfluencies; enhancing clarity and explicitness; organising text; and marking certainty and salience (see Table 1 below). Examples of each are discussed in turn.

A striking finding from the paper concerns the wide inter-speaker variation in the use of i mean. Fewer than half of the 34 presenters use i mean at least once, with a single speaker producing 20% of the occurrences, and five speakers contributing two thirds of all hits. To see if a different distribution might be found in similar English as a native language (ENL) data, Fernández-Polo consulted the monologic lectures in the American MICASE corpus. He found that i mean occurs in the MICASE lectures with the same standardised frequency (5 per 10,000 words) and with similar inter-speaker variation – one speaker in MICASE produced 27% of occurrences, with 14 speakers producing 60% of hits. It thus appears that the choice of discourse markers varies a lot based on a speaker’s preference or habit. Keep reading…


Why mixing languages isn’t so bad after all

Kaisa Pietikäinen is a PhD student at the University of Helsinki, where she carries out research in the field of English as a lingua franca (ELF)

Kaisa Pietikäinen is a PhD student at the University of Helsinki, where she carries out research in the field of English as a lingua franca (ELF)

by Kaisa Pietikäinen

You know those moments when you’re speaking English (as a lingua franca, or ELF), and all of a sudden your mind goes blank? You know the word you’re looking for, but you just can’t get it into your head. You might remember it in another language, but your brain just isn’t connecting to the English equivalent. Fear not – it’s more common than you think. And if your interlocutor isn’t a complete monolingual, you can try code-switching into a different language to resolve the situation.

I’ve been studying code-switching among ELF couples – couples who come from different cultures and language backgrounds, who have found each other and established a relationship despite the fact that neither partner uses his or her first language as the language of the relationship. (Actually, this might even make their relationship more equal.) These couples are very interesting as subjects of ELF research because they are much more established in their use of ELF than the traditional subjects in ELF studies – students, academics, and business people. ELF couples also make great subjects for the study of long term ELF: They use ELF every day with the same person, year after year. They open us a view to the future of ELF, on what established ELF could be like. Also, their use of ELF can give us important insight into what strategies work in the long run – and seems like code-switching is one of them!

In fact, code-switching is a very flexible device. In an earlier study (Pietikäinen 2012, available here), I discovered it can be used not only for covering for linguistic gaps, but also for

  • demonstrating use of a language
  • replacing nontranslatables, terms that do not quite catch their original meaning in English
  • specifying addressees by switching into another language, and
  • message emphasis.

In addition, sometimes code-switching seemed to emerge completely automatically, without any preparing cues or flagging, and interestingly, these instances of automatic code-switching seemed to pass without specific attention from either partner, which would suggest that code-switching is considered pretty normal an activity among ELF couples. Keep reading…

On the other side: variations in organising chunks in ELF

Variations in organising chunks aren't that common, but they do tend to stand out.Source: Livio Bourbon via The Telegraph

Variations in organising chunks aren’t that common, but they do tend to stand out.
Source: Livio Bourbon via The Telegraph

When working with ELF data – English used as a lingua franca between second/foreign-language speakers – one of the things that stands out are slight variations in conventional chunks of language. A formulaic chunk like as a matter of fact might be realised as as the matter of fact, or you could hear now that you mention it spoken as now that you say it. There’s no sense in calling them errors, since the variants won’t cause miscommunication, they resemble their conventional counterparts in both function and form, and the less-preferred variant is likely found elsewhere. It’s just not the English native-speaker preference.

These variations are interesting linguistically and they tend to stand out impressionistically for researchers, but I’ve wondered how often these variations actually occur in ELF – both in frequency and also in their distribution relative to conventional forms. It’s not an easy question to answer. Many of these formulaic chunks of language occur infrequently, so finding a couple variants doesn’t really tell you much. The example above of now that you say it occurs twice in the million-word ELFA corpus, with just one instance of the conventional form. Alternatively, as the matter of fact is found in ELFA 21 times compared to just eight occurrences of the expected chunk, but only two speakers account for those 21 instances.

We can see from these examples that a formulaic chunk that rarely shows up won’t reveal much about how often variation occurs among ELF users, across speech events, in different times and places. To find out more, I wanted to start with the highest frequency chunks I could find. These are described by Linear Unit Grammar as organising chunks, the recurring and relatively fixed chunks we use to structure our speech and writing, like on the other hand. Using the corpus freeware AntConc, I looked at the most frequent 3-, 4- and 5-word clusters (aka n-grams) in the ELFA corpus of spoken academic ELF. Keep reading…

Tagged , , ,

Who’s in charge of English? Uses and descriptions of ELF

The late October view up Unioninkatu toward the Helsinki Cathedral

The late October view up Unioninkatu toward the Helsinki Cathedral

One of the recent topics here has been language regulation – what are the norms of English when it’s used as a lingua franca (ELF), when most of the parties in interaction aren’t native speakers of English? The only way to find out is to investigate the practices of ELF users in naturally occurring interaction, and also to inquire into their beliefs and expectations of what is good or acceptable English. Niina Hynninen researched these questions in her 2013 PhD thesis on language regulation in academic ELF, and this is the final post of a three-part review of her work.

I first outlined her data and methodology, which draws both interactive and interview data from the same participants in three university “study events”. These events – a lecture course and two groups of students doing out-of-class projects – were held over several weeks. Thus, interactive data was recorded from the same groups over multiple meetings, with interviews held with the same participants at the end of these periods. In the last post I discussed some of Niina’s findings about language regulation in the groups’ spoken interaction. Today I consider the interview data and how the ELF users’ beliefs about English connect to how they actually use ELF.

Student interviews: interpretations & expectations of ELF

In the analysis of her interviews with 13 students from the study events, Niina discusses three broad interpretive repertoires emerging from the interview data. They involve the students’ descriptions of their own and others’ use of ELF and how English ought to be used. These three interpretive repertoires are described as follows:

  • clarity & simplification – recurring themes across student interviews involved descriptions of “clear”, “simple” or “simplified” English. This came out in descriptions of “clear sentences” or avoiding “long sentences”, as well as adapting one’s speech for another speaker’s perceived proficiency. The two native speakers of English interviewed from the study events were also aware of these adaptive strategies and reported their own efforts to simplify and clarify their speech. Keep reading…

WrELFA corpus progress report: 500k words

This little fellow wants Dionysus' grapes. From the Capitoline Museums in Rome.© Nina Valtavirta

This little fellow wants Dionysus’ grapes. From the Capitoline Museums in Rome.
© Nina Valtavirta

There’s growing interest in English as a lingua franca (ELF) research on description of written ELF. Up to now, ELF data has almost exclusively been drawn from spoken interaction, which is where a lingua franca gets used in the first place. But the use of English as a second/foreign language extends into the written mode as well, and this may also be directed to an international audience. In globalised networks such as academia, examples of English used as a written lingua franca aren’t hard to find. Like other high-stakes domains of ELF, an academic career involves producing English texts that are used to evaluate the author’s professional competence.

Alongside the growth in ELF research has been growing awareness of a power imbalance in academic publishing – journals concentrated in the US & UK typically place a perfect imitation of “native-like” English as a basic criteria for being published. This goes beyond just “correct grammar” and extends into idiomatic usage, phraseological choices, and rhetorical style. So while there’s no dispute that non-native users of English as a lingua franca far outnumber the native English speakers of the world, academic journals tip the balance of power in favor of English native speakers. In short, “good English” is equated with “native-like English”.

This is a question of interest to descriptive ELF research. How does “good English” written by educated professionals who speak a first language other than English differ from the mythologised “native-like English”? This question and the issues surrounding it are persuasively developed by David Owen (2011) in an article on academic publishing and language revision. In his work doing language revision in a Spanish university, he observes that papers rejected on linguistic grounds are often “formally impeccable”, and he presents a series of extracts to illustrate this “correct” vs. “native-like” distinction. In the end, he calls for descriptive ELF research that could clarify this timely question. What does good written ELF look like?

WrELFA: a corpus of written ELF in academia

Late in the same year as Owen’s article was published, Anna Mauranen tasked me with starting compilation of the Corpus of Written English as a Lingua Franca in Academic Settings (WrELFA corpus), which she had been talking about for some time. The million-word ELFA corpus of spoken academic ELF interaction was completed in 2008, and a written companion was a natural development. I’ve been working on this project ever since, and with help from research assistant Jani Ahtiainen, this summer we reached 500,000 words of processed WrELFA text. At this halfway mark to our million-word goal, I thought I’d give an update on our progress. Keep reading…