As I’ve earlier argued on this blog, sometimes the claims of “fluidity”, “diversity”, and “innovation” found in English as a lingua franca (ELF) research are overstated. It’s so diverse that even ordinary diversity won’t do – it’s “super-diversity” now. It could very well be ultra-mega-diversity-squared, but the question of the prominence of these presumably innovative features is a quantitative one. More specifically, it’s a question of how frequently any variant forms might occur in naturally occurring ELF interaction, relative to the conventional forms. One of my shameless nerd hobbies is writing little Python programs to query corpora, and several of these mini-studies have appeared on this blog. I especially enjoy working with the VOICE corpus, which is great because 1) it contains a million words of unelicited ELF interaction; 2) it’s ready for processing as well-formed XML; and 3) it has been meticulously part-of-speech (POS) tagged for both the form and function of each word in the corpus.
The value of this double form-function tag is that it reveals every token in the corpus where a word like fluently, which is formally recognisable as an adverb, functions in a different way, like as an adjective: i think you are very fluently in english. This example of fluently from VOICE has a form tag of RB (adverb), but a function tag of JJ (adjective) to reflect that fluently seems to be serving in an adjectival function. This kind of form-function variation in ELF is presumably prominent enough that it necessitates this double tagging to adequately describe the fluidity. The VOICE team was kind enough to carry out this formidable task involving manual inspection of all million words. Now that this resource is in place (and freely available), the instances of these form-function mismatches can be easily found, counted, and viewed in context.
I’ve wondered for some time how often these variant form-function tokens occur overall, in relation to their conventional forms. My interest was renewed by the recent paper by VOICE project researcher Ruth Osimk-Teasdale in the Journal of English as a Lingua Franca. One of the main workers on the VOICE POS-tagging project, she investigates word class shifts in VOICE. She narrows her data to double form-function tags that reflect a shift of category across word classes (like from adverb to adjective). These inter-categorical word class shifts therefore exclude variations within a word class, like singular nouns which are treated as plural. She focuses on items like fluently above, where word class conversion occurs without any change to the form of the word itself.
Assigning these form-function tags – and the analysis of them – are directly linked to the fluidity of ELF: Keep reading…