Data-driven learning with WordSift

image source: John Allan

I recently came across a web resource that reminded me of using Data-driven learning (DDL) with students.  I have not tried using DDL for a few years but I think that WordSift will allow instructors to use basic DDL techniques with their students.

What is DDL?

Data-driven learning is a learning approach in which learning is driven by research-like access to linguistic data (Johns, 1991). DDL examines a corpora or body of text. WordSift can generate useful usage data from existing corpora (databases of text based language created for a general or specific purpose.), text snippets from websites or documents generated by the students themselves.


WordSift is a user-friendly multifunctional tool.  Simply locate a digital text source, copy the text, then paste the text into the WordSift.  To start the analysis, click on the Sift button.  On completion of analyzing the text, Wordsift returns analysis in six formats.

  • A word cloud of the most frequent terms (excluding function words)
  • In the same screen region as the word cloud a Mark Word feature can be toggled for viewing. Words from the lists below may be highlighted to determine frequency and usage within the original corpora.
    • Academic Word Lists (AWL)
    • General Service List (GSL)
    • New General Service List (NGSL)
    • Marzano & Pickering: Language Arts
    • Marzano & Pickering: Science
    • Marzano & Pickering: Math
    • Marzano & Pickering: Social Studies
  • A Visual Thesaurus® visualization based on the selected keyword in the word cloud
  • Links to images and videos related to the active word
  • Examples of selected words in context (in a sentence)

With all of these results, an instructor may be wary of using this tool with students.  However, a structured worksheet can guide the learners through a self-paced activity leading to discovery.

Example of a student activity

  • Locate a digital text resource (ex. News in Levels – London Kicks Out Uber)
  • Read the article with the students
  • On their devices or workstations, students open Wordsift
  • Students open the text resource software or website in another browser tab
  • Students copy the relevant text
  • In Wordsift, students paste the text into the textbox
  • Students click on Sift!
  • The results appear as a word cloud
  • Students are charged with locating 8 to 10 words and identifying
    • part of speech
    • a relevant collocate
    • a sample sentence
    • a definition, based on the sample sentence
    • a suitable image
  • Students can prepare their submission in Word or PowerPoint.

Where can students locate texts?

Potential sources of texts for English language students are listed below in alphabetical order.

  • Breaking News English offers level appropriate current events articles with a myriad of related activities.
  • Newsela provides level appropriate articles on current news events with a writing prompt and a multiple choice quiz.
  • News in Levels publishes newspaper articles in three levels for language learners accompanied by an audio script listening option.
  • Project Gutenburg is a resource that offers over 54,000 free eBooks.


Data driven learning is not likely to be included in your syllabus, but it may be something to explore. It could be used to expand upon the learning activities that you and your students use or as a break from the normal routine.  I have created a DDL and WordSift “how to” sheet for instructors.  If you have experienced DDL or have thoughts on this, please comment below.


Breaking News English

Data-driven Learning “How To” Introductory Activity


News in Levels

News in Levels: London Kicks Out Uber – level 3

Project Gutenburg



Johns, Tim (1991). “Chapter 2: Should you be persuaded: Two examples of data-driven learning” (PDF). Classroom Concordancing. Birmingham: ELR.


Hi—I'm John Allan. I am an educator who works in the technology enhanced language learning field. I create online learning opportunities and mentor instructors on the Avenue project. I have experience teaching ESL and EFL in Canada and the Middle East. I hold an MSC in Computer Assisted Language learning, a M.Ed. in Distance Education, TESL B. Ed., a B.Ed. (OCT), and a variety of TESL relevant certifications from TESL Canada, TESL Ontario and the Ontario Ministry of Education. For more articles, learning objects, projects and blog links see


8 thoughts on “Data-driven learning with WordSift”

  1. Hi John, thanks for the blog. The example student activity is really helpful in seeing how WordSift can be used in the class.

    Also, I think the blog is a great intro to the webinar you will be presenting on Tutela coming up on November 22!

  2. Beth, a long time ago, I thought that DDL was going to be a staple when teaching language learners. It has not turned out that way. I am hoping that some instructors look deeper into DDL and try some activities with their students. I am looking forward to the webinar to see if there is any interest in this area.


  3. Hello John,
    I really like your lesson ideas using WordSift. In my June TESL Ontario webinar “Activities to Enhance Vocabulary Learning,” I briefly provided an overview of this tool for attendants to explore, including using it as a word map to teach students how specific vocabulary might have different meanings and uses. For example, the word ‘scale’ can be a word and a verb. Its meaning will depend on context (weight, fish, wall) or discipline – architecture, sports, and so on). The great thing with a tool like this is that teachers can go basic or more advanced; students can explore one word, a sentence, or an entire text as you describe above.

  4. Thanks for writing this post on the TESL Ontario blog. Ahem. Just trying to counter erroneous usage of the noun “blog” into the corpus. Oh, I know. Descriptive / prescriptive. But I can try, can’t I, to keep folks using the terms in the original way? This whole thing is a web log and John has written a post, folks. 🙂
    I’m always happy when I see posts about the use of corpora in classrooms. I am fascinating by them. But I do find it difficult to get students into it. Right now I’m using Skell Easy Corpus due to the simple interface, but students still look at me like I have three heads when I try to show them the benefits of checking corpora to learn about usage and frequency.

    They find it overwhelming, I think. And I find the fractal-like branching graphic representation, as on WordSift, to be overwhelming. Maybe that is why corpus use in classes is not catching on very quickly.

  5. Kelly, Thanks for the tip on Skell Easy Corpus, it looks like a good, quick tool for simple DDL activities. I thought, 120 years ago, that DDL would be a standard part of language instruction but I also thought we would be driving flying cars. 🙂 My aim is to bring awareness to the TESL Ontario community of basic DDL through this blog post and a webinar. If there is a chance to use DDL and my students can benefit, I introduce this activity to them. Some students benefit while others are indifferent. The Skell EC is a faster option with less distractions, I may use it next time.



Leave a Reply

Your email address will not be published. Required fields are marked *