Wikipedia:De kroeg/Archief/20191124


Wikimedia Research Showcase bewerken

Iedere derde woensdag van de maand is er via youtube een livestream met de nieuwste onderzoeken binnen de Wikimedia-beweging. Voertaal is Engels.

Aankomende woensdag is de laatste van dit jaar, om 18.30 Amsterdam tijd. Er worden 2 onderwerpen gepresenteerd:

  • Wikipedia Text Reuse: Within and Without, door Martin Potthast, Leipzig University
We study text reuse related to Wikipedia at scale by compiling the first corpus of text reuse cases within Wikipedia as well as without (i.e., reuse of Wikipedia text in a sample of the Common Crawl). To discover reuse beyond verbatim copy and paste, we employ state-of-the-art text reuse detection technology, scaling it for the first time to process the entire Wikipedia as part of a distributed retrieval pipeline. We further report on a pilot analysis of the 100 million reuse cases inside, and the 1.6 million reuse cases outside Wikipedia that we discovered. Text reuse inside Wikipedia gives rise to new tasks such as article template induction, fixing quality flaws, or complementing Wikipedia’s ontology. Text reuse outside Wikipedia yields a tangible metric for the emerging field of quantifying Wikipedia’s influence on the web. To foster future research into these tasks, and for reproducibility’s sake, the Wikipedia text reuse corpus and the retrieval pipeline are made freely available. Paper Demo
  • Characterizing Wikipedia Reader Demographics and Interests, door Isaac Johnson, Wikimedia Foundation
Building on two past surveys on the motivation and needs of Wikipedia readers (Why We Read Wikipedia; Why the World Reads Wikipedia), we examine the relationship between Wikipedia reader demographics and their interests and needs. Specifically, we run surveys in thirteen different languages that ask readers three questions about their motivation for reading Wikipedia (motivation, needs, and familiarity) and five questions about their demographics (age, gender, education, locale, and native language). We link these survey results with the respondents' reading sessions -- i.e. sequence of Wikipedia page views -- to gain a more fine-grained understanding of how a reader's context relates to their activity on Wikipedia. We find that readers have a diversity of backgrounds but that the high-level needs of readers do not correlate strongly with individual demographics. We also find, however, that there are relationships between demographics and specific topic interests that are consistent across many cultures and languages. This work provides insights into the reach of various Wikipedia language editions and the relationship between content or contributor gaps and reader gaps. See the meta page for more details.

Persoonlijk ben ik enorm benieuwd naar de laatste. Vorig jaar kwam er al een onderzoek uit onder lezers van diverse talen, over hun motivatie om Wikipedia te lezen. Waarom komen al die lezers naar Wikipedia, wat halen ze bij ons? (spoiler: Nederlanders komen meestal naar Wikipedia om feiten te controleren). De research deze woensdag gaat daarop verder.

Kun je woensdag om 18.30 niet, dan zijn de sessies altijd ook achteraf nog terug te kijken. Ciell 17 nov 2019 11:35 (CET)[reageren]