Improve quality content/Opportunities to expand content - topical

Important trends

There are several important trends that are changing the ways that topical content is created and consumed. These include:

  • Social media tools providing new ways for people to share real-time information and news
  • The rise of citizen and participatory journalism and hyper-local reporting


Thus, the topical end of the content landscape is changing in important ways. Users are gaining more control over the creation and consumption of news and other local information, and a new media ecosystem is emerging that combines news, data, and direct opinion. And all of this may be happening at the expense of traditional media outlets, who appear to be losing their dominant position in an increasingly fragmented arena. In a recent blogpost, Jose Antonio Vargas (the Technology and Innovations editor for The Huffington Post) explained it this way: "Yes, news is still about accuracy and fairness, about shoe-leather, on-the-ground reporting, about facts speaking for themselves. But news in this social networking, here-comes-everybody era is also about connection, conversation and community. News is connection: you, the reader, can connect with the content on the screen, be it text, audio or video, or a mash-up of all three. News is conversation: an article is not the final product but merely the beginning of a conversation -- and you, the reader, can add to it, question it, pass it around within your own online social networks. News is community: when you visit a news site, you must feel like you're a part of that digital news hub." [1]


Author Steven Berlin Johnson offered his take on this changing landscape in a recent speech he gave at the South by Southwest Interactive Festival. As he explained, this new media ecosystem will result in "more content, not less; more information, more analysis, more precision, (and) a wider range of niches covered." His entire speech, as well as a very interesting framework for how he thinks about this ecosystem, can be found here.


To provide a few interesting examples that highlight these changes:

  • To date, there have been 367K CNN iReports worldwide
  • Facebook now has 300M active members worldwide
  • Over 25M tweets are sent per day (with a total of 5 billion sent since the site launched)
  • 77% of active Internet users read blogs and there are an avg. of 900K blog posts per day


Where does Wikimedia stand now?

Until recently, most traditional news organizations would have considered Wikimedia an unlikely competitor. Now, however, it has become impossible to ignore Wikipedia's popularity when it comes to content that has historically been the domain of professional journalists and media companies.


As Matt Thompson, a journalist and researcher into next generation new sites, explained in an interview (link to interview notes), "Wikipedia is probably the first and in some cases the only place that folks can think of to go when they want a background to a news story". As examples of the progression, he cited two New York Times stories in 2007 that noted Wikipedia's effectiveness and prominance when it came to covering breaking new stories (such as the Virginia Tech massacre), as well as a 2008 statistisic that "for 4 out of the top 5 stories for the year on Google, blogs outranked the New York Times, but the really surprising thing was that Wikipedia trounced them both." An even more recent example can be seen with the news surrounding Michael Jackson's recent death. In the four weeks after the event, Google News (7.1%) and Wikipedia (6.8%) received the greatest number of clicks from related searches. CNN, the traditional media outlet with the greatest number of clicks, came in 10th place with 1.5% (add link to source).


In fact, Wikipedia has become such a popular source of news and localized information that it has essentially rendered Wikinews unnecessary. In the words of a recent New York Times article, "So indistinct has the line between past and present become that Wikipedia has all but strangled one of its sister projects - Wikinews." [2]


This fact package will attempt to dig deeper into Wikimedia (and Wikipedia in particular) as a source of topical content, as well as to identify areas where there might be room to expand the amount and types of information provided in this area. In particular, it will dig into Wikimedia's current position relative to:

  • Content breadth
  • Community support for localization
  • Credibility


Analysis of the "Wikipedia model of news"

In order to understand the strength of the "Wikipedia model of news", it helps to call out a couple of important aspects of the traditional news model (at least when it comes to online content). Under the traditional news model, a small number of editors work to publish discreet stories that are meant to be relevant in the moment and then replaced with updated stories as events unfold. As Matt Thompson sums it up, "They live for the moment and then go away." In the online space, however, this results in a series of different stories posted at different URLs. As the news cycle rolls on, the number of URLs continues to grow, and no one article becomes authoritative enough to acquire a high search ranking.


Contrast this with Wikipedia articles, where every topic has a single, comprehensive, and consistent topic page and at any given time a group of editors are collaborating to pull in or link to all relevant information. This structure ends up giving Wikipedia not only a search rank advantage, but also a content advantage over traditional news sources. Two quotes help to illustrate this point:


"The key to Wikipedia’s success is that its pages are designed to catch traffic, provide key information and then send users on their way to deeper engagement on the subjects they’re interested in" (Neiman Lab) [3]


"Today, in online news, publishers frequently publish several articles on the same topic, each at their own URL. The result is parallel Web pages that compete against each other in terms of authority and placement in links and search results. Consider instead how the authoritativeness of news articles might grow if an evolving story were published under a permanent, single URL as a living, changing, updating entity. We see this practice today in Wikipedia’s entries and in the topic pages at NYTimes.com. The result is a single authoritative page with a consistent reference point that gains clout and a following of users over time” (Marissa Mayer, Google VP, in a recent testimony to Congress).[4]


Traditional news organizations have taken notice of Wikipedia's success, and there are signs that they are adjusting their own strategies (here is an article about how the AP is attempting to better position itself to compete with Wikipedia). The best example of success so far is the New York Times, which has implemented it's own version of topic pages. “I absolutely think that news organizations are paying a lot of attention to Wikipedia page rank and wondering how they can get that," explained Matt Thompson. "The notion of topic pages has become very current in the world of news organizations; they are all mulling or implementing topic page strategies.”

Analysis of topical content breadth

At this point, there is no way of knowing how much topical content is available on Wikipedia. What we do know, however, is that topical content is popular with users of many language projects (~60% of page hits for the top 100 pages in enWikipedia are in pop culture or current events, and the statistics are similar for other languages). Some additional context might also be helpful here. It is striking that 80% of page hits for the top 100 most popular sites on the Japanese Wikipedia are related to pop culture. What is even more striking, however, is that this content is very specific to Japanese culture, and to people interested in specific music groups or anime characters. The Hindi Wikipedia is another good example. On the surface it looks like users of that project might not be as interested in topical information. It is interesting to note, however, that most of articles being accessed in Culture and the Arts, Geography, History and Politics are specific to India, and often to specific Indian cities or states.


In order to understand the popularity of topical content on Wikipedia, it may help to think about what information people probably look for frequently, but haven't traditionally been able to find through easily accessible information sources. As Steven Berlin Johnson points out, "I adore the City section of the New York Times, but every Sunday when I pick it up, there are only three or four stories in the whole section that I find interesting or relevant to my life – out of probably twenty stories total. And yet every week in my neighborhood there are easily twenty stories that I would be interested in reading: a mugging three blocks from my house; a new deli opening; a house sale; the baseball team at my kid’s school winning a big game. The New York Times can’t cover those things in a print paper not because of some journalistic failing on their part, but rather because the economics are all wrong: there are only a few thousand people potentially interested in those news events, in a city of 8 million people. There are metro area stories that matter to everyone in a city: mayoral races, school cuts, big snowstorms. But most of what we care about in our local experience lives in the long tail" [5]


Thus, by providing easier access to the long tail of local and current information (and compiling all relevant information into one place), Wikipedia's digital platform and mass collaboration model have enabled users to create and consume topical content in new and expanding ways. This is an especially powerful model for localized Wikipedia projects that have access to a wide enough base of relevant online content (both in terms of language and the focus of the localization) for the community to compile, link to, and reference. Conversely, the model may be less viable where it would potentially be valued most: places that do not even have that foundational content to build on (an example here would be Sub-Saharan Africa, where there is a notable lack of content available in Swahili, the local language).


In order to understand if a topical content model is viable for Wikimedia, it would therefore be helpful to understand more about both the local content that is available already on Wikipedia, as well as the landscape of such information that is more broadly available online. This could help identify high potential local Wikipedia projects, as well as barriers that would need to be addressed if Wikimedia wants to be successful in other localized areas.

Analysis of community support for localization

If a base of local content is a prerequisite for the start of a viable topical content model, then so is an active local community that is large enough to effectively curate, improve, and maintain that content in line with Wikipedia processes and standards. There are already concerns that the Wikipedia community (for the English project, at least) lacks diversity, and that this diversity both skews and limits content growth. As Wikipedia scales down, these concerns become even more pronounced. As Matt Thompson explains, "I am concerned that at the local level, you find that who the editors are starts to take on more prominance. The community that forms around the project has more visible markings, and current editing structures start to break down."


There are already some signs that the community is interested in heading down the topical content path. The strategic planning process Call for Proposals has resulted in several proposals for Wikipedias geared towards more localized languages or even specific geographies. And local WikiProjects already exist within the broader language-based Wikipedias. One example of this is the WikiProject for Columbia, Missouri, which seeks to faciliate collaboration between Wikipedians interested in contributing to a series of articles about the city. It includes project announcements, a list of project participants, and a way to track the status of current articles and direct participants to high priority article and article quality gaps. (Please see the talk page for more examples of local WikiProjects)


For a similar model to succeed at any kind of scale, however, Wikimedia would need to find ways to generate and sustain community interest and growth at a more localized level. On the one hand, this involves raising awareness about smaller communities and building communittee features to make it easier for like-minded contributors to find each other and work together. On the other hand, it requires expanding the contributor pool by making it easier for users who discover a relevant local community to join the contributor ranks (for example, through easier editing technology).


Analysis of Wikimedia as a credible source for topical content

Knowing that article quality and quality assurance are key issues when it comes to Wikipedia and core reference content, it may be worthwhile to explore whether or not there are any critical differences when it comes to breaking news and other local information.


There is not much information currently available on this topic, but there are two points that may make for a good place to start:

  • People who come to Wikipedia for topical content may be choosing to do so instead of visiting the site of a traditional news organization
  • People who come to Wikipedia for topical content are likely to do so as that content is being created or updated


The first point is relevant because traditional news organizations still benefit from a certain level of authority. As Matt Thompson pointed out, "Journalists perceive credibility as a Wikipedia weakness, and in general the public sees news organizations as having higher credibility. There is still a notion of really strong, really trusted individual brands that carry authority and engender trust." Wikipedia's popularity with this type of content is clearly a sign that credibility is not a significant barrier to use. At the same time, Matt's point may suggest that there is not a ton of room for error.


This brings up a connection to the second point. Even with low vandalism rates and a quick average revert time, it seems logical to assume that articles about breaking news are more likely to attract vandalism, and that more people are likely to see that vandalism in the short time that it persists. Thus, it may be true that the eventualist approach to quality, and Wikipedia's current approach to quality assurance, are not entirely sufficient for all types of topical information.


While we know that traditional media organizations have an authority advantage, and may benefit from additional "benefit of the doubt", we also know that they too are struggling to define quality control in a world of online news. As an American Journalism Review article entitled "The Quality-Control Quandry" asks, "How far can you cut editing without crippling credibility? How do you balance immediacy with accuracy? How much does fine-tuning matter to the work-in-progress online ethos?" [6]. And this is where Wikipedia might have an advantage - transparency. Even if there is vandalism or editorial debate or mistake corrections, readers should theoretically have access to this information through edit histories and talk pages (which replicate many of the processes that go on in the newsroom, but that are usually only accessible to people involved in those activities). The problem, though, is that if you aren't familiar with Wikipedia's culture and structures, a talk page is very difficult to understand. And if a reader can't understand what is going on, they can't benefit from that transparency or even contemplate joining the conversation.


Thus, there may be some benefit in further exploring these issues. Is there a way for Wikipedia to make the editing process more transparent to all users in the moment? How could users be provided with, and contribute, real-time quality feedback (e.g. user ratings) that enables the community as a whole to quickly and easily judge an article and determine where there are likely to be issues as an event unfolds and its related article continues to evolve?

Potential areas for expansion

Given Wikipedia's current position and unique strengths, there appear to be several different approach that could be taken to expanding the breadth of topical content, including:

  • Increase opportunities for users to create and access topical content
    • Additional smaller, more localized language projects (e.g. Canadian French)?
    • Projects with a geographic or topic-specific focus?
    • Supporting other initiatives that expand the amount of content available online in local languages?
  • Improve Wikipedia's ability to support smaller, more localized communities
    • Technology
    • Other community features and calls-to-action
  • Expand to include different types of topical content (e.g. opinion)
    • Focus on projects where topical content is already popular?

Notes

  1. [1]
  2. The New York Times, "All the News That's Fit to Print Out" [2]
  3. [3]
  4. [4]
  5. [5]
  6. Carl Sessions Step, "The Quality-Control Quandry", American Journalism Review [6]