Proposal:Interwikis and categories outside article code

Status (see valid statuses)

The status of this proposal is:
Request for Discussion / Sign-Ups

Every proposal should be tied to one of the strategic priorities below.

Edit this page to help identify the priorities related to this proposal!


  1. Achieve continued growth in readership
  2. Focus on quality content
  3. Increase Participation
  4. Stabilize and improve the infrastructure
  5. Encourage Innovation


A feature request or bug related to this proposal has been submitted to Bugzilla under ID 167.

See Category:Proposals with Bugzilla submissions for all submitted bugs.

Summary

The interwikis and categories code should be put outside the article code.


Proposal

I propose the interwiki and categories to be put in distinct boxes, instead of together with the article content.

The several advantages of doing so (usability, server load, etc.) will be detailed in the #Motivation section.

Statistics and numbers were taken from the french wikipedia (the 3rd biggest one) at mid-july 2009. Theses should

Motivation

Reducing servers load

Automated interwiki editing amounts for roughly 9% of all edits in the (main) namespace. That's 1600 edits per day on frwiki alone.

Please note that there's also several "manual interwiki edits" and "categories edits" whose amount can't be counted automatically, as they don't have a common edit summary.

Every of these edits are treated as normal edits, which means : the whole page is needlessly processed by MediaWiki's parser (updates of tables for internal links/external links/categories/interwikis, saving the old version of the page, passing through the abuseFilter/flaggedRevision where enabled, etc.). If they were in a dedicated box :

  • only a tiny part of the processing would need to be done when editing IW/cat (Interwikis/categories), with a bit of history management ; and reversely, that part wouldn't have to be processed when doing a "normal" edit (load gain)
  • Pages history would be lightened, as IW/cat would have their specific history (space and usability gain)
  • All automated edits of IW/cat could go through the API (it's currently possible to read the IW/cat of an article with the API, but the script needs to get the whole article since it has to submit a modified version of said article afterwards anyway) (bandwidth/load gain)
  • With some tweaks, it should be possible to not invalidate the page cache when an IW/cat edit occurs (some techie confirmation is needed on that one) (load gain)
Improving usability
  • putting things in dedicated box makes things less confusing for inexperienced users : everyone that has done some bits of article patrolling has to have seen occurrences of IW/cat code removal by newbies

Key Questions

  • exact percentage of edit it'd affect ?

Potential Costs

Financial

I estimate this proposal would need 2 weeks worth of work (counting for one person) for the programming itself, followed by several(?) other weeks of testing. Seeing large, I suppose it translate to 3 months worth of one dev's salary.


programming side
  • 2 new tables (one for category changes history, one for interwiki changes history)
  • This need parser/database/interface alteration. The parser is especially evil and any bug would have serious consequences.
  • No direct change is needed for "current version of interwiki" storage ; but but a removal from the "current article content" table will have to be progressively done (temporary additional load).

References



Community Discussion

Do you have a thought about this proposal? A suggestion? Discuss this proposal by going to Proposal talk:Interwikis and categories outside article code.

Want to work on this proposal?

  1. .. Sign your name here!