Task force/Offline/UDC categorisation

en.WP uses a range of categories which are poorly organized and non-standard as knowledge category schema. This enhances the difficulties deriving useful metadata from dumps.

Possibilities of adding categories

Parallel categorization within projects

It may be possible to use standardized knowledge categories as a parallel categorization initiative.

Dump-only

Working within the available dumps, it may be possible to provide category cross-referencing to UDC categorization with post-processing.

  • User:Hippietrail is currently developing tools to parse certain of en.WP infoboxes to extract metadata as an extension of the Mediawiki dump DTD.