Data warehouse for stats on content served

Proposal

Build an infrastructure (database + API) to better support community generated reports on content served.

Example major beneficiary: GLAM

Input for database: Domas's hourly page view dumps, extended with image view counts, possibly also geo info.

  • Focus for WMF : design database and API, erect database , get operations into place
  • Focus for community : build API, build reports

Needed

  • Position paper on infrastructure and governance (if possible before German Hackaton in May)
  • API design
  • Mockup or small scale implementation (e.g. in Hadoop)