data and analysis

data and analysis

Hey John, thanks for all your hard work assembling data about peoples' contributions. There are two areas related to the community health task force that I'd like to learn more about. (1) Why some new users survive their first few edits, and some don't. (2) Why veteran users eventually leave.

On the first topic, I think it might be helpful to compare the first two or three groups of editors. What topics/areas do the different groups (1 edit, 10 edits, 100 edits) focus their time on, and is there a trend that certain areas make for an easier socialization process? Compare the first edit of the three groups -- is the one-off group making lousier contributions than the 10 group or the 100 group, or do they have worse luck with the community? What other data can we assemble about peoples' early edits, and finding trends about how to survive passed the 10 edit threshold?

On the second topic, it's a bit more feasible to do more detailed surveys that target veterans who left Wikipedia. I think the most obvious reason for turnover is that they start devoting their time to other things. But we can dig deeper. Maybe we can ask where the bulk of their time was being used by the end of their Wikipedia life, and how that changed from their favorite days at Wikipedia. We can also ask general questions about what, in their mind, made them leave. But we want to be careful that it doesn't become a soapbox for where they personally think Wikipedia should go. We'll have to ask good questions that look for underlying patterns, rather than specific viewpoints.

I know this is a lot of work, and I haven't the faintest idea how much you might be able to manage. At this point, I'm just throwing out some ideas. Randomran 17:35, 29 October 2009 (UTC)

Randomran17:35, 29 October 2009

Hi randomran. I'm a big fan of the ideas you're playing with. That said, I wonder how we could track users over time to assess their contributions--this strikes me as something that may be too resource-intensive (time, analysis) for task force members.

That said, we've pulled together some information here, but there may be more things to explore.

Looking forward to seeing how this all progresses! I wonder if a wide-scale survey is feasible in our timeframe; are there other options to pursue, like drawing on the task force members' contacts and doing more informal surveying?

JohnF20:25, 29 October 2009

Apologies--it looks like you've already seen the analysis we posted up. Looking forward to hearing your thoughts?

JohnF20:31, 29 October 2009

Thanks John. I figured some of my ideas weren't very feasible. An informal survey could give us a starting point. I think a lot of us have some casual understanding of the problem, but my hope is to have something a bit more scientific. Instead of a long-term study (which would take a lot of time and analysis), maybe we could do a sample? Look at a random 20 one-shot users, and examine their edits. Then compare it to a random 20 users who have contributed 10-100 edits, and see if their first edits were noticeably different in some way? I wouldn't know the first thing about finding those users, and picking a random sample. But I know the analysis could be relatively quick and easy once we found our sample. Randomran 21:00, 29 October 2009 (UTC)

Randomran21:00, 29 October 2009

Interesting thoughts!

I wonder if this kind of analysis is an absolute necessity. We know that the new editors' edits are being reverted at high rates, which is the issue we're wrestling with here, but importantly (in my eyes), we know that the revert rate for every type of editor (experienced, inexperienced, etc.) is increasing. That, to me, suggests a broader trend of unfriendliness (as opposed to declining quality).

What do you think?

JohnF21:08, 29 October 2009

It's hard to get to a specific solution without identifying the specific problem. I think that some of it is generally unfriendliness. But we want know if the hostility is in a particular topic area. It would also be helpful to know if people are being reverted under the guise of enforcing one particular policy, or if they're being reverted on the whim of a few jerks. And where are they most likely to be reverted -- featured articles, stub articles, popular articles, or what? And are these edits even reasonably useful? I think we want to work on how we treat new users and socialize them into Wikipedia's culture, but it's hard to know what we need to work on with more specifics. It will be a lot of subjective arguments of "I think they hate what I hate", versus "no, I think they hate what *I* hate". It would genuinely be helpful to see what it is that the 100-edit editors are doing differently than the 1-edit editors, so that we can figure out how to close the gap.

Mostly thinking out loud. We may be stuck with the data we have, and we can still do a lot with just that.

Randomran21:33, 29 October 2009

Ah, I see your point. Knowing the malady better will help to tailor a better solution. You got my attention, in particular, with where new editors are being reverted--stubs, long articles, and so on.

Do the other task force members have contacts/ideas along these lines?

JohnF21:46, 29 October 2009

I'm not sure. You could ask them. Part of the problem of being a veteran is a lot of your contacts end up being at least intermediate level, and it's hard to get more than a few anecdotes about the experience of new users. But if someone knew how to sort out some 1-shot users from 2009, and some other 10-100 edit users from a similar time period, I'm confident I could bang out an analysis pretty quickly. You'd just need a random sample of 20, and a quick comparison of their edits. I've seen pages like this. But it would be much more interesting to look at new users who survived, and compare them to new users who didn't. It would also be more interesting to look at editors with more than 100 (or even 1000 edits) who haven't edited recently, and look at their last slew of edits.

Randomran21:55, 29 October 2009