Cricket Tales – scaling up

Work on cricket tales the last few weeks has been concerned with scaling everything for the sheer amount of data involved. The numbers are big – we’re starting with the footage from 2013 as a test (a ‘smaller’ year), where 145 cameras recorded in total 438 days worth of video of cricket burrows. Our video processing robot is currently chopping this up into 211,889 sped up one minute clips, and encoding them into webm, ogg and h.264 mp4 for maximum browser compatibility. It looks like this would take a few months to do in total, and over the last week or so we have 8,000 videos processed.

vid1

Now we have a framework in place that will support this quantity of data, for example we can be continuously processing video while people are tagging – and swap them in and out so we don’t need to fit them all on disk. The database contains an entry for every video with a status (currently ‘not encoded’, ‘ready’ and ‘complete’). Movies could be marked complete after some number of players have watched it or some correlation metric with their tags has been met, then the files can be deleted off the disk. In terms of feasibility, we had 68K people playing the camouflage citizen science games over the last year – so if we managed to get that many people we’d need them to view 3 minutes each to get the entire dataset watched once.

map

This is a fictional map of the cricket’s burrows, with the name of the player who contributed most to them so far displayed – one idea for the ‘play’ element, which is the next big thing to consider. This is a balance between making it quick and fun and making people feel like they are contributing to something bigger and being able to see a tangible result for their efforts. Do we aim for something that takes someone five minutes of their time and getting enough people to make it work, or do we aim for a smaller number of more dedicated players who keep coming back? We have loads of options at this point, and to a large extent this depends on the precise nature of what the researchers need – so we need to do some more thinking together at this stage.

Leave a Reply

Your email address will not be published. Required fields are marked *