Wednesday, July 8, 2015

New data!

Recently the LHC has started Run 2 and this week there are high intensity beams delivering a lot of data. This is a very exciting time to be a physicist and what I have been waiting for the past two years. In fact, my contract as been extended to make the best use of this new data. No amount of working with simulation or theory can compare to working with real data. This job has a lot of advantages and disadvantages, and right now the job is extremely motivating and enjoyable.

What the first 2015 data looked like at CMS.

Something I've wanted to do with new data is automatically analyse them as they arrive, and now I can do that for the first time. (I had tried when I worked on ATLAS, but due to time and technology constraints it was not possible.) Now that we have a globally distributed computing system where any computer can access any file at any time I can stream the latest data each night and run the analysis before I even wake up. So that's what I'm doing right now and it's one of the most fun projects I've ever developed! I've spent the past couple of years preparing and tweaking the software so that I'd have a push-button system in place by now, and it's worked very well so far. There are going to be many teething problems, of course, but they're minor compared to the labour that will be saved. My first large test job is currently running, analysing at its peak 300 events per second (100 events per second, once I/O is taken into account.) Following on from a previous post about how my personal computing resources are not sufficient to do all the work I need to do, I've contacted IT support and been given a generous allowance of disk space, as well being able to run my CPU and I/O intensive jobs overnight. This is going to make a huge difference, and has allowed me to support my students with the simulations and datasets that I analyse. Seeing all this come together is a wonderful experience.

This is how excited physicists get when we first see beams in the machine! I was a very proud Shift Leader that day.

On the one hand this is, of course, very rewarding, but on the other hand a little disappointing that this isn't already being done. There seems to be a culture in particle physics that real work is about manually submitting jobs and analysing data, and there are politics associated with the notion that people should take turns in carrying the burden. We have the technology and expertise to run an analysis automatically, so why not do that? That's what we would see in the private sector. We have access to some of the largest datasets the world has ever seen and some of the most powerful computing resources ever created, it would be a crime not to exploit those as much as possible. I get the impression that I'm the only person doing this kind of work, and if that's the case then it's certainly time to move on to a different workplace. But first, I have data to analyse! I love data.

