Interview: Data processing for particle physics at Cern

uttu
7 Min Read


It is just over a year since Cern, home to the Large Hadron Collider (LHC), became the base for the three-year pilot phase of the Open Quantum Institute (OQI). The OQI is a multi-stakeholder, global, science diplomacy-driven initiative, whose main objectives include the provision of quantum computing access for all and the acceleration of applications for humanity.

Speaking to Computer Weekly in June at the inaugural Quantum Datacentre Alliance forum, held at Battersea power station in London, Archana Sharma, senior advisor for relations with international organisations and a principal scientist at Cern, described the OQI as “an evaluation of where we are in terms of quantum computing, quantum networks, quantum computers” that “allows us to somehow take stock of what is happening at Cern”.

“Cern’s mission is particle physics,” she says. “We can’t just close particle physics and get started on quantum computers.”

But Sharma believes there may be potential synergies between the development of quantum technologies and the research taking place at Cern. The acceleration in the particle accelerators happens due to various forces, she says. “All the processes that are happening while the acceleration is going on are very much quantum mechanics.”

Moreover, quantum mechanics is the magic that enables the particle accelerator’s various detectors to collect the results from the experiments run by the scientists at Cern.

And there are vast amounts of data being produced by these experiments. In fact, technology developed to support particle physics experiments at Cern, called White Rabbit, is set to be applied in the pursuit of error correction in quantum computing. White Rabbit is an open source precision timing system boasting sub-nanosecond accuracy, which is distributed via Ethernet.

UK-based quantum networking technologies firm Nu Quantum recently joined Cern’s White Rabbit Collaboration. The technology from Cern offers Nu Quantum a way to deliver synchronisation at the necessary level to scale quantum computing networks.

Computing in pursuit of particle physics

The web came out of an idea from Tim Berners-Lee when he was at Cern, and today the home of the LHC maintains several GitHub repositories and has developed a number of open source platforms in its pursuit of advancing particle physics research.

Computing is one of the three pillars of Cern. “The first [pillar] is research,” says Sharma. “The second one is the infrastructure, which means the accelerators, the experiments and the detectors. And then there is computing.”

Sharma says Cern has been evolving its computing centre capabilities to meet the demands of the infrastructure required by the experiments.

“We need to ensure that we are looking at good data and recording good data,” she says, which means Cern has to whittle down data from 40 million collisions per second down to about 1,000 initially, and then to 100. 

This processing needs to occur extremely quickly, before the next collision in the particle accelerator is detected. She says the processing time is around 2.5 milliseconds.

The sensors, to use Cern terminology, are “channels”, and there are 100,000 of these channels to process per experiment. Cern relies on pattern recognition and machine learning to help with the processing of the vast datasets produced during experiments and create simulation models, as Sharma explains: “That’s the biggest tool we have. We have run a lot of simulations to produce models that tell us how each collision will be read out.”

Archana Sharma CERN QDA CREDIT Hermione Hodgson

“We need to ensure that we are looking at good data and recording good data”

Archana Sharma, Cern

In effect, the models and the simulations enable Cern to streamline the collection of trigger data or collisions identified as tiny electrical signals from the sensors across the 100,000 channels that need processing during an experiment.

The trigger data is used for reconstruction, where the measurements of energy from the sensors are summed up. The reconstruction is effectively a simulation of the experiment using the observed data. In enterprise IT, such a setup may be considered an example of a digital twin. However, Shama says Cern’s simulation is close but cannot fully be classed as a digital twin.

“We are not exactly a digital twin because the software for the physics is probabilistic. We try to be as close as possible,” she says.

The good, the bad and the plainly wrong

In the world of data processing, the task at hand is one of predictive analytics, in that it is built on science using prediction theory. “We are standing on the shoulders of predictions – we measure and we corroborate what we are told against what the theory predicts,” says Sharma.

Either the observations, based on the data collected from the experiment, support the theory, or something is wrong. Sharma says a wrong result may mean the theory is not right and needs tweaking, or it could mean there are calibration errors in the LHC itself.

The LHC is about to enter a “technical stop” phase, a three-year shutdown where it will be upgraded to support new science. One area of improvement, according to Sharma, is a 10 times improvement in luminosity, which she says will enable it to gather 10 times more data.

Along with the work that will be required on the infrastructure and the detectors at Cern, Shama says Cern’s computer centre is also preparing for the vast increase in data that will need to be processed.

Share This Article
Leave a Comment