First Ever Open Access Data From the Large Hadron Collider Helped Physicists Confirm Subatomic Patterns
The release included up to roughly 29 terabytes of data from 300 million high-energy collisions.
Despite what their parents might have taught them, physicists aren’t always the best at sharing with others. Yet a public release of data from the Large Hadron Collider (LHC), back in 2014, has already yielded exciting results: for one, it’s allowed physicists to confirm that a fundamental equation correctly describes what happens in the real world.
In a new study, the recently-confirmed equation describes the jets of particles produced when protons collide and split off into their fundamental parts, particles known as quarks and gluons. This exciting finding came as a result of analyzing open data from 750,000 particle jets produced by collisions in the Compact Muon Solenoid (CMS) experiment — one of the LHC’s largest experiments.
“In our field of particle physics, there isn’t the tradition of making data public,” lead author Jesse Thaler, an associate professor of physics at MIT and a long-time advocate for open access in particle physics, said in a press release. “To actually get data publicly with no other restrictions—that’s unprecedented.”
The LHC’s immense data release, published online on the European Organization for Nuclear Research (CERN) Open Data Portal Website, added up to roughly 29 terabytes of information from 300 million high-energy collisions within CMS. The release was the first-ever of its kind from a large collider to the public.
Though one might assume the fundamental equations used by physicists would be unassailable, many mathematical functions used in physics technically remain theoretical; the math tells physicists that they’re correct, but the pattern these equations describe hasn’t yet been observed through experiments in the physical world.
One such equation was the evolution equation, also known as the splitting function, which since the 1970s has been used to describe the pattern of particles put out by proton collisions.
“This idea had not existed before,” Thaler said in the press release. “That you could distill the messiness of the jet into a pattern, and that pattern would match beautifully onto that equation—this is what we found when we applied this method to the CMS data.”
Thaler’s team used the CMS data to examine each particle collision one by one, looking at the most prominent jet from each and categorizing their emissions as particles cleaved from one another.
The resulting study from Thaler’s team was published in Physical Review Letters.
Colliders have been historically tight-fisted with their data, out of concerns that it could be misinterpreted; often, glitches in the detectors themselves can create the ghosts of new physics phenomena, some persistent enough to fool physicists themselves.
“I think it was believed that no one could come from the outside and do those corrections properly, and that some rogue analyst could claim existence of something that wasn’t really there,” Thaler said.
Yet the team hopes that their success might inspire other colliders to release some information of their own. As Thaler concluded: “Colliders are big endeavors. These are unique datasets, and we need to make sure there’s a mechanism to archive that information in order to potentially make discoveries down the line using old data, because our theoretical understanding changes over time. Public access is a stepping stone to making sure this data is available for future use.”