Joint SLAC-LCLS-NERSC-ESNET-LANL- CAMERA Collaboration on Exascale Processing to Analyze
Cori supercomputer at NERSC. In a joint collaboration, CAMERA scientists, together with supercomputer experts at NERSC and network specialists at ESNET will team up with LCLS/SLAC and LANL scientists to build new algorithms, wokflow, and processing power to tackle the vast volumes of data expected to come from next generation light source experiments.
"X-ray lasers, such as SLAC’s Linac Coherent Light Source (LCLS) have been proven to be extremely powerful “microscopes” that are capable of glimpsing some of nature’s fastest and most fundamental processes on the atomic level. Researchers use LCLS, a DOE Office of Science User Facility, to create molecular movies, watch chemical bonds form and break, follow the path of electrons in materials and take 3-D snapshots of biological molecules that support the development of new drugs.
At the same time X-ray lasers also generate giant amounts of data. A typical experiment at LCLS, which fires 120 flashes per second, fills up hundreds of thousands of gigabytes of disk space. Analyzing such a data volume in a short amount of time is already very challenging. And this situation is set to become dramatically harder: The next-generation LCLS-II X-ray laser will deliver 8,000 times more X-ray pulses per second, resulting in a similar increase in data volumes and data rates. Estimates are that the data flow will greatly exceed a trillion data ‘bits’ per second, and require hundreds of petabytes of online disk storage.
As a result of the data flood even at today’s levels, researchers collecting data at X-ray lasers such as LCLS presently receive only very limited feedback regarding the quality of their data.
“This is a real problem because you might only find out days or weeks after your experiment that you should have made certain changes,” says Berkeley Lab’s Peter Zwart, one of the collaborators on the exascale project, who will develop computer algorithms for X-ray imaging of single particles. “If we were able to look at our data on the fly, we could often do much better experiments.”
Amedeo Perazzo, director of the LCLS Controls & Data Systems Division and principal investigator for this “ExaFEL” project, says, “We want to provide our users at LCLS, and in the future LCLS-II, with very fast feedback on their data so that can make important experimental decisions in almost real time. The idea is to send the data from LCLS via DOE’s broadband science network ESnet to NERSC, the National Energy Research Scientific Computing Center, where supercomputers will analyze the data and send the results back to us – all of that within just a few minutes.” NERSC and ESnet are DOE Office of Science User Facilities at Berkeley Lab.
X-ray data processing and analysis is quite an unusual task for supercomputers. “Traditionally these high-performance machines have mostly been used for complex simulations, such as climate modeling, rather than processing real-time data” Perazzo says. “So we’re breaking completely new ground with our project, and foresee a number of important future applications of the data processing techniques being developed.”
This project is enabled by the investments underway at SLAC to prepare for LCLS-II, with the installation of new infrastructure capable of handling these enormous amounts of data.
A number of partners will make additional crucial contributions.
At Berkeley Lab, we’ll be heavily involved in developing algorithms for specific use cases,” says James Sethian, a professor of mathematics at the University of California, Berkeley, and head of Berkeley Lab’s Mathematics Group and the Center for Advanced Mathematics for Energy Research Applications (CAMERA). “This includes work on two different sets of algorithms. The first set, developed by a team led by Nick Sauter, consists of well-established analysis programs that we’ll reconfigure for exascale computer architectures, whose larger computer power will allow us to do better, more complex physics. The other set is brand new software for emerging technologies such as single-particle imaging, which is being designed to allow scientists to study the atomic structure of single bacteria or viruses in their living state.”
The “ExaFEL” project led by Perazzo will take advantage of Aiken’s newly formed Stanford/SLAC team, and will collaborate with researchers at Los Alamos National Laboratory to develop systems software that operates in a manner that optimizes its use of the architecture of the new exascale computers.