GISAXS: Current and Future Work
GISAXS at the ALS has greatly evolved due to the advent of high brightness and flux beamlines combined with the advent a single photon counting detectors. Time resolved scattering experiments can produce several thousands of images of data per day. The GISAXS user community expresses an everincreasing demand for the following data analysis capabilities associated with the GISAXS experiments.

Fast and accurate simulation of a wide range of nanastructures. Beamline users would like to perform realtime simulation of scattering patterns while collecting the data at the GISAXS beamline. Comparing the computed and the observed patterns can provide clear guidance to the users for conducting better experiments, for example, by adjusting the GISAXS geometry, sample orientation or parameters for the time resolved sample environments. In addition, the simulation code must be flexible, allowing to simulate the diffraction pattern for any given superposition of custom shapes or morphologies. Thus, we can easily tackle a wide range of possible sample geometries such as nanostructures on top of or embedded in a substrate or a multilayered structure. Existing software code are limited in speed (they are serial codes) and restricted to treatment of only a small set of representative structures which can be described with analytical expressions of their shape.

Morphology characterization from observed scattering patterns. The ultimate goal is to extract the key structural and morphological information from the observed data with high confidence. Therefore, attaining the full power of GISAXS rests heavily on the availability of fast and effective fitting methods that fully capture the structural information embedded in the observed patterns. That is, from the large volume of collected patterns, users would like a simulation tool that deduces the composition of nanoparticles' shapes, size, size distribution, the average internanoparticles distance, etc. This requires solving the highly nonlinear inverse problems for parameters estimation.
Recently, we have been developing an extensible computational framework based on the Distorted Wave Born Approximation (DWBA) theory, with capabilities for rapid simulation of a wide range of possible nanostructures. Our HipGISAXS software is a massively parallel code using C++ augmented with MPI, Nvidia CUDA, OpenMP, and parallelHDF5 libraries on largescale clusters of multicores and GPUs. The current parallel code attains speedups of 200x on a singlenode GPU compared to the sequential code. Moreover, the multiGPU (CPU) code achieved additional 900x (4000x) speedup on 930 GPU (6000 CPU) nodes. In addition to being fast and enabling highresolution simulations, HipGISAXS is more flexible than the other codes; it allows one to simulate the diffraction pattern for any given superposition of custom shapes or morphologies (e.g. obtained graphically via a discretization scheme) in a userdefined region of kspace for all possible incidence angles and sample rotations. Thus, we can easily tackle a wide range of possible sample geometries such as nanostructures on top of or embedded in a substrate or a multilayered structure.
The initial success of HipGISAXS and in particular the parallel algorithm has seized the interest of many research groups in material science, nanoscience, polymers across the US and Europe. Our current and future research focus is on developing effective algorithms and software capabilities for accurate mophology characterization. In addition for soft Xray scattering, each sample is being probed with a wide variety of Xray energies and Xray polarizations. We are investigating the following approaches to solving this very challenging inverse problems with noisy experimental data.

Reverse Monte Carlo method. In the case of a transmission geometry we have developed a fast reverse Monte Carlo method (RMC). In RMC modeling, we start with an initial configuration M of the particles within a spatial matrix with periodic boundary conditions. This initial configuration may either be generated randomly or be an educated guess of the sample structure. The density fraction of nanoparticles or atoms is either known or is derived from known information about the sample. A Fourier transform of this function is then computed F. The absolute square of F represents the scattering measurement of model M, and thus embeds information about the particle configurations contained in it. The resulting transformation is then compared with an experimentally measured scattering pattern S using the chisquareerror test. Subsequently a particle is moved and chisquare is calculate again. If the new move has a lower chisquare the move is accepted otherwise the move is rejected with a Boltzman probability. The temperature tstar is lowered in a simulated annealing scheme. In addition we implemented a scaling algorithm to adjust the size of the matrix. All parameters, except the numbers of iterations are self tuning.

Nonlinear global optimization methods. We investigate various wellestablished fitting and optimization techniques to evaluate the set of sample parameters (chosen by the user) that minimizes the L2error between the simulated and the experimental images. Since each function evaluation requires an expensive simulation, we would prefer to explore the approaches that do not require gradients or Hessian, such as the derivativefree trust region algorithm, the limitedmemory, variablemetric (LMVM) algorithm, and the surrogate models that mimic the behavior of the simulation model. The challenge of the GISAXS data is that the error function is nonsmooth and contains many local minima, which prevent the descentbased algorithms from converging to the global minimum.

HipIES  High performance Interactive Environment for Scattering HipIES is an interactive graphical environment for organization and analysis of scattering data in an intuitive interface. Automation procedures and integration with remote HPC comprise a streamlined data processing pipeline. The processing backend of HipIES provides GPU and multicore acceleration for high data throughput. HipIES is crossplatform, portable, extensible, and compatible with the NeXus file standard format. HipIES is written in Python, and utilizes the powerful processing/plotting packages: PyFAI, FabIO, PyQtGraph. HipIES forms the GUI front end for HipGISAXS and HipRMC.

Particle Swarm Optimization. The Particle Swarm Optimization (PSO) method seeks, iteratively, to improve an agents solution with regard to the defined objective function. PSO optimizes a problem by having a population of agents (particles) and moving these agents around in the parameter space according to simple mathematical formulae over the agent's position and velocity. Each agent's movement is influenced by its local best known position but, is also guided toward the best known positions in the search space, which are updated as better positions are found by other particles. This is expected to move the swarm toward the best solutions.

Machine learning methods to reduce parameter space. Fitting experimental GISAXS patterns using simulators such as HipGISAXS requires a reasonable initial estimate of the sample. These initial estimates can be obtained from microscopy as well by extracting features from the measured data itself. We are developing machine learning algorithms that can analyze various large data sets and extract reasonable initial guesses for particle shapes and distributions. For example : By detecting peaks and arcs in the measured GISAXS data it is possible to estimate the orientation as well as the distribution of various grains in the sample. Such initial estimates can then be refined by fitting algorithms that we are also pursuing as discussed above.

Accurate multislice GISAXS simulations. In order to accurately interpret GISAXS data, it is important to be able to simulate various patterns. While HipGISAXS supports various complicated cases, we propose extending its capability to account for the wave guiding effect. The wave guiding effect is important because it can be used to probe the sample at different depths by varying the incidence angle of the incoming beam. Moreover these effects are important when simulating dense assemblies of nanoparticles. To accurately simulate nanostructures while accounting for the waveguiding effect, we use the multislice distorted wave Born approximation (DWBA).This method slices a sample along the vertical direction and coherently combine the scattering from each slice to form the GISAXS pattern. However computing the various factors in this model requires an accurate estimate of the average refractive index as well as the Fourier transform (FT) of the structure in each slice. We are designing highperformance algorithms that exploit various properties of the FT to enable the simulation of arbitrary structure using the multisli

Informationtheoretic method from computational mechanics. Although the optimization method above is a tangible and standard method, it is prohibitively expensive if the user start with the initial conditions for 10,000 images. Therefore, as an alternative formulation, we are going intend beyond fixing a set of parameters by the user to describe a sample and instead using a parameterfree approach to infer the structural information and discover patterns in the sample from data: entropyrelated measures to quantify sample’s level of organization, disorder, redundancy, memory, etc. A number of key quantities are introduced in the field of information theory to measure the level of order and randomness in the systems. By interpreting the sample as a hidden process and the scattering pattern as the observation that expresses and encodes the organization of that process, we seek to build a model of the sample based on the (impoverished) GISAXS data.
This work can only be made possible by extensive collaboration between experimentalists, beamline scientists, mathematicians and computational scientists. This innovative initiative may open new opportunities for research and development enhanced by computer modeling and simulation by developing the fastest and most flexible GISAXS/GIWAXS software in the world. Thiswork lays the foundation for future joint efforts between contributors in previously isolated fields to take on intricate scientic problems through high performance computing.