New Advancements on Serial X-ray Crystallography
The traditional method of obtaining high resolution atomic structure information from macromolecules is via conventional x-ray crystallography, where several copies of the target object are arranged into a large, typically O(>10 microns), periodic crystal structure, in order to increase the strength of the collected signal, and diffraction images are collected from the sample as it is rotated. In general, the pixel intensities of a diffraction pattern measure the magnitude of the 3D Fourier transform of the sample's electron density along a spherical slice in frequency space. Due to the translational property of the Fourier transform, the periodic crystal structure induces the formation of several sharp bright spots of intensity, known as Bragg peaks, whose location and intensity values are used to ultimately invert the data and reconstruct the electron density. The missing phase information in the data may be recovered experimentally through techniques such as anomolous diffraction, where the wavelength is varied through and absorption edge, or heavy atom replacement, which requires a duplicate crystal to be made with the inclusion of heavy atoms in the crystal structure.
While conventional x-ray crystallography has been successful in determining the structure of several thousands of objects, it is limited to samples which can be formed into large crystals, a laborious process that can take several years to perform, and the crystal samples are commonly plagued with imperfections that may hinder the reconstruction process. An appealing alternative is serial x-ray crystallography, which uses a large ensemble of easier to build nanocrystals or microcrystals, typically delivered to the x-ray beam via a liquid jet. The beam power density required to retrieve a sufficient amount of signal is large enough to destroy the crystal during the imaging process. Therefore, ultrafast pulses, e.g. < 70fs, are required to ensure that the data is collected before damage effects come into play. The use of small crystals introduces several practical difficulties into the reconstruction procedure. For instance, due to the small crystal size the Bragg peaks are smeared out and signal in between peaks becomes noticeable.
Due to the delivery system and short pulses, one can not integrate out the peak shape via rotational averaging, as is done in conventional crystallography. Therefore, only partial peak reflections can be measured, resulting in reduced and noisy collected intensities. Further sources of uncertainty and error are caused by large variations in crystal sizes, background signal introduced by the disordered water molecules in the liquid jet, signal intensity fluctuations induced by beam fluctuations and partial collisions of the crystals with the x-ray beam, and the fact that orientations of the crystals are unknownduring the data collection process.
If the crystal orientations were known, the noise and variation in the peakmeasurements could be averaged out, allowing one to proceed to invert the data to retrieve the electron density of the object. In theory, location of a sufficient number of Bragg peaks in an image can be used determine the orientation of the crystal up to symmetry of its periodic lattice, a process known as autoindexing. While autoindexing has been performed extensively to increase the accuracy of orientations of conventional crystals, a few fundamental issues still remain in its use for orientation of small crystals. One issue is the robustness of autoindexing in the presence of partial reflections and secondary reflections. Furthermore, autoindexing only narrows down the orientation of an image to a list of possibilities whose size is that of the crystal symmetry group, which is known as the indexing ambiguity. If the indexing ambiguity is left unresolved, then the data will appear to be perfectly twinned, i.e., averaged over multiple orientations. While there has been some success in using molecular replacement techniques to determinestructure from perfectly twinned data, a very accurate initial model is required in this case.
We have been developing several computational techniques to address the challenges of serial crystallography. They include:
While current autoindexing techniques have been successfully applied in some serial crystallography experiments, they often can only handle images with a large number of recorded Bragg peaks. However, it is often the case that a large majority of the images do not contain a sufficient number of peaks and thus are not able to be indexed. We are developing new approaches to autoindexing to handle such images. This has included a compressive sensing method in addition to an approach which utilizes the extra inter-peak information available in diffraction images from small crystals.
In serial crystallography, the shape of the peak is unable to integrated out, as is done in conventional crystallography. As a result, each recorded intensity is multiplied by a different random number, depending on where the Ewald sphere slices through the peak profile. This results in extremely noisy intensities, which greatly complicates the data processing and reconstruction. For nanocrystals, partiality is driven by the size of the crystal and, therefore, can be estimated if one has approximation to the crystal size and accurate orientation information. In contrast, the peak shapes for microcrystals are largely determined by defects in the crystal, e.g., mosaicity. We are studying methods to increase the accuracy of the collected data by estimating these partiality effects by either using extra information provided by small angle data collected from rear detectors or directly fitting partiality models through a global maximum likelihood estimator.
Resolution of the Indexing Ambiguity:
If the diffraction data has less symmetry than the crystal lattice, this leads to the "indexing ambiguity", in which complete orientation information cannot be achieved through autoindexing. In such cases, reconstruction cannot be performed unless one already has a sufficiently accurate initial model of structure. We have developed techniques, combining multi-modal analysis, scaling, and clique analysis, which are able to resolve the indexing ambiguity. We are currently studying ways to further enhance the fidelity of this approach via a global maximum likelihood formulation and, if crystals themselves are actually twinned, twin fraction estimation for each image.
In order to determine the atomic structure of the sample one must determine the phase information, which is missing in diffraction experiments. However, common phasing techniques for crystallography often require extra information or assumptions. An appealing alternative is iterative phase retrieval, which only requires that the diffraction data be sampled at twice the Nyquist rate of the sample. While such an approach has been infeasible in conventional crystallography, since the diffraction data is collected at exactly the Nyquist rate, diffraction data from nanocrystals contain a significant amount of information in between Bragg peaks, which may allow one to sample at the required rate. We have been studying the feasibility of applying iterative phasing techniques to determine structure from nanocrystallographic diffraction images. While we have shown that this approach may be applicable to diffraction images collected from perfect nanocrystals, this inter-peak information is highly sensitive to defects in the crystal and may have less symmetry than the peak data, which are pressing issues that must be resolved in order to make iterative phasing practical for nanocrystallography.