SAXS: Data Analysis for Small Angle X-ray Scattering

Information on the three-dimensional structures of biological macromolecules is essential for the advancement of our basic understanding of life in health and disease. The analysis of structures of large macromolecular assemblies has remained one of the formidable challenges in structural biology. Consequently, Small Angle X-Ray Scattering (SAXS) has become a key tool in the analysis of macromolecular structures and complexes. SAXS is uniquely suited to study large macromolecule assemblies for which structural information is available for the individual components, but not for the larger assembly. Partially disordered proteins and macromolecular assemblies are also ideally investigated by this low-resolution solution method.

SAXS relies on matching computed scattering patterns obtained from molecular models to the patterns observed in experiment. Rapid advances in data collection technologies have placed this methodology within the reach of many structural biologists. However, model analysis and software development has lagged behind, and the evaluation and interpretation of SAXS data is perceived as a serious bottleneck in the general application of SAXS.

We are developing a systematic modeling and analysis framework for evaluation and interpretation of SAXS experimental data. The framework will be based on a systematic development and analysis of models and a statistical analysis of the effects of uncertainty and error in models and data. We will also develop software design patterns and/or code generation techniques that enable iterative model development while maintaining computational efficiency. All of the software developed to experiment with SAXS modeling and data interpretation will be distributed as open source.

The proposed process of the interplay between SAXS data and molecular models, including model validation, is schematically illustrated in the Figure below.

The description starts at the far left and describes the process in a clockwise fashion.

Our collaborative group includes principal investigators from Biochemistry, Computer Science, Mathematical Sciences, and Statistics.

Preliminary work for this project has received substantial seed funding from the Vice President of Research and the Office of the Dean of the College of Natural Sciences at Colorado State University and has included weekly or bi-weekly meetings of a multi-disciplinary team of faculty for more than two years. We have written, and distributed as open source a highly efficient modeling exploration software that will calculate the in vacuo scattering pattern for a protein, nucleic acid, or complex. Rigorous hand-optimization of the software has resulted in a 130-fold reduction in execution time.

Below is a more detailed diagram of the data analysis process that we are developing.

mstrout@cs.colostate.edu .... August 15, 2010