The eyes of chemistry
(Dedicated to Geoffrey Roughton)
For most humans, vision constitutes the main way to apprehend the world. We understand information best when we can see it in a three-dimensional frame. This explains the impact of X-ray crystallography from its onset, a century ago, as it allows seeing in atomic detail the main players in chemical and biological processes. However, deriving a three-dimensional structure from the diffraction experiment requires overcoming the phase problem. The evolution of crystallographic methods along the quest to retrieve the phases, which are lost in the diffraction experiment, has been linked to milestones in chemistry and biology. The field of crystallography has arguably been marked by the character of its pioneers, which possibly determined the percentage of women scientists, the collaborative spirit or the support of education and science in emergent economies.
Our group has developed structural methods to exploit the stereochemical knowledge present in small, yet very accurate, fragments. Their use to solve the central problem of crystallography, the phase problem, is implemented in our software ARCIMBOLDO. We are extending the use of fragments to map interpretation, other diffraction methods currently undergoing exciting development and structural bioinformatics. As illustrated in the painting by Giuseppe Arcimboldo, the information content derived from a correct combination of fragments goes beyond their simple addition.
This article reflects the “Ellerslie talk” held jointly (alternated contributions) by Claudia Millán and Isabel Usón at 9 Adams Road, Cambridge, on the 28th July 2017.
An image is worth a thousand spectra
In 1991, towards the end of my Ph.D. after the usual four years of dedicated work, I was starting to panic that what I could write in the thesis would be more fitting for the XIX than for the XX century. I had accumulated a lot of information on the synthesis and properties of organometallic platinum clusters but paradoxically, I did not know what they were or how to explain their properties. For example, I knew that from the reaction:
I could isolate this luminescent cluster, which I had characterised by its spectra and determining its elemental composition. In solution, it was elusive, its instability precluding the use of solvents needed for NMR. I also had probed its reactivity but… I had no clue of what was happening at the molecular level. I knew all along that a crystal structure was needed. Indeed, solving the problems to obtain stable crystals, to measure the diffraction intensities and to determine the atomic structure displayed in Fig. 1, allowed placing all previous information in a visual framework and settled conclusively our many questions . The structure confirmed the intended direct bonds between both types of metal centres and the interactions to the fluorine substituents in the ligands. I was delighted, and I thought about how many others before me must have shared this same joy of literally seeing the answers so long pursued. But the structure also raised new questions: why was the coordination around lead, in oxidation state II, linear? Should there not be a stereochemically active lone pair of electrons? Why then was the structure not bent around the lead centre?
The Braggs shaped the character of crystallography
About one hundred years earlier, in 1895, X-rays had been discovered by Röntgen . Their ability to penetrate matter and yield image information (Fig. 2, left) had been patent from the onset. Diffraction by crystals, by Max von Laue, Walther Friedrich and Paul Knipping, served to simultaneously establish the nature of X-rays and crystals . From these findings, William Bragg and his son, sir Lawrence Bragg realized that in the case of molecules, the natural grating provided by crystals would be required to amplify the scattering signal, by bringing a large number of equivalent atoms to add their contributions . We cannot see X-rays and if we want to experiment with light scattering on macroscopic objects and how it reflects periodicity and the underlying blocks making up the periodic object, we should choose the appropriate wavelength: monochromatic visible light. Among much excellent material available on-line to visualize diffraction on gratings, we would suggest a YouTube video  (Video 1) showing green light scattered on periodic, everyday objects, such as spans of thread, strings of beads, screws, spiral springs mimicking the double helix structure adopted by DNA or sieves illustrating lattices! As can be appreciated in the video, we are not seeing an image of the illuminated items, but the pattern is obviously related to the periodicity and to the underlying structure of the object.The fundamental law of diffraction is named Bragg’s Law, and establishes the geometry of the diffraction pattern (Fig. 2, right). We can think of diffraction as reflection on sets of planes running through the crystal. Only at certain angles are the waves diffracted from different planes shifted by a whole number of wavelengths apart, i.e. in phase. For such angles, the intensity of the diffracted beams can be recorded on a detector. At other angles, the waves reflected from different planes are out of phase and cancel out.
Lawrence Bragg, was the first to perform structural analysis by X-ray crystallography, determining the structures of various inorganic salts, such as NaCl . The Braggs analysed the diffraction pattern and figured out how it related to the structure in real space, placing the atoms that composed structures. Their work established the grounds for X-ray crystallography, as we know it today and was awarded a Nobel Prize in 1915, following Max von Laue’s prize from 1914.
Mentoring crystallographers regardless of their gender
The Braggs were not only pioneers in scientific terms. They also shaped the field by mentoring and influencing a new generation of crystallographers. In particular, support of women scientists can be appreciated in their scientific descent and in the opportunities they promoted women to, thus treating them as equals for the first time . Prominent examples comprise Joan Evans, who was invited in 1923 by William Bragg to be the first woman to ever deliver a Discourse at the Royal Institution; Lucy Wilson, who was Lawrence Bragg’s first research student in 1923-1924; Kathleen Lonsdale, one of the first two female Fellows of the Royal Society elected in 1945, had been a student in the group of William Bragg; Helen Megaw, the first female staff member in Cambridge’s legendary Cavendish Laboratory, was appointed in 1946 by Lawrence Bragg.
In the genealogy reproduced on Fig. 3, displaying part of the scientific descent of the Braggs it is noticeable the unusually high numbers of distinguished women scientists. In fact, possibly due to the Braggs’ direct intervention, crystallography has been differentiated from other scientific fields by a higher participation of women, providing role models for young scientists and also for the collaborative, open, joy-in-discovery attitude in many of its distinguished members. However, the actual percentage of women in crystallography was much lower than the one commonly perceived. While the popular feeling is that nearly half of the researchers in this field were women, in truth they amounted to less than 15% of entries in the World directory of crystallographers by the year 1981 (updated figures are available but crystallography is currently practised by many scientists who do not define themselves primarily as crystallographers).
This biased perception must have been brought about by having such prominent figures in the field as Dorothy Hodgkin, Kathleen Lonsdale, Rosalind Franklin, Isabella Karle and still active in the present, Ada Yonath or Eleanor Dodson. Even in Spain, where the incorporation of women to academia cannot be compared with the UK, Sagrario Martínez-Carrera pioneered the development of crystallographic computing from the 50s on . Nobel laureate Dorothy Hodgkin, who determined such key structures as penicillin and vitamin B12, was determinedly involved in the cause of scientists from, at the time, developing countries, in particular India and China. Eleanor Dodson, nucleated the “collaborative computing project number 4” (CCP4), which has provided a unique model of cooperation in crystallography and done much to actively support science and mentoring worldwide.
Regarding relatives, the wife of Lawrence Bragg was Alice Hopkinson. She and her cousin happened to share the same maiden name and both families used to live in the same street in Cambridge. The other Alice Hopkinson married Francis John Worsley Roughton. Alice Roughton was the first woman to obtain a Ph.D. in psychiatry in Cambridge. Not only did she study: she practised taking her commitment to an extreme. She was literally living with her patients, including psychotic cases or prospective convict teenagers, which she brought into her home. After World War II, she heard of the German physicists imprisoned in Farm Hall, near Cambridge. She contrived to reach an agreement whereby Hahn, Heisenberg and their colleagues would be fetched and brought to Cambridge every evening to participate in the academic life and dinner provided in high tables. Later on, she continued throwing her house open to refugees from every conflict, scholars, artists and stranded people from every nationality. Alice Roughton was a psychiatrist, a medical campaigner, a pioneer in the movement against nuclear weapons and a conservationist and her inspiring biography has been recently published . But the story of structure and chemistry brings us back to Jack Roughton, professor of Colloidal Science at Trinity College.
The physiology of respiration
The role of atomic structure in providing a visual framework to all our indirect information and prompting new questions extends to many chemical contexts: for instance, something as immediate as the physiology of respiration, which we all exercise 13 times in a minute. Much was known of this process, thanks to the works of the physiologist Jack Roughton in Cambridge, before an atomic model could be envisaged. As a result of his efforts (which required constructing the apparatus to measure the reactions of haemoglobin with gases) the oxygen equilibrium curve, the thermodynamics and some of the kinetics of the reaction with oxygen seemed to be well understood. Also established were the binding of oxygen at the iron centre, whereas carbon dioxide was binding a different site, in the peptide chain, or the cooperativity among the four different haemoglobin subunits in the active species . The Bohr effect, first discovered by physiologist Christian Bohr in 1904, explained how hydrogen ions and carbon dioxide affect the affinity of oxygen in haemoglobin. If the pH was below the normal physiological pH of 7.4, haemoglobin would not bind oxygen as well. But some vital gaps and anomalies remained by the time the structure was determined by Max Perutz .
Again, relating the previous knowledge to a three-dimensional frame provided conclusive information about the mechanism and the large structural changes involved, but opened new questions. For example, it was totally unexpected that the active sites, which were acting in cooperation, would be separated by large distances, rather than in immediate proximity. With this new structural insight, Jack Roughton continued researching the biochemistry of respiration from a new perspective .
The phase problem
As mentioned above, crystallography does not render a direct image, and no lenses for X-rays are available to reconstruct it. In order to calculate an electron density map, both the intensities (proportional to the squares of the amplitudes F) and the phases φ of the scattered beams, hkl, would be required, but only the former are recorded in the diffraction experiment:
The problem of phasing is highly non-linear and has a very poor radius of convergence, thus search, rather than minimisation methods are needed. Chemical structures can usually be solved ab initio, not requiring previous structural knowledge of the unknown structure or collection of additional experimental data. Directly solving the phase problem for small chemical structures, with up to 200 atoms is possible due to the excellent diffraction properties of typically well ordered crystals, allowing the measurement of many more independent reflections. This highly overdetermined problem, requiring comparatively few parameters, can be solved by computational brute force direct methods. The assumption of atomicity imposes statistical restraints on the phases. For the development of their equations to derive phases from the intensities, Herbert Hauptman and Jerome Karle received the Nobel Prize in 1985 . It has been often rued, that Isabella Karle, who actually got the method to work and solved the first structures this way , did not get to share in the award.
Alternatively, the presence of a few heavy atoms can be used to solve the structure of chemical molecules by the Patterson function (the Fourier transform with F2 as coefficients), which can be calculated directly from the experimental data and gives information about interatomic distances, relating atoms with significantly more electrons than the rest of the structure .
In the case of macromolecules, the larger number of parameters and lower proportion of independent measurements accessible makes ab initio methods unsuccessful. Crystals tend to be less perfect, as half of the volume is filled by disordered water, but this in turn opens the door to new phasing methods . Experimental phasing is based on inducing or differentiating a small molecule within the macromolecule. For instance, by introducing heavy atoms like Hg. Because of the many electrons at these atoms, the X-ray intensities from derivatives differ sensibly from their native counterpart and these differences can be exploited to determine the structure of the heavy atoms, which in turn can be used to provide reference phases for the whole macromolecule. The first protein structures, those of myoglobin and haemoglobin, were so determined. Alternatively, a related structure can provide starting phases for the unknown one through Molecular Replacement: this requires placing the related structure in the unit cell of the target one, to best match the data.
Antibiotics constitute a class of molecules typically falling in between small molecules and macromolecules. They tend to give crystals with more than 200 independent atoms but so compactly packed that no space is left for disordered solvent. Too large for classic direct methods, these crystals could not be modified diffusing solutions with chemicals and closely enough structures would not be available. The structure of many antibiotics was only achieved in the 90’s, after 50 years had passed from the determination of the structure of penicillin. These and other “large small molecules” needed recycling between the real space and reciprocal space formulation of the problem, constraining atomicity in both. For instance, vancomycin, a last resort antibiotic administered in hospitals against particularly resistant strains to common antibiotics, had not seen its structure determined until 1996, even though crystals diffracting to atomic resolution had been available for over two decades. Such direct methods-based dual-space recycling methods finally succeeded in obtaining the vancomycin structure .
Fragments in phasing, map and structure interpretation
It is not surprising that in the absence of atomic resolution, enforcing atomicity is of limited use. Therefore, dual space recycling methods succeeded in extending the scope of direct methods to larger structures but remained tied to the requirement of exceptionally good data and heavily overdetermined problems, i.e. data to atomic resolution. Instead, they became essential in the solution of substructures of heavy atoms and anomalous scatterers required in experimental phasing . For such substructures, resolution is still “atomic” in the sense that their components are separated by longer distances and thus, resolved.
At the typical resolutions reached in macromolecular crystallography, rather than exploiting atomicity as a constraint, it was necessary to resort to the fact that macromolecular structures contain fragments of known geometry. Therefore, our group developed methods to exploit the stereochemical knowledge present in small, yet very accurate, structural units such as secondary structure fragments and their association into local folds . Their use to solve the phase problem is implemented in our software ARCIMBOLDO, named in analogy to the portraits this artist painted in the XVIth century out of common objects such as fruits and vegetables. We assemble structural hypotheses out of common fragments of secondary structure or small local folds, such as alpha helices or small beta sheets. This is achieved with the molecular replacement methods implemented in PHASER based on Bayesian statistics , which provide a sensitive guide to decision making. If one of such substructures, comprising some 6% of the total structure, is correct and accurate to 0.5Å rmsd to the true structure, density modification and automatic map interpretation with SHELXE  reveals the “portrait” of our protein. As most trials remain a still life, massive, parallel computing is needed in difficult cases. A video reproducing the process can be seen in our YouTube channel  (Video 2). As illustrated in the painting by Giuseppe Arcimboldo, the information content derived from a correct combination of fragments goes beyond their simple addition. This has required developing our own particular toolbox for the very detailed view required in phasing , which can be extended to the solution of other problems. In particular, we are also extending this view to map interpretation in autotracing and general structure interpretation. The recent resolution revolution experimented in electron diffraction methods , has brought cryo-electron microscopy into the world of high resolution, and thus quantitative structure determination and many computing methods are being developed starting from crystallographic ones.
Airlie McCoy is gratefully acknowledged for her lecture on the Bragg symposium, at the ECM 28 in Warwick, UK. We thank Nicolas Soler for expert feedback.
ICREA Research Professor,
Crystallographic methods group,
Structural Biology Unit – SBU,
Institute of Molecular Biology of Barcelona – IBMB,
Crystallographic methods group,
Structural Biology Unit – SBU,
Institute of Molecular Biology of Barcelona – IBMB,
- Usón R, Forniés J, Falvello LR, Usón MA, Usón I. “Synthesis and molecular structure of bis(tetrabutylammonium)bis[tetrakis(perfluorophenyl)palladium]plumbate(2-), the first lead(II) compound linearly bonded to two metal atoms.” Inorg Chem, 1992, 31: 3697. DOI.
- Röntgen WC. “Ueber eine neue Art von Strahlen. Vorläufige Mittheilung”. In: Sonderabbdruck aus den Sitzungsberichten der Würzburger Physik.-medic. Gesellschaft. Stahel, Würzburg, 1895. URL.
- Friedrich W, Knipping P, von Laue M. “Interferenz-Erscheinungen bei Röntgenstrahlen.” Sitz. ber. Bayer. Akad. Wiss., 1912, 1912,14: 303. URL.
- Bragg WH, Bragg WL. “The Reflection of X-rays by Crystals.” Proc R Soc A, 1913, 88: 428. DOI.
- “Diffraction gratings.” YouTube (channel: SuperLaser123), 00:03:19 (2014). ID: jzmqeRp_tmk.
- Bragg WL. “The structure of some crystals as indicated by their diffraction of X-rays.” Proc R Soc A, 1913, 89: 248. DOI.
- Julian MM. “Women in crystallography”. In: Women of science : righting the record, Kass-Simon G, Farnes P, Nash D (Ed). Bloomington : Indiana University Press, 1990, p. 335.
- Martínez-Carrera S. “Commission on crystallographic apparatus.” Acta Cryst B, 1987, 43: 408. DOI.
- Puiggròs XM, Alice’s. La biografia de la doctora humanista Alice Roughton en el Cambridge del segle XX. L’art de la memoria, 2018.
- Ferguson JKW, Roughton FJW. “The chemical relationships and physiological importance of carbamino compounds of CO2with haemoglobin.” J Physiol, 1934, 83: 87. DOI.
- Perutz MF, Rossmann MG, Cullis AF, Muirhead H, Will G, North ACT. “Structure of haemoglobin: a three-dimensional Fourier synthesis at 5.5-A. resolution, obtained by X-ray analysis.” Nature, 1960, 185: 416. DOI.
- Roughton FJW. “Some recent work on the interactions of oxygen, carbon dioxide and haemoglobin.” Biochem J, 1970, 117: 801. DOI.
- Karle J, Hauptman H. “A theory of phase determination for the four types of non-centrosymmetric space groups 1P222, 2P22, 3P12, 3P22.” Acta Cryst, 1956, 9: 635. DOI.
- Karle IL. “Molecular formula, configuration and conformation by X-ray analysis.” Pure Appl Chem, 1977, 49: 1291. DOI.
- Patterson AL. “A Fourier Series Method for the Determination of the Components of Interatomic Distances in Crystals.” Phys Rev, 1934, 46: 372. DOI.
- Hendrickson WA. “Evolution of diffraction methods for solving crystal structures.” Acta Cryst A, 2012, 69: 51. DOI.
- Sheldrick GM, Hauptman HA, Weeks CM, Miller R, Usón I. “Ab initio phasing”. In: International Tables for Crystallography, Vol. F: Crystallography of biological macromolecules, Rossmann MG, Arnold E (Ed). Springer, Dordrecht, 2006, p. 333. DOI.
- Weeks CM, Adams PD, Berendzen J, Brunger AT, Dodson EJ, Grosse-Kunstleve RW, Schneider TR, Sheldrick GM, Terwilliger TC, Turkenburg MG, Usón I. “Automatic Solution of Heavy-Atom Substructures”. In: Methods in Enzymology, Vol. 374, Carter CW, Sweet RM (Ed). Elsevier, 2003, p. 37. DOI.
- Millán C, Sammito M, Usón I. “Macromolecularab initiophasing enforcing secondary and tertiary structure.” IUCrJ, 2015, 2: 95. DOI.
- Oeffner RD, Afonine PV, Millán C, Sammito M, Usón I, Read RJ, McCoy AJ. “On the application of the expected log-likelihood gain to decision making in molecular replacement.” Acta Cryst D, 2018, 74: 245. DOI.
- Usón I, Sheldrick GM. “An introduction to experimental phasing of macromolecules illustrated by SHELXmathsemicolon new autotracing features.” Acta Cryst D, 2018, 74: 106. DOI.
- “4J5M structure solution using ARCIMBOLDO_SHREDDER spheres.” YouTube (channel: ArcimboldoTeam), 00:03:04 (2017). ID: KdmmujCit3o.
- Sammito M, Millán C, Rodríguez DD, de Ilarduya IM, Meindl K, Marino ID, Petrillo G, Buey RM, de Pereda JM, Zeth K, Sheldrick GM, Usón I. “Exploiting tertiary structure through local folds for crystallographic phasing.” Nat Methods, 2013, 10: 1099. DOI.
- Kuhlbrandt W. “The Resolution Revolution.” Science, 2014, 343: 1443. DOI.