To the untrained eye, raw cryo-EM data looks very much like the static on a television when the input signal is lost. Somewhere buried in the black, white and grey are 2D projections captured from individual proteins that need to be located, aligned, averaged, and then combined to produce a 3D map.
Simplistically, the feasibility of high-resolution structure determination is dependent on how easily we can locate individual particles in the images (particle size) and how similar individual particles are to one another (compositional/conformational homogeneity). Very small proteins are much harder to “see” and, as with X-ray crystallography, highly mobile regions of the protein will be washed away during averaging.
Is my protein big enough?
Ideally, your protein/complex of interest is 100 kDa or greater. A little smaller is ok, but this can increase the challenges of sample prep, data acquisition and processing. Certainly, very small macromolecules have been studied by cryo-EM, including the 52 kDa streptavidin at 2.6 Å (1) and the 40 kDa SAM-IV riboswitch at 3.7 Å (2), but such structures are far from routine, especially in drug discovery pipelines.
There’s an important caveat when it comes to mass in cryo-EM workflows, however. We need observable mass, i.e., well-ordered, conformationally constrained proteins or parts of proteins that are large enough to detect in the noise of micrographs. The concept of ordered mass is thus a critical one in assessing how “cryo-EM-able” a given construct may be and not all 100 kDa proteins are equal in this regard.
What should I look out for?
- Beads on a string. Proteins with long inter-domain linkers tend to be structurally dynamic in the absence of other binding partners. Since each of the domains will orient differently with respect to one another in each different particle in the sample, the apparent mass of the protein in the context of cryo-EM is only that of the largest structurally well-defined region. The typical approach in crystallography pipelines would be to study individual subregions as separate constructs. This, however, typically creates protein constructs that are too small for cryo-EM and diminishes the overall functional understanding of the full-length protein. Adding cognate binding partners can both add significant stability to the region of interaction, contribute ordered mass, and increase the biological information in the resulting structure. For examples of large complexes, check out the recent structures of SAGA (3) and BAF (4).
- Strings or pairs of pearls. Even short inter-domain linkers can present challenges when the individual domains do not meet minimum mass requirements. Repeating units of cadherin domains are a good example of this. Each is small and can shift in orientation relative to the next in the sequence. Stabilization of domains relative to one another would be necessary to achieve high resolution in these structures.
- Waving Flags. Mass that “dangles” from the primary domain of interest also does not contribute to the overall mass needed for cryo-EM studies. Small domains that are highly mobile relative to the rest of the molecule are not typically observed unless they can be captured in one or more stable positions through interactions with the body of the protein.
Are there strategies for increasing the ordered mass of my target?
Cognate binding partners, high affinity Fabs and emerging scaffolds are all options for increasing the ordered mass of cryo-EM targets, though each of these increases the complexity of the system. Multiple, non-competing Fabs can also be used to bulk up a single target complex. When Fabs are difficult to generate, full length monoclonal antibodies can be used with the important caveat that only the ordered mass of the Fab (~52kDa) will be visible in the final map.
In each of these strategies, creation of a relatively large protein-protein interface is needed to ensure a tight lock is formed between the component. Cryo-EM studies are typically carried out in the 0.5-5 mg/ml range, so care should be taken to confirm that the Kd is in an appropriate range. For weaker affinity complexes, negative stain should be avoided as an assessment tool for cryo-EM readiness, as very dilute samples are typically used.
Fusion proteins, like MBP or GST, antibodies against affinity tags, or domain insertions with long linkers are not suitable strategies for increasing the mass of a protein and should be avoided.