Annu. Rev. Biophys. Biomol. Struct. 1994. 23:377-405
Copyright © 1994 by Annual Reviews Inc. All rights reserved
KEY WORDS: nanotechnology, protein engineering, atomic force microscopy,
supramolecular chemistry
The range of structures termed molecular machines is extraordinarily diverse. Examples include pairs of trypticine moieties with aryl groups that mesh to enforce gearlike corotation (65), polymeric materials that can be driven through cycles of contraction and relaxation by changes in pH (99), zeolite catalysts, enzymes, and the bacterial flagellar motor. A recent review stated that "a molecular machine, commonly an assembly of polymeric molecules, is a structural fabrication that can convert energy from one form or location to another" (99, p. 819). This review instead focuses on a more restricted class of objects–those in which specific molecules combine to form "a structure consisting of a framework and various fixed and moving parts, for doing some kind of work" (35a); the quoted language is a dictionary definition of machine.
Theoretical studies of mechanically guided chemical synthesis (mechanosynthesis) have described some of the useful work that molecular machine systems can perform (18, 24, 61, 67); other studies describe how they can perform computation (20, 22, 24, 63). Thus, besides processing energy, molecular machine systems can also process matter and information. Among these processing functions, mechanosynthesis is of basic importance because it can be used to build molecular machine systems able to perform the other functions. Developing molecular machine systems capable of mechanosynthesis is thus a strategic objective.
In the next section, I compare solution synthesis and enzymatic synthesis
with mechanosynthesis and describe the physical principles of mechanosynthetic
systems and processes. This description helps define objectives for molecular
machine research, indicating the size and nature of the required structures.
In the following two sections, I review state-of-the-art in techniques
for implementing molecular machine systems and use the requirements of
mechanosynthesis to focus attention on key concerns. The first of these
sections discusses techniques for molecular positioning using atomic force
microscope technologies; a development approach based on these techniques
would use mechanosynthesis to build molecular machines. The second reviews
progress in protein engineering as a basis for the design and construction
of complex functional devices; a development approach based on these techniques
would build molecular machine systems via Brownian self-assembly of protein-like
molecules to form supramolecular structures. Thus, mechanosynthesis provides
a goal for molecular machine development, and AFM positioning and engineering
of protein-like molecules provide alternative means toward that end.
Although a spectrum of intermediate cases can be identified, enzymatic synthesis differs significantly from the standard solution-phase model. Enzymatic reactions (11) begin with reagent binding, which places molecules in well-controlled positions and orientations. The resulting high effective concentrations (often augmented by strain energy, polarization, and catalytic groups) result in high reaction rates. The specificity of the binding interactions determines reaction geometries and results in highly specific reactions. Most alternative reagents will not bind in the active site, and among bound reagents, most alternative reaction geometries are excluded.
Mechanosynthesis as defined here differs from enzymatic catalysis (again, a spectrum of intermediate cases can be identified), yet many of the same principles apply. Anticipated molecular machine systems can position bound reagent molecules with respect to one another and guide the chemical reactions with a degree of specificity not found in solution-phase chemistry (24). The strain energy applied by enzymes is no greater than a fraction of the potential binding energy of the substrate (11), but power-driven mechanosynthetic devices can apply strains of bond-breaking magnitude (24). The provision of enzyme-like reaction environments is also feasible.
One can perform mechanosynthesis by using macroscopic devices, such
as scanning tunneling and atomic force microscope (STM and AFM) mechanisms.
The first clear example of a mechanically controlled synthesis (albeit
of a noncovalent structure) was the arrangement of 35 xenon atoms on a
nickel crystal to spell "IBM" (27). STM manipulation
of atoms and molecules has been extended, for example, to CO bound to platinum
and to silicon atoms on silicon (56). Although
these processes still lack the control necessary to build molecular machine
systems, they demonstrate the direct, mechanically guided rearrangement
of the fundamental building blocks of matter.

Intramolecular reactions can be accelerated by electronic effects (in
which case one might say that the reacting groups have been altered), by
release of steric strain (a piezochemical effect), by differential solvation
effects, or more commonly, by proximity and orientation. When groups A
and B are linked by a flexible polymer, the effective concentrations take
on conventional values (less than 55 M, the concentration of water molecules
in water). When A and B are held in close proximity and in the correct
alignment for a reaction, the effective concentration can be much higher.
Thiols in proteins exhibit effective concentrations for disulfide formation
of
COMPLIANCE AND THERMAL MOTION The effective concentration of a favorably positioned and oriented group is greater than or equal to the spatial probability density of a reference point in that group (e.g. a nuclear position) relative to a coordinate system based in the other group. This probability density can be calculated from the temperature and the potential energy surface of the system.
In terms of the compliances c (reciprocal stiffnesses, measured in m/N), the potential energy of a group positioned by a linear mechanical system is
![]()
for a suitable choice of coordinates that measure displacement of one group with respect to another, provided that deformation of the groups themselves can be approximated as linear. (Conformational flexibility would violate this condition.)
PROBABILITY DENSITY AND EFFECTIVE CONCENTRATION In both classical and quantum statistical mechanics, the probability density for a harmonic system is the product of the Gaussian probability densities along each of the coordinates. A comparison of classical and quantum mechanical results shows that the classical approximation is adequate for describing most mechanical systems of nanometer or larger scale at room temperature (24); quantum mechanical corrections become substantial (0.1) only at extremes of low temperature, high stiffness, or low mass (e.g. for displacements of individual hydrogen atoms).
In the classical approximation, the Gaussian distribution for a single coordinate is characterized by a standard deviation
![]()
At the peak of the Gaussian distribution, the local concentration of a group is
![]()
where Na is Avogadro’s number and energies are in joules. For a system with a compliance of 1 m/N along each of three axes, Clocal approximately equals 400 M at 300 K. If the transverse compliance of the transition state is small compared with the compliance of the positioning mechanism (as is typical), the local concentration calculated
above is approximately equal to the effective concentration (in the absence of orientational effects, which can increase the effective concentration by several orders of magnitude). This formulation neglects the steric interaction of the two reagent groups, which typically increases the effective concentration via excluded volume effects.
A useful model for many mechanosynthetic systems treats the positioning system as applying force along the z axis that presses a group against an unyielding wall. Neglecting stiffness along the z axis, the probability density function is exponential in z, while remaining Gaussian in ×and y. The effective concentration is then
![]()
where F is the applied force in Newtons. For F = 0.01 nN, with a compliance of 1m/N along each of remaining axes, Clocal = ~150 M at 300 K.
Rate increases from positioning are purely entropic: confining the reacting groups to a smaller volume of configuration space increases the frequency of reactive encounters. Positional control of reagents suppresses reactions at remote sites; along an axis with a compliance of 0.66 m/N, the effective concentration of a group at a distance of –0.3 nm is 108 of the peak (in the absence of reactive groups from other sources). This effect is enthalpic: displacement of the group to a remote reaction site imposes a large strain energy on the supporting structure. Mechanical forces can reduce reaction energy barriers by applying a strain energy that is relieved by passage through the transition state. These favorable enthalpic effects are discussed elsewhere (24).
More accurate models of mechanosynthetic control can be developed from
transition-state theory and knowledge of the reaction potential energy
surface, but such models are beyond the scope of this review. Models based
on effective concentration suffice to show how reaction rates at selected
sites can be accelerated, while reactions at alternative, chemically equivalent
sites can be suppressed. Provided that the desired reactions occur with
high reliability (for a review of requisite conditions, see 24), mechanical
control will permit syntheses with 106 sequential reactions
to proceed with high yield.
Figure 1 Overall geometry of a mechanosynthetic device based on a Stewart platform The lengths of the six Struts are controlled externally, thus enabling the platform to be moved in six degrees of freedom.
Figure 2 A two-dimensional diagram of a portion of the range of motion available to a Stewart platform like that in Figure 1.
Figure 3 A rotary bearing based on a modified diamond structure, designed and analyzed in collaboration with R. Merkle at Xerox Palo Alto Research Center. Interlocking ridges provide a large axial stiffness, while calculated rotational energy barriers are <0.001 at 300 K. Fabrication of such structures will require mechanosynthesis.
Either way, the initial products will be polymeric structures. Mechanosynthetic
systems immersed in solution should enable the assembly of complex structures
from monomers joined to form an extensively cross-linked polymer. Complex
structures made by synthesis and self-assembly will likely be built from
polymeric building blocks with fewer cross-links. The feasible Young’s
modulus for protein-like structures is presumably no less than that of
structural proteins: 4 GPa for wool (containing alpha-helical keratin)
(102), 10 GPa for silk (along the beta-strand
direction) (102), 10 GPa estimated for bacterial
flagella and F-actin (polymers of globular proteins) (73).
The feasible modulus of elasticity for cross-linked polymeric structures
is presumably in the range of biological and engineering polymers, for
example, 1–14 GPa for various resins (97) and
100 GPa for cellulose (102). It may be feasible
to stiffen self-assembled structures by incorporating grooves that bind
stiffer polymeric chains or chemical precursors to chains or nets.
A more detailed examination of mechanosynthetic devices (24)
describes means for binding appropriate reactive molecules and for using
externally imposed pressure fluctuations to drive and control operations.
The present survey of mechanosynthesis and the requirements it imposes
on molecular machine design suffices to identify the fabrication of complex
and stiff aperiodic structures as a key development problem.
The degree of positional control, together with the relatively low compliance of the tip with respect to transverse displacements (41 m/N), suits AFMs for use as positioning mechanisms in a mechanosynthetic system. The chief remaining requirement is the stiff attachment of molecules (e.g. monoclonal antibody fragments) able to bind and orient reagent molecules from a dilute solution (23–25). A system of this sort has not yet been demonstrated, but the physical principles of each component are sufficiently well understood to enable an estimate of the likely properties. Anticipated effective concentrations are on the order of 100 M, which would provide localized reaction-rate accelerations (relative to background reactions) on the order of 108 (24). These accelerations should suffice to build structures containing 105 monomers.
A possible sequence of objectives in pursuing this line of development
includes: (a) binding desired molecules to tips in controlled orientations,
(b) placing and bonding several reactive groups, and (c) building increasingly
complex systems by extending these operations. This approach to molecular
machine development has substantial advantages and has been reviewed quite
recently (24). The balance of the present
review focuses on self-assembling systems, which are more closely related
to biological systems and hence to biophysical concerns.
For stability, covalent bonds usually must join the smallest units: ew other links between small molecules withstand thermal agitation at ordinary temperatures. Polymeric structures have natural advantages for making diverse structures from small units; they are used in biology, and some (peptides and DNA) have well-developed synthetic methodologies. Self-assembled structures must be stiff to serve as good machine components; this requirement favors structures with dense internal packings, ideally of rigid, bulky substructures.
When joining larger units, a combination of many weak interactions typically serves better than the formation of a few strong bonds: weak interactions can combine to give strong binding in a particular geometry but little binding in other geometries. This phenomenon enables reliable assembly steps. Large, asymmetrical structures suitable for use as machine systems could then be assembled using any of several different strategies. One is to give each subunit interface a unique structure with unique binding propensities (as in the ribosome) and mix blocks of all types simultaneously. Another strategy is to emulate solid-phase synthesis by using cyclic exposure to different subunits to control the sequence in which components are added. Structural control then results partly from intrinsic binding specificities and partly from an externally imposed sequence of operations. This approach falls outside recently described, biologically inspired models for controlling self-assembly (55).
The surfaces of assembled components must meet various functional constraints.
These will presumably include provision of binding sites for other molecules,
such as tools, auxiliary structural elements, and perhaps electronically
active components. Binding of other molecules (perhaps followed by their
polymerization) can provide a mechanism for altering surface structure
after the primary self-assembly process. Binding these molecules may help
in meeting the conditions (24) for interfacial
sliding with small energy barriers and thereby facilitate the design and
fabrication of bearing surfaces.
NUCLEIC ACIDS Ease of synthesis and ease of designing matching structures are two reasons for using nucleic acids as building materials. Nonlinear structures can be engineered from DNA; for example, Chen & Seeman have made a framework containing eight DNA junctions, which form a structure with the connectivity of a cube (8). However, the size and geometry of base-paired nucleotides, as well as the presence of a charged backbone, render the design of dense structures awkward. Furthermore, biological systems have (with a few striking exceptions) exploited proteins rather than nucleic acids as a basis for molecular machinery and structural components. Although nucleic acids deserve further consideration, protein-like materials seem more attractive.
STANDARD PROTEINS Naturally occurring molecular machines are built chiefly of standard proteins, which are unbranched polymers of the 20 genetically encoded amino acids. Engineered standard proteins, which can often be made by biological means, can closely resemble natural models, thereby facilitating their design. The chief disadvantages of standard proteins are the limited choices of monomers and chain topology. The de novo design of standard proteins was recently reviewed (89), and successful efforts indicate that engineering of standard proteins could provide a path to complex molecular machine systems. This review, however, focuses on the advantages gained from working with a broader class of structures.
PROTEIN-LIKE MOLECULES A primary motivation for engineering standard proteins is to understand the relationship between structure and stability in natural proteins (95). When engineering nonstandard, protein-like molecules, however, the primary motivation is to take the knowledge gained from engineering standard proteins and use it to make structures that fold more stably and predictably than natural or engineered standard proteins. Potentially useful differences from standard proteins include use of nonstandard amino acids and nonlinear chain topologies, as discussed below.
Researchers can incorporate nonstandard amino acids into proteins by using chemically aminoacylated tRNA in RNA-directed protein synthesis in vitro (4, 9, 72). These techniques have been used to probe protein stability (28, 59), thereby extending and confirming the results of more conventional mutational studies (95).
Scientists continue to review and improve methods for solid-phase peptide synthesis (30, 49); these methods enable wholesale use of nonstandard amino acids. Synthesis of chains that contain over 30 amino acid residues is now routine, although certain sequences are more difficult than others. To the extent that difficult sequences can be predicted (101), experimenters can avoid them by imposing suitable constraints during design (an option not available when making a predetermined natural product). Protein-like molecules (50 residues) can be constructed either directly or by linking several shorter peptides. The latter approach presents serious difficulties if the links must be peptide bonds (e.g. to make a standard protein) but lesser difficulties if links are nonpeptide bonds (e.g. disulfide or metal-ligand bonds) formed between deprotected chains.
OTHER POLYMERS Starburst dendrimers (98)
form a diverse class of highly ordered three-dimensional molecular structures.
Examples thus far, however, have had high symmetry or flexibility, which
precludes their use as building blocks for self-assembled molecular machine
systems. Future developments could remove these limitations.
Yun-yu et al (108) have argued that present methods for computing free energy differences cannot accurately predict the effect of mutations on protein stability because of the difficulty of sampling enough conformations to compute entropic contributions. This prediction limitation, however, need not present difficulties for design. Consider a set of structural changes, each of which is of a type that is statistically favorable but individually uncertain. Because effects on stability are approximately additive, researchers can reliably predict that the joint effect of many such changes will be large and positive.
DESIGN AND STABILITY OF STANDARD PROTEINS DeGrado (13), Mutter & Vuilleumier (70), and Richardson et al (88, 89) have reviewed various aspects of the design of peptides and proteins. Several of the latter have been designed de novo. The first was a4 (14, 15, 86), which has been used as a basis for several subsequent modifications and refinements (37, 38, 82, 85). Others include a 79-residue de novo design (Felix) deliberately modeled on natural protein sequences; this structure folded successfully but with low stability (an estimated 0.8 kcal/ mol) (41). Felix, a four-a-helix design, was as successful on the first try as a b-sheet design (by the same group) was on the ninth attempt; this difference was attributed to the greater modularity of helices and the lesser solubility of b-sheet structures. Another four-helix bundle structure has been designed to present a desired immunogenic determinant (47). A 242-residue a/b-barrel protein has been designed and appears to fold correctly (35). A major conclusion drawn from de novo protein design is the importance of negative design, which seeks not only to stabilize a desired fold but to destabilize alternative folds and patterns of aggregation (14, 88).
Until recently, proteins designed de novo have shown NMR spectra characteristic of disordered cores (89); by modifying an earlier design, Raleigh & DeGrado (82) produced a protein with more native-like characteristics, including a distinct melting transition. These modifications replaced multiple Leu residues (which made up the entire core of a4) with b-branched and aromatic residues and introduced shape complementarity into the packing interactions between helices. A separate effort, which yielded an even more well-ordered core, modified the a4 design by adding two zinc binding sites (38).
NATURAL PROTEINS ARE NEEDLESSLY DESTABILIZED Free energies of protein unfolding are favorable by about 5–20 kcal/mol (17), which corresponds to probabilities of occupying the unfolded state of about 10-4 to 10-15. Because protein folding times are about 10-1 to 103s (11), the mean waiting time to observe an unfolding event in an intact protein with a stability of 15 kcal/mol is over 100 years, which is longer than the lifespan of most organisms.
This waiting time is significant for protein design: because natural selection can only operate on events that occur, evolution has no direct mechanism for favoring proteins much more stable than those observed. The upper end of the stability range observed in vitro could be explained by the presence in vivo of denaturing influences, such as solutes, mechanical stresses, or elevated temperatures, which lead to selection for high stability. Alternatively, stability may sometimes correlate with other evolutionarily favored properties. Finally, free energies of unfolding for highly stable proteins in water are typically extrapolated from data on unfolding in concentrated guanidinium chloride, which may introduce errors.
Because random mutations tend to destabilize any particular folded structure,
evolutionary processes (either neutral or adaptive) tend to carry proteins
to the threshold of significant instability. Accordingly, the consistently
low stability of natural proteins is quite compatible with the goal of
constructing artificial proteins of high stability (18).
The infeasibility of predicting the folding of natural proteins is also
compatible with the goal of designing proteins that fold predictably (18),
as has been demonstrated (14).
BRANCHED CHAINS Mutter has observed that branched chains have a lower conformational entropy in the unfolded state (because of excluded volume effects) and that branched protein-like structures consistent with a desired fold can be expected to fold more stably and predictably. His laboratory has synthesized several structures (termed template-assembled synthetic proteins) that confirm these predictions (68–70).
Applying the same principle with a different synthetic methodology, Hahn et al synthesized a stable 73-residue peptide that had four helical branches (36). They included a substrate-binding site with catalytic residues patterned on chymotrypsin; this peptide cleaved ester substrates at –0.01 the rate of chymotrypsin, with a similar substrate affinity.
LOOPED CHAINS Linking a polymer to form a loop decreases its entropy, and fold-compatible cross-links accordingly favor the folded state by reducing the entropy of unfolding (79). Researchers have developed computer-aided design software for identifying potential loops and disulfide bridges between already-defined structures (74) and used engineered disulfide bonds to stabilize natural proteins (75).
Rizo & Gierasch (90) recently reviewed conformationally constrained peptides containing nonstandard structures. Cross-links between residues i and i+4 can be used to stabilize an a-helix. Formation of amide bonds between side chains during solid-phase synthesis has introduced multiple cross-links in good yields (29).
METAL-ION BINDING Coordination of side-chain ligand groups to a metal ion can result in branched or looped structures of increased stability. Because metal ions can bind selectively to deprotected peptides under mild conditions, they can be incorporated late in an assembly sequence. Metal-ion binding can be quite specific because different groups have affinities for different metals and complexation can be reversible. Furthermore, Cr(II) and Co(II) are labile with respect to ligand exchange, but Cr(III) and Co(III) are relatively inert. Experimenters have introduced the labile forms into metal binding sites in proteins and then oxidized them to the exchange-inert species (100). This process could be useful in self-assembly because it combines the advantages of reversible, cooperative binding with those of a stable product.
Regan (84) recently reviewed the design of metal-binding sites in proteins; among the design efforts reported, the success rate was high. Regan & Clarke modified a4 to incorporate a binding site for Zn(II) with two His and two Cys ligands; the resulting protein bound Zn(II) with an estimated dissociation constant of 2.5 ×10-8 M and was significantly stablized (85). Handel & DeGrado modified a de novo protein design (a4) to incorporate a three-His Zn(II) binding site and observed both binding and substantial stabilization (37). They subsequently incorporated two such sites, which yielded a structure in which the core’s stability was comparable to that of a natural protein (38). In a separate line of development, Pessi et al have engineered a three-His binding site in a protein modeled on a portion of an antibody variable domain (77). Hellinga & Richards designed a metal binding site that appears to bind Hg(II) as they predicted but binds Cu(II) in an unexpected mode (42, 43).
Metal complexes can stabilize isolated helices. One way to achieve such stability is to bind metal ions to pairs of nonstandard, metalligating residues (92); another approach results in an exchange-inert Ru(III) complex that binds pairs of histidine residues (32).
Metal complexes can also link separate peptide chains. For example, researchers have bound various divalent metal ions to assemble triple-helix bundle proteins in which amphipathic peptides are linked to a 2,2'-bipyridene functionality at their N termini (34, 54). A stable Ru(II) complex has been used to make a four-helix bundle protein by linking four pyridyl functionalities, each at the N-terminus of an amphiphilic peptide (33). These structures have branched topologies, the advantages of which were discussed earlier in this review.
MATCHED HYDROGEN-BONDED PAIRS Unlike nucleic acids, standard proteins lack specific pair-wise interactions between monomers (with the exception of cysteine pairing to form disulfides). A study surveying multiple protein structures found no tendency for specific pairs of hydrophobic side chains to associate in protein cores and no tendency for specific pairs to be surrounded by better packed neighbors (6). [In any specific protein, however, the backbone geometry might impose stringent constraints on the allowable set of side chains and conformations (80).]
Nielson et al used standard solid-phase synthesis techniques to prepare polyamides that consisted of thymine-linked aminoethylglycyl units; these polymers bind tightly to DNA with sequence recognition (71). The use of a wider range of hydrogen-bonding groups in this manner would presumably enable the design of protein-like structures containing interfaces that had DNA-like complementarity.
REDUCED BACKBONE FLEXIBILITY Substituting non-Gly for Gly residues or Pro for non-Pro residues decreases the conformational entropy of the unfolded state, thereby stabilizing the folded state (provided that these substitutions are compatible with the folded geometry) (58). Expected improvements are about 1 kcal/mol per residue replaced; observed improvements in natural proteins are smaller but comparable. Helices are stabilized by a-methyl-a-amino acids (dialkylglycines); difficulties in chain synthesis can be overcome by incorporating them as the C-terminal components of protected dipeptides (2).
REDUCED SIDE-CHAIN FLEXIBILITY In several proteins, a buried side chain has multiple conformations discernible in X-ray structures (96), but most buried structures have no conformational freedom. Consequently, burial of flexible structures during folding imposes an entropic cost. The estimated TDS costs for burial of residues in a well-packed core (78) are substantial: Ile, –0.89; Leu, –0.78; Met, –1.61; Phe, –0.58; Val, –0.51 (all in kilocalories per mole at 300 K).
Protein-like molecules can incorporate nonstandard amino acids that have lower conformational entropy per unit volume in the unfolded state. Examples might include amino acids with side chains containing cyclohexyl, norbornyl, or adamantyl moieties. Assuming these ammo acids are incorporated into folded structures that have similar core packing densities, strain energies, and so forth, this reduction in conformational entropy will contribute directly to increasing fold stability. Furthermore, by limiting side-chain flexibility, this strategy should decrease the likelihood of alternative core packings in the folded state and yield a structure of improved predictability and mechanical stability.
INCREASED PACKING DENSITY Richards found that the packing density of protein interiors (the fraction of the volume inside the van der Waals surface) is like that of crystals of small organic molecules. 0.70- 0.78 (87). The local packing densities within a single protein vary from - 0.60 to 0.85; poor side-chain packing correlates with good backbone hydrogen bonding (e.g. in regular secondary structure), while dense side-chain packing correlates with regions of less regular hydrogen bonding (87).
High packing density tends to correlate with stronger van der Waals interactions and burial of more hydrophobic area; hence, provided bad steric contacts are avoided, greater packing density should correlate with greater fold stability and greater mechanical stiffness, which would improve components for molecular machine systems. The variable packing densities in natural proteins suggest that improvements are usually possible. By providing additional side-chain options that add to the set of available shapes, the use of nonstandard amino acids multiplies the number of possible core packings and hence the number of packings with unusually high densities.
Several algorithms identify densely packed sets of side chains and conformations compatible with a specified backbone structure. Some investigators assume that side chains always occupy approximately rotameric states (local minima of strain energy) (16, 80). In other algorithms, simulated annealing is used to search for low-energy packings of a specified set of side chains, including nonrotameric conformations (45, 52).
A rotameric algorithm successfully identified a repacking of a substantial region of the core of a natural protein (42, 43). Another rotameric algorithm that included a molecular mechanics energy, a solvation energy, and an entropic term (also based only on rotameric states) gave good predictions of the relative catalytic efficiency of over 40 related enzyme-substrate combinations and was used to design an altered enzyme that is both highly active and selective for a nonnatural substrate (106). These successes with standard amino acids suggest that the potential of nonstandard amino acids for increasing packing densities can in fact be successfully exploited.
REDUCED TORSIONAL STRAIN A study of high-quality X-ray structures shows that roughly 20% of side chains are nonrotameric (i.e. have at least one torsion deviating by more than 20° from the optimal, rotameric angle); each has a strain energy of more than 1 kcal/mol (94). Even in a protein of only 70 residues, a mean strain energy of 1.5 kcal/mol for 20% of the side chains implies a destabilizing energy of 21 kcal/mol, which is greater than the net stability of the folded state. For a 300-residue protein, the estimated destabilization is 90 kcal/mol.
From a fold-prediction perspective, the prevalence of nonrotameric side chains presents difficulties for rotamer-based algorithms (16, 80). From a fold-design perspective, however, this situation presents some opportunities: if rotamer-based algorithms succeed in designing a dense, hydrophobic core, the resulting structure will be less strained and hence more stable (perhaps by 20 kcal/mol) than an analogous natural protein. Core designs with a relaxed rotameric constraint will have a trade-off between increasing density and decreasing strain. Adding nonstandard side chains to the set of options shifts this trade-off curve in a favorable direction.
REDUCED BAD CONTACTS The mechanical compliances associated with torsional deformations and nonbonded contacts are comparable (though nonlinearities make this a rough comparison). Consequently, one would expect that packing forces, in storing a given amount of energy in torsional strain, would store a comparable amount of energy in bad van der Waals contacts. This effect would more or less double the total strain energy associated with nonrotameric torsions in side chains.
INCREASED CORE HYDROPHOBICITY Dill concludes that hydrophobicity is the chief force driving protein folding and that the pattern of hydrophobicity along a chain chiefly determines the general pattern of its fold (17). Hydrophobicity patterns have been a primary concern in de novo designs. The a4 design has an extraordinarily hydrophobic core and hydrophilic surface and an extremely high stability-- ~ 15.4 kcal/ mol in one version (38). The Felix design does not exaggerate hydrophobicity contrasts and has an unusually low stability -- ~0.8 kcal/mol (41).
Based on a reinterpretation of the calorimetric studies of protein unfolding performed by Privalov & Gill (81) and on electrostatic calculations, Yang et al (107) have concluded that desolvation of peptide units and other polar structures has a large enthalpic cost (~1 kcal/mol per residue) even if these structures are hydrogen bonded in the folded state. This largely cancels the favorable effects of burying and desolvating hydrophobic side chains.
The cost of burying polar structure suggests strategies for increasing fold stability:
Perfluoroalkylated amphiphilic molecules are bulkier and more hydrophobic than their hydrogenated analogues and readily form supramolecular assemblies (51). Fluorinated side chains could also prove useful in engineering protein-like molecules.
NONAQUEOUS SOLVENT SYSTEMS Many water-soluble enzymes remain catalytically active after being lyophilized and dispersed in anhydrous organic solvents such as octane and toluene (50). Furthermore, many are stable to temperatures 100°C and appear to have less conformational flexibility than in water (103). Antibody-hapten binding remains strong and specific in anhydrous organic solvents (93). Because natural proteins (which have evolved under selective pressures favoring marginal stability in aqueous media) can often tolerate anhydrous organic solvents, the design of stable protein-like molecules for anhydrous media is unlikely to prove difficult. The decreased flexibility of proteins in anhydrous media suggests that protein-like molecules may perform better as machine components in the absence of water.
IMPROVED AQUEOUS SOLUBILITY Good water solubility is routinely achieved in engineered helical proteins with a high density of hydrophilic side chains on the surface. Ionic side chains are strongly hydrophilic, and engineered salt bridges can stabilize folded structures. Helix formation, for example, is promoted by the presence of i, i+4 salt bridges between Gly- and Lys+ (57). Extensive use of such side chains also makes folding more predictable by providing a strong contrast in hydrophobicity between surface and interior residues.
SUMMARY Evolution does not maximize the stability of natural
proteins. Designs chosen from a larger set of possible structures, with
attention focused on maximizing stability, should have substantially greater
stability. Table 1 summarizes estimated contributions
from the stabilizing mechanisms discussed in this section. The estimated
total suggests that application of the strategies reviewed here can increase
folding stability by 100 kcal/mol. Even a less thorough and only par tially
successful application of these strategies can likely produce structures
of excellent stability.
|
|
||
|
|
|
|
|---|---|---|
| Looped chainsb |
|
|
| Metal ion bindingc |
|
|
| Reduced backbone flexibilityd |
|
|
| Reduced side-chain flexibilitye |
|
|
| Increased packing densityf |
|
|
| Reduced torsional strainsg |
|
|
| Reduced bad contactsh |
|
|
| Increased core hydrophobicityi |
|
|
| Sum of estimated increments |
|
|
| a All changes are relative to a typical 70-residue natural protein with no disuflied cross-links and assume a design in which all structural features are compatible. | ||
| b Based on the change in conformational entropy from forming four 10-residue loops (79). | ||
| c Tajen is equal to b. | ||
| d Assumes a 1 kcal/mol increment per live residues through use ot dialkylglycines | ||
| e Assumes an average improvement of 0.1 kcal/mol per residue through substitution of bulky. inflexible side chains for more flexible side chains in the core. | ||
| f Exploitation of nonstandard side chains is assumed to increase packing density by 4% from the mean for proteins (0.74) to a high value for organic crystals (0.78), which is less than dense packings observed in proteins (0.85) (87). The stability increment is estimated from the 1.1 kcal/mol stabilization, which results from filling a cavity the size of a methylene group (48). | ||
| g See text; assumes successful core packing with nearly rotameric side chains. | ||
| h Estimated at about 0.5 (g). | ||
| i Reduction of buried polar groups by 7% through avoidance of buried polar side chains–assuming 1 kcal/mol per polar residue removed. | ||
Several of the stabilizing mechanisms discussed here decrease the entropy of the unfolded state: branched chains, looped chains, reduced backbone flexibility, and reduced side-chain flexibility; metal-ion binding is sometimes stable enough to act in this manner. Because these mechanisms operate by reducing the number of possible conformations, they reduce not only the entropy of the unfolded state but the number of potential misfolded states as well. Accordingly, they can be expected to reduce the likelihood that a designed molecule will fold to an approximately correct state but with a disordered core. Stabilization by reduced conformational entropy thus acts to increase folding predictability, thereby acting as an implicit form of negative design (89).
Most of the stabilizing mechanisms discussed here have been demonstrated
experimentally, although not all in one molecule. Applied in combination,
they seem more than adequate to allow the design of stable protein-like
structures of ~70 residues, while leaving extensive freedom in design of
the molecular surface, and hence of intermolecular interactions.
If one models monomers as cubes, then a family of protein-like molecules consisting of 4 ×4 ×4 = 64 monomers would have 16 monomers exposed on each face. If each monomer site permits only five variations in polarity (e.g. nonpolar, hydrogen-bond donor, hydrogen-bond acceptor, positive ion, or negative ion) and only two variations in geometry (bump or hollow), then each monomer site can have one of 5 ×2 = 10 functionally distinct structures, and each face can have one of 1016 different structures. If we use one face structure as a reference, only ~10-4 of a set of randomly generated structures will be complementary in polarity and geometry at more than half of the monomer sites. This calculation suggests that accidental complementarity will be rare, except in regular structures (e.g. the arrays of hydrogen-bonding groups found at the edge of b-sheet).
Most of the previously described strategies for improving fold stability are applicable to designing surfaces for tight, specific binding. The exceptions are altered chain topologies and backbone flexibility, which are irrelevant in already folded structures, and strong hydrophobicity, which would be incompatible with aqueous solubility. Of the others, metal-ion binding can drive quaternary assembly of natural proteins (31), and complementarity of hydrogen bonds and ionic species is common in antibody binding interfaces (12). Dense, low-strain packing of relatively inflexible structures is as favorable in the interface between molecules as in the interior of a single molecule.
Interfaces in quaternary structures are commonly hydrophobic (II), but studies of antibody-protein complexes demonstrate that hydrophilic protein surfaces can bind tightly in water (12). Pellegrini & Doniach successfully used computer simulation of intermolecular interactions to identify the binding sites of three antibodies to hen egg white lysozyme (76), thus suggesting that binding can be modeled well enough to support design.
A sequence of objectives in pursuing the development of molecular machine systems via self-assembly might be as follows:
Mechanosynthetic systems will rely on mechanical positioning to guide and control the molecular interactions of chemical synthesis. The effective concentration of a mechanically positioned species depends on the temperature and on the stiffness of the positioning system. These concentrations can be large (100 M) and localized on a molecular scale. Background concentrations can approach zero, thus enabling precise molecular control of the locations and sequences of synthetic operations. Researchers have developed concepts for mechanosynthetic systems and defined general technology requirements.
One approach to the fabrication of molecular machine systems is the development of AFM-based mechanosynthetic devices. These would position molecules by binding them to (for example) antibody fragments attached to an AFM tip. Development of suitable monomers, binding sites, and reaction sequences would then be a basis for the fabrication of complex mechanical structures.
Biological molecular machine systems rely on the self-assembly of folded polymers. A review of progress in protein engineering suggests that we have the means to design and synthesize protein-like molecules with well-defined structures and excellent stability. Success in this effort provides a basis for the design of self-assembling systems, and experience with the design and supramolecular assembly of smaller molecules is encouraging regarding the success of this next step.
Development of a molecular machine technology promises a wide range
of applications. Biological molecular machines synthesize proteins, read
DNA, and sense a wide range of molecular phenomena. Artificial molecular
machine systems could presumably be developed to perform analogous tasks,
but with more stable structures and different results (e.g. reading DNA
sequences into a conventional computer memory, rather than transcribing
them into RNA). Self-assembling structures are widely regarded as a key
to molecular electronic systems (55, 64),
which therefore share an enabling technology with molecular machine systems.
Finally, studies suggest that the use of molecular machine systems to perform
mechanosynthesis of diverse structures (including additional molecular
machine systems) will enable the development and inexpensive production
of a broad range of new instruments and products (18,
24,
61).
Laboratory research directed toward this goal seems warranted.