ADVANCES IN PROTEIN CHEMISTRY Volume 17
This Page Intentionally Left Blank
ADVANCES IN PROTEIN CHEMISTRY EDITED BY ...
6 downloads
547 Views
23MB Size
Report
This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!
Report copyright / DMCA form
ADVANCES IN PROTEIN CHEMISTRY Volume 17
This Page Intentionally Left Blank
ADVANCES IN PROTEIN CHEMISTRY EDITED BY
C. B. ANFINSEN, JR.
M. L. ANSON
Department o f Biological Chemistry
London, England
Harvord Medical School Boston, Massachusetts
KENNETH BAILEY
JOHN T. EDSALL
University o f Cambridge
Biological Laboratories
Cambridge, England
Harvard University Cambridge, Massachusetts
VOLUME 17
1962
ACADEMIC PRESS
New York a n d ond don
COPYRIGHT @ 1962,
BY
ACADEMIC PRESSINC.
ALL RIGHTS RESERVED NO PART OF THIS BOOK MAY B E REPRODUCED I N ANY FORM B Y PHOTOSTAT, MICROFILM, OR ANY OTHER MEANS, WITHOUT WRITTEN PERMISSION FROM THE PUBLISHERS.
ACADEMIC PRESS INC. 111 FIFTHAVENUE NEW YORK3, N . Y.
United Kingdom Edition
Published by ACADEMIC PRESS INC. (LONDON)LTD. Berkeley Square House, London,
W.l
Library o j Congress C’utalog Card Nunibor 44-8863
PRINTED I N THE UNITED STATES OF AMERICA
CONTRIBUTORS TO VOLUME 17 B. KEIL, Institute of Organic Chemistry and Biochemistry, Czechoslovak Academy of Science, Prague, Czechoslovakia S . M. PARTRIDGE, Low Temperature Research Station, Downing Street, Cambridge, England JERKER PORATH, Institute of Biochemistry, University of Uppsala, Uppsala, Sweden S. J. SINGER, Department of Biology, University of California, San Diego, L a Jolla, California
F.
SORM,Institute
of Organic Chemistry and Biochemistry, Czechoslovalc Academy of Science, Prague, Czechoslovakia
CHARLES TANFORD, Department of Biochemist y, Duke University, Durham, North Carolina
D. B. WETLAUFER, Department of Biochemistry, Indiana University School of Medicine, Indianapolis, Indiana*
* Present
address: Department of Physiological Chemistry, Medical School, IJniversity of Minnesota, Minneapolis 14, Minnesota
V
This Page Intentionally Left Blank
PREFACE W. T. Astbury was one of the great men of protein chemistry; his personal and his scientific influence were profound. This volume opens with a brief tribute t o him by one of us who knew him intimately; its aim is primarily to portray Astbury as a man, and touches only incidentally upon his work. Although the study of proteins in mixed solvents, such as water-alcohol mixtures, has a long history, the systematic use of truly nonaqueous solvents for the study of proteins is relatively new. S. J. Singer, who has been one of the leaders in this field, considers what has been achieved, and points the way to further research, in the first review in this volume. As with the article by Witkop in Volume 16 of the Advances, this is in many respects a preview, rather than a review; a most interesting beginning has been made, but far more extensive researches lie before us. Hydrogen ion titration curves of proteins provide a powerful tool to reveal many aspects of the structures of individual proteins. The characteristic ionization constants of the acidic and basic groups in the amino acids and peptides may be profoundly modified when these groups are incorporated in a protein molecule. An increasing number of proteins have been found in which potentially reactive groups are inaccessible for titration in the native molecule, and become available only after denaturation. Such findings can, in the years ahead, be correlated with detailed knowledge of the three dimensional structure of proteins, as obtained by X-ray diffraction and other methods. The present state of the field is reviewed by Tanford, who has done so much to advance it over the last decade. I n the following review Sorm and Keil grapple with a thorny problem. Are there systematic regularities in the structures of the peptide chains of proteins? Do certain sequence patterns repeat themselves more often than would be expected if we assumed a merely random distribution of amino acid residues in the peptide chains? Is it justifiable, or meaningful, to equate one amino acid residue to another when looking for identical, or analogous, sequence patterns in different peptide chains or in different parts of the same chain? Can significant relations be obtained by comparing two sequences, but reading them in opposite directions, as if a n inversion had occurred in one of them? This review addresses itself to these and other difficult problems. The views of the authors are not likely to win general vii
viii
PREFACE
or complete assent. The problem of formulating what we mean by a truly random sequence is beset by formidable conceptual and practical difficulties; yet the concept of a random sequence is indispensable for the claim that non-random patterns do appear in the peptide chains of proteins. The mathematical analysis, beginning on page 198, is likely to prove difficuIt reading for most biochemists; indeed even some experts in statistical theory, whom the editors have consulted, have remarked on the difficulty and subtlety of the concepts involved. Many biochemists are likely to challenge the concept of “standard amino acid interchanges,” and the use of inverted sequences for comparison of one portion of a peptide chain with another, as set forth by Sorm and Keil. Their ideas were developed before the new experimental evidence on the coding problem was available, and represent essentially a different domain of thought. This article, like others in the Advances, sets forth the personal point of view of the authors, who have had full scope to express their opinions, after examining the comments of various readers. We expect that it will serve as a valuable stimulus to further inquiry. The use of “molecular sieves” for the separation of molecules of different sizes has acquired extraordinary importance lately. The most widely used, and the most satisfactory, molecular sieves available today are the crosslinked dextrans. The theory and practice of their use is discussed in a brief review by J . Porath, whose work has been so fundamental in the development of this field. Elastin is a protein of widespread distribution and great importance. It has, nevertheless, been relatively neglected by biochemists, who have achieved great triumphs in unraveling the structure of its neighbor, collagen, as discussed by Harrington and von Hippel in Volume 16 of the A d vances. In the fifth review appearing here, s. M. Partridge presents what we believe to be the first extensive review of the chemistry and biochemistry of elastin. Partridge’s treatment shows how much knowledge of elastin has already been achieved, but it will be plain to any reader how much need there is for further research on elastin, and how significant a protein it is for the biologist. The use of ultraviolet absorption spectra in the study of protein structure was reviewed by Beaven and Holiday in Volume 7 of the Advances, ten years ago. There has been a vast outpouring of articles on the subject since. The use of ultraviolet spectra for the exploration of protein structure has ramified in many directions, and it is becoming an increasingly powerful tool for the study of the finer details of protein structure. In the final review of this volume D. B. Wetlaufer surveys the field as it stands today. We are indebted, as always, to the staff of the Academic Press for their
PREFACE
ix
skillful and devoted work in preparing this volume for publication. We also wish to express our appreciation to Mrs. Katherine Torgeson for preparing the index. C. B. ANFINSEN, JR. M. L. ANSON KENNETH BAILEY JOHN T. EDSALL November, 1962
wI I .LI .4M
THOMAS ASTBURY
WILLIAM THOMAS ASTBURY 1898-1 961 A Personal Tribute By KENNETH BAILEY
“Rill” Astbury was born in that region of North Staffordshire known as the Potteries, an island jungle of pot banks, kilns, brickyards, and steelworks, surrounded by a pleasant and varied countryside. Its people are sturdy and stolid with a facet of brittle humor which defies their harsh environment. He grew up on the fringe of the “Five Towns” and could not claim to be a Potteries man, but there was a great deal in Astbury that was characteristic of the neighborhood. He could well have found a place in the great novels of Arnold Bennett. Astbury was the fourth child of a family of seven. His father, William Edwin, was throughout his life a Potter, providing comfortably for the family by this and other lines of business. Except for the wisdom of his schoolmaster, Astbury might have gone into “the Pots”; instead, he won a scholarship at Longton High School. Here, his interests were shaped by the Headmaster and second master, both chemists, and towards the end of his schooldays he won the only local scholarship available and an exhibition a t Jesus College, Cambridge. He crowned his career at school by becoming Head boy and winning the Duke of Sutherland’s Gold Medal. For all this, he was not what we in a nearby school called a “swot” or crammer. He played cricket, made excursions into amateur dramatics and short-story writing, and was good a t sketching. From his father, both he and his younger brother, Norman, inherited a great love of music-an important feature of Potteries life in those days-and the two brothers ranged over vast fields of “one piano, four hands.” His heroes were Bach, Mozart, Beethoven, and Schubert, and though he loved to argue about it, he would never admit that any other composers approached their towering genius. In later life, as he seemed to undergo a phase of philosophical questioning and perplexity, he turned to the violin and enjoyed home-made music with his son, Bill, and anyone who would oblige at the piano. One of the greatest thrills of his life was to have dined with Yehudi Menuhin. Fifty years ago, few people in the Five Towns were famous in the academic sense, and few had the chance to be. During vacations, Astbury spent much time at the Technical College with A. T. Green, who later xi
xii
KENNETH BAILEY
became Director of the British Ceramic Research Association, and he also came to know J. W. Mellor, who was Director of the British R,efractories Research Association and compiler of the monumeutJal ‘Treatise 011 Inorganic and Theoretical Chemistry.” Another contemporary and acquaintance was Reginald Mitchell, whose “Spitfire” did so much to save Britain from the Nazi invasion. After only two terms at Cambridge, he was called up in the First?World War, and his poor medical rating, following appendectomy, caused him to serve with the R.A.M.C. in 1917. He was drafted to Cork and there met his future wife, Frances Gould. This break in his studies did not affect his application, and on his return to Jesus College, he took Firsts in both parts of the Tripos, specializing in Physics in his last year. After Cambridge, he joined Sir William Bragg a t University College, London, and two years later (1923) accompanied him to the Davy-Faraday Laboratory a t the Royal Institution. Here, of course, a new world in mathematical crystallography had opened up, which matched in its exciting prospects the sparkling enthusiasm of Astbury’s own approach to his work and hobbies. He not only had the kindly genius of Bragg to draw upon, but his fellow workers must have set a standard in which mediocrity could find no place. Among these were Kathleen Lonsdale, “Sage” Bernal, Robertson, Shearer, and others. The earliest papers were concerned with classic crystallography-the crystal structure of tartaric acid-and were accepted for publication by the Royal Society. He was led to think of structure analysis in the realm of biology through Bragg himself, who, as he says, “besides being by implication a crystallographer, was by induction a biologist, a molecular biologist.” To illust,rate Bragg’s Faraday Evening lectures on “The Imperfect Crystalisation of Common Things,’’ Astbury provided X-ray photographs of cotton, hair, bone, and sea-hedgehog spines. It could not have been by accident that Bragg had first laid the foundation of X-ray diffraction a t the University of Leeds, and that Astbury, in the light of his cursory exploration of biological structures, should go to Leeds in 1928 as Lecturer in Textile Physics. Bernal has pointed out that only Astbury’s imagination and perseverance could have hoped to disentangle what Astbury himself called “pretty dreadful photographs,” and he attempted it according to the Baconian recipe of torturing his material by boiling, stretching, and giving it a permanent wave. In this, his lack of biological background was a positive asset, in that he did not allow the cellular “junk” to obscure his clear vision. He was nevertheless very much aware of the heterogeneity of his material, and it plays a major part in his detailed interpretations of the complex elastic phenomena of mammalian keratin. The clarity of his approach shines through all his work, and though it led him into error, and
WILLIAM THOMAS ASTBURY
xiii
made some of his reasoning somewhat over-plausible, it did lead to those generalizations of fiber structure on which the more definitive interpretations have necessarily been built. The papers published between 1932 and 1934 reflect the bustling activity of the first years at Leeds. In addition t o the classic paper with H. J. Woods on “X-ray studies of the structure of hair, wool and related fibres” there appeared papers on the molecular structure of feather keratin, gelatin, and on the cellulose fibers in the cell wall of the giant alga Valonia. His only book ‘(Fundamentals of Fibre Structure,” which may still be read with the greatest enjoyment, also appeared at this time. Soon after, he was seeking a biochemist who could help him with experiments on soluble proteins. I was introduced to him by A. C. Chibnall, and though I considered myself a confirmed carbohydrate chemist, I was quickly won over to the idea that proteins were much more exciting and fundamental. Certainly, his impact on a young student could be tremendous, his euphoric evangelizing zeal transforming laboratory routine into a great adventure. So began a collaboration which continued on and off until his death. Although the number of his collaborators was small, his output up to the outbreak of World War I1 was remarkable, all the more because of the diversity of the materials he was thinking about-nucleic acids as well as proteins. This was his most creative and happiest period, and it was a great pleasure for me to share it. We began to “torture” the corpuscular proteins and were able to produce the oriented @-structurefrom denatured edestin. The “cross-p*’ pattern observed with “poached egg white” deeply interested him, and accorded well with his thoughts on long-range elasticity. It turned up again and again, in thermally contracted muscle, Rudall’s epidermin, and bacterial flagella. The occurrence of the a-pattern in myosin and muscle, fibrinogen and epidermin pointed to an ubiquity which is adequately recognized today even in the everyday corpuscuIar proteins. Incidentally, and though it in no way affects the argument, the myosin films from Mylilus and the diffraction patterns both of living and dead Mytilus muscle most probably represented still another a-protein not then defined-the paramyosin of Schmitt and Bear, a type of tropomyosin isolated by the author many years later. It is pertinent to mention in passing the widespread soluble tropomyosin which appears to exist in every type of muscle. If one had to discover a protein to please him, this was it. It not only gave a very pronounced a-pattern, but was capable of aggregating into a linearly fibrous form; further it could be made to form true crystals of which 90 % was salt and water. The crystals which T.-C. Tsao sent him, 1 mm long, “knocked him for six,’’ if one may use his cricketting phraseology. Astbury was a notable lecturer and a notable writer. He took enormous
xiv
KENNETH BAILEY
pains to maintain an easy, flowing style, even where the arguments were particularly subtle or closely-reasoned, and a t all times there was room for the quip or quotation. He was always so much his natural self that he could communicate his personality to an audience within the first few minutes. At a Faraday Society Meeting he was prevented from beginning his lecture by the conversation of some distinguished scientists in the front row. At last he went, up to them, glared, and in his Staffordshire accent said “Shut oop.)’ He was never too famous not to be flattered by an invitation to lecture, whether to “Industry” or to deliver very special lectures-the Spiers, the Proctor (given both in 1940 and 196l), the Jubilee of the Society of Chemical Industry, the Atkin, and the Mather. His Croonian lecture to the Royal Society adhered to the provision of Croone that it be devoted to “the advancement of natural knowledge in local motion,” via., the structure of biological fibers and the problem of muscle. Astbury had an unusual mixture of enthusiasm, idealism, and, beneath the bluff exterior, quit,e a measure of sentimental feeling. His honesty and loyalty induced in him a degree of sensitivity particularly discernible if he thought some friend or colleague had betrayed some trust or shown some insincerity. One might argue or squabble with him and yet feel compelled to forgive any annoyance hc caused, and on his part he very readily forgave it in others. Whatever difficulties he encountered as head of a department, he was always cheering and cheerful. He had to discard his own elegant a-model, but with characteristic enthusiasm took up the implication of the planar amide group in the PaulingCorey structures, considering it successively in relation to the models for wool, the cross @-pattern,collagen, and feather keratin. His contribution to fiber structure rests not so much on the models themselves as on his pioneer attempts to provide them. Historically, his distinction will seem to lie in the generalizations he made about the types of naturally occurring fibers, the master plans which characterize the a- and p-structures and the connective tissues. He explored their ubiquity arid evolutionary modifications, and laid down the principles on which their complex elastic behavior must be explained. The many honors he received were less important to him personally than as a recognition of the new Science of which he was both Master and Prophet, a world of order and plan, the vast field of Molecular Biology as we know it today. He lived to see it flourish and he lived also to see the crowning triumph of X-ray crystallography in the structures for DNA, myoglobin, and hemoglobin. To me, a privileged friend for more than 25 years, his work and character are inextricably mixed. Time may blur the edges of his personality, but will not obscure the pioneer qualities so evident in his writings that led him forth into his great adventure in the world of fibrous molecules.
CONTENTS CONTRIBUTORS TO VOLUME 1 7 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . PREFACE.. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Vii
WILLIAM THOMAS ASTBURY ...........................................
xi
v
The Properties of Proteins in Nonaqueous Solvents
S. J. SINGER I. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 11. Characteristics of Nonaqueous Solvents of Interest. . . . . . . . . . . . . . . . . . 2 5 111. Solubility of Proteins in Nonaqueous Solvents. . . . . . . . . . . . . . . . . . . . . . IV. The Conformations of Protein Molecules in Nonaqueous Solvents. . . . . 10 V. Physical and Chemical Properties of Proteins in Nonaqueous Solvents. . 59 References. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65
The Interpretation of Hydrogen Ion Titration Curves of Proteins CHARLESTANFORU
I. Introduction.. . . . . . . . . . . . . . . . . . . . . . . . . 11. Dissociation Constants of Appropriate Sm 111. Experimental Titration Data. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76 IV. Counting of Groups. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82 V. Reversibility ; Thermodynamic and Kinetic Analysis. . . . . . . . . . . . . . . . . 90 VI. Semiempirical Thermodynamic Analysis. . . . . . . . . . . . . . . . . . . . . . . . . . . 95 ....................... 121 VII. More Exact Thermodynamic Analysis. . VIII. Volume Changes Accompanying Titratio IX. Miscellaneous Topics. . . . . . . . . . . . . . . . . ....................... 127 X. Results for Individual Proteins, . . . . . . . . . . . . . . . . . . . . . . . . . . . . 131 References. . . . . . ........................................... 161 Regularities in the Primary Structure of Proteins F. SORM AND 13. KEIL I. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11. Types of Structural Regularities in Proteins. . . . . . . . . . . . . . . . . . . . . . . . 111. Comparison of Proteins and the Search for Common Types of Structural Regularities. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . IV. Mathematical Approach to the Evaluation of Structural Similarities between Proteins.. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xv
167 171 191 198
xvi
CONTENTS
V. Relevance of Structural Regularities to Protein Synthesis. . . . . . . . . . . . 201 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 205
Cross-Linked Dextrans as Molecular Sieves
JERKER PORATH
I. Introduction. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11. Principle and Mechanism, . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111. Simple Fractionations Employing Highly Cross-Linked Gels. . . . . . . . . IV. Separation of Peptides and Proteins by Molecular Sieving. . . . . . . . . . . . V. Concluding Remarks. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
209 211 211 216 224 224
Elastin S. M. PARTRIDGE I. Introduction. . . . . . . . ........................................ 11. Occurrence and Morp gical Structure. . . . . . . . . . . . . . . . . . . . . . . . . . . 111. Physical Properties of Elastin. . . . . . . . . . . . . . . . . . . . . . . . . IV. Isolation and Analytical Characterization of Elastin. . . V. Elastolytic Enzymes.. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . VI. Chemical Structure of Elastin.. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . VII. Conclusions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . .........................
227 228
,
276 284 297
Ultraviolet Spectra of Proteins and Amino Acids D. B. WETLAUFER
I. Introduction, . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 304 11. Terminology. . . . . . ................................... 111. Experimental. . . . . . . . . . . . . . . . . . ................... ................... IV. The Absorbing Components of Pr V. Spectrophotometric Titrations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . VI. Difference Spectra and Chromophore Environment. . . . . . . . . . . . . . . . . . 346 .................................... 375 VII. Analytical Applications. . . VIII. Miscellaneous Problems, . . .................................... 380 IX. Summary. ..................................... . . , _ . . 382 References. . . . . . . . . . . . . . . . ........................... 383 Author Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 391 Subject Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
THE PROPERTIES OF PROTEINS IN N O N A Q U E O U S SOLVENTS
. .
By S J SINGER Department of Biology. University o f California. Son Diego. La Jolla. California
I . Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . I1. Characteristics of Nonaqueous Solvents of Interest ....................... A . General Considerations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .......... B . Classification of Nonaqueous Solvents . . . . . . . . . . . . . . C . Acid-Base Properties of Solvents ..................................... I11. Solubility of Proteins in Nonaqueous Solvents . . . . . . . . . A . Strongly Protic Solvents . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . B . Weakly Protic Solvents . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . C . Experimental Aspects of Solubility. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . IV . The Conformations of Protein Molecules in Nonaqueous Solvents . . . . . . . . . A . Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . B . Factors Involved in Determining Protein Conformations in Solution . . C. Experimental Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . D . Polypeptides in Nonaqueous Solvents . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . E . Proteins in Nonaqueous Solvents . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . F. Nucleic Acids in Nonaqueous Solvents . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . G Conclusions . . . . . . . .............................................. V . Physical and Chemica perties of Proteins in Nonaqueous Solvents . . . . A . Quaternary Structure in Proteins . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . B . Biochemical Reactions of Proteins in Nonaqueous Solvents and Solvent Mixtures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . C . Protein Solutions a t Low Temperatures ......................... D . Chemical Modification of Proteins in Nonaqueous Solvents . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
.
1 2 2 4 4 5 5 8 8 10 10 11 23 35 40 53 56 59 59 61 63 64 65
I. INTRODUCTION It is a natural consequence of the paramount importance of water in biological systems. that the solution properties of proteins and other biologically important macromolecules have been almost exclusively studied in aqueous media . It is the properties of these aqueous solutions that must be understood if we seek to understand the behavior of living matter . On the other hand. it can be readily appreciated that much interesting and fundamental information should be obtained by the study of nonaqueous solutions of these substances . In some respects. the situation has its parallel in the field of simple electrolytes . There again. aqueous solutions are of the most immediate concern. but a very considerable insight into the properties of simple electrolytes and their aqueous solutions has come from in1
2
S. J. SINGER
vestigations of their nonaqueous solutions (Evers, and Kay, 1960). The accessibility of a wide variety of physical and chemical properties among nonaqueous solvents is the primary reason for their usefulness in such studies. Thus, broad variations in dielectric constant, viscosity, temperature (since many nonaqueous solvents freeze at substantially lower temperatures than water), spectral transparency, solvent molecular volume and structure, relative acidity or basicity, hydrogen bond accepting and donating capacity, chemical inertness-to mention some of the more obvious solvent properties of interest-are at least hypothetically accessible. Prior to the last decade or so, there had been some investigations of proteins in a few nonaqueous solvents, but little systematic exploration of this area was carried out, and the potentialities of these studies were not fully realized. Within the last decade, as a result of a combination of factors, a revival of interest in this field has occurred, and it is safe to predict that this revived interest will be maintained for some time to come. The purpose of this article is to focus attention on this rapidly developing area of biophysical chemistry. The properties of some nonaqueous solvents of interest are examined, some representative recent studies of nonaqueous solutions of proteins and related macromolecules are discussed, and suggestions are derived from these of possible new directions such investigations might profitably explore. No attempt has been made to review thoroughly all the published papers in this area, and no historical survey of earlier studies has been included. There is a substantial older literature dealing with solutions of proteins in mixtures of water and some other solvent, which has not been discussed in this article, because the intent of such studies has generally been to determine the effect of (presumed) relatively small perturbations in the properties of the water solutions produced by the second solvent. Solutions in water or in water-nonaqueous solvent mixtures are discussed primarily as they bear on the corresponding solution behavior in the pure organic solvent. 11. CHAI~ACTEIZISTICS OF NONAQUEOUS SOLVENTS OF INTEREST A . General Considerations Proteins are generally not directly soluble in the common nonpolar solvents, or in the usual polar solvents such as alcohol and acetone. If they were, much more extensive studies of nonaqueous solutions of proteins would probably have been carried out long ago. However, they are directly soluble in strongly protic solvents, as had been discovered early in the history of protein chemistry, and in some polar solvents which have only relatively recently become commercially available. Furthermore, by
PROPERTIES O F PROTEINS IN NONAQTJEOUS SOLVENTS
3
suitable means which are discussed in Section 111, a wide variety of nonaqueous solutions of proteins can be prepared and studied. It is clear that certain general requirements must be met by a nonaqueous solvent, if it is to be useful. Since proteins contain chemically reactive groups, a first requirement of a satisfactory solvent is chemical inertness. That is, the solvent must not cause the rupture or formation of any covalent bonds in the macromolecule other than those with hydrogen atoms. This places a severe restriction on the number of useful nonaqueous solvents, since oxidizing or reducing, alkylating or acylating, etc., solvents are eliminated. This requirement also places some not-so-obvious limitations on solvents which might otherwise appear to be satisfactory. For example, small amounts of residual water may produce scission of peptide linkages in proteins dissolved in acidic or basic solvents. Such residual water may be very difficult to remove from both the solvent and the macromolecule in question. As another example, acidified anhydrous alcohols have been employed as solvents for proteins. Under relatively mild conditions, however, such media serve as excellent esterifying agents for carboxyl groups (Fraenkel-Conrat and Olcott, 1945). In this connection, the compound 2-chloroethanol, which has received much attention recently a s a protein solvent, may contain an appreciable amount of dissolved HC1, and Blout (1960) has reported that the carboxyl groups of some synthetic polypeptides are esterified in this solvent. Another obvious requirement of a nonaqueous solvent is chemical stability under a variety of conditions. Thus, methanol, especially after standing in the presence of air, may contain small amounts of formaldehyde which can react with groups on proteins and nucleic acids. Formamide, N ,N-dimethylformamide, and related compounds, are slowly decomposed by acid or base in the solvent, and the possibility exists that such decomposition may be catalyzed to some extent by a protein dissolved in the solvent. Thus Rees and Singer (1956) found that the apparent osmotic pressure of a solution of insulin in N ,N-dimethylformamide continually increased over a period of a week at 25°C but reached equilibrium at 13.8"C, which might have been due to the slow decomposition of the solvent on the solution side of the osmotic membrane a t the higher temperature. These brief comments are intended to emphasize the critical need to use carefully purified and monitored nonaqueous solvents in any biophysical studies. Investigators, particularly physical chemists, oriented to studies in aqueous solution may tend to overlook the chemical properties of nonaqueous solvents. Useful information about a variety of nonaqueous solvents is collected in
4
S . J. SINGER
several works (Weissberger et aZ., 1955; Scheflan and Jacobs, 1953; Audrieth and Kleinberg, 1953).
B. Classification of Nonaqueous Solvents For the purpose of this article, those polar liquids which have been found to serve as solvents for proteins may be classified into two broad categories, strongly protic and weakly protic. I n the former category are strong anhydrous acids such as formic and dichloroacetic acids, and strong bases, such as liquid ammonia and aliphatic amines. The weakly protic solvents are characterized by being only weak proton donors or acceptors or both (amphiprotism). I n this respect they resemble water. As an operational definition, a weakly protic solvent can be regarded as one whose 1 M solution in water is characterized by 6 < pH < 8. There are some nonaqueous solvents such as pyridine and phenol which may constitute intermediate cases between strongly and weakly protic solvents, but for the purpose of this article these do not have to be separately considered. I n Table I are given a number of solvents which have been used successfully with proteins, together with a summary of some of their physical properties.
C . Acid-Base Properties of Solvents
A protein molecule contains many groups capable of accepting protons from, or donating protons to, suitable solvent molecules. I n general, we are interested in the following equilibria:
+ HX = G i + HzX" HGi + HX = HIG: + X2HX = HgX+ + X-
HGi
(11
(21 (3 1
where HGi is a particular kind of amphiprotic group on a macromolecule and HX is a n amphiprotic solvent. Studies with simple model compounds containing the group Gi in the solvent HX are required to define the equilibrium constants, K1 and K 2, of reactions (1) and (2) respectively, while for reaction ( 3 ) , the autoprotolysis constant, K , , of the solvent is needed. Much of this and related information has been laboriously obtained b y a variety of methods and over a long period of time in the solvent water, and has been correlated with the electric charge properties and titration behavior of proteins in aqueous solution. A similar body of data is required for each nonaqueous solvent whose solution properties are t o be well understood. This informat#ionfor any nonaqueous solvent of interest is almost totally lacking a t the present time. From a qualitative point of view, however, we can make an important
PROPERTIES OF PROTEINS IN NONAQUEOUS SOLVENTS
5
distinction between strongly protic and weakly protic solvents. In a particular weakly protic solvent HX, some of the groups Gi,, are characterized by values of K1 and K 2that are exceedingly small, such that Kl K , and K z > K , and K1 > K , . An extreme case is the solvent anhydrous H F (Katz, 1954a). Fluoride ion is such a weak base that even aromatic hydrocarbons are appreciably protonated in anhydrous HF solutions (Hammett, 1940, p. 293). Thus, K 2 is of the order of for benzene, and lo-' for hexamethylbenzene, in anhydrous H F (Kilpatrick and Luborsky, 1953). Therefore, all groups of a protein molecule which are as basic or more basic than benzene should be essentially completely protonated in dilute solutions in HF. Not only should RNHz groups be completely in the RNH; form, for example, but carboxyl groups should be protonated to RCOOH;, amide groups to RC+(OH)NHR', alcoholic OH groups to RCHzOHi, etc. The macromolecule should consequently take on a highly cationic character in this solvent. These considerations apply to other anhydrous acids and correspondingly to anhydrous bases to an extent depending on their relative acid or base strengths (Hammett, 1940, Chapter IX). In general, therefore, a protein must be highly protonated in strong acid solvents and highly depleted of protons in strong bases, and will ordinarily not exhibit ampholyte properties in such solvents.
111. SOLUBILITY OF PROTEINS IN NONAQUEOUS SOLVENTS
A . Strongly Protic Solvents A wide spectrum of proteins can be directly dissolved in any one of a number of strongIy protic solvents. This has been demonstrated particuIarly for dichloroacetic acid (Yang and Doty, 1957), hydroffuoric and tri-
TABLEI Properties of Some Pure Nonaqrceous Solvents4 Solvcnt
~1
Strongly protic acids HydroRuoric Formic m-Cresol Phenol Trifluoroacetic Dichloroacetic Acetic Bases Hydrazine Ammonia Ethylenediamine 1,2-propanediarnine Sulfur dioxide Pyridine Weakly protic alcohols Glycerol EtJhylcne glycol Methanol 1 ,2-Propmediol ZChloroethanol Ethanol 1-Propano1 2-Propanol
Dielectric constant
83.60 58.516
11.P5 9. 7S60 8.230*
8.Zzo 6.15¶O 51. 725 2 2 3 3
14.220 12.35'2 12.325
42. 5z3 37.723 32. 6325 32. Oao 25.W 24.3OZ3 20.125 18.345
B.1'.
Refractive
M.P.
Density
index (nD)
Viscosity (CP)
Referencs
19.5 100.7 202.7 181.8 72.4 194 117.7
-83 8.2 12.0 40.9 -15.3 9.7 16.6
0.99184-a 1.2133P3 I .o m 3 0 1.0576" 1.4EMP 1.5585'3 1.0437'5
1.369425 1.543820 1.541841 1.285020 1.370025
0.2406.5
1
1.96626 9.807ao 4. 07646 1.04030
2
113.5 -33.4 116.2 120.5 -10.2 115.6
2 -77.7 11.0 -72.7
1.01415 0.a - ' 0 0.89125 0. ~58425 1.46-1° 0 .9878l5
1.451330 -
1.72.!iZ5 0. 429-lo 0. 8B30
1.261380 1. 117115 0.796116 1.032P5 1.2019~0 0.79366'5 799555 0 . 78095a5
1.473525 1.4331j5 11326625 1.433l20 1.443815 1.359493 1.3835z3 1.374725
290.0 197.8 64.51 188.2 128.6 78.32 97.15
82.40
-41.8
18.2 -12.6 -97.5 I
-67.5 -114.5 -126.2 49.5
O
r
I
1.5067z5
94526 26.0915 0.545"
45.6W 3.913" 1.07825 2.52215 2.85916
2 2
3, 2 4 2
1 1 2
5 1
2 2 2 2 2
2 2 2 2
Amides N -Met hyl acet amide Formamide N ,N-Dimethylacetamide N , N-Dimethylformamide Miscellaneous Propylene carbonate Dimethylsulfoxide Acetonitrile Nitrobenzene Dioxane
178.9a0
204
109.5e5
210.5 165 153 241.7 189 81.6 210.8 101.3
37.82; 36.P 65. I=& 45 37.520
34.8225
2.2125
-61
0.9503a’ 1.129225 0.93662$ 0.944525
1.4468z5 1.43582h 1.42802j
-49.2 18.5 -45.7 5.76 11.80
I. 198725 1.lW~ 0. i71325 1.2082‘6 1.0269%
1.4 m 2 0 1.478721 1.X16= 1.5526’’ 1.q202’5
29.7 2
3. wa”
3.302” 9.610
-
2.55325 1.10027
3.75015 2.165‘8 1.43916
6, 7, 8 2
9, 10, 11 2,9, 10, 12 13, 14 9, 15 2 2 2
Superscripts indicate the temperature in “C at which the data apply. In ref. 2, the dielectric constant of trifluoroacetic acid i s given as 39.5 at 20°C. This value is not acceptable for reasons given in ref. 3. a b
REFERENCES FOR TABLE I -4
1. Audrictb, L. F., and Kleinberg, J. (1953). “Nonaqueoua Solvents.” Wiley, New York. 2. Weissbcrger, h.,Proskauer, E. R., Riddick, J. A., and Toops, E. E., J r , (1955). “Organic Solvents,” 2nd cd. Interscience, New York. 3. Damhauser, W., and Cole, R. H . (1852). J. Am. C h e n . Soc. 74, 6105. 4. “Internat,ional Critical Tables.” (1929). McGraw-Hill, New York. 5. Bcilbron, I., and Bunbury, H. M. (1953). ”Dictionary of Organic Compounds,” revised ed. Oxford Univ. Press, London and New York. 6. Postma, J. C. W., and Arens, J. F. (1956). Rec. trav. chim. 76, 1377. 7. Dawtlon, L. R., Wilhoit, E. n., and Sears, P. G. (1956). J. Am. Chem. SOC.78,1569. 8. Reynolds, W. L., and Weiss, R. H. (1959). J . Am. Chem. SOC.81,1790. 9. “Merck Index.” (1960). Merck, Rahway, New Jersey. 10. Leader, G. B., and Gormley, J. F. (1951). J . Am. Chem. SOC.73,5731. 11. Petersen, R. C. (1960). J , Phys. Chem. 64, 184. 12. Ioffe, B. V. (1955). Zhur. ObshcheiKhim. 26, 902. 13. Kronick, P. L., andFuoss, R. M. (1955). J . Am. Chem. SOC.77,6114. 14. Pepper, W. P. (1958). Ind. Eng. Chem. 60, 767. 15. Douglas, T. B. (1948). J . Am. Chem. Soc. 70, 2001.
8
S. J. SINGER
fluoroacetic acids (Katz, 1954a, b ) , liquid sulfur dioxide (Katz, 1955), ethylenediamine, propylenediamine, and hydrazine ( Rees and Singer, 1956). This spectrum of proteins is often so broad that it includes some that are normally insoluble (such as keratin) as well as those that are soluble, in aqueous media. The broad solubilizing power of strongly protic solvents has generally been assumed to arise primarily from hydrogen bonding to the solute. However, it appears quite probable that a variety of factors is involved, one of which is their capacity to convert proteins into highly deprotonated (in strong bases) or highly protonated (in strong acids) species, as discussed in Section I1,C. The molecular species so formed are then solubilized by their interaction with the polar solvent molecules, It is interesting that the strongly protic liquids that have been found to be effective protein solvents span a broad range of dielectric constants, from 6 for glacial acetic and 12 for ethylenediamine to 52 for hydrazine and 84 for liquid HF. This suggests that the ionizing power of the solvent is not critically involved in solubility, since in media of lower dielectric constant the protein salts must be only weakly dissociated (see Section IV,B,l). In such cases, a strong solvent-solute interaction produced by the formation of hydrogen-bonded ion-pairs (cf. Barrow, 1956) may be most important in solubilizing proteins.
B. Weakly Protic Solvents By contrast to the situation with strongly protic liquids, weakly protic pure liquids are generally incapable of converting proteins into highly protonated or deprotonated species, and at least partly as a result of this, the range of solubility of isoelectric proteins in weakly protic solvents is much more limited. In anticipation of the discussion to be presented in Section IV, it is important to realize that the native conformations of proteins are almost always altered in nonaqueous solvents, and the conformation of any one protein may be different in different solvents. The solubility of a protein might well depend on conformation, since the latter should determine the nature of solute-solvent interactions. Particularly as a result of this additional complication, the solubility of proteins in weakly protic solvents can only be approached in a wholly empirical manner. C. Experimental Aspects of Solubility In preliminary surveys of the solubilities of proteins in nonaqueous solvents, the solvent has generally been added directly to the solid protein. After suitable agitation, those mixtures that appeared homogeneous were classed as solutions. In this manner a number of interesting binary systems have been discovered. Some proteins which are not very soluble in aqueous media are quite soluble in a wide range of suitable nonaqueous
PROPERTIES O F PROTEINS IN NONAQUEOUS SOLVENTS
9
solvents. Certain vegetable proteins, such as zein and gliadin, had been found many years ago to be soluble in 80% ethanol (Robertson, 1918). Swallen and Danehy (1946) discovered that zein was soluble in 27 solvents, some 21 of them being weakly protic. Rees and Singer (1956) reported 15 nonaqueous solvents for zein, 12 of them being additional to those demonstrated by Swallcn and Danehy. It seems clear that the li8t of solvents for zein could be indefinitely extended if desired. Similarly, Rees and Singer (1956) found that insulin could be directly dissolved to the extent of a t least 1 mg/ml in some 13 nonaqueous solvents, closely paralleling the solubility behavior of zein. It is quite probable that in many cases those proteins which are insoluble or difficult to dissolve in aqueous solution, and which are not bound by covalent linkages into some generally insoluble protein matrix, will be soluble in some nonaqueous solvents. For proteins which are relatively water-soluble, the number of weakly protic solvents that have been found by direct solubilization experiments has been significant but nevertheless fairly limited. The indications are, however, that many more systems could be found with some application in the following directions. The lack of any appreciable direct solubility of a protein in a pure nonaqueous solvent may sometimes be a matter of unfavorable rate, rather than free energy, of solution. In any event, in cases where the protein does not dissolve directly, a stable solution can often be prepared in nonaqueous solvent by first dissolving the protein in an aqueous medium and then dialyzing this solution against mixtures successively richer in the nonaqueous component, and finally, against several batches of the pure anhydrous solvent (Geiduschek and Gray, 1956). Thus, dilute solutions of deoxyribonucleic acid (DNA) were prepared by this dialysis procedure in some 11 weakly protic solvents (but not in several others) (Herskovits et al., 1961), whereas direct solubility in such solvents is limited to formamide and ethylene glycol (Rees and Singer, 1956). In a similar manner, Rees and Singer (1956) found that trypsin was directly soluble only in formamide and dimethylsulfoxide, of a number of weakly protic solvents tried, but in a preliminary study (Fleck and Singer, unpublished experiments) stable solutions containing ca. 1 mg of trypsin/ml could also be prepared by dialysis from acidic aqueous solution into methanol, ethanol, ethylene glycol, propylene glycol, and glycolonitrile, but not into acetonitrile or acetone. Furthermore, it should be possible to prepare solutions of proteins by dialysis from one nonaqueous solvent into another, which may be useful with moderately protic solvents which produce fairly acid or alkaline solutions in mixtures with water. A difficulty with the dialysis procedure, however, is that it is relatively lengthy, and the protein may, even at low temperature, undergo slow irreversible changes in the time required.
10
8. J. SINGER
Some proteins that are insoluble in a pure nonaqueous solvent may dissolve much more readily on the addition of a neutral salt (the analog of salting-in in water solutioiis). Thus, Katz ( 1955) found that whereas proteins are generally not directly soluble in pure liquid sulfur dioxide, they are readily soluble in sulfur dioxide containing 6 M NHICNS. Zwitterionic salts (glycine, for example) may he useful in this connection in weakly protic solvents as they are in water (Cohn and Edsall, 1943, p. 617). In a related connection, the solubility of a protein may be considerably greater in a slightly acidified or alkalized weakly protic solvent than in the pure liquid. This would be analogous to the situation which one obtains in aqueous media, in which proteins are generally increasingly soluble the further the pH is from the isoelectric point of the protein, within certain pH limits (Cohn and Edsall, 1943, p. 606). This is probably the reason that 2-chloroethanol is such an excellent solvent for proteins (Doty, 1959). This solvent is not a very stable one, and significant amounts of HCl can be present in it. This may also account for the observation that, although bovine serum albumin is insoluble in pure acetone, methanol, and ethanol, it dissolves in them when trichloroacetic acid ( 1 %) is added (Levine, 1954). On the other hand, the trichloroacetate counterion may itself influence the solubility of the protein salt in the nonaqueous solvents. The presence of counterions bearing hydrocarbon-miscible tails, such as the trichloroacetate anion just mentioned, or tetraalkylammonium cations, instead of the usual inorganic ions, may be a n aid in solubilizing proteins in nonaqueous solvents. Mixed nonaqueous solvent systems are also of great interest and potential versatility as protein solvents, as they are for simple electrolytes (Evers and Kay, 1960). Their systematic use should enormously extend the range of nonaqueous solvent systems and properties. This is already suggested by several studies (cf. Doty et al., 1956; Yang and Doty, 1957). Many solvents not capable of dissolving a given protein may do so in mixtures with a small amount of a nonaqueous good solvent for the particular solute. In another connection, the addition of less than 1 % of one nonaqueous solvent to a solution of a macromolecule in another nonaqueous solvent can profoundly alter the physical chemical properties of the solution (Eirich et al., 1951; Doty et al., 1956).
IV. THECONFORMATIONS OF PROTEIN MOLECULES IN
NONAQUEOUS SOLVENTS
A . Introduction Until fairly recently, the native conformation of a protein molecule has generally been considered to be a characteristic property of the macro-
PROPERTIES OF PROTEINS IN NONAQUEOUS SOLVENTS
11
molecule itself, the numbers and arrangements of its amino acid residues and the covalent and secondary bonds holding the residues together. I n the last decade, however, it has become generally recognized that the solvent water plays an exceedingly important role in determining and stabilizing the characteristic structure that a protein molecule exhibits in an aqueous environment. It is in connection with this problem of the role of solvent in regulating macromolecular structure that nonaqueous solutions of proteins have been most frequently studied in the last few years. In an aqueous protein solution, it is well known that any of a wide variety of treatments, such as an increase of temperature or exposure to pH extremes, leads to denaturation of the protein. Denaturation is generally accompanied by a disorganization and partial randomization of the secondary and tertiary structure of a protein molecule. Therefore, we have been conditioned to think of the native aqueous configuration of a particular protein molecule as the most highly ordered form which its covalent structure permits it to attain. One of the most interesting developments of studies of proteins in nonaqueous solvents, therefore, has been the finding that in certain systems an apparently more highly ordered conformation (greater helical content) of the protein molecule exists than its native aqueous form. On the other hand, there are other nonaqueous protein solutions in which the macromolecular conformation becomes highly disorganized. In order to evaluate these studies, it is first necessary to consider the nature of the interactions, other than covalent bonds, that are important in determining the conformations of protein molecules, and to attempt to assess the effect of a change in solvent on these interactions. A most lucid and informative discussion of macromolecular interactions has been given in an earlier volume of this series by Kauzmann (1959). The following remarks are intended as comments upon and footnotes to the Kauzmann exposition.
B. Factors Involved in Determining Protein ConJormations in Solution 1. Electrostatic Interactions
Two principal kinds of intramolecular electrostatic interactions can operate in protein solutions. One is an attractive type, between closely spaced fixed charges of opposite sign on a protein molecule, to form salt linkages or ion-pair bonds. The other is a longer-range repulsive type, due to the net charge on a protein molecule. I n aqueous solutions, ion-pair bonds, if they exist, do not appear to be of significance in determining the native conformation of a protein molecule (Jacobsen and Linderstrgm-Lang, 1949). On the other hand, repulsive interactions can destabilize the na-
12
6 . J. SINGER
tive conformation in water if the net charge on the protein molecule becomes sufficiently great, as occurs at extremes of pH (Tanford, 1961) or by suitable chemical modification (Habeeb et al, 1959). The compact conformation is then converted to a highly swollen and unfolded conformation. I n what follows, we will largely confine our attention to these repulsive interactions, since in order to discuss the attractive type, detailed information, at present not available, would be required concerning the spatial distribution of charges on the protein molecule. What is the effect on these repulsive electrostatic interactions of changing from an aqueous to a nonaqueous solvent? Does the result contribute to the stabilization, the further ordering, or the disordering of the native aqueous conformation of a protein molecule? It is important to emphasize that. we wish to compare at this point the native conformation of the protein molecule in the aqueous and the nonaqueous solvents. Since, as will be shown subsequently, the native conformation is disrupted in most nonaqueous solvents, it is a hypothetical state not experimentally attainable. Nevertheless, the question just asked can be explored theoretically. This problem is best approached by considering the electrostatic free energy, P, , of a protein molecule due to its net charge and its interaction with an ion atmosphere. For a spherical ion of uniform charge density, according to the Debye-Huckel theory (Cohn and Edsall, 1943, p. 473)
where 2 is the net charge on the protein ion of radius b, t is the unit charge, K is the reciprocal of the thickness of the Debye-Huckel ion atmosphere, and a is the distance of closest approach of the macroion and its counterion, measured between centers. More refined treatments to obtain expressions for Pe are available (Tanford, 1961), but for our purposes Eq. ( 4 ) will suffice. If we consider the value of P , for the native conformation of a protein molecule in different solvents, b is constant and K can be maintained constant by suitable manipulation of electrolyte concentration; hence the effect of changing the solvent can be resolved into the effect on the values of Z and D. The value of 2 is subject to change in two different ways: one due to the influence of the solvent on the acid-base equilibria of the protein; the other due to ion-pairing of the fixed charges on the protein molecule as the dielectric constant is decreased. Ion-pairing may occur between fixed charges of opposite sign on the protein molecule (salt linkages) or it may occur between the fixed charges and small counterions of the electrolyte in the solvent, or the lyate or lyonium ions of the solvent. It will he assumed in what follows, that for the native conformation of the protein in the nonaqueous solvent, as with the native conformation in water, salt linkages between fixed charges are not energetically significant; and that
PROPERTIES OF PROTEINS I N NONAQUEOUS SOLVENTS
13
ion-pairing to counterions, or lyate and lyoriium ions, is the more important factor. The interplay of these factors is difficult to evaluate quantitatively in the general protein-solvent system. Some useful qualitative conclusions, however, may be arrived at in certain extreme cases. One extreme case is that of strongly protic solvents of high dielectric constant. The large value of 2 placed on the protein molecule in such a solvent must substantially increase P , and act to disorder the native aqueous conformation. In view of the discussion in Section II,C, for example, isoionic bovine serum albumin when dissolved in anhydrous HF is potentially capable of binding about 1000 protons per molecule. In aqueous media, the binding of about 10 protons per molecule to the isoionic protein is sufficient to begin the disruption of its native structure (Tanford et al., 1955b). The D value for H F at O"C, 83.6, is close to that of H20. Thus ( P e ) H F / ( P e ) H 2 0 S Z,",/Zfzo . If at one extreme there were no counterion binding to the protein in H F solution (i.e. that Z H F 1000) then (P,)Hp would be about lo4times greater than that value of P, in H 2 0 which corresponds to the onset of electrostatic disordering of the native protein conformation in aqueous solution. Even with a substantial degree of counterion binding in HF, however, ( P , ) H F / ( P e ) ~ , o should still be quite appreciable. Since it is highly unlikely that other kinds of interactions in HF solution would be augmented enough to overcome such a gross destabilizing influence, it can be predicted that the molecules of bovine serum albumin and of other proteins in H F solution are in a highly unfolded state at ordinary ionic strengths. Formic acid ( D = 56.5 at 16OC) and hydrazine (D = 51.7 at 25°C) are two other strongly protic solvents to which similar considerations apply. In strongly protic solvents of dielectric constant less than about 15, however, it is certain that the Z value for a particular protein will be considerably less than the number of protons abstracted from, or donated to, the solvent, as a result of the increase in counterion binding a t low values of D. To investigate the nature of counterion binding, let us compare a particular hypothetical protein in its native conformation in an aqueous and a nonaqueous solvent, S, such that the number of protons bound per protein molecule is the same in both solvents. The fixed charges on the protein molecule in the aqueous solution are taken to be completely ionized; let Zo be the net charge on the molecule in this solution. Let us assume that the different fixed charged groups on the protein molecule behave identically with respect to the binding of univalent counterions in the S solvent; then Zs = a&, where a is the fractional dissociation of the fixed charged groups in S solution. Other things being equal, then, (P,)s/ (Pd)HzO is therefore a2DH,o/Ds, in this situation. The magnitude and variation of CY can be estimated from an analysis of
14
S. J. SINGER
studies of the association behavior of simple electrolytes in solvents of low dielectric constant. It appears both theoretically and experimentally that the equilibrium constant, K , , for the association of a uni-univalent electrolyte to form ion pairs increases exponentially with l/D (Fuoss, 1958; Sadek and FUOSS, 1959; Berns and FUOSS, 1960, 1961). Some typical experimental results for the electrolyte tetrabutylammonium picrate in mixtures of nitrobenzene and CCld (Hirsch and FUOSS, 1960) are shown in Fig. 1. From these data, the fractional dissociation, a, of the electrolyte a t a con-
k
Fro. 1. The variation of the ion-pair association constant with the dielectric constant of the solvent for the electrolyte tetrabutylammonium picrate in nitrobenzene-CClr mixtures (Hirsch and Fuoss, 1960).
centration of M in these solvents has been calculated, with the assumption of ideal solution behavior of the ions. For this system, a2D,,,/D varies with D as shown in Fig. 2. The details of this curve, of course, depend on electrolyte concentration and will be different for different electrolytes, depending on their charge type and interaction with the solvent, etc., but the main features of the function should hold generally. With decreasing D , a2D,,,/D first increases slowly to a maximum and then falls precipitously. The initial part of this variation is due to a decrease in D without any significant occurrence of ion-pairing. As ion-pairing sets in, however, it rapidly becomes the dominant factor because of its exponential variation with 1/D.
PROPERTIES OF PROTEINS IN NONAQUEOUS SOLVENTS
15
For the hypothetical protein described above, it follows that a t equal extents of proton binding to a protein in water and in the solvent 8,and should vary with D at equal values of K in the two solvents, (Fs)B/(Fe)HtO in a manner similar to the function a2D,,,/D shown in Fig. 2. The large net charge on the protein should act to enhance counterion binding over that by a univalent ion at any particular value of D , and hence the maximum in (~,)s/(~,)E,o may shift to considerably larger D values, but otherwise the dependence on D should be much the same as in Fig. 2. If
70
80
FIG.2. The variation with dielectric constant of the function d D a 2 0 / D (see text) for the system of Fig. 1 at an electrolyte concentration of lo-* M.
the valence of the counterions is greater than unity, the maximum should again shift to larger D values. For any real protein, the presence of considerable numbers of different kinds of fixed charges in different local environments on the protein surface might tend to broaden the variation of counterion binding with D. Furthermore, simultaneous changes will generally occur in the degree of proton binding, and hence in 2, as D is varied. It is difficult to take these factors properly into account, but in view of the considerations just given, it seems probable, for the hypothetical native conformation of a real protein in weakly protic solvents, that there should be a range of D values, 80 > D >
16
8. J.
SINGER
-30, in which the binding of univalent counterions is not significant; then (P,)s/(F,)H,~ = P ~ D H , , / P & , D s at constant K , where Ps and PHZO are the numbers of protons bound to the initially isoelectric protein in the solvent S and in HzO, respectively. If Ps > P,,O ,then (F,)s> (p8),,, , and there should be an increase in the tendency to disorder the native conformation, other things being equal. On the other hand, for D < -15, ( F , ) s < (F,),,, in weakly protic solvents, and probably in most strongly protic ones also, and the electrostatic tendency to disorder should be substantially decreased under its magnitude in HzO.
2. Hydrogen Bonds between Peptide Linkages The author subscribes to the opinion expressed by Kauzmann (1959) that it is “unlikely that hydrogen bonds other than those involving peptide linkages can make a major contribution to the (over-all conformational) stability of most native proteins; there are generally relatively few of the other types of groups present, and none of them appear( s) to be especially strong.” In this discussion, therefore, we will confine attention to peptide NH. hydrogen bonds. For such bonds occurring in homogeneous solution, the generalized equilibrium of interest is:
-
a
0
4
NH.. .fj + C-0..
.fj
NH- .O=C + 8.* .S s
(6)
in which S is a solvent capable of acting as both a hydrogen bond donor and acceptor. No specific stoichiometric relations are implied by this equation. Quantitative studies of hydrogen bonding in simple amide systems in solution have as yet been few in number, and these have been carried out mainly in nonpolar solvents. Some thermodynamic data are listed in Table I1 for NH. -O=C hydrogen bonds. Several features of these data are of interest. I n the first place, even in the most nonpolar solvents, which are incapable of forming hydrogen bonds, the value of - AH is quite small, of the order of 3 to 5 kcal/mole of bond. Furthermore, even as weak a hydrogen-bonding solvent as CHC1, markedly diminishes this value, presumably because of competition effects expressed in Eq. (5). Still stronger hydrogen-bonding solvents must tend to drive the equilibrium in Eq. (5) even further to the left, other things being equal. In this connection, some recent measurements of Klotz and Franzen (1960, 1962) of the equilibrium constant, K , for the formation of hydrogenbonded dimers of N-methylacetamide are of particular interest (Table 11). The value of K observed in water solutions of the amide is about that in CC14solution, corresponding to a difference in AFo of about 4.1 kcal/rnole of hydrogen bond. This point of view has led to the realization in recent years that, taken
-
17
PROPERTIES OF PROTEINS I N NONAQUEOUS SOLVENTS
by themselves, intrapeptide hydrogen bonds can confer only marginal stabilization to the native conformation of a protein molecule in aqueous solution (Schellmm, 1955a, b; Kauzmann, 1959, Klotz, 1960). Quantitative information about amide hydrogen bonding is unfortunately very meager in nonaqueous liquids which are protein solvents. In later sections, however, much stress will be placed on the one result (Table 11) that amide hydrogen bonds in dioxane solution are considerably stronger than in water; the difference in AF" for the dimerization of N-methylacetamTABLEI1 Enthalpy of Formation of NH. . .O=C Hydrogen Bonds i n Solutiona -AH
Components
Solvent
1
p-Benzotoluidide Dioxane p-Benzotoluidide N , N-dimethylformamide ii-Valerolact am Acet amide N-Methylacetamide
CCl,
2.3
cc14
3.8
cc14 CHCli CsHe CClr
5.2 1.1 3.5 3.8 4.2b 1.6 0.8b
CHCL Dioxane HzO a
b
(kcal/mole H bond)
O.Ob
Kb
(1 /mole)
4.7 0.52 0.005
After Table 6.2 in Bamford et al. (1956). For the reaction 2A Az , see Klote and Franzen (1962).
ide in the two solvents corresponds to about 2.8 kcal/mole. It would be of great interest to have available corresponding data in a large number of the solvents listed in Table I. Certain semiquantitative considerations are, however, of interest here. The interaction of dilute solutions of amides with other hydrogen bond agents in nonpolar solvents has been studied by infrared spectroscopy by following the effects of these agents on the C-0 and N-H stretching frequencies of the amide (Cannon, 1955; Mizushima et al., 1955). This work is reviewed by Bamford et at. (1956). These studies have shown that the C==O group in amides acts as a strong proton acceptor, whereas the NH group has a relatively weak proton-donating capacity. Thus, intrapeptide hydrogen bonding is the product of a weak donor (NH) interacting with a strong acceptor ( C = O ) , resulting in a moderately weak hydrogen bond (Table 11). Even if a solvent S is a strong proton acceptor, it will not easily displace the strong C = O acceptor from the NH group. Under these
18
S . J. SINGER
circumstances, the proton-donating capacity of a nonaqueous solvent might be expected to be of greater importance than its proton-accepting capacity in determining the state of equilibrium in Eq. (5). This conclusion has several corollaries. In a series of compounds oontaining the OH group, for example, proton-donating capacity can vary widely, as is evident in Table 111. The tendency to disrupt int,rapeptide hydrogen bonding should therefore increase markedly, other things being equal, in the order: alcohols, phenols, carboxylic acids, and finally, substituted carboxylic acids such as trifluoroacetic and dichloroacetic acids. Water and the polyhydric alcohols are difficult to place in this sequence because they can donate and accept more than one proton per molecule to form hydrogen bonds, but these substances probably belong between phenols and carboxylic acids. Experimental data are required, however, to substantiate this. Furthermore, certain nonaqueous solvents such as dioxane, acetonitrile, TABLE111 Enthalpy of Formation of Hydrogen Bonds of OH Compounds with Dioxane in CClr Solutions Compound: benzyl alcohol - A H (kcal/mole H Bond) : 2.1 a
phenol -4.5
o-cresol 5.4
benzoic acid 6.2
Pimentel and McClellan (1960), p. 234.
dimethylsulfoxide, pyridine (and other tertiary amines) , and dimethylformamide, function primarily as hydrogen bond acceptors and have little proton-donating capacity. These solvents should therefore show, roughly speaking, a relatively small tendency to disrupt intrapeptide hydrogen bonds, since they cannot too successfully compete with the peptide NH group for the O=C hydrogen bond. 3. Lyophobic Interactions
The term “lyophobic interactions” is intended to generalize the expression “hydrophobic interactions” to other solvents than water. Hydrophobic interactions have been prominently implicated in determining the native configuration of proteins in aqueous solution. These interactions are actually not of a single relatively well-defined character, as are electrostatic or hydrogen bond interactions, but are rather a set of interactions responsible for the immiscibility of nonpolar substances and water. Proteins contain a substantial proportion of amino acids such as phenylalanine, valine, leucine, etc., with nonpolar side-chain residues. These nonpolar groups should tend, therefore, other factors permitting, to cluster on the
PROPERTIES O F PROTEINS I N NONAQUEOUS SOLVENTS
19
inside of the protein molecule away from the aqueous environment, as a result of these interactions. In the context of this article, lyophobic interactions refer to the forces tending to produce a similar clustering of the nonpolar residues of a protein in a nonaqueous solvent. Other types of lyophobic interactions (as for example between a solvent of low dielectric constant and the ionic groups of a protein molecule) are not included. Hydrophobic interactions have been correlated with unitary free energy changes, A F , , of relatively simpler processes such as (Kauzmann, 1959): Hydrocarbon in nonpolar solvent (mole fraction z) hydrocarbon in water (mole fraction z) (6) Pure liquid hydrocarbon
hydrocarbon in water (mole fraction z)
(7)
It is a most interesting fact that the unfavorably positive values of AF, accompanying these processes are primarily due to negative unitary entropy changes, and not to positive enthalpy changes. This has been interpreted to mean that water molecules become highly ordered around dissolved hydrocarbon molecules (Frank and Evans, 1945), suffering thereby a loss of entropy not compensated for by a favorable enthalpy change. It may be that water is a unique substance in exhibiting this large entropic loss. It has been suggested that similar thermodynamic considerations apply to hydrophobic interactions for protein molecules in aqueous solution (Kauzmann, 1959). If we extend these considerations to nonaqueous solvents, reaction (7) may be written as: Pure liquid hydrocarbon hydrocarbon in nonaqueous solvent (mole fraction z) (7a)
AF, for this process can be determined as follows: if the hydrocarbon is only partially miscible with the solvent, and the saturated solution is sufficiently dilute to be ideal, AF,
=
-RTlnx,
where xa is the mole fraction of hydrocarbon in the saturated solution. If, on the other hand, the hydrocarbon is completely miscible with the solvent, A F , can be obtained from suitable vapor-liquid equilibria data. Consider a solution containing mole fraction x of the hydrocarbon in equilibrium with a vapor, a t total pressure P, containing mole fraction y of the hydrocarbon. For a sufficiently dilute solution, the fugacity f = YP obeys Henry’s law:
f = KX
(8)
20
S. J . SINGER
where K is a constant. AF, for reaction (7a) is then given by: A F , = RT (lnf/fo
- In 2) = Rl’ In K / f 0
(9
where j o is the fugacity of the pure solvent. For the purpose of this article, we have calculated A F , values for the hydrocarbon benzene, using Eq. (8) or (9) depending on the degree of TABLEIV Unitary Free Energy Changes f o r the Transfer of Benzene to Various Solvents“ Temperature (“C)
(kcal/mole)
AFu
Reference
Water Ethylene glycol Formic acid Propylene glycol
18 25 25 25
4.07 1.83 1.45 0.99
1 2 3 2
Methanol Ethanol Isopropanol Acetonitrile CClr
35 45 45 45 40
1.23 0.96 0.88 0.67 0.08
4 5 8 7 8
Solvent
a The first four solvents are only partially miscible with benzene, and A P , was calculated using E q . (8).
REFERENCES FOR TABLE IV 1. Kausmann, W. (1959). Advances i n Protein Chem. 14, 1 . 2. Curme, G. O., and Johnston, F. (1952). “Glycols,” p. 48. Reinhold, New York. 3. “International Critical Tables.” (1929). McGraw-Hill, New York. 4. Scatchard, G., Wood, S. E., and Mochel, J. M. (1946). J . A m . Chem. SOC.68, 1957. 5 . Brown, I., Fock, W., and Smith, F. (1956). Australian J . Chem. 9, 369. 6. Brown, I., Fock, W., and Smith, F. (1956). Australian J . Chern. 9, 364. 7. Brown, I., and Smith, F. (1955). Australian J . Chem. 8. 62. 8. Scatchard, G., Wood, S. E., and Mochel, J. M. (1940). J . A m . Chem. SOC.62, 712.
miscibility of benzene with a nonaqueous solvent. These values are given in Table IV for a number of solvents for which data are available. With Eq. (9) K values were obtained by extrapolation of yP/x to infinite dilution of benzene in the solvent. Unfortunately, data for all these systems are not available at any one temperature, and the A F , values listed in Table IV are not directly comparable. They provide an estimate, however, of the relative magnitude of A F , for different solvents. The process we are in fact most concerned with is:
21
PROPERTIES OF PROTEINS IN NONAQUEOUS SOLVENTS
Hydrocarbon (in water, mole fraction 5) s hydrocarbon (in nonaqueous solvent, mole fraction
5)
(10)
The unitary free energy change of this isothermal transfer process, AF: is given by ( AFu)nonaqueouB solvent - (AFu)water. The calculation of exact values of AF: requires AFu values obtained at the same temperature, but a t least semiquantitative conclusions about AF: follow from the data in Table IV. For all the solvents listed in the table, AFt is negative. This reflects, of course, the fact that benzene is less soluble in water than in any other liquid that can function as a protein solvent. Furthermore, even for those nonaqueous pure solvents in which benzene is only partially soluble a t 25°C (among all the solvents in Table I, this category includes only the weakly protic solvents ethylene glycol, propylene glycol, glycerol, and formamide, and the strongly protic solvents hydrofluoric acid, formic acid, and hydrazine), -AF: is at least 2 kcal/mole. With the majority of the solvents in Table I benzene is completely miscible at 25°C; -AF: must therefore be considerably larger than 2 kcal/mole for any of these solvents. For example, benzene in methanol forms about the most nonideal solution for which the components are still miscible in all proportions at 25"C, yet -AFL is about 3 kcal/mole in this solvent. Most benzenemiscible solvents form benzene solutions that are so much closer to ideal that it is clear that for such solvents -AF; values must cluster between 3.5 and 4.1 kcal/mole. The sum of these AF: , 2 AF: , taken over all the nonpolar residues found in typical protein molecules, can attain very large negative values. If the native conformation of a protein molecule in aqueous solution is indeed in considerable part stabilized by lyophobic interactions, it follows that this stabilization should be substantially if not completely lost on transferring the protein molecule to almost any pure nonaqueous solvent. This destabilization might be expected to be less extensive in those few weakly protic nonaqueous solvents with which hydrocarbons are only partially miscible, such as glycerol, ethylene and propylene glycols, and formamide, than in the other solvents with which hydrocarbons are completely miscible. Furthermore the latter solvents should be very little differentiated under these circumstances, since AF: is so similar for most of them. As is demonstrated subsequently, these expectations are closely realized in fact. It is an oversimplification, of course, to correlate lyophobic interactions in a solvent-protein system solely with the solubility of simple hydrocarbons in the solvent. There are many other kinds of groups besides hydrocarbon ones in protein molecules and the influence of altered solvent environment on such groups must also be considered. In this connection, systematic
22
S. J. SINGER
solubility studies of model compounds in a number of nonaqueous Solvents, especially those described by Cohn and Edsall (1943, Chapter IX) are of value in estimating the free energy changes accompanying the transfer of particular groups from water to nonaqueous solvents. These studies will not be examined in detail, since they are discussed extensively elsewhere (Cohn and Edsall, 1943; Cohn, 1936). They reveal, however, that the nonpolar residues make the most substantial negative contributions to AF; in many solvents; for polar, nonionic residues (those of serine, threonine, tyrosine, etc.), the net contribution is small; while for the peptidelike group, -CH2CONH, it is positive. It is therefore not an entirely unjustified approximation to discuss hydrophobic interactions solely in terms of the nonpolar residues of proteins, which clearly make the largest contribution to them.
4. ConformationaZ Entropy There is a very substantial loss of conformational entropy accompanying the conversion of a protein molecule to its native conformation from a random-coil structure limited only by primary valence bonds. This entropy loss makes a large contribution to the destabilization of the native conformation. However, if the assumption is made that the random-coil form has a similarly large conformational entropy in different solvents, this entropy change should not be much influenced by a change of solvent, and is therefore not considered further at this point. 6. Cooperative Nature of Interactions Having thus attempted to isolate some of the more important types of interactions which determine the structure of proteins in solution, we conclude this discussion by emphasizing the fact that in any real system these interactions are difficult to separate. For one thing, lyophobic and hydrogen bond interactions are generally not independent of one another. A solvent may be very lyophobic for nonpolar groups expressly because the solvent molecules are strongly hydrogen bonded to one another. In such a solvent, therefore, strong solvent-solute hydrogen bonds may also be formed. Furthermore, one type of interaction may profoundly influence the degree to which another participates. For example, it has been shown (see Section IV,B,2) that solute-solute hydrogen bonding such as occurs in the a-helix is relatively weak in aqueous solution. If, however, hydrogen bonds are buried in the interior of a hydrophobic region of a protein molecule, they should become much stronger and thus enhance the stabilizing influence of the hydrophobic interactions. It follows that the helical regions and hydrophobic regions of a protein molecule need not be mutually
PROPERTIES OF PROTEINS I N NONAQUEOUS SOLVENTS
23
exclusive; in fact, there may be considerable coincidence of the two. This is demonstrated directly in the structure of myoglobin (Kendrew et al., 1961). In this protein, all the phenylalanine and methionine side chains are directed inwards in what are clearly hydrophobic regions of the molecule, and of these residues about half are located in helical and half in nonhelical regions. It is therefore difficult to dissociate completely the different kinds of interactions, and to assign to them roles of varying importance in determining macromolecular structure.
C. Experimental Methods At this juncture, it is useful to discuss the experimental methods that are of value in studying and separating the various kinds of interactions in macromolecular systems. A variety of experimental methods have been applied to the determination of protein structure and conformation in solution, and these have been summarized by Kauzmann (1959). In the discussion which follows, emphasis is placed on those methods which have so far been of most use in studies of proteins in nonaqueous solvents, and these remarks should be considered as supplementary to the Kauzmann summary. I . Electrostatic Interactions In order to examine the influence of these interactions, the charge properties of protein molecules have to be investigated in nonaqueous solvents. Titration, electrophoretic, and conductometric experiments could yield important information in this connection, but few studies in nonaqueous solvents have been performed. With respect to titration studies, potentiometric acid-base titrations in nonaqueous media have often been carried out, but these have been mainly concerned with analytical end-point determinations. For protein solutions, titration studies involve the measurement of hydrogen ion activities (see the article by Tanford in this volume). I n this connection, Sage and Singer (1962) have shown that the glass electrode can function as a hydrogen electrode in essentially pure ethylene glycol solutions. A spectrophotometric potentiometric titration was carried out with L-tyrosine ethyl ester in 0.2 M KC1 in ethylene glycol, by means of a glass electrode in conjunction with a Ag-AgC1 reference electrode and a salt bridge containing a saturated solution of LiCl in ethylene glycol. The ionization equilibrium of the phenolic group of tyrosine ethyl ester can be written as:
where K D is the acid dissociation constant in ethylene glycol, and aH+
24
S. J. SINGER
, and uRO- are the activities of hydrogen ion, and the phenolic and phenoxide ion forms of tyrosine ethyl ester, respectively. It was found that the increment in molar extinction coefficient, A€, of a partially titrated solution of tyrosine ethyl ester in ethylene glycol at 296 mM is essentially proportional to the mole fraction of the phenoxide ion species. Thus, if the glass electrode functions as an exact hydrogen electrode in ethylene glycol solution, the following relation should hold :
aROH
RT A ~ T- A6 RT E = constant - - In K , - - In ___ 5 A€ 5
(1 2)
where E is the measured potential, the constant is the sum of the standard half-cell potentials and the (assumed constant) liquid junction potential, 5 is the Faraday, and AeT the molar extinction coefficientof the phenoxide ion species. Therefore, a plot of E against log [(A€, - Ae)/Ae] should give a straight line with a slope of -0.0591 volt at 25”C, as was indeed found (Fig. 3). That the glass electrode can function as a hydrogen electrode may turn out to be true in a variety of other acid- and alkali-stable, weakly protic nonaqueous solvents besides ethylene glycol. This would make it experimentally feasible to investigate the acid-base behavior of simple substances and of proteins in a manner quite analogous to the extensive studies that have been carried out in aqueous solution. It is difficult to define a thermodynamic “pH” scale in a nonaqueous solvent (Gutbezahl and Grunwald, 1953) but an operational “pH” scale can be defined in terms of suitable standard buffers as is done in aqueous media (Bates, 1954). Such studies should be of considerable value in defining the properties of proteins in nonaqueous solvents. It might also be possible to perform electrophoresis experiments in weakly protic solvents that are not readily oxidized or reduced, although little work has been done in this area. As was pointed out by Tiselius (1959) , although the conductance in such media may often be very low, this may be compensated for by the application of high voltages without concomitant large heating effects. Some interesting conductance studies of protein-nonaqueous solvent systems have been carried out. Greenberg and Larson (1935), for example, found that gelatin, casein, and edestin dissolved in glacial formic acid (D = 56.5 at 16°C) showed marked increases of conductivity over that of the solvent, whereas no conductivity increment was observed with the same proteins dissolved in glacial lactic or acetic (D = 6.15 at 20°C) acids. This is in accord with the conclusions reached in Section IV,B,l, that in solvents of low dielectric constant, ion-pairing (in these cases, of the protein cations to lactate and acetate counterions) is essentially complete. Con-
PROPERTIES O F PROTEINS I N NONAQUEOUS SOLVENTS
25
ductivity measurements are easy t o make and should be of considerable utility in estimating the magnitude of electrostatic interactions relative to those existing in aqueous media (see also Herkskovits et al., 1961). From an operational point of view, the influence of electrostatic interactions in a particular protein-solvent system can be greatly diminished, often
-.-
1.0
.8
.6
.4
.2
0
-.2
-A
-.6
-.8
-1.0
A E -~ A € log Ac
FIG.3. Titration data of L-tyrosine ethyl ester HCI in 0.2 2cI KC1-ethylene glycol according to Eq. (11) of text. Line drawn through the d a t a has slope of 0.0591 v . (Sage and Singer, 1962.)
essentially eliminated, in a number of ways. Studies can be made at the isoelectric point of the protein (as revealed by electrophoresis measurements) if the solvent is a weakly protic one. Since 2 = 0 a t this point, P, is a t a minimum. Further, the addition of increasing amounts of supporting electrolyte, by increasing K and by decreasing 2 (through counterion binding), also produces a decrease in I”, [Eq. (4)]. If by these devices, no effect on the conformation of the protein molecule in a particular solvent is observed, then electrostatic interactions are not of primary importance in the system in question.
26
S. J. SINGER
2, Hydrogen Bonds between Peptide Linkages
At least two general classes of intramolecular peptide hydrogen bonds may be recognized, intrachain and interchain bonds. The former type leads to single-chain helical structures of which the a-helix (Pauling et al., 1951) is the only one that has 80 far been shown to exist in protein molecules (Kendrew et al., 1961). Interchain hydrogen bonds can lead to a variety of different structures, among them multichain helices as in collagen, and sheetlike structures as in silk fibroin (cf. Bamford et al., 1956, Chapter IV). Although the a-helix may prove to be the most important of these various structures for proteins, the possibility that other regular hydrogen-bonded structures may exist in specific cases must be kept in mind (Donahue, 1953; Luzzati et al., 1961). Infrared spectra are of considerable interest in the study of peptide hydrogen bonding. (For an extended discussion of this subject, see Bamford et at., 1956, Chapters V and VI.) The formation of the hydrogen bond N-Ha . -0-C weakens the forces acting between the N and H atoms and between the 0 and C atoms, and hence the N-H and 0-C stretching frequencies decrease. On the other hand, the resistance to bending of the N-H and O=C bonds is increased by the hydrogen bond, and the bending frequencies therefore increase. Furthermore, the nature of the peptide hydrogen bonding influences these frequencies. Examination of a large number of polypeptides in thin solid films prepared in a variety of ways has shown that characteristic frequency changes accompany the conversion of intrachain to interchain hydrogen bonds (a-+ p transformation). This is shown in Table V. These changes are conceivably due to a change in the conformation of the polypeptide chain. A difficulty, however, in applying this method to solutions of polypeptides and proteins in nonaqueous solvents is that many solvents of interest are insufficiently transparent in the important regions of the infrared, and much of the work heretofore has therefore been carried out with solid films of these macromolecules. The study of helical structures in solution has been given great impetus in recent years by the discovery that such structures contribute markedly to the optical rotatory properties of polypeptides and proteins (Cohen, 1955; Moffitt, 1956; Moffitt and Yang, 1956). This subject is undergoing very rapid development and continual re-evaluation as more experimental studies are performed. A full discussion of optical rotatory phenomena with these substances is inappropriate in this article, and for further details, the reader is referred to recent reviews (Blout, 1960; Levedahl and James, 1961; Urnes and Doty, 1961). Here a brief summary of the subject must suffice. The equation of Moffitt and Yang (1956), although without adequate theoretical basis (Moffitt et al., 1957), does describe the optical rotatory
27
PROPERTIES OF PROTEINS I N NONAQUEOUS SOLVENTS
properties of a number of helical polypeptides and proteins: [m'],=
In this equation,
Mo _ _3_ . blx na + 2 100
[m']h is
=
ad:
+----b-d
A2 - A:
(A*
2 x,:
the efTective residue rotation at wavelength X,
TABLEV Wave Numbers of Principal Bands i n Polypeptide Infrared Spectra03 b C=O (str.) Polymer
a
(cm-1)
1657 1655 (1652) Poly-L-glutamic acid 1652 Poly-L-lysine hydroiodide 1656 Poly-y -methyl-L-gluta1654 mate Poly-y-benayl-L-gluta1653 mate Poly-L-leucine-I,-phenyl1656 alanine (1 :1) Poly-DL-alanine 1661 1665 Poly -nL-phenylalanine 1661 Poly -DL-leucine (1659) Poly-y-methyl -D r,-glutamate 1662 Poly -7-benzyl-DL-ghtamate 1661 Poly-DL-leucine-DLphenylalanine (1 :1) 1661 Poly -DL-leucine-DL-aamino-n-caprylic acid (2:l) (1659) Poly -L-alanine Poly -L-leucine
NH (def.)
NH (str.)
B
a
(cm-1)
B
(cm-1)
(cm-1)
(cm-')
(cm-')
1635
1545 1545 (15491 1550 1531 1551
1524
3293 3292
3283
1525
3292
3287
B
1628
a
3292
1549 1545 1629
1539 1542 1547 (1551)
1631
1515
3305 3300
1526 1551
3297
1542 1551 (1553)
Figures in parentheses refer t o 1% solutions in nonpolar solvent. Data from Bamford et al. (1956).
which is the specific rotation [.]A corrected for the refractive index, n, of the solvent, and for the average molecular weight, M O, of the amino acid residues; and a0 , bo , and A0 are constants. The constant uo is a composite of a number of contributions to the rotation, but the constant bo appears to be more directly correlated with helical structures, in view of the following considerations.
28
S.
J. SINGER
A number of synthetic polypeptides have been studied by optical rotatory dispersion measurements in the visible arid near ultraviolet, with some of the results shown in Table VI. Almost all of these systems can be fitted to the Moffitt-Yang equation with A 0 = 212 f5 mp. However, the values of bo fall into two groups. In one group (group I, Table VI), the values of bo are all remarkably similar, - 630" f Z O O , and are essentially independent of solvent, provided the solvent does not disrupt the helix (see Section IV,D). In solvents which by other criteria do convert group I polypeptides from the helical to random-coil form, bo generally becomes zero. In the other group (group 11) bo takes on a variety of values ranging from 0" to +611° in solvents in which the polypeptides are helical and shows no characteristic change in solvents in which the macromolecules are randomly coiled. At first sight, these results suggest that the parameter bo is subject to such wide variations for helical polypeptide structures as to be of little use in studies of the helical content of proteins. The situation, however, appears to be more favorable than this. First of all, it is worth noting that the optical rotatory disperions of the free amino acids themselves segregate into the same two groups (Strehm et al., 1961). The structural distinction between these two groups appears to be the nature of the substituent on the &carbon atom of the amino acid. All the monomeric amino acids of the group I polypeptides have a methylene group (or an H-atom in the case of alanine) on the P-carbon, and all of those in group I1 have some other functional group so situated. Now, globular proteins generally contain about 90 % of amino acids of group I and only about 10 % of group 11. In copolymers of group I and group I1amino acids, the group I component is the dominant one (Blout, 1960, 1961) in determining the sign and magnitude of bo . Hence, it is not unreasonable that helical structures in protein molecules might exhibit the characteristics of group I polypeptide helices. Furthermore, those proteins that by other criteria appear to be extensively helical, exhibit bo values of -630" f 30" (Table VI) in their native conformations and values close to zero when denaturated (Cohen and Szent-Gyorgyi, 1957), which corresponds precisely to the behavior of group I polypeptides. Finally, in the case of myoglobin, for which X-ray crystallographic methods show that 77 % of the amino acid residues are in the right-handed a-helical conformation in the crystal, optical rotatory dispersion data obtained in aqueous solution in the ultraviolet from 360 to 240 mp, can be fitted to the Moffitt-Yang equation to give a helical content in reasonable agreement with the observed value (Urnes et aE., 1961). There appears to be, therefore, considerable correspondence between the optical rotatory dispersion parameters of the group I polypeptides and of
TABLE VI Opdical Rotatory Dispersion of Polypeptides and Some Proleins" Helix-forming solvents Macmmoleculc Group I polypept,ides Pol y-L-alanine Poly-y-benzyl -L-glutamate
Solvent
bo
----I~-
Poly-L-glutarnic acid Poly-L-leucine Poly -L-lysine Poly -N-carbobenzoxy-L-Iysine h3 a Copoly-L-lysine-L-glutaniic acidb Group I1 Polypeptides Poly -L-tyrosine Poly-0-acetyl-L-serine Poly -1-benzyl -L-histidine Poly-L-serine Poly-L-histidine.j5 HzO Poly-&benzyl-L-aspartate
-m
Random coil-forming solvents ha _ _ _ _ _ _ - ~
Solveiit
Film Ethylene dichloride Dioxane CIICl, N , N-Dimethylformamidc HzO, pH 4.4 Dichloroacetic acid 0.2 M NaBr in HzO, pH 11.5 N , N-Diinethylformainide 2-Chloroethanol
- 390 0
Dichloroacetic acid Hydrazinc
+%
- 177 0
H20, pH 10.5 Trifluoroacetic acid HzO, pH 6.8
0
Trifluoroacetic acid
+450
0 0 +G11
0.15 M NaCl in HzO, pH 10.85 25:75 Dichloroacetic acid:ethylene dichloride Benzyl alcohol 10 M LiBr in HzO 0.2 M NaCl in HzO, pH 6 CHCl3
-600 -620
HzO, 0.6 M KCl E I 2 0 , O . G M KC1
- 190d 0
-f3B
-630 -625 -6430 -610 -650 -650 -625 -G3G
+sa 0
NU
0
+3%
0 0
0.15 iM NaCl in HzO, pH 12.3 Dichloroacetic acid
Dichloroacetic acid 8 M Urea in Hz0
0.2 M NaCl in HzO, pH 3.7
Proteins Paramyosin Tropomyosin
Data froni Hlout (1SGO). Doty el al. (1958). = NL signifies a nonlinear plot. d Not completely denat,rired (Cohen and Saent-Gyorgyi, 1957).
u
b
HzO, 9.5 M urea H2U, 9.5 M urea
30
S. J. SINGER
certain proteins in their right-handed a-helical, as well as their disordered, conformations.' Optical rotatory dispersion measurements of globular proteins in aqueous solutions yield a range of bo values from around zero to - 300' (Table VII) . It has been suggested that the fraction - bo/630" be used as a measure of the fractional helical content of globular proteins (Cohen and SzentGyorgyi, 1957; Doty, 1959). Several problems associated with this suggestion must be considered. If such globular proteins indeed have only a fraction of their amino acid residues in the helical form, the helical segments may be fairly short and might have atypical optical rotatory properties. It appears reasonable theoretically, however, that even short helices TABLE VII Estimated Per Cent Excess of Right-Handed Helical Contents of Proteins* Protein Tropomyosin Insulin Bovine serum albumin Ovalbumin Lysozyme Pepsin Histone Ribonuclease Globin H
Water solution"
2-Chloroethanol solutione
88 38 46 31 29
110 45 75 85 63 44 72 67 74
31 20 16 15 ~~~
a
~
Data from Doty (1959). (-bo/630) X 100. Average of (-b0/630) X 100 and (a7/650) X 100 (see Doty, 1959).
exhibit the rotations characteristic of infinite ones, and that their contributions within a protein molecule be additive (Zimm et al., 1960). At least the following possibilities must also be considered: ( a ) both right- arid left-handed a-helices may be present in a protein molecule, and partially compensate each other's rotatory contributions; ( b ) helical structures other than the a-helix might be present in certain globular proteins (Bamford 1 Some ambiguity must be attached t o this statement in view of the conclusions reached by Luzzati et al. (1961) on the basis of small angle X-ray scattering studies from dilute solutions of polybenzyl-L-glutamate in dimethylformumide, pyridine, and m-cresol. These authors conclude t h at the conformation of the polypeptide in these solutions is t h at of the 310-helix(Donohue, 1953) and not that of the a-helix. The value of bo = -630" in Eq. (13) was obtained from optical rotatory dispersion measurements with similar solutions, and therefore if the conclusions of Luzzati et al. are correct, this value characterizes the 310-helix. It would then be necessary to determine whether the bo value for the a-helix is significantly different.
PROPERTIES OF PROTEINS IN NONAQUEOUS SOLVENTS
31
et al., 1956, Chapter IV), and have different rotatory properties; and ( c ) structures other than helical ones, such as interpeptide chain structures and asymmetric disulfide groups (Strehm et al., 1961) might affect the rotatory dispersion. Our primary objective in the context of this article is to examine the rotatory dispersion of proteins in different solvents. The qualifications raised in the preceding paragraph, while of significance to the interpretation of the absolute value of bo in any one system, are of less importance when comparing the value of bo for a given protein in different solvents. In this connection, the demonstration that bo for the helical conformation of poly7-benzyl-L-glutamate is unaffected by different solvents (Maffitt and Yang, 1956) (Table VI) is particularly pertinent. In subsequent sections, therefore, the procedure will be followed of estimating the changes in helical contents of a particular protein in different solvents by comparing the values of - bo/630° in the different solvents (Cohen and Szent-Gyorgyi, 1957; Doty, 1959). The alternative proposals made by Yang and Doty (1957) that the helical content of proteins be estimated from values of [aIDor from A, in the equation [a] = k/(X2
- A:)
are subject to considerable ambiguity in nonaqueous solvent systems (Tanford et al., 1960; Tanford and De, 1961). These parameters appear to reflect more than only helical structural changes in protein molecules. It has recently been found (Simmons et d.,1961) that a pronounced Cotton effect exists in solutions of helical polypeptides and proteins in the wavelength region from about 260 mp to 220 mp. A large trough in rotation occurs at 233 mp which is removed when the helix is disrupted. The magnitude of [ ( ~ ] 2 3 ~may therefore be found to serve as an independent measure of the helical contents of proteins. This effect has not yet been extensively applied to proteins in nonaqueous solvents, but it should prove to be of great interest for proteins dissolved in those solvents which are sufficiently transparent in this region of the spectrum. 5. Lyophobically Bonded Regions of Protein Molecules
It is difficult at the present time to prescribe general methods by which effects on lyophobic interactions can be specifically distinguished and measured. In certain cases, pK shifts of ionizable groups may occur when they are situated in the interior of hydrophobic regions of protein molecules, where they are inaccessible to the aqueous phase. On the other hand, pK shifts may be due to a variety of other causes as well, such as hydrogen bonding to other groups, vicinal electrostatic effects, or longrange electrostatic effects (Kauzmann, 1959). Some of the phenolic OH
32
S. J. SINGER
groups of tyrosine residues of proteins undergo large pK shifts, as can be readily observed by spectrophotometric titration experiments (Crammer and Neuberger, 1943). Tyrosine OH groups are of special interest in this context because they are not ionized a t physiological pH, unlike most other titratable groups of protein molecules. They are therefore more likely to be buried (probably hydrogen-bonded) in hydrophobic regions of low dielectric constant within a protein molecule, than are for example, carboxylate, ammonium, or guanidinium groups. Such buried tyrosines should exhibit large pK shifts. The tyrosines of bovine pancreatic ribonuclease ( RNase) appear to be n case in point. Three of the total of six tyrosine residues per RNase inolccule titrate reversibly with a normal pK of about 10, but the other three titrate only a t much higher p H and then irreversibly (Shugar, 1952; Tanford et al., 1955a). These results suggest that the RNase molecule has to undergo a profound structural rearrangement before the three anomalous tyrosines become accessible to titration. Furthermore, the absorption of RNase due to tyrosine residues a t about 280 mp exhibits a hyperchromic effect presumably as a result of the special environment of the three anomalous tyrosines. With several other proteins, such 2,s bovine serum albumin (Tanford and Roberts, 1952), lysozyme (Tanford and Wagner, 1954), and p-lactoglobulin (Tanford and Swanson, 1957), pK shifts of the phenolic OH groups of tyrosine residues are observed, but these are of a qualitatively different nature. Thus, the tyrosines of any one of these proteins cannot be readily differentiated into a normal and an abnormal variety, since the spectrophotometric titration data for these proteins are reversible and fall on single smooth curves, in contrast to the situation with RNase. On the other hand, the tyrosine residues of ovalbumin show comparable behavior to the three abnormal tyrosine groups of RNase (Crammer and Neuberger, 1943). About 2 of the total of 9 tyrosine residues appear to titrate normally, but the remainder are not titrated up to p H 12. At p H 13, these anomalous tyrosines become titratable, and this is accompanied by the irreversible denaturation of the ovalbumin molecule. It is therefore suggested (Tanford et al., 1955a) that at least those tyrosine residues which are distinguished by being titratable only a t p H > -12, and whose titrations are accompanied by an irreversible denaturation of the molecule, be presumed to exist in hydrophobic regions of their respective protein molecules in aqueous solutions. The titration characteristics of these anomalous tyrosine residues in nonaqueous solvents may then be examined to determine whether the hydrophobic regions in which they are presumed to exist persist in the nonaqueous solvent (Sage and Singer, 1958, 1962).
PROPERTIES OF PROTEINS I N NONAQUEOUS SOLVENTS
33
In certain cases, inaccessibility to reagents other than Hf and OH- may also be used to distinguish and to follow the fate of hydrophobic regions of protein molecules. For example, the three normal tyrosines of RNase may be iodinated under conditions which differentiate them from the three abnormal buried tyrosines (Cha and Scheraga, 1961). Another possible example involves methionine residues. It appears probable that the relatively nonpolar methionine residues will often be located in hydrophobic regions of protein molecules, as is explicitly demonstrated in the structure of the myoglobin molecule (Kendrew et al., 1961). It is known that the S of free methionine in aqueous solution reacts readily with iodoacetate to form a sulfonium salt and that the rate of the reaction is essentially independent of pH over a wide range (Gundlach et al., 1959a, b). The four methionine residues of native RNase at pH 5.5, on the other hand, do not react with iodoacetate; however, if the RNase is treated with iodoacetate at pH 2.8, or at pH 5.5 in the presence of 8 M urea, the methionine residues react extensively, and the molecule is irreversibly denatured (Stark et al.. 1961). That is, its activity is then lost, and interestingly, its anomalous tyrosines titrate normally. These results clearly imply that the four methionine residues are buried in the hydrophobic regions of the native molecule, and react with iodoacetate only when they are exposed. This suggests a possibly general method of following the fate of hydrophobic regions of protein molecules dissolved in suitable nonaqueous solvents. Let us assume that, as in the case of RNase, at least some of the methionine residues of a native protein molecule in aqueous solution show little reactivity towards iodoacetate or iodoacetamide. Then if the rates of reaction, under standardized conditions in the nonaqueous solvent, of the methionine S of the protein and of free methionine with these reagents are the same, it may be presumed that these groups on the protein are exposed in the solvent and that the hydrophobic regions are disrupted. The rates of reaction at the S of methionine can readily be followed by n,mino acid analyses (Gundlach et al., 1959a, b ) . The ultraviolet spectra of many proteins exhibit small shifts of absorption maxima to shorter wavelengths and small decreases in extinction coefficient at these maxima when the native conformations are disrupted in aqueous solution (see Beaven and Holiday, 1952; Beaven, 1961; and the article by Wetlaufer in this volume). These effects are presumably due to the special environments in which chromophoric side chains, particularly those of tyrosine and tryptophan, find themselves within the native protein molecule as compared to the free chromophores in aqueous solutions. The special properties of these native environments have been attributed variously to hydrogen bonding, particularly of tyrosine residues (Scheraga and Laskowski, 1957), vicinal electrical effects, including ion-
34
6. J. SINGER
dipole and dipole-dipole interactions (Wetlaufer et al., 1958), and hydrophobic bonding (Yanari and Bovey, 1960). The last named authors by studies of the simple model compounds benzene, indole, and phenol in aqueous and hydrocarbon solvents, adduce evidence that these spectral effects can be associated with the polarizability of the medium surrounding the aromatic residues. On passing from isooctane to water, i.e., from the medium of larger polarizability (higher refractive index) to lower, the spectra of all three model compounds exhibit similar shifts to shorter wavelengths, which are closely analogous to those exhibited in the transformation of native to denatured proteins in aqueous solution. These spectral changes in proteins, Yanari and Bovey conclude, can therefore be adequately accounted for by hydrophobic bonding in the native proteins. In nonaqueous solutions of proteins, the situation may be more complicated. For example, Yanari and Bovey found that relative to isooctane, ethanol and water have opposite effects on the spectra of phenol and indole. By the criterion of polarizability, however, ethanol should be intermediate between isooctane and water. I t is suggested by these authors that differences in hydrogen bonding of ethanol and water to the model compounds are responsible. Since most of the nonaqueous solvents for proteins are hydrogen bond-forming agents, it is not clear what spectral effects to expect in any particular solvent. In view of these results, two observations may be made: ( a ) as a corollary to studies of protein ultraviolet spectra in any particular nonaqueous solvent, the spectral properties of relevant simple compounds in that solvent must be investigated; and (b) any changes in protein spectra produced as a result of modification of the native protein conformat,ionin a particular nonaqueous solvent must be superimposed on changes resulting simply from the replacement of the aqueous environment by the nonaqueous one of generally different polarizability and refractive index. In the extreme case, for example, it may make little or no difference spectrally whether the aromatic chromophores remain internally bound within the protein molecule, or whether they become exposcd to the solvent, and hence no useful information about protein conformations can he expected. More studies have to be made to clarify to what extent spectral changes can be useful in the investigation of proteins in nonaqueous solvents. Many other methods of studying protein structure in solution have been proposed and tested (Kauzmann, 1959). Many of these can be used to establish that structural changes in a protein molecule may have occurred in a particular solvent, but almost all suffer from their inability to discriminate unambiguously between effects produced within helical or hydrophobic regions of protein molecules. The elegant method of the rate of deuterium exchange of proteins in aqueous media (LinderstrGm-Lang,
35
PROPERTIES OF PROTEINS I N NONAQUEOUG SOLVENTS
1955), which might be extended with suitable modifications to nonaqueous solvents such as alcohols and glycols, also falls in this category. Taken together with the results of methods already discussed, however, such measurements might provide useful information in nonaqueous media. The ultimate in structural studies would, of course, involve X-ray crystallographic studies of protein crystals prepared from nonaqueous solvents of the kind that are now being so successfully carried out with certain protein crystals prepared from aqueous media (Kendrew et al., 1961). A priori, there is no reason to exclude the possibility that proteins might be crystallized from pure nonaqueous solvents, although no reports of such attempts have appeared.* This is particularly so in view of the fact that in certain pure solvents, proteins appear to exhibit a more highly ordered (helical) conformation than they do in water solution. With these remarks we now proceed to discuss the experimental studies that have been made to date on the conformational properties of polypeptides and proteins in nonaqueous solvents.
D. Polypeptides in Nonaqueous Solvents Synthetic polypeptides provide very useful model systems for examining some of the problems of protein structure, as has already been discussed in Section IV,C,2. In the course of an important and extensive series of investigations of these compounds by Bamford, Elliott, and their co-workers (summarized in Bamford et al., 1956), it was shown that films of polypeptides cast from different solvents exhibited two different structures, designated a and fi, which were distinguished by two different classes of infrared spectra (Table V). The a-form is now known to be associated with the a-helix, and the 0-form with extended interchain structures. These results suggested that the a + 0 transformation in polypeptides could be achieved by appropriate solvents. This conclusion for these particular experiments, however, has been disputed by Blout and Asadourian (1956), who suggest that the polypeptide samples employed in these experiments were mixtures of a- and /3-structures in the solid state, and that extraction of one or the other form by particular solvent, was involved in these film-casting studies, rather than a true a * &transformation. An interesting study of the influence of solvent on the structure of polyf
2 In many instances (cf. Cohn et al., 1947; King et al., 1956) proteins have been crystallized from mixtures of water and organic solvents, but t o our knowledge, in no case from pure nonaqueous solvents. In this connection, the use of solvent additives t o induce the crystallization of proteins involves the not fully recognized hazard t h a t appreciable conformational changes may be induced in the protein molecules i n such solvents (see Section IV,E).
36
S . J. SINGER
y-benzyl-L-glutamate has been made in the course of a series of studies by Doty, Blout, and co-workers (Doty et al., 1956; Yang and Doty, 1957). The liquids tested as solvents for a high molecular weight (130,000) polypeptide sample could be placed into four categories: (1) nonsolvents; (2) solvents in which extensive aggregation of the polypeptide occurred; (3) solvents in which the polypeptide was molecularly dispersed and in a helical conformation’; and (4)solvents in which the polypeptide was molecularly dispersed and in a disorganized, random-coil conformation. These liquids are listed in Table VIII. The criteria employed to distinguish between helical and random-coil structures were viscometry and anomalous rotatory dispersion. TABLEV I I I Solution Properties of Poly-~-benzyl-~-glutamate~~ Helix-f orming
Nonsolvents
* ~ ~ ~ ~ : “solvents g
Formic acid
Benzene
Acetic acid Propionic acid Formamide
CHCI,
Dioxane Ethylene di chloride
E thylenediamine CCll
N,N-Dimethylformamide m-Cresol Dioxane CH C13-formainide
Random coil-forming solvents Dichloroacetic acid Trifluoroacetic acid Hydrazine
Ethylene dichloride Pyridine
Polymer of 130,000 weight-average molecular weight.
* Data from Doty et al. (1956), and Yang and Doty (1957). The solvents which were found to induce the random-coil conformation are all strongly protic solvents. Their capacity to form strong solventsolute hydrogen bonds is probably primarily responsible for the disruption of the helix. In this particular polypeptide no strongly acidic or basic side chains are present, and it is not likely that the weakly protic ester and amide groups enter significantly into acid-base reactions with these solvents. On the other hand, it would be gratifying t o have unequivocal evidence that this is truly a negligible factor. The possibility should be kept in mind that in a strong acid such as trifluoroacetic acid a small but significant fract,ion of the amide groups might be protonated. While this might not result in serious electrostatic interactions in a medium of as low dielectric constant as that of trifluoroacetic acid, it might by periodically disrupting intrapept>idehydrogen bonding along the helix, contribute substantially to the destabilization of the helix. (See below on the effect of helix length on stability.) Perhaps nuclear magnetic resonance studies 1
Footnote 1 on page 30.
PROPERTIES OF PROTEINS IN NONAQUEOUS SOLVENTS
37
in some of these solvents would be useful in determining the state of the amide group. The helix-forming solvents all appear to be relatively weaker hydrogen bond agents than the random coil-forming solvents, The conclusion seems justified that this nonionic homogeneous polypeptide assumes a helical conformation if no unusually strong solvent-solute hydrogen bonds interfere. The same conclusion is forthcoming from studies of other nonionic polypeptides (Downie et al., 1957, 1959). Two important qualifications of this conclusion must be made. While hydrogen bond interactions may be of primary importance in stabilizing the helical forms of such polypeptides, other factors contribute as well. This was particularly demonstrated by Fasman (1962). A number of different nonionic polypeptides in the helical form in chloroform solution were titrated with dichloroacetic or trifluoroacetic acids to determine the concentration of acid required to convert the polypeptide to the random-coil form, as judged by optical rotatory criteria. It was found that for the polypeptides poly-P-benzyl-L-aspartate, poly-e-carbobenzyloxy-L-lysine, and poly-7-benzyl-L-glutamate, about 7 %, 38 %, and 68 % dichloroacetic acid was required, respectively. For poly-L-methionine, the helical form is partially stable even in 100 % dichloroacetic acid, while poly-L-leucine is completely helical in 100% dichloroacetic acid, and requires a stronger acid, trifluoroacetic acid, to convert it to the random coil. These results reflect the contribution of regular homogcneous arrays of side chains to the stabilization of the helix, which must be included in the broad spectrum of effects termed lyophobic interactions. On the other hand, since the regularity and homogeneity of side chains is a feature of polypeptides not shared by proteins, it is not clear to what extent these results of Fasman's bear directly on the problem of lyophobic interactions in proteins. A second qualification is that the helical conformation in a given solvent is much less stable if the number of residues in the helix is less than a certain critical value, due to end effects (Schellman, 1955 a, b; Blout and Asadourian, 1956; Goodman and Schmitt, 1959). Doty el al. (1958) extended these observations by studying the influence of solvent on the conformation of an ampholytic polypeptide, copoly-Llysine-L-glutamic acid. In trifluoroacetic acid, the C=O stretching frequency is 1635 ern-', and bo in Eq. (11)is zero. The polypeptide is clearly present in the random-coil conformation. In 2-chloroethanol, on the other hand, the C=O stretching frequency is 1655 cm-I (see Table V) and bo is -636") and the polypeptide is in the helical conformation. Again, in trifluoroacetic acid, strong solvent-solute hydrogen bonds, particularly C=O. * .HOOCCF3 bonds (see Section IV,B), most likely play a prominent role in destabilizing the helix. While the copolymer must be present
38
S. J. SINGER
in this solvent with all the e--NHz groups of the lysine residues converted to e-NHt groups, these should certainly be extensively ion-paired to trifluoroacetate counterions in a medium of as low dielectric constant as trifluoroacetic acid (see Section IV,B,l), Repulsive electrostatic interactions in this solvent must therefore be of negligible structural significance. The possible protonation of amide groups in this solvent, however, which was discussed above, should be noted again. I n 2-chloroethanol solution, the C=O. . .HOCHnCH2C1hydrogen bond is considerably weaker than the corresponding C=O. . .HOOCCF, bond, and most likely as a. result of this, the polypeptide assumes the helical form. In aqueous solutions of this copolymer, the conformation is markedly dependent on pH. In acid solution b, = -310; at pH 7 bo = -100; and at pH 12 bo = 0. It is evident that the change in electrostatic interactions with pH is critical in this solvent. In acid solution, where the polypeptide is about 50 % helical, the electrostatic interaction of the E-NH: groups tends to disrupt the helix (poly-L-glutamic acid under these conditions is entirely helical). If the ionic strength is raised in the acid solution, the helical content is increased. At neutral pH, there is little net charge on the molecule since both COO- and NH; groups are present in roughly equal numbers, yet the helical content is only about 15%. In this case, the net increase in short-range electrostatic repulsions between adjacent, like charges (the copolymer being of more-or-less random sequence), and perhaps the loss of side-chain carboxyl-carboxyl hydrogen bonds, probably serve to reduce the helical content below that found in acid solutions. At pH 12, the large net negative charge on the molecule acts to destabilize the helix. Of course, these electrostatic effects are critical because of the large dielectric constant of water and because the electrostatic factors are superimposed on the low intrinsic stability of intrapeptide hydrogen bonds in aqueous solution (Section IV,B,2). In these relatively simple polypeptide systems, therefore, the experimental studies so far performed suggest a t least a crude correlation between the capacity of a solvent to disrupt polypeptide helices and the capacity of a solvent to form strong hydrogen bonds, although in special cases electrostatic and lyophobic interactions also appear to be involved. There is little apparent correlation with any other independent property of the solvents. The polypeptide poly-L-proline is an interesting and special case because of the exclusively imino peptide linkages of the polymer. The properties of this macromolecule and its high and low molecular weight analogs are discussed in the review of Harrington and von Hippel (1961), and will only briefly be mentioned here. The polymer can exist in two forms of markedly different optical rotatory properties, and can be converted from one form
PROPERTIES OF PROTEINS IN NONAQUEOUS SOLVENTS
39
to the other, depending on the solvent. In acidic solvents such as acetic or formic acids form I1 of the polypeptide exists, and if these solutions are diluted with propanol, form I is produced. The former structure is thought to be a left-handed helix with all trans-imide linkages, and the latter a right-handed helix with all cis-imide linkages. For the purpose of this review, there are several significant features of these experiments to be noted. Poly-L-proline does not have any intrapeptide hydrogen bonds because of the absence of peptide H-atoms, yet it exists in apparently only two principal conformations in solution, both highly ordered. Furthermore, the stability of these two conformations is dependent on the solvent, and hence on solute-solvent interactions. The fact that form I1 is stable in acidic solvents suggests that strong hydrogen bonding between the peptide C=O groups and the OH groups of the solvent plays a critical role in stabilizing this conformation, and that the weaker hydrogen bonds formed between the C=O groups and alcohol solvents are inadequate to overcome the effects of solute-solute (lyophobic) interactions favoring form I. [That solute-solute interactions are more prominent in form I is consistent with form I having a shorter helical pitch and lower asymmetry, hence greater compactness, than form I1 (Harrington and von Hippel, 1961).] Another interesting feature is that the interconversion of the two forms of poly-L-proline is catalyzed in both directions by small amounts of strong acid (Steinberg et al., 1960). These authors provide strong evidence that the mechanism of interconversion involves proton binding at the imide linkages of the polymer, thus facilitating cis-truns-isomerisation about the imide linkage. This is of particular pertinence in considering the effects of strongly protic solvents on proteins containing proline imino peptide linkages. Much remains to be done with synthetic polypeptide systems in nonaqueous solvents, particularly weakly protic ones. If indeed the helical conformation of a polypeptide (other than poly-L-proline) in solution is mainly stabilized by intrapeptide hydrogen bonds, this can be used to arrange different nonaqueous solvents in the order of their effectiveness as hydrogen bond-forming agents towards the amide group. A solution of poly-P-benzyl-L-aspartate in the helical configuration in chloroform, for example, could be titrated to determine the volume or mole fractions of different weakly protic nonaqueous solvents required tjo disrupt the helix. This information should correlate with independent studies of the hydrogen bonding of simple amides in the pure nonaqueous solvents. Such differentiation of the effective hydrogen bond-forming capacity of nonaqueous solvents would be very valuable in quantitative studies of helixcoil transformations in polypeptides and in the interpretation of the effects
S. J. SINGER
40
of these solvents on protein conformations as is discussed in the following section.
E . Proteins in Nonaqueous Solvents 1 . Weakly Protic Solvents
Although chronologically, structural studies of proteins in strongly protic solvents preceded those in weakly protic ones, it is advantageous to discuss the latter studies first. Yang and Doty (1957) were the first to obtain clear evidence for conformational changes in protein molecules in solvent mixtures rich in weakly protic liquids. By the criterion of anomalous optical rotatory dispersion, they showed that silk fibroin is extensively helical in a solvent mixture containing 15 % (by volume) of dichloroacetic acid and 85 % ethylene dichloride, whereas it is considerably less helical in a concentrated aqueous LiBr solution. Sage and Singer (1958) found that the three anomalous tyrosine residues of RNase titrated normally in pure ethylene glycol solution, which suggested that the hydrophobic regions of the molecule present in aqueous solution were disrupted in the nonaqueous solvent (Sage and Singer, 1962). More recently, Doty and co-workers (Doty, 1959) have found that a large number of globular proteins are directly soluble in the solvent 2chloroethanol. While the apparently unique solubilizing power of this substance is probably attributable to the HC1 present in the unstable solvent, some very significant measurements of the optical rotatory dispersions of proteins in a “pure” weakly protic nonaqueous solvent were made as a result of this discovery. In all cases studied, proteins exhibited larger, often substantially larger, values of -bo in 2-chloroethanol than in HzO (Table VII). These effects are reversible with change of solvent. Increased helical content in proteins dissolved in such solvents is probably connected with ( a ) decreased hydrogen-bonding capacity of the solvent compared to water, and ( b ) decreased electrostatic repulsive interactions between the fixed charges on the protein molecule in the low dielectric constant solvent as compared to water, due primarily to counterion binding. The suggestion that intrapeptide hydrogen bonding is stronger in 2-chloroethanol than in water is consistent with the arguments advanced in Section IV,B,2; with the related experimental results obtained with the copolymer copoly-L-lysine-L-glutamic acid in HzO and in 2-chloroethanol (Doty et aZ., 1958) described in Section IV,D; and with results in other solvents discussed later in this section. Nevertheless, despite intrinsically weaker intrapeptide hydrogen bonding (and presumably also weaker side-chain hydrogen bonding) in water
PROPERTIES OF PROTEINS IN NONAQUEOUS SOLVENTS
41
solutions than in 2-chloroethanol, and despite increased electrostatic repulsive interactions, globular proteins, instead of assuming a random-coil conformation in water solution, exhibit highly folded, compact, and relatively stable structures in HzO. It appears necessary to conclude that some interactions other than, and in addition to, hydrogen bonding must be involved in stabilizing the native conformations. The interactions involved must be hydrophobic interactions. In the absence of the results in 2-chloroethanol, it might have been argued that hydrogen bonding in all its various forms is the major kind of interaction responsible for stabilizing the native conformatlion of globular proteins, but that the low apparent helical content of some of these proteins in H2O is the result of either of two possibilities: ( a ) the presence of comparable amounts of both left- and right-handed helices; or ( b ) the presence of a preponderance of interpeptide (and therefore not helix-forming) hydrogen bonds as compared to intrapeptide ones. If either of these possibilities were actually realized, however, it would be difficult t o explain why a nonaqueous solvent such as 2-chloroethanol should so profoundly favor the formation of right-handed a-helices, since, other things being equal, there is no reason why the solvent should discriminate so markedly between one kind of N-He . .O=C hydrogen bond and another. Reference to Table VII indicates that with some globular proteins as much as a fourfold increase in - b0/630, and hence by inference in net right-handed helical content, is found in 2-chloroethanol as compared to H2O. It follows then that hydrophobic interactions must play a vital role in stabilizing the native configurations of many proteins in aqueous solution. Corresponding increases in -bo are observed upon the addition of similar amounts of other weakly protic, hydrocarbon-miscible, nonaqueous solvents t o protein solutions in H2O (Tanford et al., 1960; Tanford and De, 1961). Among the solvents so far examined which exhibit this effect are dioxane, ethanol, dimethylformamide, N-methylpropionamide, and 1-propanol. There is therefore nothing unique about 2-chloroethanol in this respect except its solubilizing capacity for proteins (Section II1,C) which is not nearly as extensively exhibited by the other pure solvents. An example of the studies of Tanford and co-workers is given in Fig. 4. The addition of a number of nonaqueous solvents to an aqueous solution of p-lactoglobulin a t pIl 3.0, ionic strength 0.02, results in a gradual increase in -bo ; the parameters - [m’]and -a0 in Eq. (13), however, first show an increase and then a decrease as the solvent mixture is enriched in the nonaqueous component. The maximum values of -[m’] and -a0 occur at a solvent composition at which only a small change in bo is found. These results suggest that at least two successive conformational changes are produced on the addition of the nonaqueous solvent to the aqueous
42
8. J. SINGER
solution. Depicted in Fig. 5 , from Weber and Tanford (1959), is one possible schematic representation of these changes. The first change might involve an unfolding of the protein molecule with little change in helical content; the second might then involve a refolding of the molecule into right-handed &-helicalregions. Related experimental results, except for the absence of optical rotatory dispersion data, had been obtained
mi organic sdvmt/100ml solution FIG.4. The effect of solvent additives on the optical rotatory properties of 8lactoglobulin at pH 3, ionic strength 0.02. D = dioxane, E = ethanol, F = formamide, G = ethylene glycol, U = urea. For the last additive, the abscissa is given in grams/100 ml of solution (Tanford et al., 1960, 1962; Tanford and De, 1961).
earlier by Bresler and Frenkel (Bresler, 1958) upon addition of dioxane and chloroethanol to aqueous solutions of bovine serum albumin. It is particularly interesting that the solvents used in these studies are not greatly differentiated by their capacities to alter the conformation of native proteins, and that among them is the solvent dioxane. For reasons expounded in Section IV,B,2, it is fairly certain that intrapeptide hydrogen bonds are stronger in dioxane than in water solution. The transformation from the native conformation to the unfolded state (Fig. 5A ---f B) cannot therefore be caused by the rupture of intrapeptide hydrogen bonds. It is
PROPERTIES O F PROTEING I N NONAQUEOUS SOLVENTS
43
much more reasonable to assign primary responsibility for the transformation to a reduction in lyophobic interactions in the mixed solvent as compared to pure HzO (hydrocarbons being more soluble in the former than in the latter), accompanied by only a slight reduction in the hydrogenbonding capacity of pure HzO. The subsequent transformation from the unfolded to the extensively helical conformation (Fig. 5B -+C) might then be explained as primarily the result of substantially further reducing the hydrogen-bonding capacity of the solvent so as to permit much more extensive intrapeptide hydrogen bonding to take place. On the other hand, it is possible that changes in electrostatic interactions are significantly involved in the conformational transitions that have just
A
C FIG.5. Schematic representations of a protein molecule in several possible conformational states: (A) its native conformation in aqueous solution; (B) a n unfolded conformation retaining the net helical content of the native form but with hydrophobic regions disrupted; and (C) an extensively helical conformation.
been described. These experiments of Tanford and his co-workers were performed with protein molecules carrying their maximum positive charge, in order t o maintain the value of Z constant upon addition of a weakly protic solvent. The addition of a nonaqueous solvent such as dimethylformamide to HzO lowers the dielectric constant of the solvent. For reasons discussed at length in Section IV,B,l, the electrostatic free energy, P, [Eq. (4)],of the native aqueous conformation of a protein molecule should, at constant 2, first increase and then decrease as the concentration of the nonaqueous component of the solvent mixture is increased (Fig. 2). This should favor the formation of an unfolded structure for a protein molecule in mixtures of a nonaqueous solvent and HzO as compared to the two pure solvents. Such unfolding might therefore be induced in a protein molecule in a mixed solvent if the molecule were already on the verge
44
S . J. SINGER
of becoming unfolded in H20, as is the case with many proteins bearing large net charges in aqueous solution. The value of D corresponding to the expected maximum value of P , is difficult to predict, particularly in a mixed solvent, in which, because of the unequal interactions of the solvent components with the protein molecule, the macroscopic value of D and its effective value might differ considerably. It is interesting in this connection that in a mixture of HzO and 2-chloroethanol containing 80 mole % HzO, in which an unfolded state of the protein RNase was detected by viscosity measurements (Weber and Tanford, 1959), a tenfold change in the ionic strength of the solvent mixture has a profound effect on the intrinsic viscosity of the protein, whereas ionic strength has no such effect in either of the pure solvents. At the highest ionic strength employed (0.2 M ) , the intrinsic viscosity is no larger than that in pure 2-chloroethanol. This dependence on ionic strength might be explained (Weber and Tanford, 1959) as a polyelectrolyte effect, i.e., the protein molecule in the 80 mole % HzO solvent mixture may remain in an unfolded state a t all ionic strengths, but the molecular domain occupied by the unfolded chains may increase with decreasing ionic strength as a result of electrostatic repulsions. On the other hand, the dependence of the intrinsic viscosity on ionic strength might also be explained as reflecting the existence of a folded conformation on the verge of destabilization, resulting in the partial conversion from the folded to an unfolded conformation with decrease in ionic strength. More information is required to distinguish between these possibilities. Furthermore, electrostatic interactions may be involved in the transformation of the unfolded protein to the highly helical conformation (Fig. 5B --+ C) in solvents rich in the nonaqueous component. In such solvents, the effective dielectric constant is probably small enough to induce extensive ion-pairing, thereby substantially reducing the electrostatic free energy contribution to the helical conformation below that existing in water, and making a more extensively helical conformation more favorable. Furthcr studies are required to evaluate the relative roles played by changes in lyophobic, hydrogen bonding, and electrostatic interactions in these transformations. A solvent which has been found to be of great interest in connection with protein conformation studies is ethylene glycol. Sage and Singer (1958, 1962) have investigated in some detail the properties of RNase in pure ethylene glycol, containing added neutral electrolyte. They examined the ultraviolet absorption spectrum, the ionization behavior of the tyrosine residues by spectrophotometric titratJion experiments, and the optical rotatory dispersion of the system. These authors found that the hyperchromicity which characterizes the
45
PROPERTIES OF PROTEINS I N NONAQUEOUS SOLVENTS
three anomalous tyrosines of the RNase molecule in water (Tanford et al., 1955a) is lost in ethylene glycol solution. The data in Table IX show that at the absorption maxima near 280 mp, the ratio of molar extinction coefficients of RNase and tyrosine ethyl ester is 6.1 in ethylene glycol, whereas it is 6.9 in water. While part of this reduction might be due solely to solvent effects independent of conformational considerations (see Section IV,C,3), the essentially complete loss of hyperchromicity nevertheless suggests that the three anomalous tyrosines of the RNase molecule which are considered to he buried in hydrophobic regions of the molecule in aqueous solution (Tanford et al., 1955a), are exposed to the solvent in ethylene glycol. Furthermore, the spectrophotometric titration of the tyrosine residues of RNase in ethylene glycol, carried out with a glass electrode, showed that all six tyrosines titrate on a normal sigmoid curve TABLEIX Ultraviolet Spectral Properties of Tyrosine Ethyl Ester and Ribonucleaseo Compound
Solvent
Absorption maximum (mp)
Molar extinction coefficient
Tyrosine e t h y l ester
0.2 M KC1-water 0.2 M KC1-ethylene glycol
275 277
1400 1850
Ribonuclease
0.2 M KC1-water 0.2 M KC1-ethylene glycol
277 275
9700 11400
Sage and Singer (1962).
and reversibly in this solvent whereas they titrate with an inflected curve and irreversibly in aqueous solution (Fig. 6). This also indicates that the tyrosine residues are accessible to solvent and that the hydrophobic regions of the protein molecule present in water are disrupted in ethylene glycol. The value of -bo in Eq. (13), however, was found to be 92", close to the value 105" for neutral aqueous solutions of the protein (Weber and Tanford, 1959). This suggests that the net helical content of the molecule is not significantly changed on transfer from water to ethylene glycol solution, which is in marked contrast to the situation in pure 2-chloroethanol, for example (Doty, 1959). Taken together these results can be interpreted to indicate that the RNase molecule in pure ethylene glycol is in the intermediate conformational state depicted in Fig. 5B. Presumably, the hydrogen-bonding capacity of ethylene glycol and of water, with respect to the peptide group, is sufficiently similar so that intrapeptide hydrogen bonding occurs to about the same extent in the two solvents. This does not necessarily mean,
46
S. J. SINGER
however, that the helical regions of the RNase molecule which are present in water solution persist in ethylene glycol ; helices originally present in hydrophobic regions of the molecule (see Section IV,B,5) in aqueous solution might become disrupted and new helical regions might be formed to an equivalent extent in ethylene glycol. On the other hand, that lyophobic int,eractionswith the nonpolar residues of the protein are weaker in ethylene 6
7
8
9
10
II
12
13
I
I
I
I
I
I
I
I
50
0
200
150
100
-50
-100
-150
pH
-200 mv(HzO)
6 0
;5 E
t t
In4 w
z
v)
g 3 t LL 0
a2 w m
2Z I 0 200
I50
loo
50
0
-50
-100
-150
-200
MILLIVOLTS (GLYCOL)
FIG.6. Spectrophotometric titrationa a t 296 rnfi of RNase in 0.2 M KCl in HzU (dashed curve) and in KC1 in ethylene glycol (solid curve and data). The upper mv and pH scales refer to the HZ0 titration and the lower scale to the ethylene glycol. Forward and back titration data in ethylene glycol show that the titration curve is reversible, unlike that in HzO. (Sage and Singer, 1958, 1962.)
glycol than in water is reasonable in view of the greater solubility of hydrocarbons in ethylene glycol than in water (Curme and Johnston, 1952) (see Table IV). While such lyophobic interactions must be weaker in ethylene glycol solution than in water, they should be considerably stronger in ethylene glycol, which is still only slightly miscible with hydrocarbons, than in nonaqueous solvents such as ethanol or dioxane, which are completely miscible with hydrocarbons at 25°C (see Section IV,B,3). It is therefore
PROPERTIES OF PROTEINS IN NONAQUEOUS SOLVENTS
47
of great interest that Tanford et al. (1962) have recently found with mixtures of ethylene glycol and water, that a very large volume fraction of ethylene glycol is required to effect any conformational changes in p-lactoglobulin; a much larger volume fraction than is required of any other weakly protic nonaqueous solvent examined (Fig. 4). This is completely in accord with the hypothesis that hydrophobic interactions are importantly involved in maintaining the native conformation of p-lactoglobulin in water. Formamide is another nonaqueous solvent only partially miscible with hydrocarbons, It is therefore apparently contradictory to the hypothesis just stated, that formamide was found to be only slightly less effective than ethanol or dioxane as a denaturant for p-lactoglobulin (Tanford and De, 1961) (Fig. 4). However, formamide has an unusually large dielectric constant, and it is possible that the low ionic strength (0.02 M ) employed in these studies was inadequate to minimize electrostatic repulsions within the highly charged p-lactoglobulin molecule. At larger ionic strengths, a more profound difference between formamide and hydrocarbon-miscible nonaqueous solvents might be observed in such experiments (Herskovits. 1962). The effectiveness of urea as a denaturant (Fig. 4) may be mainly related to its strong hydrogen bond-forming capacity; on the other hand, there is evidence available to suggest that urea also acts t o weaken hydrophobic interactions (Kauzmann, 1959; Bruning and Holtzer, 1961) and the effectiveness of urea may therefore be due to a combination of these two factors. In view of these results, it should be profitable to investigate a number of related polyhydroxy solvents other than ethylene glycol as protein solvents. For example, if lyophobic interactions are so important to the maintenance of the native conformations of proteins, the efficacy as protein denaturants of the following three solvents should be in the order propylene glycol > ethylene glycol > glycerol, since this is the order of their decreasing capacity to dissolve hydrocarbons (Table IV). The capacities of these compounds to alter the extent of intrapeptide hydrogen bonding per se, however, should be very similar to one another and to water. Anhydrous mixtures of ethylene glycol or glycerol with still more highly polyhydric alcohols, such as pentaerythritol, might have even less tendency to alter the native conformations of protein molecules. 2. Strongly Protic Solvents
There is ample evidence that in strongly protic solvents, proteins generally exist in an unfolded configuration, resembling roughly the randomcoil form of polypeptides. Thus, Ambrose and Elliott (1951) showed that insulin treated with formic acid and cast as a film exhibits a shift in the maximum of the C=O infrared stretching frequency from 1657 cm-I for
8. J. SINGER
48
the native protein to about 1637 cm-' (see Section IV,C,2). Similarly films of silk fibroin cast from dichloroacetic acid solution show a C=O peak at 1630 cm-' whereas with films cast from concentrated aqueous LiBr solutions, the peak is at 1660 cm-' (Ambrose et al., 1951). Katz (1955) found that solutions of globular proteins dissolved in anhydrous SO2 (containing 2.5 M KI) exhibit a shift of the 1550 cm-l NH deformation band of the native proteins to the vicinity of 1515 cm-'. These shifts in infrared spectra are probably all to be interpreted (see Table V) as having TABLEX Ultracentrifugal Properties of Proteins in Strongly Protic Solventsa Sedimentation constants') Protein
Nonaqueous solvent
Nonaqueous solvent
Water
Zein
Ethylenediamine 0.05 M NaNOa-ethylenediamine
1.1 1.2
2.20
B-Lactoglobulin
Ethylenediamine 0.2 M NaC1-hydrasine
1.1 1.0
3.4
Bovine serum albumin
Ethylenediamine 0.2 M NaC1-hydrazine
4.8, l . F d
4.9
1.6
Fibrin
Ethylenediamine
1.9
9.0"
Bovine ?-globulin
0.2 M NaCl-hydrasine
1.61
8.0
Rees and Singer (1956). Corrected t o the viscosity and density of water at 25°C. c As determined in 60% ethanol. d Two peaks i n sedimentation pattern. * Sedimentation constant for fibrinogen. f Protein underwent decomposition with tiine.
resulted from the disruption of intrapeptide hydrogen bonds and their replacement by hydrogen bonds of other types. Rees and Singer (1956) found that a variety of proteins in solution in ethylenediamine or hydrazine exhibited single well-defined peaks in the ultracentrifuge, but with markedly reduced corrected sedimentation coefficients (Table X), and, in 6ome cases, considerably increased specific viscosities, than in their respective aqueous solutions. These hydrodynamic properties are to be expected with unfolded macromolecules in solution. In terms of the discussion previously presented, these results may be
PROPERTIES OF PROTEINS I N NONAQUEOUS SOLVENTS
49
rationalized as follows. For strongly protic solvents, three factors may be involved in the disruption of the native structures of protcin molecules and their conversion to highly unfolded conformations: (a) The strong hydrogen bonds formed between the solvent arid the protein solute, particularly between the solvent as donor and the amide C=O groups of the protein as acceptor. This factor has already been discussed in connection with synthetic polypeptides, Section IV,D. (b) A marked decrease, compared to water, in lyophobic interactions towards the nonpolar residues of the protein. Of the common strongly protic liquids, only hydrofluoric acid, formic acid, and hydrazine are not completely miscible with simple hydrocarbons, while the others, such as ethylenediamine, dichloroacetic and trifluoroacetic acids, are completely miscible with them. Even in the cases of hydrofluoric acid, formic acid, and hydrazine, the solubility of hydrocarbons is much greater than in water. Thus, a saturated solution of benzene in formic acid at 25°C contains 0.088 mole fraction of benzene, compared to 0.00035 mole fraction of benzene in water (see Section IV,B,3 and Table IV). (c) An increase, compared to water, in intramolecular electrostatic interactions within protein molecules in the case of strongly protic solvents of high dielectric constant (> -40), such as hydrofluoric acid, formic acid, and hydrazine. Protein molecules are highly ionized as a result of the abstraction of protons from the acidic solvent or the donation of protons to the basic solvent. On the other hand, in strongly protic solvents of low dielectric constant ( < -15), such as dichloroacetic and trifluoroacetic acids and ethylenediamine, these ionic charges on protein molecules must certainly be extensively paired to solvent counterions (see Section IV,B,l), and electrostatic repulsive interactions must be of negligible importance in destabilizing the native conformation. It has often been stated that the capacity of strongly protic solvents to disrupt native protein conformations is attributable to strong solventsolute hydrogen bonding. It is clear, however, that this is an inadequate description of the situation. In the solvent formic acid, for example, any one of the three factors just mentioned might contribute enough negative free energy to the unfolded conformation of the protein molecule to cause, by itself, the disruption of the native conformation. It is meaningless, therefore, to attribute the disruption to any single one of these factors. Again, the decrease in lyophobic interactions in trifluoroacetic acid must be of comparable magnitude to their decrease in ethanol, since both solvents are completely miscible with simple hydrocarbons. If this factor makes an apparently sufficiently large free energy contribution to disrupt native protein conformations in ethanol solutions it must, by itself, be capable of doing the same in trifluoroacetic acid. On the other
50
S. J. SINGER
hand, since trifluoroacetic acid is such an effective helix-disrupting solvent for synthetic polypeptides (see Section IV,D), it is also quite likely that the formation of strong solvent-solute hydrogen bonds in this solvent by itself also makes a sufficiently large free energy contribution to disorder native protein conformations. From these considerations, it is evident that studies of protein structure in strongly protic solvents can reveal little if any additional information beyond what has been, and is to be, achieved in weakly protic solvents. Aside from the additional complications introduced by extensive acid-base reactions in such solvents, they are such powerful protein denaturants as t o be indistinguishably effective in this property. 3. Reversibilit y of Conformational Changes
It has been demonstrated in preceding sections that almost all nonaqueous solvents for proteins can induce one or more types of conformational changes in protein molecules. For several reasons it is important t o inquire whether and to what extent these changes are reversible. In the first place, it has been tacitly assumed in previous discussion that a particular nonaqueous solvent is chemically inert towards protein molecules, except for proton exchanges, and that its effects on protein structure involve changes solely in the noncovalent bonding in the protein molecule. I t has already been pointed out, however, in Section II,A, that this assumption may not be fulfilled in particular cases, and a demonstration of reversibility of solvent effects is required to justify it. I n the second place, the complete reversibility of conformational changes may be very important in connection with several practical questions, such as the usefulness of nonaqueous solvents in protein extraction and isolation procedures, and the preparation of chemically modified proteins by reactions carried out in nonaqueous solvents. The latter problem is discussed in more detail in Section V,E. Furthermore, the problem of reversibility has its own intrinsic interest in pursuing the question of whether the native conformation of a protein is thermodynamically or kinetically stable: is the native conformation in aqueous solution completely determined by its primary valence structure, or is it a metastable one “frozen in” during the process of biosynthesis (cf. Anfinsen, 196l)? The criteria used to assess reversibility must be adequately sensitive for the purposes at hand particularly if it is necessary to detect subtle irreversible changes in the native conformation. This is a subject itself worthy of extensive treatment. It is, however, not feasible to discuss it at length in the context of this article. Suffice it to say that it is desirable to examine the conformational state of the recovered protein by several methods rather than by only one; that among the more sensitive criteria
PROPERTIES OF PROTEINS IN NONAQUEOUS SOLVENTS
51
on the physical side are optical rotatory properties, ultraviolet absorption spectra, and their susceptibility to change from stresses such as temperature, pH, and solvent changes; and on the chemical side, the enzymatic and antigenic properties of the protein. The last property, although rarely employed, is particularly useful because of its generality and high degree of sensitivity to conformational changes in proteins. Consideration should also be given to the procedure employed to reverse conformational changes induced in nonaqueous solvents. It is conceivable, for example, that although the recovery of the native conformation of a protein may be favored thermodynamically, the reversal procedure may introduce kinetic barriers to its attainment, That is, a conformation other than the native one may be made metastable in the process of recovery. The reversal procedure must therefore be gradual enough to permit equilibrium to be attained at each stage. Furthermore, different proteins may differ considerably in the degree to which their conformations can be reversibly altered. It has been pointed out (Foss and Schellman, 1959), for example, that the apparent stability of ribonuclease to various treatments is not due to any inherent stability of its native conformation, but rather to its extraordinary degree of reversibility. It has generally been found in those systems investigated so far that the gross conformational changes induced in proteins by weakly protic solvents are at least largely reversible (Sage and Singer, 1958; Doty, 1959; Weber and Tanford, 1959). Inagami and Sturtevant (in Rees and Singer, 1956) showed that trypsin could be recovered from solution in formamide with retention of about 95 % of its enzymatic activity, while from dimethylsulfoxide solution, about 93 % was retained in the presence of 0.05 M CaClz (Inagami and Sturtevant, 1960). These observations were essentially confirmed and extended by Vratsanos el al. (1958). Following up these studies, Fleck and Singer (unpublished experiments) found that solutions of trypsin could be obtained in a number of weakly protic solvents by a dialysis procedure (see Section III,C), and the activities recoverable from these solvents into aqueous solution are shown in Table XI. No attempt was made to maximize these recoveries. Sage and Singer (1958, 1962) showed that ribonuclease could not only be recovered from neutral ethylene glycol into aqueous solution with essentially full retention of enzymatic activity, but that this was so even after all six of its tyrosine residues had been converted to the phenoxide ion form in ethylene glycol. This is in contrast to the situation in water solutions of this protein, in which the titration of more than three of the six tyrosines results in an essentially instantaneous irreversible loss of enzymatic activity (Sela and Anfinsen, 1957). This suggests the interesting possibility that the irreversible transition that occurs in aqueous solutions
52
S. J. SINGER
a t strongly alkaline pH is due to some scission or rearrangement of disulfide or other primary valence bonds (Ryle and Sanger, 1955; Zahn and Osterloh, 1955) within the RNase molecule which is retarded in ethylene glycol solution. This serves to emphasize the possibility that has been raised frequently in the past, that irreversible denaturation of proteins in aqueous media may often be the direct result of changes in primary covalent structure in the protein molecule, and not of irreversible changes in noncovalent binding and conformation. I n many weakly protic nonaqueous solvents, conformational changes can be induced under conditions that are milder and less likely to produce covalent bond alterations than the conditions required to effect similar conformational changes in aqueous media. With proteins dissolved in strongly protie solvents studies of reversiTABLEX I Recovery of Enzymatic Activity of Trypsin from Various Nonaqueous Solvents into Aqueous Media" Solvent: Per cent activity recovered :
methanol ethanol ethylene propylene glycologlycol glycol nitrile 72
94
82
84
72
hydrazine
0
G. Fleck and S. ,I. Singer, unpublished experiments. Enzymatic activity was measured by the rate of alkali consumption by the hydrolysis of benzoylarginine ethyl ester a t p H 8 in a phosphate-NaC1 buffer (Inagami and Sturtevant, 1960). (1
bility have been less extensive, but it has been found that the conformational changes are often apparently not reversible. Thus about half of the proteins recovered from solution in HF were insoluble in water (Katz, 1954a). Rees and Singer (1956) found that bovine serum albumin in hydrazine and ethylenediamine, and y-globulin in hydrazine, could not be recovered in soluble form into aqueous solution. Furthermore, evidence was obtained that degradation of the y-globulin molecule occurred in the hydrazine solutions over a period of hours. These observations taken together suggest that, whatever may be the detailed reasons for the observed irreversibility in strongly protic solvent,s, the possibility of covalent bond alterations must be seriously considered. Much work remains to be done in correlating the extent and the nature of conformational changes produced in proteins dissolved in chemically inert nonaqueous solvents and the precise degree to which these changes are reversible. In this connection the distinction must be made, of course, between the reversibility of conformational changes within covalently bound subunits of protein molecules, and the reversibility of the structural
53
PROPERTIES O F PROTEINS IN NONAQUEOUS SOLVENTS
integration of the subunits into the whole protein molecule. This question is taken up in Section V,A.
F. Nucleic Acids in Nonaqueous Solvents In recent years, increasing interest has been attached to the properties of nucleic acids in nonaqueous solvents. It is outside the purview of this article to review all the studies that have so far been carried out with such systems; certain observations have been made, however, which bear directly on the interpretation of the behavior of proteins in nonaqueous TABLEXI1 Properties of Salmon D N A in Water, Ethanol, and Methanol" DNA sample C Property
0.2 M NaC1HzO
Molecular weight X 5.9 10-6 Radius of gyration, A 2700 Sedimentation con21 stant, ~ 2 0 . ~ Intrinsic viscosity, dl/ 57 gm Extinction coefficient 6550 per mole P
b c
Ethanol
DNA sdmple H-I1 Redialyzedb 0.2 M NaC1H2O
0.2 M
hIeth-
Ng$-
anol
Redialyzed" 0.2 M NaClHzO
6.0-7.0
5.7
7.0
8.2
7.0
680-900 90-100
1780 -
2900 26
1560 41,40
2250
1.3-1.5
43
68
24
65
6300
9020
7070
9350 f 270
-
-
Herskovits et al. (1961). Dialyzed into 99.5% ethanol and back into aqueous salt solution. Dialyzed into 99.5y0 methanol and back into aqueous salt solution.
solvents, and these will be discussed. From the viewpoint of protein structure and behavior, nucleic acids are simpler molecules t n deal with, since their subunits are fewer in number, and at least in thc case of the deoxyribonucleic acids (DNA), their structures are comparahely regular and uniform. Of particular interest is a series of studies (Geiduschek and Gray, 1956; Herskovits et al., 1961; Geiduschek and Herskovits, 1961; Herskovits, 1962) concerning the effect of different solvents on the structural properties of DNA. In ethanol and methanol solutions, the helical conformation of DNA is disrupted, as is indicated by gross changes in ultraviolet absorption, light scattering, and hydrodynamic properties of the solutions compared to those in water (Table XII). Under the proper conditions these changes are substantially reversible. Conductivity measurements
54
S . J. SXNGER
(Coates and Jordan, 1960; Herskovits et al., 1961) suggest that the phosphate groups of the DNA molecule are extensively paired to counterions in methanol solution, such that a2/D (Section IV,B,l) is only roughly onesixth of its value in water. This applies to the disrupted conformation of the DNA molecule as it exists in methanol solutions, but it may be assumed that a 2 / D is not greatly different for the hypothetical native helical conformation in methanol. It can be concluded, therefore, that there is a net reduction in electrostatic repulsive interactions, and in electrostatic free energy, for the native conformation of DNA in methanol compared to water. Similar considerations apply in ethanol, whose dielectric constant is still smaller than that of methanol. Therefore, electrostatic factors alone would tend to stabilize the helical Conformation in these nonaqueous solvents, and in spite of this, the structure is disrupted. To what factors, then, can this destabilization be attributed? I t is widely held that the most important intramolecular interactions stabilizing the Watson-Crick twin-helical conformation of the DNA molecule are the hydrogen bonds formed between purine-pyrimidine base pairs. If this were so, however, it is difficult to explain why the helix should be disrupted in ethanol and methanol solutions and be stable in water. We have discussed in Section IV,E the observations (Doty, 1959; Tanford et al., 1960) that indicate that intramolecular hydrogen bonding in protein molecules is considerably more favored in 2-chloroethanol and ethanol as compared to water. Such solvents should similarly substantially promote internucleotide hydrogen bonding compared to water. [There is much other evidence from a variety of experiments in aqueous DNA solutions which is also difficult to reconcile with the hypothesis that the helical conformation is primarily stabilized by internal hydrogen bonding (Sturtevant et al., 1958) .] On the other hand, as has been pointed out in previous sections, lyophobic interactions towards nonpolar residues are substantially reduced in ethanol, methanol, and many other nonaqueous solvents, below their strength in water. The increase in free energy of the heIical conformation in methanol compared to water which results from this reduction in hydrophobic interactions is apparently sufficiently great to overcome the free energy decreases resulting from changes in hydrogen bonding and electrostatic interactions in methanol compared to water. It follows (Herskovits et al., 1961) that hydrophobic interactions must be vital to the maintenance of the native conformation of DNA in aqueous solution. It is important to realize that the Watson-Crick twin-helical structure of DNA not only permits the maximum number of intramolecular hydrogen bonds to form but is also a structure in which the most intimate clustering
PROPERTIES OF PROTEINS I N NONAQUEOUS SOLVENTS
55
of the aromatic residues of DNA is achieved, and which thereby permits the maximum number of solvent-solvent interactions to occur in a given DNA solution. From the conclusion that hydrophobic interactions are of great importance in stabilizing the native structure of DNA in water, it follows that a wide variety of weakly protic nonaqueous solvents should disrupt this structure, since in all of these, lyophobic interactions should be substantially decreased compared to water. This has been found to be true (Helmkamp and Ts’o, 1961; Herskovits et al., 1961; Herskovits, 1962). The addition of each of the following solvents to an aqueous solution of DNA produced a collapse of the DNA helical conformation, with the numbers in parentheses indicating the mole per cent of nonaqueous solvent at the midpoint of the denaturation process; dimethylformamide (23), dimethylsulfoxide (27), formamide (57), methanol (79), and ethylene glycol (80) (Herskovits, 1962). The smaller the mole per cent of nonaqueous solvent required, the more effective is the solvent; hence the order given is that of decreasing effectiveness as DNA denaturants. In tlhese solvent mixtures the concentration of added electrolyte was 5 X M . This decreasing order of effectiveness is similar to that found by Tanford and co-workers in studies of protein denaturation (see Section IV,E); there is a correlation between this order and the decreasing tendency of the pure nonaqueous solvents to dissolve hydrocarbons (increasing AFu ;Table IV) and with an increasing tendency to interfere with the formation of intrapeptide (and hence, presumably, also internucleotide) hydrogen bonds. Clearly this is essentially the order to be expected if hydrophobic interactions in aqueous solutions are of primary importance in stabilizing the helical conformation of DNA. Of considerable further interest is the finding of Marmur and Ts’o (1961) that the denaturation of DNA in formamide-water mixtures results in the separation of the two polynucleotide strands. An NI4Nl6hybrid Escherichia coli DNA was treated with 95 % formamide a t 37°C for 15 min at an ionic strength of 0.01, the formamide was then removed by dialysis, and the DNA was centrifuged in a CsCl density gradient. The presence of two bands of density different from that of the original hybrid indicates that strand separation occurred. It is of considerable practical interest that these conditions are much milder than those required to achieve strand separation in aqueous solution. Whether strand separation occurs also in solutions of DNA in other nonaqueous solvents is not yet determined. Ribonucleic acids (RNA) have been studied in formamide and dimethylsulfoxide (Helmkamp and Ts’o, 1961), and their secondary structures in these solvents were found also to be disrupted.
56
6. J. SINGER
G. Conclusions To summarize the major results discussed in this section: (a)The native conformations of globular protein molecules are in a rather delicately balanced equilibrium with the solvent water. Any of a wide variety of weakIy protic or strongly protic nonaqueous solvents is capable of radically altering these native conformations. Depending on the solvent, the structure of a protein can be made more highly disordered (random-coil form) or more highly ordered (more helical) than the native aqueous form. (b) Most of the weakly protic nonaqueous solvents appear to induce a two-stage transformation of the native conformation of at least certain globular proteins. Addition of such a solvent to an aqueous protein solution first causes an unfolding of the protein molecule; with further enrichment of the nonaqueous solvent component, the molecule refolds to a conformation with larger helical content than the native. A few solvents, such as forniamide and ethylene glycol, appear to induce the first transformation, but only incompletely, if at all, the second. Much higher concentrations of ethylene glycol than of most nonaqueous solvents studied so far are required to effect even the first of these transformations. (c) The effects of nonaqueous solvents on the stability of the twinhelical conformation of DNA are strikingly parallel to their effects on the native conformations of proteins. These and other observations can be quite satisfactorily explained on the hypothesis that hydrophobic interactions play a major role in stabilizing the native conformation of proteins and DNA in aqueous solution. The marked reduction in lyophobic interactions in almost any nonaqueous solvent compared to water must be critically involved in the conformational changes observed. Although adequate data are lacking, it is evident qualitatively at least that many of these solvents are poorer competitors than water for interpeptide and internucleotide hydrogen bonds. Dioxane is a clear example, since it is only a moderately strong hydrogen bond acceptor, and not a hydrogen bond donor at all. If hydrogen bonds were the only, or the most important, interaction stabilizing these native structures, the addition of dioxane should only make them more stable rather than less. This is not to say that intramolecular hydrogen bonds are unimportant in determining these structures, but rather that they do not appear to be the major source of the free energy required for the purpose. An increasing number of investigators have recently come to believe that the following generalized scheme of interactions accounts for the properties of globular protein and DNA molecules in aqueous solutions. From results such as have been discussed in previous sections, and from a considerable amount of other kinds of evidence (Kauzmann, 1959), it is con-
PROPERTIES OF PROTEINS I N NONAQUEOUS SOLVENTS
57
sidered likely that hydrophobic interactions make the largest single contribution to stabilizing the native conformations of these macromolecules in aqueous solution. On the other hand, hydrophobic interactions are too nonspecific by nature to lead of themselves to unique conformations for these molecules. Intramolecular hydrogen bonds, which by themselves are too weak in aqueous solutions to lead to conformational stability, are nevertheless much more specific, and the cooperation of thcse two interactions results in stable specific structures in aqueous solution.
CREE ENEROY
FIG.7. Schematic diagram of the contributions of various types of interactions t o the free energy of the native conformations of a hypothetical protein or a DNA molecule in aqueous solution. F a , F,: , F H B , and F H I represent tjhe free energy contributions of conformational entropy, electrostatic interactions, hydrogen bonding, and hydrophobic interactions, respectively. The magnitude of F E may vary considerably with the p H and ionic strength of the aqueous solution.
The situation that may obtain is very schematically illustrated in Fig. 7. The free energy contributions to the native conformation of a macromolecule in aqueous solution due to the loss of conformational entropy, to electrostatic repulsive interactions, to hydrogen bonding, and to hydrophobic interactions, are assumed to be separable, and are denoted respectively by F s , F , , F H B, and F H I; the sum of these four terms, F, is negative. In the absence of hydrogen bonding, the free energy of the native conforF - F,, , is, according to this scheme, still negative. In the mation F - F H I is positive and absence of hydrophobic interactions, however, therefore unfavorable. While hydrophobic interactions are thus depicted
58
8. J. SINGER
as providing the major stabilizating influence favoring the native conformation, the contribution of hydrogen bonding makes a specific conformation substantially more likely than others. In almost all nonaqueous solvents, F H I is apparently so markedly increased that even a decrease in F H B is inadequate to compensate and retain F as negative. In order to determine to what, extent these speculations have validity, it is necessary to be able to evaluate more quantitatively the relative contributions of these interactions to the free energies of protein and nucleic acid molecules in water and nonaqueous solvents. For this purpose, a substantial body of quantitative data is required concerning the properties of suitable model compounds in a variety of solvents, including their solubilities, acid-base dissociation constants, and thermodynamics of hydrogen bond formation. The dearth of pertinent data on hydrogen bonds in solvents of interest is particularly frustrating to even a semiquantitative evaluation of the scheme presented in Fig. 7. Although mixtures of water and nonaqueous liquids have been most frequently studied as protein solvents up to the present time, it should be realized that from a quantitative point of view, such solutions are enormously complicated systems. It is well known that the effective microscopic properties of such mixed solvents can be vastly different from their macroscopic properties, and can vary with the solute, because of selective attraction of one of the components of the solvent by the solute. Pure nonaqueous solvents are likely to be more useful for systematic and quantitative studies. The conclusions derived from these conformational studies have many possible important chemical and physiological consequences. From a chemical point of view, for example, the use of nonaqueous solvents as additives in the isolation and purification of proteins, or in their crystallization: may be accomplished with less damage to the proteins concerned if proper recognition is given to possible conformational changes produced in the proteins by the nonaqueous solvent. One important physiological problem which is pertinent involves the fact that the cellular environment of many proteins contains high concentrations of lipid components. In formed bodies such as mitochondria and chloroplasts, proteins are intimately and functionally associated with lipid substances. This is also true of proteins in a wide variety of cellular membranes. The current view is that these proteins are embedded in a lipid matrix; the only structural specification usually made is that the polar ends of the lipid molecules be oriented towards the protein (Danielli, 1951;Sjostrand and Rhodin, 1953). It must be realized, however, that the gross conformations of these proteins in situ might be determined by this association with a nonaqueous
PROPERTIES OF PROTEINS I N NONAQUEOUS SOLVENTS
59
environment, at least in part, and the functional properties of these proteins might therefore be critically dependent on this environment. One possible consequence of this is that such proteins extracted into aqueous media, and freed from associated lipid, might have entirely different conformations from their native ones, and as a result be functionally altered or inactive. This may be an important factor in interpreting the experiments of Jurtshuk et al. (1961), who found that the mitochondria1 enzyme 8-hydroxybutyric apodehydrogenase, after extraction into aqueous media, has an absolute requirement for the phospholipid, lecithin, in addition to that for DPN, the specific electron acceptor.
V. PHYSICAL AND CHEMICAL PROPERTIES OF PROTEINS IN NONAQUEOUS SOLVENTS Although most recent studies of proteins in nonaqueous solvents, and the major part of this article, have been devoted to problems of protein molecular conformations, there are many other aspects of these systems which might profitably be investigated. A few of these aspects are discussed briefly in this section. A , Quaternary Structure in Proteins The term quaternary structure was introduced by Bernal (1958) to denote the kinds of organized structures obtained by the noncovalent interaction of macromolecular subunits in aqueous solution where the subunits themselves are internally covalently bound. Such structures may be composed of closely similar or identical subunits (homogeneous type), as in the cases of hemoglobin (Field and O’Brien, 1955), P-lactoglobulin (Townend et al., 1961), insulin (Harfeneist and Craig, 1952), glutamic dehydrogenase (Frieden, 1959), and many other proteins; or they may contain unlike subunits (heterogeneous type) as in the case of the protein and ribonucleic acid subunits of tobacco mosaic virus. The noncovalent bonds between the subunits can vary greatly in strength from one case to another, and may be due to any combination of a wide spectrum of forces including electrostatic, hydrogen bonding, and hydrophobic interactions. It is therefore difficult to generalize about the properties of such structures. Nonaqueous solvents may provide a useful approach to the study of quaternary structure. For structures of the homogeneous type, an important problem is the determination of the true minimal subunit size, and a suitable nonaqueous solvent or solvent mixture may induce dissociation of subunits to a much greater extent than is possible in aqueous media. A solvent such an anhydrous formic acid, for example, might be quite generally effective in this regard, assuming that a given protein can be dissolved in it, and that no significant changes in covalent bonding within
60
S.
J. SINGER
the subunits are produced in the process. For reasons detailed in Section IV,E,2, in a strongly protic solvent of high dielectric constant such as formic acid, the native conformation of a subunit is highly likely to be converted t,o a random-coil form, thereby disrupting the native quaternary structure of thc protein in the process. The highly charged subunits should then be kept dispersed by electrostatic repulsive interactions in the high dielectric constant medium and a molecular weight determination in the strongly prot,ic solvent should reveal the average subunit size. Conformational changes within protein subunits are also produced, as we have seen earlier, in strongly protic solvents of low or high dielectric constant, and in weakly protic solvents as well. It is therefore likely that quaternary structure will also be disrupted in these solvents. However, nonspecific aggregation of the structurally altered subunits is more generally likely t o occur in such solvents than in strongly protic solvents of high dielectric constant; such aggregation would interfere with the determination of the minimal subunit size. Thus, ribonuclease was found to be extensively aggregated in essentially anhydrous ethylene glycol solution (Sage and Singer, 1962). On the other hand, in certain favorable instances, dissociation and essentially complete subunit dispersion may occur in a weakly protic solvent. For example, in dilute solutions of insulin in dimethylformamide and dimethylacetamide (Rees and Singer, 1955, 1956) the protein is largely dissociated into the true minimal subunit containing one A and one B chain. Related to this is the observation by Fredericq (1957) that insulin is essentially completely dissociated in 40 % dioxanewater. Nonaqueous solvents may also be useful in investigations of the nature of the noncovalent binding between subunits. If hydrophobic interactions, far example, were significantly involved in a particular association system in aqueous solution, the addition of weakly protic, nonaqueous solvents should produce dissociation, with the order of effectiveness of the solvents being that described in Sections IV,B,3 and IV,E,l. Since the noncovalent binding between subunits might be expected ordinarily to be weaker than the noncovalent binding maintaining the native conformation within the subunits themselves, the dissociation might be produced a t such small volume fractions of the nonaqueous solvent that significant conformational changes might be avoided. Otherwise, such conformational changes would interfere with the interpretation of the results. The appearance of conformational changes could be monitored by optical rotatory dispersion measurements. One system that would be of interest to investigate in this respect is the protein of tobacco mosaic virus, whose subunits have been shown b y Lauffer et al. (1958) t o undergo an endothermic association in aqueous solution.
PROPERTIES OF PROTEINS I N NONAQUEOUS SOLVENTS
61
If hydrophobic interactions are involved in this association, a s has been suggested by Kauzmann (1959) on the basis of this observation, it should be sensitive to the introduction of small amounts of the appropriate weakly protic nonaqueous solvents. The endothermic association of sickle cell anemia hemoglobin (Murayama, 1956) is another case to investigate. B. Biochemical Reactions of Proteins in Nonaqueous Solvents and Solvent Mixtures It is very likely that nonaqueous solvents and mixtures can be of use in the elucidation of the mechanisms of biochemical reactions carried out by proteins, such as enzymatic catalysis and antigen-antibody interactions. With the rapidly accumulating knowledge of the effect of nonaqueous solvents on the conformation of proteins to guide us, it should be possible to carry out increasingly informative experiments in such directions. 1. Enzymatic Reactions
A number of studies have been made on the kinetics of enzyme-catalyzed hydrolysis reactions in aqueous media containing nonaqueous solvents in varying proportions. One of the most careful and in te r e s h g of these was performed by Inagami and Sturtevant (1960) on the trypsin-catalyzed hydrolysis of benzoyl-L-arginine ethyl ester (BAEE) in dioxane-water mixtures. They found, quite remarkably, that the maximum rate, T,,,, , of hydrolysis does not vary greatly with increasing dioxane concentration, and that even in 88 volume % dioxane, rmax is 68 % of that iii water itself at the same pH. On the other hand, the apparent MichtLelis-Menten constant, Km(app) , which was assumed to be the dissociation constant of the enzyme-substrate complex, increases markedly with increasing dioxane is 1300 times its value in water concentration until in 88 % dioxane, Kmtapp) at pH 8, and 5500 times at pH 8.6. These authors suggested that this apparent increase in the dissociation of the enzyme-substrate complex with increasing dioxane concentration could reasonably well be accounted for by the increase, with decreasing dielectric constant of the solvent, in thc electrostatic repulsion of the critical positively charged guanidinium group of the substrate and the net positively charged enzyme, but thcy recognized that this dissociation could also be due to conformational changes in the protein produced by the high concentration of dioxane. No conformational studies were made, however. The fact that rmaxdid not change greatly despite the increased dissociation of the enzyme substrate complex they attributed to the compensating factor that water acts as an inhibitor a t the enzymatic site, and that on increasing the dioxane concentration and decreasing the water concentration of the solvent, the water is dissociated from the site.
62
S. J. SINGER
The conformation of the trypsin molecule, or at least of its active site, is apparently unusually stable to a decrease in lyophobic interactions with the solvent, since several other globular proteins undergo marked structural changes at much smaller dioxane concentrations than 88% as was discussed in Section IV,E,l. The structural studies discussed in that sFtion suggest also that formamide, in which pure solvent trypsin is directly soluble (Rees and Singer, 1956), might profitably be substituted for dioxane in the experiments of Inagami and Sturtevant, since it is a less effective protein denaturant and is also inert chemically. Ethylene glycol or glycerol, on the other hand, may participate directly in transfer reactions catalyzed by the enzyme, as in the case of the chymotrypsin-catalyzed transfer of alcohols to acyl groups (Balls and Wood, 1956; MacDonald and Balls, 1956). These compounds would therefore not function as chemically inert solvent additives. This brings us to the possibility that enzymes might exhibit catalytic activity in pure nonaqueous solvents. Many enzymes classed as hydrolases, for example, can catalyze transfer reactions that do not involve water as a stoichiometric participant. Chymotrypsin has already been mentioned in this regard. RNase is another case: in aqueous solutions containing low concentrations of alcohols, it catalyzes the reaction between cyclic mononucleotides and the alcohols to form alkyl phosphate esters (Heppel and Whitfeld, 1955; Heppel et al., 1955). It would be of interest, for example, t o determine in an anhydrous solvent in which RNase undergoes little or no detectable conformational change (perhaps this might be glycerol or a glycerol-sugar mixture) whether such transfer reactions to the solvent or other suitable receptor could be detected. Other enzyme-substrate systems might also be amenable to such experiments.
2. Antigen-Antibody Reactions A considerable variety of chemical groups can function as antigenic determinants, and a number of different kinds of forces have been implicated (Pauling et al., 1943) in the formation of the relatively weak noncovalent bond between a particular antigen and its specific antibody. Along with the recognition of the importance of hydrophobic interactions in determining the conformation of proteins, the possible significance of these interactions in the reactions between one protein and another, or between a protein and a small molecule, has been discussed (Kauzmann, 1959). In connection with this possibility in antigen-antibody systems, the study of the effects of nonaqueous solvents may provide some useful information. A very suggestive observation (Grant, 1959) has been made that 20% dioxane strongly reduces the amount of precipitate formed in several pro-
PROPERTIES OF PROTEINS IN NONAQUEOUS SOLVENTS
63
tein antigen-antibody systems. The effect is reversible, since removal of the dioxane by dialysis results in extensive specific precipitation. On the other hand Tanford et al. (1962) have found that 60 % ethylene glycol does not inhibit precipitation in Ouchterlony diffusion experiments with a different antigen-antibody system. Precipitating antigen-antibody systems are too complex to afford unambiguous interpretation of these solvent effects. Aside from conformational changes that might be induced in the antigen and antibody proteins by the solvent, the solubility of antigen-antibody aggregates might be altered; these factors could obscure the effect of solvent on the formation of the antigen-antibody bond per se. Systematic studies of the reactions of simple haptens and their purified specific antibodies, by equilibrium dialysis, or preferably some more rapid method of measurement, should be informative about the importance of hydrophobic interactions in any particular antigen-antibody reaction. Such studies must of course include measurements on the effect of solvents on the conformation of the antibody protein molecules. The reversible dissociation of antigen-antibody bonds by nonaqueous solvents such as dioxane may prove of considerable practical use in procedures for the isolation and purification of specific antibodies (cf. Singer et al., 1960).
C . Protein Solutions at Low Temperatures The solubility of proteins in nonaqueous solvents makes it possible to study them in homogeneous solution at temperatures below the freezing point of aqueous solutions, since many pure nonaqueous solvents have freezing points well below 0°C (Table I). Freezing temperatures can be lowered even further by the use of solvent mixtures. This opens a new dimension in protein chemistry the significance of which can only dimly be appreciated at the present time. Several possible problems of interest may be mentioned, however. The equilibrium state of any process involving an enthalpy change must be affected by temperature. The conformational transitions in proteins discussed in Section IV,E are a case in point. The transition from the native to the denatured form in a nonaqueous solvent is usually an endothermic process, and a decrease in temperature will favor the native form. In such cases, it is possible that the disruption of the native conformation in a given protein-solvent system, which observed at room temperatures, may be reversed at sufficiently low temperatures. [On the other hand, particularly in mixed solvents, the transition from an ordered to a disordered state may be an exothemic process (Doty and Yang, 1956; Foss and Schellman, 1959), and the reverse effect of temperature may be ob-
64
S. J. SINGER
served in these cases.] One may, therefore, be able to obtain solutions of protcins in their native conformations in nonaqueous solvents morc readily a t temperatures well below 0°C than a t 25°C. Indeed this consideration was fundamental to the development of the techniques for fractionation of proteins in ethanol-water mixtures a t low temperatures by Cohn el al. (1946). It would be of particular interest to study the rates of enzymatic and antigen-antibody reactions at low temperatures, where they might be appreciably slower and more readily measured than in aqueous solutions at room temperature. These and other conformational considerations discussed in this article are of obvious practical importance in the preservation of biological structures a t low temperatures (Smith, 1954). It is also possible that at low temperatures, optical and electronic processes in protein systems may be more readily investigated (Freed, 1958; Freed et al., 1958).
D. Chemical Modijication of Proteins in Nonaqueous Solvents The chemical modification of proteins in nonaqueous solvents is already of considerable importance, but its potential significance is even much greater. Among those reactions which so far have been carried out systematically with proteins in nonaqueous solvents are : esterification of carboxyl groups of proteins in anhydrous alcohols containing either HC1 (Praenkel-Conrat and Olcott, 1945) or thionyl chloride (Bello, 1956); and acetylation of amino and hydroxyl groups with acetic anhydride (Vratsanos el al., 1958: BelIo and Vinograd, 1956). It is clear that many modification reactions may be found to proceed readily in nonaqueous solvents which occur in poor yield or not at all in aqueous media, either because of low solubility of the required reagents, or interference by the water. In the context of this article, we wish only to emphasize that the conformation of a protein molecule is dependent on the solvent. The rates of modification reactions at particular groups on a protein molecule will therefore differ in a particular nonaqueous solvent compared to aqueous media not only through a generalized solvent effect, but also indirectly through the effect of solvent on the relative accessibility of the protein groups tjo the reagent. Furthermore, if groups which are normally buried in the interior of the native conformation of a protein molecule in aqueous solution, become available and are modified in a nonaqueous solvent, the conformation may be irreversibly altered. That is, on return of the modified protein to an aqueous medium, the native conformation may not be recovered because of interference from the modifying groups.
PROPERTIES OF PROTEINS IN NONAQUEOUS SOLVENTS
65
ACKNOWLEDGMENTS The author’s experience with the subject of this article was obtained with support from grant A-2441 from the National Institutes of Health, U. S. Public Health Service. The author is indebted to Drs. Charles Tanford, E. Peter Geiduschek, and T. T. Herskovits for the opportunity to see their manuscripts prior to publication, and to these individuals and Dr. Bruno Zimm for many helpful discussions. He is also very grateful to several colleagues, who, observing signs of distress at various occasions during the writing of this article, quickly supplied generous quantities of a mixture of water and the appropriate nonaqueous solvent.
REFERENCES Ambrose, E. J., and Elliott, A. (1951). Proc. Roy. SOC.A208, 75. Ambrose, E.J., Bamford, C. H., Elliott, A., and Hanby, W. E. (1951). Nature 167, 264. Anfinsen, C. B. (1961). J . Polymer Sci. 49, 31. Audrieth, L.F., and Kleinberg, J. (1953). “Nonaqueous Solvents.” Wiley, New York. Balls, A. K., and Wood, H. N. (1956). J . Biol. Chem. 219, 245. Bamford, C. H., Elliott, A., and Hanby, W. E. (1956). “Synthetic Polypeptides.” Academic Press, New York. Barrow, G. M. (1956). J. Am. Chem. Soc. 78, 5802. Bates, R . G. (1954). “Electrometric p H Determinations.” Wiley, New York. Beaven, G. H . (1961). Advances in Spectroscopy 2, 331. Beaven, G. H., and Holiday, E. R. (1952). Advances in Protein Chem. 7, 319. Bello, J . (1956). Biochim. et Biophys. Acta 20, 426. Bello, J., and Vinograd, J. R . (1956). J . Am. Chem. SOC.78, 1369. Bernal, J. D. (1958). Discussions Faraday Soc. No. 26, 7. Berns, D. S., and Fuoss, R. M. (1960). J . Am. Chem. SOC.82, 5585. Berns, D. S., and Fuoss, R. M. (1961). J . Am. Chem. SOC.83, 1321. Blout, E. R . (1960). I n “Optical Rotatory Dispersion” (C. Djerassi, ed.), Chapter 17. McGraw-Hill, New York. Blout, E. R. (1961). Tetrahedron 13, 123. Blout, E. R., and Asadourian, A. (1956). J . Am. Chem. Soc. 78, 955. Bresler, S. E. (1958). Discussions Faraday SOC.No. 26, 158. Bruning, W., and Holtzer, A. (1961). J . Am. Cheni. SOC.83, 4865. Cannon, C. G. (1955). Mikrochim. Acta 2-3, 555. Cha, C. Y., and Scheraga, H. A. (1961). Biochem. Biophys. Research Conimuns. 6, 67. Coates, J. H., and Jordan, D. 0. (1960). Biochim. et Biophys. Acta 43, 223. Cohen, c. (1955). Nature 176, 129. Cohen, C., and Szent-Gyorgyi, A. (1957). J . A m . Chem. SOC.79, 248. Cohn, E.J . (1936). Chem. Revs. 19, 241. Cohn, E. .J., anti Edsall, J . T. (1943). “Proteins, Amino Acids, :md Peptides.” Reinhold, New York. Cohn, E. J., Strong, L. E., Hughes, W. L., Jr., Mulford, D. J., A\hworth, J. N . , Melin, M., and Taylor, H. L. (1946). J . Am. Chetn. Sor. 68, 4.59. Cohn, E. J., Hughes, W. L., Jr., and Weare, J. H. (1947). J . Am. C h m . J’oc. 69, 1753.
66
S.
J. SINGER
Crammer, J. L., and Neuberger, A. (1943). Biochem. J . 37, 302. Curme, G. O., and Johnston, F. (1952). “Glycols,” p. 48. Reinhold, New York. Danielli, J. F. (1951). I n “Cytology and Cell Physiology” (G. H. Bourne, ed.), 2nd ed., p. 152. Oxford Univ. Press, London. Donohue, J . (1953). Proc. Natl. Acad. Sci. U . S. 39, 470. Doty, P. (1959). In “Biophysical Science” (J. L. Oncley, ed.), p. 112. Wiley, New York. Doty, P., and Yang, J. T. (1956). J . A m . Chem. SOC.78, 498. Doty, P., Bradbury, J . H., and Holtzer, A. M. (1956). J . A m . Chem. Soc. 78, 947. Doty, P., Imahori, K., and Klemperer, E. (1958). PTOC. Natl. Acad. Sci. U . S . 44, 424. Downie, A. R., Elliott, A., Hanby, W. E., andMalcolm, B. R. (1957). Proc. Roy. SOC. A242, 325. Downie, A. R., Elliott, A., and Hanby, W. E. (1959). Nature 183, 110. Eirich, F., Katchalski, E., Reisman, J., and Spitnik, P. (1951). Advances in Protein Chem. 8, 123. Evers, E. C., and Kay, R. L. (1960). Ann. Rev. Phys. Chem. 11, 21. Fasman, G. (1962). In “Polyamino Acids, Polypeptides, and Proteins” (M. A. Stahmann, ed.), p. 221. Univ. Wisconsin Press, Madison, Wisconsin. Field, E. O., and O’Brien, J . R . P. (1955). Biochem. J . 60, 656. Foss, J. G., and Schellman, J. A. (1959). J . Phys. Chem. 83, 2007. Fraenkel-Conrat, H., and Olcott, H. S. (1945). J. Biol. Chem. 181, 259. Frank, H. S., and Evans, M. W. (1945). J . Chem. Phys. 13, 507. Fredericq, E. (1957). J . A m . Chem. SOC.79, 599. Freed, S. (1958). In “Symposium on InformationTheory in Biology” (H. I?. Yockey, ed.), p. 171. Pergamon, London. Freed, S., Turnbull, J . H., and Salmre, W. (1958). Nature 181, 1731. Frieden, C. (1959). J . Biol. Chem. 234, 815. FUOSS, R.M. (1958). J. A m . Chem. SOC.80. 5059. Geiduschek, E, P., and Gray, I. (1956). J. A m . Chem. SOC.78, 879. Geiduschek, E. P., and Herskovits, T.T. (1961). Arch. Biochem. Biophys. 96, 114. Goodman, M . , and Schrnitt, E. E. (1959). J . A m . Chem. SOC.81, 5507. Grant, R. A. (1959). Brit. J. Exptl. Pathol. 40, 551. Greenberg, D. M . , and Larson, C. E. (1935). J . Phys. Chem. 39, 665. Gundlach, H. G . , Moore, S., and Stein, W. H. (1959a). J . Bio2. Chem. 234, 1761. Gundlach, H.G., Stein, W. H., and Moore, S. (1959b). J . Biol. Chem. 234, 1754. Gutbezahl, B.,and Grunwald, E. (1953). J. Am. Chem. SOC.76, 565. Habeeb, A. F. S. A., Cussidy, H. G., Stelos, P., and Singer, 5.J . (1959). Biochim. et Biophys. Acta 34, 439. Httmmett, L. P. (1940). “Physical Organic Chemistry.” McGraw-Hill, New York. Harfeneist, E. J., and Craig, L. C. (1952). J . A m . Chem. SOC.74, 3087. Harrington, W. F . , and von Hippel, P. H. (1961). Advances i n Protein Chem. 18, 1. Helmkamp, G. K., and Ts’o, P. 0. P. (1961). J . A m . Chsm. SOC.83, 138. Heppel, L. A . , and Whitfeld, P. R . (1955). Biochem. J . 60, 1. Heppel, I,. A., Whitfeld, P. R., and Markham, R. (1955). Biochem. J . 60, 8. Herskovits, T.T. (1962). Arch. Biochem. Biophys. 97, 474. Herskovits, T.T., Singer, S. J., and Geiduschek, E. P. (1961). Arch. Biocheni. Riophys. 94, 99. Hirsch, E., and Fuoss, It. M. (1960). J . A m . Chem. SOC.82, 1018. Inagami, T., and Sturtevant, J. M. (1960). Biochirn. et Biophys. Acta 38, 64.
PROPERTIES OF PROTEINS IN NONAQUEOUS SOLVENTS
67
Jacobsen, C. F., and Linderstdm-Lang, K. (1949). Nature 164, 111. Jurtshuk, P., Jr., Sezuku, I., and Green, D. E. (1961). Biochem. Biophys. Research Communs . 6, 76. Katz, J. J. (1954a). Arch. Biochem. Biophys. 61, 293. Katz, J. J. (1954b). Nature 174, 509. Katr, J. J. (1955). Science 121, 642. Kaurmann, W. (1959). Advances in Protein Chem. 14, 1 . Kendrew, J. C., Watson, H. C., Strandberg, B. E., Dickerson, R. E., Phillips, D. C., and Shore, V. C. (1961). Nature 190, 666. Kilpatrick, M., and Luborsky, F. E. (1953). J. A m , Chem. SOC.76, 577. King, M. V., Magdoff,B. S., Adelman, M. B., and Harker, D. (1956). Acta Cryst. 9, 460. Klotr, I. M. (1960). 1%“Protein Structure and Function.” Brookhaven Symposium in Biology, p. 25. Brookhaven National Laboratory, New York. Klotz, I. M., and Franzen, J . S. (1960). J. Am. Chem. SOC.82, 5241. Klotz, I . M., and Franzen, J. S. (1962). J . A m . Chem. SOC.84, 3461. Lauffer, M. A., Ansevin, A. T., Cartwright, T. E., and Brinton, C. C., Jr. (1958). Nature 181, 1338. Levedahl, B. H., and James, T. W. (1961). Tetrahedron 13, 1-240. Levine, S. (1954). Arch. Biochem. Biophys. 60, 515. LinderstMm-Lang, K. (1955). Chem. Soc. (London), Spec. Publ. No. 2 . Luseati, V., Cesari, M., Spach, G., Masson, F., and Vincent, J. M. (1961). J. Mol. Biol. 3, 566. MacDonald, C. E., and Balls, A. K. (1956). J. Biol. Chem. 221, 993. Marmur, J., and Ts’o, P. 0. P. (1961). Biochim. et Biophys. Acta 61, 32. Mizushima, S., Tsuboi, M., Shimanouchi, T., and Tsuda, Y. (1955). Spectrochim. Acta 7, 100. Moffitt, W. (1956). J . Chem. Phys. 26, 467. Moffitt, W . , and Yang, J. T. (1956). Proc. Natl. Acad. Sci. U . S . 42, 596. Moffitt, W., Fitts, D. D., and Kirkwood, J. G. (1957). Proc. Natl. h a d . Sci. U . S . 43, 723. Murayama, M. (1956). Federation Proc. 16, 318. Pauling, L., Campbell, D. H., and Pressman, D. (1943). Physiol. Revs. 23, 203. Pauling, L., Corey, R. B., and Branson, H . R. (1951). Proc. Natl. Acad. Sci. U . S. 37, 205. Pimentel, G . C., and McClellan, A. L. (1960). “The Hydrogen Bond,” p. 234. Freeman, San Francisco. Rees, E. D., and Singer, 8. J. (1955). Nature 176, 1072. Rees, E. D., and Singer, S. J. (1956). Arch. Biochem. Biophys. 63, 144. Robertson, T. B. (1918). “Physical Chemistry of Proteins.” Longmans, Green, London. Ryle, A. P., and Sanger, F. (1955). Biochem. J. 60, 535. Sadek, H., and Fuoss, R. M. (1959). J. Am. Chem. SOC.81, 4507. Sage, H. J., and Singer, S. J. (1958). Biochim. et Biophys. Acta 29, 663. Sage, H. J., and Singer, S. J. (1962). Biochemistry 1, 305. Schefian, L., and Jacobs, M. B. (1953). “The Handbook of Solvents.” van Nostrand, New York. Schellman, J. A . (1955a). Compt. rend. trav. lab. Carlsberg Sbr. chini. 29, 223. Schellman, J. A. (195513). Conipt. rend. trav. lab. Carlsberg Sbr. chini. 29, 230. Scheraga, H., and Laskowski, M., Jr. (1957). Advances i n Protein Chem. 12, 1. Sela, M., and Anfinsen, C. B. (1957). Biochirn. et Biophys. Acta 24, 229.
68
S. J. SINGER
Shugar, D. (1952). Biochem. J . 62, 142. Simmons, N . S., Cohen, C., Szent-Gyorgyi, A. G., Wetlaufer, D. B., and Blout, E. R. (1961). J . Am. Chem. SOC.83, 4766. Singer, S. J . , Fothergill, J. E., and Shainoff, J. R. (1960). J . Am. Chem. SOC.82,5G5. Sjostrand, F. S., and Rhodin, J. (1953). Exptl. Cell Research 4, 426. Smith, A. U. (1954). In “Biological Applications of Freezing and Drying” (R. J . C. Harris, ed.), pp. 1-62. Academic Press, New York. Stark, G. R., Stein, W. H., and Moore, S. (1961). J . Biol. Chem. 236, 436. Steinberg, I. Z., Harrington, W. F., Berger, A . , Sela, M., and Katchalski, E. (1960). J . Am. Chem. SOC.82, 5263. Strehm, J., Krishna-Prasad, Y. S. R . , and Schellman, J. A. (1961). Tetrahedron 13, 176. Sturtevant, J. M., Rice, S. A., and Geiduschek, E. P. (1958). Discussions Faraday SOC.26, 138. Swallen, L. C . , and Danehy, J. P. (1946). In “Colloid Chemistry” (J. Alexander, ed.), Vol. VI, p. 1140. Reinhold, New York. Tanford, C. (1961). “Physical Chemistry of Macromolecules,” Chapter 8. Wiley, New York. Tanford, C., and De, P. K. (1961). J . Biol. Chem. 236, 1711. Tanford, C., and Roberts, G. L. (1952). J . Am. Chem. SOC.74. 2509. Tanford, C., and Swanson, S. A. (1957). J . A m . Chem. SOC.79, 3297. Tanford, C., and Wagner, M. L. (1954). J . A m . Chem. SOC.76, 3331. Tanford, C., Hauenstein, J. D., and Rands, D. G . (1955a). J . A m . Chetn. SOC.77, 6409. Tanford, C., Swanson, S. A . , and Shore, W. S. (1955b). J . Am. Chem. SOC.77,6414. Tanford, C., De, P. K., and Taggart, V. G. (1960). J . A m . Chem. SOC.82, 6028. Tanford, C., Buckley, C. E., 111, De, P. K., and Lively, E. P. (1962). J . Biol. Chem. 237, 1168. Tiselius, A. (1959). In “Electrophoresis” (M. Bier, ed.), p. xix. Academic Press, New York. Townend, R., Kiddy, C. A., and Timasheff, S. N. (1961). J . AWLChem. SOC.89, 1419. Urnes, P. J., and Doty, P. (1961). Advances i n Protein Chem. 16, 401. Urnes, P. J . , Imshori, K., and Doty, P. (1961). Proc. Natl. Acad. Sci. U . S . 47,1635. Vratsanos, R. M., Bier,M., and Nord, F. F. (1958). Arch. Biochem. Riophys. 77,216. Weber, R . E., and Tanford, C. (1959). J . A m . Chem. SOC.81, 3255. Weissberger, A , , Proskauer, E. S., Riddick, J . A., and Toops, E. E., J r . (1055). “Organic Solvents,” 2nd ed. Interscience, New York. Wetlaufer, D. B., Edsall, J. T., and Hollingworth, B. R. (1958). J . Biol. Cheni. 233, 1421. Yanari, S., and Bovey, F. A. (1960). J. Biol. Chenz. 236, 2818. Yang, J. T., and Doty, P. (1957). J . A m . Chem. SOC.79, 761. Zahn, H., and Osterloh, F. (1955). Makromol. Chem. 16, 183. Zimm, B. H . , Doty, P., and Iso, K. (1960). Proc. Natl. Acad. Sci. U.S. 46, 1601.
THE INTERPRETATION OF HYDROGEN ION TITRATION CURVES OF PROTEINS BY CHARLES TANFORD Department of Biochemistry. Duke University. Durham. North Carolina
I . Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . ............................ I1. Dissociation Constants of Appropriate 1 Molecules . . . . . . . . . . . . . . . . . I11. Experimental Titration D a t a . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A . Electrometria Titration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . B . Reference Points . . . . . . . . . . ............................ C . Spectrophotometric Titration for Phenolic Groups . . . . . . . . . . . . . . . . . . D . Other Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . E . Effect of Solvent . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
IV . Counting of Groups . . . . . . . . . . . . . . . . . . . . . . . . . . . . .................... A . Counting Procedure . . . . . . . . . . . . . . . . . . . . . . . . .................... B . Difference Counting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . V . Rcversibility; Thermodynamic and Kinetic Analysis . . . . . . . . . . . . . . . . . . . .
A . Reversibility and Time-Dependence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ..................... ; . . . . . . . . . . . . . . . . C . Kinetic Analys ....................................... c Analysis . . . . . . . . . . . . . . . . . . . . . 7 . . . . . . VI . Semiempirical The A . The Equation of LinderstrZm-Lang . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . B Empirical Procedure., . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . C . The Electrostatic Interaction Factor . . . . . . . . . . . . . . . . . . . . . . . . . . . . D . Intrinsic pK’s and Their Interpretation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . E . Heats and Entropies of Dissociation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . V I I More Exact Thermodynamic Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . VIII . Volume Changes Accompanying Titration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . I X . Miscellaneous Topics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A . Binding of Ions Other Than Hydrogen Ions . . . . . . . . . . . . . . . . . . . . . . . . B . The Isoionic Point . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . C . Charge Fluctuations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . X . Results for Individual Proteins . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A . Chymotrypsinogen . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . B . Chymotrypsin . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . C . Collagen Fibrils. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . D . Conalbumin . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . E . a-Corticotropin . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . F . Cytochrome c . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . G . Fetuin . . . . . . . . . . . . . . . . . . . . .................................. H . Fibrinogen and Fibrin., . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . I . Gelatin . . . . . . . . . . . . . . . . . . . ................................... J Hemoglobin . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
.
.
.
69
70 71 76 76 78 80 80 82 82 82 85 90 90
95 99 111 119 121 124 127 127 128 130 131 131 133 133 133
138 138 139 139
70
CHARLES TANFORD
K. Insulin . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . L. @-Lactoglobulin.. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . M. Lysozyme.. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . N. Myoglobin.. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 0. Myosin. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . P. Ovalbumin.. . . . . . . . . . . . . . . . . ................... Q . Papain.. . . . . . . . . . . . . . . . . . . . . .................................. It. Paramyosin . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . S. Pepsin . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . T. Peroxidase.. . . . . . ............................................. 6.Ribonuclease., , . . ............................................. V. Serum Albumin.. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . W. Thyroglobulin . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . X. Trypsinogen, . . . . . . . . . . . . . . . ................................. Y. Trypsin., . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
142 144 147 149
153 153 154 154 154 156 160 160 161 181
I. INTRODUCTION In recent years several reviews have been published on the subject of hydrogen ion titration curves of proteins. Among these there are good general introductions to the subject, which include some description of experimental procedures (Tanford, 1955a; Kenchington, 1960); a review by Linderstrgm-Lang and Nielsen (1959), which gives a lucid introduction to the theoretical treatment of protein titration curves; and a review by Steinhardt and Zaiser (1955), which emphasizes anomalous behavior. A review by Jacobsen et al. (1957) is devoted to use of the pH-stat. Apart from treating that subject in some detail, it contains experimental procedures of general utility in the determination of titration data. More complete developments of the subject may be found in textbooks (Edsall and Wyman, 1958; Tanford, 1961a). The definitive treatment of the mathematical theory, for polyelectrolytes in general, and specifically including proteins, is that of Rice and Nagasawa (1961). The existence of these earlier reviews makes it possible for the present treatment to be limited in scope. It will be sufficient to touch only superficially on experimental techniques and on the theoretical derivation of equations The major objective will be, as the title of the paper implies, to show what one can learn from titration curves that is of general interest to protein chemistry. From this point of view, titration curves do not represent just another way of physically characterizing a protein molecule. More than most other physicochemical methods which are in common use, titration studies tend to emphasize individual differences among proteins, and this is reflected in the organization of this paper. There is a large section, entitled “Results for Individual Proteins,” which contains the many features of
HYDROGEN ION TITRATION CURVES OF PROTEIK
71
titration curves which are unique to individual proteins, and which cannot be described in general terms applicable to all proteins.
11. DISSOCIATION CONSTANTS OF APPROPRIATE SMALL MOLECULES In most common proteins, one out of every three or four amino acid residues contains a titratable acidic or basic group. The number of titratable groups per molecule thus ranges from about 20 to about 250 for the many common proteins with molecular weights below 100,000, and it is even larger for proteins with very high molecular weight. The titratable groups which occur in greatest abundance are the carboxyl groups of glutamic acid and aspartic acid side chains; the amino groups of lysine side chains; the guanidyl groups of arginine side chains; the imidazole groups of histidine side chains; and the phenolic groups of tyrosine side chains. Less frequent are the thiol groups of cysteine side chains and the phosphoric acid groups of phosphoserine or phosphothreonine side chains. Heme proteins contain titratable carboxyl groups attached to the heme, and may also have acidic water molecules attached to the heme iron atom. Acidic water molecules may also be attached to other metalloproteins. Glycoproteins, flavoproteins, and nucleoproteins contain additional titratable groups as part of their non-protein conjugates. Finally, the terminal amino and carboxyl groups of nearly all polypeptide chains are in the free titratable form. The structures of the more common titratable groups are shown in Fig. 1. (Other parts of the protein molecule, such as the peptide group, also possess acidic or basic properties, but they are not titrated within the range of pH 1.5 to 12, within which titration studies are usually confined. Protein molecules tend to become degraded outside this range of pH.) We want to know, before we examine the titration curves of proteins, a t what pH these groups might be expected to become converted from their acidic to their basic form. The simplest initial assumption is that proteins will not behave differently in this respect than do other organic molecules. With this assumption we would expect all effects on the acidic properties, except the effects of electrostatic charge, to be short-range effects. As a first approximation, the pK of the carboxyl group of glutamic acid (apart from the effect of electrostatic charge) could be equated with the pK of acetic acid (4.76) or propionic acid (4.87). To obtain a better estimate we can correct for the fact that the glutamic acid side chain in proteins is in fact under the influence of the polar NH and GO group attached to the third carbon atom from the COOH group. The expected pK might then be set equal to the pK of a compound containing polar groups similarly located, such as monoalkyl glutarate (pK = 4.55). Within an uncertainty of about 0.1 or 0.2 the same expected pK is usually obtained in this way,
72
CHARLES TANFORD
regardless of which of several alternative model compounds is chosen. Within an uncertainty of this order the effect of ionic strength within reasonable limits is also negligible, so that, in seeking data of this kind for appropriate model compounds, we need not confine ourselves to experimental data which permit evaluation of the effect of ionic strength.
CH-R
a-Amino group
I
$H
Lo I
CH-CH, I
co
OH
Aspartic acid
I
NH CH-CH,-CH,-CH,-CH,-NH: I
I
Lysine
b I
AH I CH-CH,-CH,-COOH
Glutamic acid
I
NH
Q
I
I
CH--CH~-CH2-CH,--NH-C I
co I Histidine
i
NH I CH CH,- SH
-
ko 1
’Qrosine
1
I
y CH-CH,--COOH A0
a
. .
Cysteine
Arginine
\
NH,
I
NH 0 I II CH-CH,-0-P-OH I A0 OH I
Phosphoserine
i
NH I CH-R’ AOOH
-Carbowl group
(1
FIG.1. Fortnullts for the most irnportant titratable groups of protein molecules. The model compounds listed in Table I were chosen to reseinble thexc groups it8 closely as possible.
The relation between structure and acidity of organic compounds has been the subject of much study. Those aspects which are of interest in connection with protein titration curves have been reviewed in definitive manner by Edsall and Wyman (1958) and by Edsall (1943), and the reader is referred to these reviews for a discussion of the theoretical and empirical principles which are involved. For the present purpose it is sufficient to extract the data which will lead to the “expected” pK values of the titratable groups of proteins, and this has been done in Table I.
HYDROGEN ION TITRATION CURVES OF PROTEIN
73
TABLEI Dissociation Conslants of Model Compounds i n Aqueous Solution at or Near 26'Cn Compound
pK observed
Correction required pK corrected
Compounds resembling a-COOH group CHa-CO-CHe-COOH 3.58 3.6 CHa-CO-NH-CHz-COOH 3.60 3.6 Tetraalanine 3.32* Electrostatic" 3.5 Cysteine Microscopic constantsd 3.8 Glutamic acid Microscopic constantsd 4.3 Tyrosine Microscopic constantsd 4.3 Compounds resembling 8-COOH groiip of aspartic acid side chain CHs-CO- (CHe)e-COOH 4.59 4.6 ROOC-(CHz)z-COOH 4.52 4.5 4.5 HOOC- (CHz)2-COOH 4.24 Statistical factore Compounds resembling Y-COOH group of glutamic acid side chain ROOC- (CHZ)a-COOH 4.55' 4.6 4.7 HOOC- (CHZ)3-COOH 4.36 Statistical factors 4.6 Glutamic acid Microscopic constantsd Compound resembling porphyrin COOH groups C sH 6- (CHI)2-COOH 4.66 4.7 Compound resembling imidaaole group of histidine side chain Poly-histidine 6.150 6.15h Compounds resembling wNH: group HeN-CO-CHZ-NH$ 7.93 7.9 ROOC-CHZ-NH: 7.7 7.7 Leucine ester 7.6 7.63 7.8 Glycylglycine ester 7.75 7.8 Tetraalanine 7.966 ElectrostaticC 7.0 Die thy1 glut amate 7.04 Cysteine 6.8 Microscopic constantsd Tyrosine 7.2 Microscopic constantsd Compounds resembling E-NH: group of lysine side chain 10.4 10.4' ROOC-(CHz) s-NH: 10.4 Lysine 10.79 Electrostaticc Compounds resembling SH group of cysteine side chain 9.5 HO- (CHI)z-SH 9.5 9.1 Cysteine Microscopic constantsd Compounds resembling phenolic group of tyrosine side chain 9.5 9.5j Polyt yrosine 9.7 Tyrosine 10.05 Electrostatic 9.8 Tyrosine Microscopic constantsd Compound resembling guanidyl group of arginine side chain Arginine ca. 12.5 Electrostatic" ca. 12.0 Compounds resembling protein-linked phosphate 1.3 Glycerol-2-phosphate (pKI) 1.34 6.6 Glycerol-2-phosphate (pK2) 6.65 6.5 6.50 Glucose-1-phosphate (pKz) 6.02 Phosphoserine peptides (pKa)
-
-
74
CHARLES TANFORD
TABLE I-Continued Most of the data are taken from Edsall (1943) or Edsall and Wyman (1958). b Aversge value for four isomers. c A charged group is present on the model compound a t considerable distance from the acidic group. An estimate for its effect on pK has been made by examining effects of similar charged groups on other acids. d By measurement of the dissociation constants of the amino acids, and of suitable esters and other derivatives. The data have been analyzed by Edsall and co-workers (Edsall and Wyman, 1958; Martin et al., 1958) so as to yield, after suitable assumptions, the twelve microscopic dissociation constants. The ones applicable here are those which refer to dissociation of the group in question from an otherwise uncharged molecule. A molecule with two identical dissociable groups will have a pIi which is twice the value for either group alone. f Interpolated between ROOC-(CH&-COOH and ROOC-(CH&--COOH. c From titration of a polymer with degree of polymerization 15 (Patchornik el al. 1957). h The state of knowledge regarding the titration of imidazole groups is unsatisfactory. The p K of the imidazole group of histidine is 6.0, which is in qualitative agreement with the value listed in the table, since the effects of the two charged groups of histidine should roughly cancel. On the other hand, imidazole itself has a pK = 7.0 and 4-metfhylimidazolehas a pK = 7.5. It is not easy to see why a polar group substitution on the ?-carbon atom (relative t o the nearest nitrogen atom) should produce so large a difference in pK. It should be noted the,t a similar problem is 9.7, whereas that of exists for amino groups. The pK of ROOC-(CH?)a-NH: CHs-(CH&-NH: is 10.7. (Added in proof: Koltun et aE. (1959) obtained pK = 6.42 for carbobenzoxy-Lprolyl-L-histidylglycinamide.) i By extrapolation from data for ROOC-(CH&-NH: with n = 1 to 4. i Essentially t h e observed average pK, for a polymer of average degree of polymerization 30, extrapolated to zero net charge of the polymer (Katchalski and Sela, 1953). k Electrostatic correction computed by the Kirkwood-Westheimer theory. 2 Folsch and Osterberg (1959) determined the pK values of several peptides containing phosphoserine and obtained results in the range of 5.4 t o 6.0. Each peptide carried one positive and one negative charge, and no attempt t o correct for their presence was made.
.
The dissociation of a hydrogen ion from an acidic group changes the charge by one unit. Either the acidic form is charged (as in -NHt) and in that case the basic form is uncharged (-NH2); or the acidic form is uncharged (as in -COOH), leading to a charged basic form (-COO-). One of the groups considered in Fig. 1 and in Table I is slightly more complicated, this being the phosphate group which may lose two hydrogen ions to acquire a double negative charge. The model compounds which are considered in Table I contain only those charges which are an inherent part of the dissociating group. No other charges are present (or, if present, have been corrected for). The pK values listed are therefore those which
HYDROGEN ION TITRATION CURVES O F PROTEIN
75
would be expected if each of the titratable groups were attached alone, without other charged groups, to a protein molecule. An actual protein molecule will contain, as we have pointed out, a large number of acidic groups per molecule. At any pH many of these groups will be in their charged form. Since electrostatic effects are long-range effects, we must expect the course of titration of any one group to be influenced by these other charges which are present. To calculate this effect of electrostatic interactions on protein titration curves has been a major preoccupation of those who have been concerned with the theoretical aspects of protein titration curves. We shall discuss these aspects in Sections VI and VII of this review. For the practical aspects of protein titration which are considered in Sections IV and V, the following qualitative conclusions of the theoretical treatment suffice. (1) It is clear from the pK values of the various dissociating groups that the charges on the protein molecule will be mostly positive at low pH and mostly negative at high pH. At some intermediate pH, positive and negative charges will be present in equal numbers. Since positive charges repel protons, and negative charges attract them, we expect pK’s to be reduced below the values of Table I at relatively low pH, and we expect them to be raised above those values at relatively high pH. The expected magnitude of the effect decreases with increasing ionic strength. For proteins with molecular weight below 100,000, at an ionic strength of 0.1 or above, the normal change in pK due to electrostatic interactions will not exceed a value of 1.5 or so, except as one approaches the extreme ends of the titration curve. Furthermore, electrostatic forces affect all groups alike, so that the pK differences between one type of group and another, at any pH, will be expected to remain the same as the differences given by Table I. (2) When a monobasic acid is titrated, the titration curve is described completely by the pK. We know for instance that, when pH = pK 1.0, 91 % of the molecules will be in the acid form, and, when pH = pK 1.0, 91 % will be in the basic form. These same relations would apply to a particular titratable group on a polybasic acid or on a protein molecule, if all other titratable groups on the same molecule were to remain unaffected. In fact, as the pH is changed, many groups on the same molecule are titrated together. Thus when a particular group on a protein molecule is in its acidic form on most of t,he molecules (pH well below pK), other groups like it will also be in their acidic form, and the net molecular electrostatic charge will be more positive than at a higher pH where this particular group is in its basic form on most of the molecules, and other groups like it are also in their basic form. The effect of electrostatic interaction will thus lead this particular group to have a lower pK at luw degrce of dissociation than at high degree of dissociation. The difference in pH required to
-
+
76
CHARLES TANFORD
go from 9 % dissociation to 91 % dissociation will not be 2.0, as it would be if the pK were independent of pH, but will be larger: the difference in p H could be as large as 3.0 or even more. Conclusion. If protein molecules exhibit no interactions that are not also present on smaller molecules, then the pK values of their titratable groups would be expected to be roughly those of Table I. Electrostatic forces may move them up or down by as much as 1.5 pK units, but relative values will be unaffected thereby. The pK changes during the course of titration, so that the titration curve for any one group will be broader than it would be for a monobasic acid. The assumption that protein molecules do not have unique interactions absent in smaller molecules is of course naive. It is in fact untrue. Special interactions occur and upset the “expectationsJJwith which this section has been concerned. It is the occurrence of such deviations from the expected result which lend interest and importance to the study of protein titration curves.
111. EXPERIMENTAL TITRATION DATA
A . Electrometric Titration The foundation for any study of hydrogen ion dissociation in proteins is the electrometric titration curve. To obtain such a curve, one begins with a protein solution of known concentration, a t an arbitrary reference pH, adds to it varying amounts of a strong acid or a strong base, and then measures the new pH attained. I n a separate experiment, or by means of calculations based on similar experiments, one determines how much acid or base is needed to take a solution which does not contain protein, but other wise has the same initial pH, ionic strength, volume, etc., to the same final pH, ionic strength, volume, etc. The amount of acid or base required for the protein solution is always larger (under most circumstances very much larger) than the amount required for the corresponding solution without protein. The difference between the two amounts is the amount of acid or base which is bound to the protein in going from the reference pH to the final pH: a plot of this quantity versus the final pH is the desired titration curve. In plotting this curve, OH- ions bound are counted as H+ ions dissociated, a procedure which is always permissible in aqueous solutions. A sample plot is shown in Pig. 2. The procedure described in the preceding paragraph will of course measure the number of hydrogen ions bound to or dissociated from all substances which are present in the solution under study. The accuracy of an experimental electrome tric titration curve depends to a considerable degree on the absence of buffers, carbon dioxide, and any other substance, other than the protein of interest, which is capable of acting as an acid or base.
HYDROGEN ION TITRATION CURVES OF PROTEIN
77
Titration studies are nearly always carried out so as to maintain the same ionic strength and protein concentration throughout t,he curve, but this is not essential for all applications. Titration curves are always dependent on ionic strength, and a curve which is not obtained at constant ionic strength can be duplicated only if the ionic strength changes are duplicated. Titration curves are often independent of protein concentration, but will depend on the concentration whenever the possibility of association between protein molecules exists.
-
i scale ZHscole PH FIG.2. Titration curve of 8-lactoglobulin a t ionic strength 0.15 and 25°C. T h e alkaline branch is time-dependent (cf. Fig. 12), and the figure shows d a t a extrapolated t o the time of mixing ( t = 0) and t o infinite time. The figure also shows how t8hecurve is divided into acid, neutral, and alkaline regions. Three ordinate scales with different reference points are given. (Data of Y. Nozaki.)
Titration curves may sometimes depend on the time which has elapsed, between addition of acid or base and the measurement of pH. (This is true, for instance, of the alkaline part of the curve shown in Fig. 2.) By the same token, the titration curve obtained by addition of successive increments of acid or base to the same protein solution will sometimes differ from the curve obtained by addition of successively larger increments of acid or base, each to a fresh aliquot of the initial protein solution. Some titration curves or parts of titration cruves are independent of the initial pH, i.e., the number of H+ ions which are bound in going from, say, a reference point a t pH 5 to pH 4, is the same as the number bound in
78
CHARLES TANFORD
going from pH 5 to pH 4 after starting at a reference point at pH 7. Likewise, the number of protons dissociated in going from pH 4 to pH 5 may be the same as the number bound in going from pH 5 to pH 4. However, in many instances the situation is not so simple. The titration curve shown in Fig. 2, for example, would be obtained from any initial pH between pH 2 and pH 9.75. A different curve is obtained after exposure to any pH greater than 9.75. (See Section V for further discussion of reversibility.)
B . Reference Points We have described titration curves as records of the number of hydrogen ions attached to a protein molecule at m y pH, relative to the number attached at an arbitrary reference pH. It is advantageous however to choose as reference point a position on the titration curve which has physical significance. There are three such positions: (1) Point of zero net proton charge. If an aqueous protein solution is passed through a mixed-bed ion-exchange resin column (Dintzis, 1952), the solution is freed entirely of all small ions except Hf and OH-. The emerging solution is called isoionic.’ The protein molecules in it usually have a very low average charge, often negligibly different from zero. When a neutral salt is added to such a solution to adjust the ionic strength to whatever value is desired, the net molecular charge may alter because the ions of the salt may be bound. However, only minute numbers of protons are bound or dissociated: the net proton charge, which is defined as the average molecular charge due to bound hydrogen ions, or to the presence of groups from which hydrogen ions have been dissociated, usually remains close to zero. We shall discuss this topic in more detail in Section IX, B. What is important here is that the net proton charge of such a solution can always be calculated exactly from the pH of the solution, as Section IX, B will show. Since the titration curve itself tells us how big a change in pH is needed to bind or dissociate any given number of hydrogen ions, it becomes a simple matter to calculate the pH at which the net proton charge is truly zero. Like all aspects of the titration curve, this pH will usually be dependent on the ionic strength. We have pointed out that the net proton charge of a n isoionic solution 1 The distinction between the isoelectric and isoionic states of a protein was first made in a classic paper by SZrensen et al. (1926). Three definitions of the isoionic point were proposed, one of these being the stoichiometrically defined point which we have called the point of zero net proton charge. The other two were operational definitions (summarized by Linderstr6m-Lang and Nielsen, 1959). The term “isoionic point,” as used here, corresponds t o one of these two operational definitions, chosen because i t always permits calculation of the point of zero net proton charge, which is the only parameter of real interest in the analysis of titration curves. The same choice has been made by Scatchard and Black (1949).
HYDROGEN ION TITRATION CURVES OF PROTEIN
79
is nearly always close to zero. It can be shown from the equations given in Section IX, B that it is in fact negligibly different from zero if (a) the isoionic point falls between pH 5 and pH 9, and ( b ) the protein concentration is 1 gm/100 ml or larger. Under these conditions the isoionic point and the point of zero net proton charge are indistinguishable. (These conditions apply to the p-lactoglobulin curve of Fig. 2, for example.) For this reason the terms “isoionic point” and “point of zero net proton charge” are often used interchangeably. It is important to be aware of the difference when proteins such as lysozyme (isoionic point at pH 11) are being studied, and in general whenever protein solutions of low concentration are being used. It is to be noted that the point of zero net proton charge can be determined only for proteins which can be deionized without precipitation or other change. The point would also have little significance (and probably could not in any event be determined unequivocally) if the pH lies in a region where the titration curve is not behaving reversibly. (2) Point of maximum proton charge. At relatively high ionic strength (0.1 or above) a definite acid end point of the titration curve can nearly always be established, an example being provided by Fig. 2. The point represents a plateau of the titration curve, a range of pH within which the number of hydrogen ions attached to the protein molecule remains unchanged. The number of attached hydrogen ions is effectively a “maximum” number: more could be bound only at much greater acidity, where the integrity of the protein molecule would probably be destroyed. In terms of the “expected” pK values of Table I, the acid end point is the point where all titratable groups listed there, except the phosphate group, are expected to have been converted to their acidic forms. (Considering that most proteins have a large positive charge at the acid end point, the expected pK1 for the phosphate group would be zero or less.) (3) Point of minimum proton charge. The shape of the titration cruve at high pH suggests that a similar end point is being approached there, and we have indicated its probable location by the dashed line of Fig. 2. Unfortunately, the experimental curve usually can not be extended to the vicinity of this end point, because irreversible degradation of the protein molecule sets in when a pH of 12 is approached. Thus the alkaline end point, representing a position of minimum proton charge, can not be defined as precisely by experimental measurement as the acid end point. In terms of the pK values of Table I, the alkaline end point of the titration curve is the point where all titratable groups listed in the table, except guanidyl groups, are expected to have been converted to their basic forms. To change the reference point of a titration curve from an arbitrary reference pH to one of the reference points just defined simply involves a
80
CHARLES TANFORD
change in the zero point of the ordinate scale of the titration curve: the difference between the scale values a t any two pH’s must remain the same regardless of the reference point used. For illustration, Fig. 2 shows three ordinate scales based on three different reference points.
C. Spectrophotometric Titration for Phenolic Groups Electrometric titration is simply a measure of the total number of hydrogeri ions bound to or dissociated from a protein molecule, with no discrimination between the various kinds of acidic or basic groups with which these hydrogen ions may be associated. Thus there is a need for alternative methods which focus specifically on hydrogen ions associated with particlar groups of the protein molecule. One such method which has been used extensively is based on the change in the ultraviolet absorption spectrum which occurs when a tyrosine phenolic group dissociates to a phenolate ion. As was first suggested by Crammer and Neuberger (1943)) this spectral change can be used to follow the progress of the dissociation of hydrogen ions from tyrosine phenolic groups of proteins, by measurement of absorbance a t an appropriate wavelength, usually in the range 290 to 300 mp. The ultraviolet spectroscopy of proteins is reviewed elsewhere in this volume (Wetlaufer, 1962), so that no detailed discussion of this method is required. It should be noted however that indole groups of tryptophan side chains absorb in the same region of the spectrum as phenolic groups, and that there are spectral shifts, both for indole and phenolic groups, which are not related to hydrogen ion dissociation. Furthermore, a general increase in absorbance in the ultraviolet takes place whenever aggregation of protein molecules increases the amount of light which is lost by Rayleigh scattering. These effects, undesirable from the present point of view, can to some extent be separated from the changes ascribable to titration of phenolic groups by measurement of absorbance over a wider wavelength range, and by auxiliary studies of molecular weight and conformation. In general the spectrophotometric titration of phenolic groups will yield more accurate results for proteins which have a low tryptophan content and which undergo no conformational change in the pH region in which phenolic groups are titrated.
D. Other Methods Spectroscopic methods for following the titration of other common titratable groups of protein molecules do not exist. The reason is that the peptide group and the aromatic rings of phenylalanine, tryptophan, and tyrosine side chains absorb strongly in the ultraviolet below 250 mp, making it essentially impossible to observe the relatively small changes in absorb-
HYDROGEN ION TITRATION CURVES OF PROTEIN
81
ance which imidazole, sulfhydryl, and other groups undergo in the ultraviolet when their state of ionization is altered. The use of infrared spectra is prevented by similar reasons: not only the various parts of the protein molecule but also the solvent (at least if it is HzO) have overlapping absorption bands, which reduce titration-induced changes to a small fraction of the total absorbance. It has been suggested (Ehrlich and Sutherland, 1954; Susi et al., 1959) that infrared absorption can be used to follow the titration of carboxyl groups of DzO is used as solvent, but no critical examination of this possibility has been made. Spectroscopic methods can be used to follow the dissociation of hydrogen ions from certain special groups present on prosthetic groups of proteins. An example is the Fe(H20)+group of proteins containing ferriheme. The dissociation of this group to Fe(0H) is accompanied by a well-defined change in the visible spectrum, which has been used to follow the reaction by Austin and Drabkin (1935), and subsequently b y other workers. The reaction is also accompanied by a change in the magnetic susceptibility of the iron atom, and this effect, too, can be used as a measure of the extent of titration (Coryell et al., 1937). A method of an entirely different kind has been proposed for following the dissociation of hydrogen ions from the imidazole groups of histidine side chains (Koltun et al., 1958). It is based on the catalysis of the hydrolysis of p-nitrophenylacetate by uncharged (basic) imidazole groups. The rate constant for this reaction is independent of pH. Moreover, the presence of the p-nitrophenylacetate appears not to affect the equilibrium between the acidic and basic forms of the imidazole group. The observed rate of the catalyzed reaction is thus a direct measure of thc fraction of imidazole groups in the basic form a t any pH. There is so far only a single recorded application of this method to proteins (myoglobin, studied by Breslow and Gurd, 1962). A disadvantage of the method is that uncharged amino groups also catalyze the hydrolysis of p-nitrophenylacetate, so that the method can be used unequivocally a t present only in a region of pH where all amino groups are in the charged acidic form. Mention should be made, finally, of the existence of indirect methods. Changes in some property which is not directly related to the dissociation of an acidic group may be observed to have a pH-dependence which resembles the pH-dependence of hydrogen ion dissociation from a single group. It may then be postulated that the change in this property is an indirect reflection of the dissociation of a single group, and that the dissociation curve and pK of the group can be obtained from it. The commonest example of this procedure involves the use of ultraviolet difference spectra (Wetlaufer, 1962), but optical rotation and other properties can be used as well. An instance of an application of such methods to proteins is provided b y Hermans and Scheraga (1961b).
82
CHARLES TANFORD
E . Effect of Solvent The brief survey of experimental methods which we have given has in general assumed that water or an aqueous salt solution is being used as solvent for the protein. When other solvents are used, two problems arise: (1) The dissociation constants of all acids and bases depend on the solvent being used. The characteristic pK’s of Table I would not apply, for instance, in a dioxane/water mixture. (2) The usual techniques for measuring pH are applicable to aqueous solutions only. The electrolytic cell normally used to measure pH, when standardized by appropriate buffer solutions, yields a value of pH which is not exactly the same as, but is very close to -log aH+as defined by other methods. (Bates, 1954; Tanford 1955a). There is no assurance that this will be true in other solvents. There is a simple way to avoid these problems. One can define a pH scale in a completely arbitrary manner relative to the emf of a suitable cell. One can then relate the pH on this scale to an arbitrarily defined “activity” of hydrogen ions, simply be setting pH = -log aH+. The dissociation constants of model compounds can then be determined in terms of this arbitrary scale. This method has been used by Donovan et al. (1959) for protein titrations in concentrated aqueous solutions of guanidine hydrochloride and of urea, and by Sage and Singer (1962) for titrations in ethylene glycol.
IV. COUNTING OF GROUPS The simplest information gained from titration curves is a count of the number of groups titrated. This count can be useful regardless of whether the titration curve is reversible and regardless of whether the protein remains native throughout the pH range of titration. The most interesting applications of group counting are in fact to situations where the titration curve depends on time, the direction of titration, and similar factors, The differences observed, between one set of conditions and another, often tell directly how the protein differs under the two conditions.
A . Counting Procedure Most titration curves, such as that of Fig. 2, consist of three well-separated parts: a steep portion between the acid end point and about pH 5.5, another steep portion between about pH 9 and the alkaline end point, and a relatively flat portion (relatively few groups titrated per pH unit) in between. It is thus quite generally possible to divide the titration curve into three S-shaped portions, as has been done in Fig. 2, and to count separately the groups which titrate in the acid, neutral, and alkaline re-
HYDROGEN ION TITRATION CURVES OF PROTEIN
83
gions. The separation between the three parts is better a t higher ionic strength, and one can also approach closer to the end points a t higher ionic strength. Thus group counting is usually best conducted at ionic strengths of 0.1 or above. Table I shows that organic molecules which have carboxyl groups that resemble the carboxyl groups of proteins generally have pK values from 3.5 to 4.7,when electrostatic effects of charges elsewhere on the molecule are absent, or have been corrected for. The electrostatic effects expected to arise from other charges on a protein molecule may shift these pK values and will broaden the range of pH within which these groups are titrated. However, the major part of all carboxyl groups should be in their acidic form at pH 2 and in their basic form at pH 6 or slightly higher. Furthermore, no other titratable groups are expected from Table I to have pK’s within 2 pK units of the carboxyl groups. (It should be recalled that electrostatic effects are expected to influence all kinds of groups about equally.) It is logical, therefore, tentatively to identify the groups titrated in the acid part of the electrometric titration curve as “carboxyl” groups, the quotation marks signifying that these groups may not all in fact be carboxyl groups. Similary, the neutral region of the curve may be identified with “imidaeole” and “a-amino” groups, the a-amino groups being the N-terminal groups of the polypeptide chains. The alkaline region, finally, may be taken to represent primarily “side-chain amino” and “phenolic” groups, with a contribution from sulfhydryl groups where these are important. The number of polypeptide chains of a protein molecule is usually known. In most cases each polypeptide chain is terminated by a titratable a-amino group at one end and a titratable a-carboxyl group a t the other end. By subtracting the numbers of these groups from the titration regions in which they are tentatively assumed to occur, we obtain a count of “side-chain carboxyl” groups and of “imidazole” groups, the quotation marks again signifying that the identification rests on an assumption of normal titration characteristics which further investigation may prove to be false. If the phenolic groups have been titrated separately by the spectrophotometric method, then a count of these groups is available. The change in absorbance at 295 mp on ionization of a 1 M solution of tyrosine (1-cm light path) is 2300, and, at the present level of approximation, the same figure may be taken for the phenolic groups of proteins, with the understanding that processes other than phenolic ionization may effect the absorbance at 295 mp (see above). Subtracting the number of phenolic groups from the number of groups titrated in the alkaline region as a whole gives a count of “side-chain amino” groups. If the point of zero net proton charge is known, then a count is available
84
CHARLES TANFORD
of the number of hydrogen ions which must be added to go from that point to the acid elid point (point of maximum net proton charge). Unless the protein contains phosphate groups, or other groups with pK’s well below that of carboxyl groups, only amino, imidazolo, and guanidyl groups (from arginine or homoarginine side chains) will hear charges at the acid end point. Moreover, all of these groups will be positively charged. The number of hydrogen ions required to go from the point of zero net proton charge to the acid end point is then simply a measure of the sum of all these groups, which we shall call EN+ for short. If the number of “imidazole” groups and of both kinds of “amino” groups is known from the counting procedures already described, then this number may be subtracted from 2” to yield the number of “guanidyl” groups. It should be noted here that guanidyl groups are normally in their acidic (charged) form throughout the pH range covered by the titration curve of a protein. The count of these groups as obtained here is essentially a count of the number of carboxyl and other groups which must be titrated to neutralize the positive charge present on guanidyl groups. It should also be noted that any metal ions coordinated to a protein may contribute to the maximum charge. A ferric iron atom coordinated to a heme protein, for example, normally bears a charge of +1 a t low pH. Phosphate groups, as previously mentioned, would have a charge of -1 at the acid end point. These charges, where present, will all make a contribution to ZN+. The division of the titration curve into three parts is subject to some arbitrariness. The uncertainty in the count of groups titrated in the acid and alkaline regions is typically about 5 %. The uncertainty in the count of groups titrated in the neutral p H range is numerically the same, but, since there are generally fewer groups in the neutral region, it represents a larger percentage of the number of groups. One way of refining the division of the titration curve into the three regions is based on the fact that carboxyl groups have a heat of dissociation close to zero, whereas imidazole groups have a heat of dissociation of about 6 kcal/mole. An apparent heat of dissociation may be measured from titration curves a t two or more different temperatures
AH,,,
=
-2.303~R[apH/a(l/T)];
(1)
the pH being measured at the same state of titration (same value of ?, see Fig. 2) a t each temperature. This value of AH,,, will generally be quite close to the true heat of dissociation, and may thus be used to define more sharply the transition from titration of L‘carboxyl’7groups to titration of “imidazole” groups. An example is provided by Fig. 3. The same method cannot be used to define the transition from t)heneutral
HYDROGEN ION TITRATION CURVES O F PROTEIN
85
range to the alkaline range, because a-amino groups (which titrate a t the upper end of the neutral region) would be expected to have the same AH (about 11 kcal/mole) as the €-amino groups of the alkaline region. Also, the phenolic groups of the alkaline region are expected to have AH N 6 kcal/mole, which is similar to the value for imidazole groups. (Heats of dissociation will be considered again in Section VI, E.) Other methods of refinement of the count of groups exist. The most important one occurs as an adjunct to the semiempirical analysis of the shape of each part of the titration curve, to be discussed in Sections VI, B and VI, C . It is often impossible to achieve a self-consistent interpre-
, 0 80
100
No.of groups titrated
120
FIG.3. Apparent heat of dissociation i n serum albumin. The break between AH = 1 kcal/mole and AH = 7 kcal/mole occurs very close t o P = 100, indicating the presence of a total of 100 carboxyl groups per molecule. (Tanford et al., 1955b.)
tation with the numbers deduced from visual examination of the curve, as described here. A self-consistent interpretation may then be obtained by minor alterations in these numbers. It may be noted finally that more classes of titratable groups than have been discussed so far may be discernible. Figure 4 is a particularly striking example. It shows that the phenolic groups of chymotrypsinogen may, simply on inspection, be divided into three separate classes.
B. Difference Counting Counting of titratable groups is particularly interesting when the titration curve for a protein depends on the conditions under which it is determined, as for instance in the example of Fig. 2, where the titration curve above pH 9.7 depends on time. What is the physical meaning of the
86
CHARLES TANFORD
difference between the zero-time curve of Fig. 2, presumably representing a continuation of the titration curve of the native protein, and the lower curve, which represents the titration after some slow molecular change has gone to completion? Titration curves provide one of the simplest methods of detecting the occurrence of conformational change, and numerous examples will be cited. To avoid cumbersome terminology we shall call striking conformational
PH
FIG.4. Spectrophotometric titration of the phenolic groups of a-chymotrypsino gen (Wilcox, 1961). The native curve was obtained a t 25°C in 0.1 M KCI. The arrows indicate time-dependent data. The upper curve was obtained a t 25°C in 6.4 M urea, containing 0.1 M KCI. Under these conditions the protein R i denatured. The total number of groups titrated is 4, a change in absorbance of about 2000 at 295 m p corresponding to titration of a single phenolic group.
changes of this kind “denaturation,” and shall refer to the altered protein as “denatured.” In most instances a combination of physical and chemical measurements is required to specify precisely the nature of a conformation change, and a detailed description is therefore beyond the scope of this paper, which is limited to information derived from titration curves. The term denaturation is to be taken as covering a variety of possible changes in conformation. Figure 2 is one type of situation in which titration curve differences are observed: in this case it is the difference between a native and a denatured protein. Differences of this kind may also be observed by examining the
HYDROGEN ION TITRATION CURVES OF PROTEIN
87
protein in different solvents, or before and after it has combined with a metal ion, or before and after it has reacted with oxygen, or before and after it has been subjected to the action of an activating enzyme, etc. Titration curve differences may be of three types. (1)The total number of groups titrated has altered, i.e., new titratable groups have appeared. (2) The total count of groups remains the same, but their division into the classes enumerated above has altered, e.g., some “imidazole” groups may have become “carboxyl” groups. (3) The numbers counted in each of the
PH FIQ.5. Difference between titration curves for a given type of group in two differ ent states of a given protein. The difference lies in the number of groups available for titration. The top figure shows the actual titration curves, the lower figure a plot of the difference between them.
classes remains the same, but the shape of the curve is altered. With the first two types of difference, the S-shaped curves representing titration of any one of the classes of groups differ in the way shown in Fig. 5. If it is merely the shape of the curve which differs, then the difference will appear as in Fig. 6. Unfortunately, one cannot always tell which type of difference is involved. The alkaline region of the zero-time titration curve of Plactoglobulin (native protein) shown in Fig. 2 cannot be carried above pH 11, because the rate of denaturation becomes too rapid. The segment which is available shows the titration in the alkaline region of about half as many groups as are titrated altogether in this region after denaturation.
88
CHARLES TANFORD
The curve for the native protein is also less steep. It is not possible, however, to tell whether an extension of this curve would continue with its same moderate slope until the total number of groups titrated became the same as for the denatured protein (analogous to Fig. 5 ) or whether it would level off earlier (analogous to Fig. 6). Titration curve differences may be measured directly by means of a pH-stat (Jacobsen et al., 1957). We begin with the protein in its initial native state a t a particular pH, and then allow denaturation, metal binding,
'!2x f
l -
0 r
PH
FIG.6. Difference between titration curves for a given type of group in two difl'crent states of a given protein. The difference lies in the shape of the curve, the number of groups available being the same. The top figure shows the actual titration curves, the lower figure a plot of the difference between them.
or whatever other reaction we wish to study, to occur. The pH-stat is designed to add acid or base so as to maintain the pH unchanged, and the total amount required between the beginning and end of the reaction is then a measure of the difference between the corresponding titration curves a t that pH. Measurements made a t a series of pH values will then produce difference titration curves such as shown in the lower parts of Figs. 5 and 6. By themselves these difference curves are less informative than the complete titration curves. They may however be more precise, and thus are often useful in conjunction with the complete curves. They also allow measurement of the rate of reaction a t the same time as the total difference is observed (see Section V, C ) .
HYDROGEN ION TITRATION CURVES O F PROTEIN
89
An example of direct difference counting, in a situation where the overall titration curve is not known, is provided by Fig. 7. It shows the difference between the titration curves of active carboxypeptidase A, which contains Zn++,and the inactive zinc-free protein, as determined b y Coleman and Vallee (1961). The measurements were made by the pH-stat method, and confined t o a relatively narrow pH range. The figure shows a difference of two groups per bound Zn++ ion, with pK values of 7.7 m d 9.1. The two groups involved lose their protons when Zn++ is bound. I n the absence of a complete titration curve a completely unequivocal interpretation
-n 0
-
V
I
$ 2.0.-
--&, \
q'
0
I
z G 1.5,-
\,
40
z 0 ++ c
\
5
pKt7.7
t\
\
\
b\-*
1.0,-
N
n
\ \
---a=-,
*\ '\
0.5 -
'\,
pK;9.1
w
-I
k!
5-
O.O.,
.
I
I
(CPD) con" _ taining Zn++, and the zinc-free apoenzyme. The data were obtained with a pH-stat, measuring the amount of base required to maintain constant pH as zinc was added t o apoenxyme. The data are for 2 5 T ,ionic strength 1.0.
cannot be made, but it is reasonable to suppose that Zn++ is chelated to two basic groups which are prevented from combining wit.h hydrogen ions as long as the metal remains attached. When Zn++ is removed, these groups bind hydrogen ions with pK values of 7.7 and 9.1. The experiments were carried out in 1 M NaC1, so that electrostatic effects should be suppressed, and pK values observed should correspond closely to those of Table I. Independent information on the strength of Zn++ion binding, suggests that the ion is chelated to a site containing a basic nitrogen atom and a sulfide group. The observed pK's are compatible with such a binding site, the pK of 7.7 corresponding to an a-amino group (or nn imidazole group), while the pK of 9.1 is the expected value for a thiol group.
90
CHARLES TANFORD
Several other interesting examples will be cited in Section X, where the titration curves of individual proteins will be analyzed. It will be seen from difference counting that the binding site for Zn++ in insulin involves two uncharged imidazole groups, and that the site for iron binding in conalbumin involves three ionized phenolic groups. Difference counting will reveal the presence of anomalous carboxyl groups in native 8-lactoglobulin and lysozyme. It will show that deviations from expected pK values, where they occur, are generally characteristic of the native conformation of a protein and that they disappear on denaturation. (One example is provided by Fig. 4. The three distinguishable classes of phenolic groups of chymotrypsinogen disappear on denaturation.) Difference counting will be seen to provide evidence concerning the action of thrombin on fibrinogen, and concerning the action of acid and base in the liberation of gelatin from collagen. Group counting may be used of course to study any chemical modification of proteins. A recent example is provided by the use of difference counting in a study of the effect of photo-oxidation on proteins (Vodrazka et al., 1961). V. REVERSIBILITY ; THERMODYNAMIC AND KINETICANALYSIS
A . Reversibility and Time-Dependence The dissociation of hydrogen ions from the model compounds discussed in Section I1 is ordinarily a very rapid reversible reaction. Measurements by ordinary methods represent thermodynamic equilibrium. They are independent of time and independent of the direction in which the reaction is carried out. Large sections of protein titration curves are often equally time-independent and reversible, as, for instance, the acid part of the titration curve of P-lactoglobulin shown in Fig. 2. Any such section of the titration curve will again represent thermodynamic equilibrium and it may be subjected to thermodynamic analysis, as outlined in Sections VI and VII. On the other hand, a titration curve may depend on the time between addition of acid or base and the pH measurement (as in the alkaline branch of Fig. 2). When this happens, the curve will also in general be irreversible. A titration curve may appear to be time-independent but irreversible, especially if a continuous titration method is employed in which successive increments of acid or base are added to the same solution for each successive pH measurement. A hypothetical example is shown in Fig. 8. When this situation occurs, a careful rerun of the curve, in which each experimental point is obtained with an entirely fresh solution, will usually show time-dependence, as shown by curves 3,4, and 5 of Fig. 8.
HYDROGEN ION TITRATION CURVES OF PROTEIN
91
In many time-dependent situations a zero-time reversible curve (curve 6 of Fig. 8) may be obtained, i.e., reversed points obtained by keeping a solution a t an extreme pH for various lengths of time, and then extrapolating to zero time of exposure to the extreme pH, may coincide with a forward titration curve, each point of which also represents an extrapolation to zero time. In the same situation, the protein represented by curve 2 of Fig. 8 will usually be different from the native protein by criteria such as optical
End with
Denatyed Protein Start with Native Protein
PH
FIG.8. Hypothetical titration curves illustrating time-dependence and irreversibility. Curve 1 is an apparently time-independent curve, obtained by continuous titration, waiting several minutes for each successive pH reading. Curve 2 is the reverse titration curve, beginning a t t h e acid end point. Curve 3 is the forward titration curve obtained by flow methods, each pH being measured on a freshly mixed solution within seconds of mixing. Curves 4 and 5 are obtained from freshly mixed solutions with longer time intervals between mixing and measurement. Curve 6 is the titration curve which one might speculatively draw t o represent “instantaneous” titration of the native protein.
rotation or ultraviolet absorption spectrum, i.e., it will represent titration of a denatured state of the protein. In some instances, curve 2 itself may be reversible, i.e., the denatured protein may be a stable product in rapid equilibrium with its environment. On the other hand, time-dependent reversion to the native form may occur, or further slow denaturation reactions may take place. The zero-time reversible curve can usually be obtained over a limited range of pH only. I n the hypothetical example of Fig. 8, the molecular change which leads to curve 2 as the observed titration curve would become too rapid for extrapolation to zero time well before the acid end point of
92
CHARLES TANFORD
the curve is reached. It is often possible in such situations to extend the zero-time reversible curve by using a lower temperature, or a different ionic strength, to reduce the rate of denaturation. An example is provided by the titration of ferrihemoglobiii. Figure ‘3 shows the titration data obtained by Steinhardt and Zaiser (1953) a t ionic strength 0.02, a t 25°C. It is not possible to see enough of the native titration curve to decide whether it differs from the back titration curve
PH
FIG.9. The acid region of the titration curve of ferrihemoglobin a t ionic strength 0.02, a t 25°C. The lower curves in the inset represent the difference between the 3-sec curve and the 2- t o 22-hr curve, the dotted line incorporating a correction (discussed in the original paper). The upper curves in the inset are not of interest for the present discussion. From Steinhardt and Zaiser (1953).
chiefly in the number of titratable groups or chiefly in the shape. Figure 10 shows similar data for the cyanide complex of ferrihemoglobin, a t ionic strength 0.3, at 05°C (Steinhardt el al., 1962). A difference in the count of groups is now evident on inspection. Another example is provided by the titration of the phenolic groups of ribonuclease (Tanford et al., 1955a), in 0.15 M KC1. At 25°C the data strongly suggest that only three out of six phenolic groups are titrated in the native protein. At 6°C this conclusion becomes unequivocal (Fig. 11). A summary of information on the reversibility of protein titration curves,
HYDROGEN ION TITRATION CURVES OF PROTEIN
93
obtained from recent detailed studies, is given in Table 11. It is seen that some proteins, like chymotrypsinogen, may be titrated reversibly and instantaneously over a wide range of pH. At the other extreme is pepsin, which undergoes autolysis below pH 5 and becomes denatured above pH 6. The fact that a protein titration curve is reversible over a given range 1.61
,
I
I
I
I
I
I
I
I
Cyonoferri hemoglobin 0.3 m CI Q5.C o Notlve Protein (Flowig)
-
1
I
/ 3
I
4
I
I
a
I
, 6
t
7
PH
FIG.10. Data similar t o those of Fig. 9, but for the cyanide complex of ferrihemoglobin, a t lower temperature and higher ionic strength. The curves without experimental points represent similar data for uncomplexed ferrihernoglobiu, at the same temperature and ionic strength. The inset shows the two difference curves obtained from these data. From Steinhardt et al. (1962).
of pH does not necessarily mean that the protein conformation remains unchanged over the same pH range. Several of the proteins listed in Table I1 undergo conformational change within this pH range. In every example cited, except ovalbumin, the evidence for such change first came from an analysis of titration curves by the methods of Section VI.
B. Thermodynamic Analysis Thermodynamic analysis can be carried out only for reversible portions of a titration curve. The commonly used methods of approach will be
94
CHARLES TANFORD
considered in Sections VI and VII. As indicated above it may be possible, over a limited range of pH, to obtain two reversible curves for analysis, one representing the native conformation of the prot.ein, the other representing a denatured state.
C . Kinetic Analysis Whenever a titration curve depends on time, a kinetic analysis becomes possible. As indicated above, the time-dependence usually reflects a
16
7' 2
1 B
X
P 10
\
1' 1
aC 2
t
48
'E 0
0
6
8
10
12
14
PH Fro. 11. Dissociation of the phenolic groups of ribonuclense at ionic strength 0.15. The dashed lines show regions of time-dependence. Half-filled circles represent measurements after reversal from pH 11.5 (middle curve) and after reversal from pH 12.7 (upper curve). 0--T = 25°C; 0--T = 6°C. From Tanford et al. (1955a).
change in protein conformation, and a kinetic study is then a measure of the rate of change of conformation. As an example, Fig. 12 shows pH-stat records of the uptake of hydroxyl ions by 0-lactoglobulin in the alkaline region (Nozaki and Bunville, 1959). These are a part of the data from which the two alkaline branches of the titration curve of Fig. 2 were constructed. They also provide, however, a measure of the rate of denaturation of the protein. No systematic review of kinetic studies of this kind will be attempted in this paper, since they should logically be considered in conjunction with other methods of following the rate of denaturation.
95
HYDROGEN ION TITRATION CURVES O F PROTEIN
TABLEI1 Reversibilitu of Protein Titration Curves at 26'CU or LLlUIlal
Chymotrypsinogen Conalbumin a-Corticotropin Hemoglobin Insulin 8-Lactoglobulin Lysoa yme Myoglobin Ovalbumin Pepsin Ribonuclease Serum albumin
9.4 6.8 8.6 7.0 5.6 5.3 (1l.l)Q 7.5 4.9 +3 or < -3. A given molecule will have a fluctuating charge, varying about the mean charge ZH = gH. Titration curves cannot measure these variations in charge. Only is determinable. However, theoretical equations relate charge fluctuations to the titration curve, so that they can be calculated. One obvious result , and this is in of charge fluctuations is that (ZH)2is not the same as fact the parameter by which the spread of molecules among different values of 2, is usually characterized. The difference between and (ZH)2is identical with the difference between 3 and (5)' given by Eqs. (22) and (24). Differentiating Eq. (22), we get 2, =
z,
As is shown elsewhere (Tanford, 1961a), the left-hand side of Eq. (36) may be evaluated by use of the LinderstrGm-Lang equation, giving
-
(ZH)2
- .j>
= Cnj~j(1 i
(37)
where nj is the number of groups of class j, and xj the degree of dissociation of groups of that class, as given by Eq. (6). It should be noted that Eq. (36) is given incorrectly in the reference just cited (Tanford, l96la).
X. RESULTSFOR INDIVIDUAL PROTEINS A . Chymolrypsinogen Titration curves for bovine a-chymotrypsinogen have been determined under a variety of conditions by Wilcox (1961). The results are summarized in Table IX. The spectrophotometric titration of the phenolic groups is shown in Fig. 4. It is seen that all four phenolic groups are titrated together in 6.4 M urea, whereas only two are titrated in the native protein. Even these two have sufficiently different pK's so that the titration of one is essentially complete before that of the second has begun. The count of side-chain amino groups corresponds to the analytical figure for lysine side chains in a denaturing solvent (8 M urea). I n the native protein, however, three of the thirteen lysine groups cannot be observed to titrate. The maximum positive proton charge ( ZN') is however the same in the native and denatured states, within the uncertainty of about
132
CHARLES TANFORD
f l in the determination of the quantity in the native state. This means presumably that the native structure stabilizes the charged form of three lysine side chains, preventing their dissociation to an uncharged state. The count of “carboxyl” groups is larger than expected from analysis. No explanation has been offered. The usual explanation for this observation is that the analytical figure for free carboxyl groups is too small because of too high an estimate for amide nitrogen. This explanation is unlikely here because asparagine and glutamine were determined by direct analysis. TABLEIX Titration Data for a-Chymotrypsinogenil
Type of group
a-Carboxyl Side-chain carboxyl Imidaeole a-Amino Phenolic Side-chain amino Guanidyl ZNi ~
Titration Analysis
Native protein
2 1
4 13 4 20
(native protein)
14
-
3
3
G.7c
2d
4 -13 10.5
9 . 7 , 10.6” -
-14
l j
Denatured proteinb
[>Kin&, 25°C
10d -20
_.
-
~~~
a Molecular weight is 25,000. Titration data are from Wilcox (1961), analytical d a t a from Wilcox et al. (1957). * Data were obtained in 4 M guanidine hydrochloride and in 6.4 M or 8 M urea. c The same P K , , ~was assigned to all three groups which titrate in the neutral region. d More phenolic and amino groups are titrated slowly above pH 12 as the protein becomes denatured. See Fig. 4. 6 Each phenolic group has a different pK.
Wilcox (1961) has also obtained a partial titration curve for a guanidinated derivative of chymotrypsinogcn (lysine side chains converted to homoarginine). The only major difference was the disappearance of the ten side-chain amino groups. No spectrophotometric titration was carried out, but the electrometric titration curve in the alkaline region suggests that the two phenolic groups which are inaccessible to titration in the native protein are still inaccessible in the guanidinated derivative. The titration curve of native chymotrypsinogen is reversible between pH 2.5 and pH 11. Analysis hy the methods of Section VI shows that the neutral pH region can be described by a single intrinsic dissociation constant for all three groups which titrate in the region, together with a w
HYDROGEN ION TITRATION CURVES O F PROTEIX
133
value of 0.065 (at ionic strength 0.1). As Table 111shows, this is a reasonable value for a compact globular protein of the size of chymotrypsinogen. The carboxyl region of the titration curve cannot be described by a single pK and a reasonable value of w. It is likely that the explanation lies in the fact that chymotrypsinogen is rich in basic nitrogen groups (isoionic point is at pH 9.4),so that the carboxyl groups are titrated in an environment rich in positive charges. If these charges are unevenly distributed with respect to the carboxyl groups, then the latter will appear not to have identical pKint values, as was discussed in Section VI, D. The major problems which this titration curve poses are identification of the two unexplained groups in the carboxyl region, and an explanation for the failure to titrate three of the thirteen side-chain amino groups in the native protein.
B. Chymotrypsin Havsteen and Hess ( 1 9 6 2 ) have studied the titration of the phenolic groups of a-chymotrypsin and of its diisopropylphosphoryl (DIP) derivative. The result is similar to that observed with chymotrypsinogen (Fig. 4) in that oiily two of the four groups are available for titration in the native protein, as well as in the DIP derivative. The data do not have sufficient precision to determine whether the two titratable groups have different pK’s, as they do in chymotrypsinogen, but the wide spread of the titration curve suggests that they do. All four tyrosyl groups are titrated normally in solvents which denature chymotrypsin. C. Collagen Fibrils Martin et al. (1961) have shown that the titration curve of suspensions of freshly precipitated collagen fibrils depends markedly on the pH at which aggregation to fibrils takes place. As this pH increases the count of “imidazole” groups decreases, and there is a corresponding increase in the number of groups titrated in the carboxyl region (in which the fibrils go into solution). There is a correlation between the titration characteristics and the occurrence of periodic banding in the fibrils, the incidence of thc latter increasing with pH. It is suggested therefore that the banded regions contain uncharged imidazole groups which cannot be titrated as long as the fibrils remain intact.
D. Conalbumin The titration curve of conalbumin (Wishnia et al., 1 9 6 1 ) is complicated by time-dependent reactions in the acid range. These reactions, however, appear to influence only the shape of the titration curve, and the count of
134
CHARLES TANFORD
various kinds of groups presented in Table X is taken to apply to the native and modified protein alike. The agreement between the titration count and the result expected from analysis is remarkably good, the only discrepancy being in the figure for the maximum proton charge ( E N + ) , where the titration value is seven less than the analytical value for the total of all basic nitrogen groups. No explanation for the discrepancy exists. The small difference between the number of observed “carboxyl” groups and the analytical value is well within the experimental error of the latter. A spectrophotometric titration of phenolic groups was carried out. The TABLEX Titration Data for Cona16umina ~
Type of group
a-Carboxyl Side-chain carboxyl Imidazole a-Amino Phenolic Side-chain amino Guanidyl ZN+
Number of groups
Analysis
8t
Titration 86
131
14
19 52 33 99
11
52
-
92
PKnt
25°C
:1
-
5°C
4.54 7.20
9.41
9.64
-
-
9.85 10.20 -
-
aMolecular weight is 76,600. Titration data are from Wishnia et al. (1961). Analytical data are as given by Wishnia et al. (196l), derived from studies by Lewis et al. (1960). b Eleven groups are titrated in the native protein, the remaining seven are titrated upon denaturation.
results are qualitatively similar to those for ribonuclease shown in Fig. 11, except that the total number of groups per molecule is larger. Eleven of the eighteen phenolic groups are titrated reversibly in the native protein, another seven appear with accompanying denaturation, and their appearance can be delayed by reducing the temperature. An interesting example of difference counting is provided in the conalbumin study. Conalbumin can bind two iron atoms very tightly and it had been concluded earlier (Warner and Weber, 1953) that each iron atom might be bound to three phenolic groups, which would remain ionized at all pH’s where the iron complex is stable. This earlier conclusion was firmly established by spectrophotometric titration of the iron complex. Only five phenolic groups were titrated between pH 8 and 12, compared to eleven in native iron-free conalbumin. The result shows, incidentally,
HYDROGEN ION TITRATION CURVES O F PROTEIN
135
that the seven phenolic groups inaccessible to titration in the native protein (presumably buried as un-ionized groups) are still inaccessible in the iron complex. At 25°C the titration curve was reversible and independent of time between pH 4.2 and pH 11.2. By use of a flow method the reversible portion could be extended to somewhat lower pH. At 5°C it was possible to extend it on the alkaline side to pH 12. The treatment of Section VI was applied to the reversible region of the curve. It was possible to account for the major part of this region by using a single value of w, which was somewhat below that calculated by Eq. (4)) but the difference is only of the order of 25% and does not suggest that conalbumin is not a globular protein in its native state. Below pH 4 and above pH 11.2 (at 25OC) there is a marked decrease in electrostatic interaction (resembling that shown for serum albumin in Fig. 17)) indicative of a reversible transition, presumably to an expanded conformation. It is interesting that the over-all change in conformation which takes place in acid solution is by these measurements found to occur in two parts: a rapid and reversible part when 2, N 20 and then a slower irreversible reaction when 2, ‘v 32. The intrinsic pK values deduced from the part of the curve which could be fitted with a constant w are shown in Table X. The values are comparable to those found for other proteins (Table V) . The heats and entropies for dissociation are given in Table VI and are also found to be unremarkable. The over-all conclusion is that the titratable groups of conalbumin are largely accessible to the solvent, except for the phenolic groups discussed earlier.
E. a-corticotropin a-Corticotropin ( ACTH) has a molecular weight of only 4541 and should contain less than twenty titratable groups. The titration curve (of sheep corticotropin) has been studied by LBonis and Li ( 1959), the results being shown in Table XI. The corticotropin is usually isolated as thc trichloroacetate, and the titration was first carried out on this salt. The count of groups was in accord with expectation, except for the presence of one anomalous group in the neutral region. The corticotropin was then deionized by ion exchange, but the anomalous group remained. Moreover, two additional anomalous groups appeared as a result of deionization, one being a “carboxyl,” the other a “side-chain amino” group. LBonis and Li present evidence that the neutral anomalous group is due to an impurity. An unusual feature of the titration curve of corticotropin is the relatively low pH for titration of the guanidine groups. The titration of these groups
136
CHARLES TANFORD
is not visually distinct from the titration of the side-chain amino groups, though the distinction becomes apparent on mathematical analysis of the alkaline region of the curve. The titration curve of beef corticotropin (LBonis and Li, 1959) and a partial curve for pork corticotropin (Danckwerts, 1952) are essentially identical to the curve for the sheep product. The titration curve is reversible and has been analyzed by the methods of Section VI. The values of w which are obtained from titration of the phenolic and amino groups at ionic strength 0.1 are near 0.03. A somewhat TABLEXI Titration Data f o r a-Corticotropin" Number of groups
Titration
Type of group
pKint
Trichloro- Deionized acetate salt sample a-Carboxyl Side-chain carboxyl Imidaaole a-Amino Phenolic Side-chain amino Guanidyl ZNi
7
8
1 1 l i 2
2
2
3 4 9 }
7"
8"
-
5 or 6*
3
10
a These dat>aare for sheep corticotropin, with molecular weight 4541. Titration data from LBonis and Li (1959);analytical d a t a from Li et al. (1955). * There are 5 residues of glutamic acid and 2 of aspartic acid. Either one or two of them may be present as arnides. c The titration of guanidyl groups overlaps t h a t of amino groups, so t h a t they cannot be differerhiated on inspection.
higher value (0.0'3) is obtained from the titration of carboxyl groups, but it is probable that this is in part due to the fact that some of the carboxyl groups are bunched together on the corticotropin chain: the sequence -Gln.Asp.hsp.Clu.occurs at one point and accounts for more than half the carboxyl groups. As explained in Section VI, C , such close juxtaposition of titratable groups is expected to lead to a high value of w. In any event, the value of 0.03 is far below the value of w expected on the basis of the compact sphere model (Table III), and thus the titration curve clearly indicates that a-corticotropin does not have a globular structure. This conclusion agrees with the results of other measurements which all indicate that this molecule has a flexibly coiled conformation.
HYDROGEN ION TITRATION CURVES O F PROTEIN
137
As might be expected of a molecule with an essentially random structure, the pKint values which are given in Table X I are essentially normal. I t is especially noteworthy that the phenolic groups have a lower pKint than the side-chain amino groups, as is expected from the data of Table I. In most proteins with a typical globular structure this order is reversed (see Table V). The pKint values given in Table X I for the imidazole and aamino groups must be regarded as quite uncertain, because of the anomalous group which is titrated in the same range of pH.
F. Cytochrome c Titration curves of oxidized and reduced cytochrome c (from beef heart or horse heart) have been determined by Theorell and ikeson (1941) and by PalBus (1954). A major problem concerns the titration of the imidazole groups, of which there are three in the protein (Margoliash et al., 1962). According to Theorell and Akeson, there are two groups titrated with a pK approximately that of imidazole groups. Since cytochrome c has no free a-amino group, a reasonable interpretation is that both of these are in fact imidazole groups, and that the one imidazole group which is not titrated is linked to the heme iron atom of the protein. This result is compatible with the proposal of Margoliash et al. (1959)) that the two basic groups coordinated to the heme iron atom are one imidazole group and one lysine amino group. Theorell and ikeson, on rather weak evidence, suggested that one of the groups titrated in the neutral region is actually not an imidazole group, and PalBus, with more accurate data, comes to the same conclusion, demonstrating quite convincingly that just a single group with pK near 7 (actually pK = 6.8) is being titrated. This would seem to favor the hypothesis that two imidazole groups are coordinated to the heme iron atom. An undecapeptide which can be obtained from peptic hydrolysis has also been subjected to titration (PalBus et al., 1955). This peptide contains the heme group, attached to the polypeptide chain through two half-cystine residues, and possesses one histidine residue, which is thought to be covalently linked to the iron atom. There is not a unique way of interpreting the titration curve. It is possible to arrive at an interpretation which assigns a pK of 3.5 to the imidazole group, but it must be regarded as specultLtive. Seventeen groups arc titrated in the carboxyl region on the basis of a molecular weight of 12,500. This figure agrees with expectation. There are twelve side-chain carboxyl groups and one terminal a-carboxyl group (Margoliash, 1962). The other four titratable groups represent the two free propionic acid groups of the heme, and the two iron-linked basic groups, which should be freed from their bonds to iron, and titrated, when
138
CHARLES TANFORD
the pH becomes sufficiently low. (This process does not lead to separation of the heme from the protein in the case of cytochrome c, because the heme is linked to the protein through its side chains, which form thio-ether bonds to the two half-cystine residues mentioned earlier.) The absorption spectrum of ferricytochrome c changes with pH, and indicates that several hydrogen ions are dissociated with accompanyin effect on the electronic structure of the heme iron atom (Theorell and keson, 1941;Boeri et al., 1953). The results are complicated by an influence of chloride ion, and a simple interpretation of all the changes is not possible. However, a change involving two H+ ions, titrated simultaneously at pH 2.1, appears to reflect the dissociation of the basic nitrogen groups from the heme iron atom. In the undecapeptide studied by Paleus et al. (1955), the corresponding change occurs when Hf ions are added or removed at pH 3.4 and 5.8.
f
G . Fetuin Fetuin is a glycoprotein which has titratable groups associated with its carbohydrate moiety (sialic acid) in addition to those present on the protein. Spiro (1960) has determined the titration curve of the native protein, as well as that of a preparation from which sialic acid had been removed. The group count differed only in the number of groups assignable to sialic acid. In particular, the value of 2 N + was the same for both proteins. It is clear from these results that the combination of sialic acid with the protein does not involve any of the basic groups of the protein.
H . F i b h o g e n and Fibrin The difference between the titration curves of fibrinogen and fibrin, in 3.3 M urea solution, has been determined by Mihalyi (1954a). It was found that the reaction fibrinogen -+ fibrin leads to the production of 3.5 new groups per molecule, with a pK near 8.0. These groups are presumably a-amino groups which arise from the proteolytic nature of the activation process. It is known that two peptide bonds per molecule of fibrinogen are broken in this process, two peptides being split off. At pH 5.5 (the reference point of Mihalyi's data) no net difference in dissociated hydrogen ions is involved, since one end of each split bond will be in the form -COOand the other end in the form -NHt. The resulting amino groups will however be titratable, losing their hydrogen ions near pH 8. The fact that 3.5 new groups, rather than two were observed, is perhaps the result of some nonspecific splitting of peptide bonds by thrombin. Mihalyi (1954b) studied in a similar way the difference in the titration curves of fibrinogen and of polymerized (clotted) fibrin. By subtracting from this difference the difference described above between fibrinogen and
HYDROGEN ION TITRATION CURVES O F PROTEIN
139
fibrin, he was able to compute the difference due to the polymerization of fibrin alone. (The clotting experiments were carried out in 0.3 M KCI, whereas the fibrinogen -+ fibrin experiment was carried out in the presence of urea. Urea was found to have only a small effect on the titration curve of fibrinogen in the pH range of interest. It was assumed that the effect on unpolymerized fibrin would be the same, and the small correction required for the change in solvent was applied on this basis.) The observed change on clotting consisted of the appearance of three or four new titratable groups with a pK of 7.0 and the disappearance of three or four groups which originally had a pK of 8.2. Several different explanations have been proposed by Mihalyi and others, none of which carries conviction. Scheraga and Laskowski (1957) consider that the altered titration curve arises not from the disappearance of groups with pK of 8.2 and the appearance of new groups with pK of 7.0, but from a small pK shift of a larger number of groups. This may well be correct, as the original interpretation of Mihalyi depends on ignoring some of the experimental points. However, the elaborate mechanism used by Scheraga and Laskowski to account for such a shift in pK is an improbable one. It may be that nothing more complicated is involved than the necessity to maintain charge neutrality of areas of contact between fibrin molecules in the polymer.
I . Gelatin Kenchington and Ward (1954) have used titration studies to resolve the molecular difference between gelatin extracted by processes employing acid and alkaline media. Acid-processed gelatin was found to have an isoionic point a t pH 9.1 and to possess 85 titratable carboxyl groups per 100,000 grams. The alkali-processed material was isoionic at pH 4.92 and contained 123 titratable carboxyl groups, essentially equivalent to the total amount of glutamic acid and aspartic acid plus the corresponding amides, which gelatin is known to contain. It is clear that processing in alkaline solution hydrolyzes side-chain amide groups to the corresponding carboxyl groups.
J . Hemoglobin Titration studies of hemoglobin have made important contributions to our knowledge of this protein. The three outstanding features are : (1) Four groups are titrated (pK near 8.0) in ferrihemoglobin,which are not titrated in hemoglobin itself, nor in the complexes of hemoglobin with 0 2 or CO (German and Wyman, 1937; Wyman and Ingalls, 1941). The same groups, with similar pK, are observed by spectrophotometric titration (Austin and Drakbin, 1935; George and Hananis, 1953), and by observation of the effect of pH on magnetic properties (Coryell et al., 1937).
140
CHARLES TANFORD
+
These groups represent the dissociation Fe(HzO)+s Fe(0H) H’, from the heme iron atom. (2) Four groups, which have a pK near 7.9 in hemoglobin, are titrated at much lower pH (pK near 6.7) in the hemoglobin-oxygen complex. These are the four “heme-linked” imidazole groups, which are responsible for the effect of pH and of COzon the hemoglobin-oxygenequilibrium (BohrHasselbach-Krogh effect). Pour other groups have an altered pK when oxygen is combined with hemoglobin, the pK being near 5.25 in hemoglobin and near 5.75 in the oxygen complex. These are also believed to be imidazole groups, though the identification is less secure. It is of interest that all eight of these groups have the same pK in ferrihemoglobin as in the hemoglobin-oxygen complex (Wyman and Ingalls, 1941), so that the pK differences observed are presumably associated with a conformational difference in the region of the molecule which contains the heme, rather than with a specific effect of oxygen (Wyman and Allen, 1951). The subject of the heme-linked groups has been previously reviewed (Wyman, 1948), and the reader is referred to that review for a detailed discussion of the subject. (3) The titration curve in the acid region is time-dependent and irreversible, as was first clearly demonstrated by Steinhardt and Zaiser (1951). This aspect of the titration of hemoglobin has also been reviewed previously (Steinhardt and Zaiser, 1955), but there was some ambiguity about the meaning of this phenomenon at the time of the review, which has been removed by more recent work, as will be briefly described here. The original observation (Fig. 9) was that a difference of up to thirtyeight groups could be obtained in the number of groups required to titrate an initially neutral protein to pH 3.5, depending on whether data were obtained by a rapid-flow method or by slower procedures. A parallel study of spectral changes indicated that protein examined immediately on attainment of pH 3.5 would be largely in its native state, but that conversion to a denatured state takes place rapidly. It was not possible however to determine whether the extra thirty-eight groups which were titrated represent an effect of conformation on the number of groups titrated in the acid range, or whether they reflect a change only in the shape of the acid part of the titration curve, brought about by unfolding and hence decreased electrostatic interaction. It appeared that at least a part of the effect, must be due to the latter phenomenon (Tanford, 1957). The question has been essentially decided by more recent experiments (Beychok and Steinhardt, 1959; Steinhardt et al., 19G2). One procedure was to reduce the temperature and increase the ionic strength, the increase in ionic strength being for the purpose of diminishing the importance of electrostatic interactions. A second procedure was to use hemoglobin derivatives which are especially stable in the native state, these being the
HYDROQEN ION TITRATION CURVES OF PROTEIN
141
CO-hemoglobin complex and the cyanide complex of ferrihemoglobin. Both procedures resolved the ambiguity in the interpretation of the earlier results. Figure 10 shows the titration curves of the stable complexes and the rapid back titration of the corresponding denatured proteins over a wide range of pH. What these curves show is that the number of groups in the carboxyl region is essentially the same in the native and denatured states, but that the number of groups titrated in the neutral (imidazole) region is about twenty-two greater for the denatured protein. These twenty-two groups (all assumed to be imidazole groups) are in their uncharged form in the native state and cannot be titrated with acid. Upon denaturation they become free from the restraint to which they were subject, so that, in going from a neutral reference point towards lower pH, under conditions where denaturation occurs, these groups are titrated as an accompaniment of denaturation. Reverse titration of the denatured protein then shows these groups as titrating in the normal region for imidazole groups. (The maximum difference of thirty-eight hydrogen ions per mole observed in Fig. 9 must represent these twenty-two imidazole groups plus an additional difference of sixteen hydrogen ions per mole which reflects the difference in electrostatic interactions between native and denatured protein.) It should be noted that four of the anomalous imidazole groups are the four groups by which the heme iron atoms are attached to the protein. These cannot be titrated as long as the hemes remain attached. The other groups must be simply “buried” in the interior. Such groups occur in myoglobin (see below), as well as in hemoglobin. Nozaki (1959) has carried out careful titration curves of bovine ferrihemoglobin over the entire accessible pH range. A preliminary group count was made. Starting with native isoionic protein, titration to the acid end point (where the protein is of course denatured) required 102 hydrogen ions/mole, so that the maximum positive charge is 102. The expected figure (based on 14 arginine, 48 lysine, 36 histidine, and 4 iron atoms, each contributing one charge) is 106. Starting with native protein and titrating towards the alkaline side gave a count of 28 groups in the neutral pH region. The expected figure is 44 [36 histidine, 4 N-terminal amino groups, and 4 Fe(Hz0)+ groups], so that there is a discrepancy of 16. This discrepancy confirms the conclusion of Steinhardt et al., given above, that many of the imidazole groups of the native protein are inaccessible to titration, the difference between the actual group numbers (22 versus 161 being probably within the experimental error of both determinations. Nozaki’s figure for the alkaline range is 60 groups per mole, in agreement with expectation (48 lysine, 12 tyrosine). The total number of groups titrated in the carboxyl region, including those released on denaturation, was 88,
142
CHARLES TANFORD
which is approximately the expected figure when the eight carhoxyl groups on the side chains of the four heme groups are included. Another titration curve, this time for human CO-hemoglobin, has been determined by Vodrazka and Cejka (1961). Unfortunately, their group counting procedure is not valid: all groups titrating below pH 4.7 were arbitrarily assigned to the “carboxyl” region and those titrating between pH 4.7 and pH 8.5 were assigned to the “imidazole plus a-amino” region. This procedure assigns too large a number of groups to the latter region and led the authors to the conclusion that the anomalous groups which titrate in the acid pH range are basic groups other than imidazole groups. The data in fact do not support this conclusion. Vodrazka and Cejka (1961) titrated the phenolic groups spectrophotometrically, and found no evidence for buried groups. Hermans ( 1962), has reported from a similar study that four groups per molecule (one per polypeptide chain) are not titratable in the native protein. Hermans’ study employed absorption at 245 mp instead of the customary wavelength of 295 mp.
K . Insulin Titration curves of insulin have been determined by Tanford and Epstein (1954) and by Fredericq (1954, 1956). As Table XI1 shows, the count of groups obtained for zinc-free insulin is in agreement with analytical data. This is true in spite of the fact that insulin is insoluble between pH and pH 7. The precipitate is evidently highly hydrated, so that titration of the acidic groups occurs as if they were in solution. Tanford and Epstein also determined the titration curve of crystalline zinc insulin, containing one atom of Zn++ per two insulin molecules. The titration curve of this material differs from that of zinc-free insulin in two ways. (1) Two new groups are titrated for each zinc ion, one near pH 8, the other near pH 12. These presumably represent acidic water molecules attached to Zn“, Zn(H20)2++ -+ ZnOH(H20)+
-+
Zn(OH)z
(2) The count of imidazole groups is reduced from four (per two insulin molecules) to two, and, at the same time, two new “carboxyl” groups appear. The titration of these new groups (in the direction of hydrogen ion addition) parallels the dissociation of Zn++ from the zinc insulin. Each Zn++ ion is evidently associated with two imidazole groups in their basic uncharged form. The value of the binding constant for zinc confirms this conclusion. Gurd and Wilcox (1956) have pointed out that the binding groups could
143
HYDROGEN ION TITRATION CURVES OF PROTEIN
be terminal amino groups rather than imidazole groups, since these groups have essentially identical pK's in the free state. Marcker (1960) has presented evidence that the amino groups are in fact the more likely binding sites. The curve for zinc-free insulin was analyzed by the method of Section TABLExrr Titration Data for Insulin at $ 6 ' 0 Number of groups Type of group
a-Carboxyl Side-chain carboxyl Imidazole a-Amino Side-chain amino Phenolic Guanidyl Zn(H,O)++
Titration
pKint
Analysis
Zinc-free insulin
Zinc insulin
8.5* 4 4
12.5
14.56
2
-
4 4
2 4
6.4 7.4
10
10
9.6'
2
2 2
11.9
-
-
a Data of Tanford and Epstein (1954; see also Fredericq 1954, 1956), calculated for an insulin dimer of molecular weight 11,466. The zinc insulin preparation contained one zinc atom per dimer molecule. The pKint values are for the zinc-free protein. * The fractional number arises from the probable presence of two forms of insulin which differ in the number of free carboxyl groups. c Two of the groups titrating as carboxyl groups are the imidazole groups t o which the Znf+ ion is bound. I, This pK was determinable because the titration curve of the carboxyl groups was clearly not compatible with the presence of 12.5 identical groups. Assuming 4 groups with a lower pK, this was the value required. 6 No attempt was made t o distinguish between amino and phenolic groups in the analysis.
VI. The pKint values are shown in Table XI1 and reveal no important anomalies. The apparent value of w rises to very high values in the region where the protein is precipitated. (See Section VI, C . ) The dissociation of the single lysine amino group of iodinated insulin has been studied by Gruen et al. (1959a). (The iodination separated the titration region of phenolic residues from the titration region of the amine group. ) No abnormalities were observed. The over-all difference between the titration curves of iodinated and nor-
144
CHARLES TANFORD
ma1 insulin corresponded to that which is expected as due to the lowered pK characteristic of iodjnated phenolic groups (Gruen et al., 1959b).
L. @-Lactoglobulin Complete titration curves of 0-lactoglobulin have been determined by Cannan et al. (1942) and by Nozaki et al. (1959). Between the acid end point and the onset of alkaline denaturation near pH 9.7, the data of the two studies are indistinguishable. Above pH 9.7, Nozaki et al., by use of the pH-stat were able to obtain two curves, one representing an extrapolation to zero time (corresponding to the titration of the native protein), the other representing infinite time (corresponding to titration of denatured protein). Both curves are shown in Fig. 2. The curve for the native protein could be obtained over a limited range of pH only, so that it is not possible to decide whether the difference between the two curves represents a difference in the number of groups accessible to titration, or whether it lies in the shape of the curve alone. It was possible to fit the data for the native protein by assuming that all amino and phenolic groups are accessible, and that the value of 20 is the same as that which is applicable to the reversible part of the titration curve. The side-chain amino and phenolic groups were assumed to have the same pKintand a reasonable value of 9.95 was obtained from this analysis. This suggests that the difference between the curves lies primarily in the steepness, and not in the count of groups. It was assumed, however, that the four thiol groups of the protein are not titrated in the native state because they are found to be quite unreactive by other methods. Since thiol groups are expected to have a somewhat lower pK than amino or phenolic groups (Table I ) , it would probably have been difficult in any case to fit the native curve with reasonable pKint values if the thiol groups had been included. The titration curve of 0-lactoglobulin denatured a t pH 12.5 has also been determined (Tanford el al., 1959). It was found that the curve is reversible. The denatured protein is insoluble near its isoelectric point and this region was not studied in detail. The group counting results obtained from these studies are reported in the last two columns of Table XITI. ( I n comparing these results with the analytical data of the first two columns of this table it should be noted that the p-lactoglobulin studied was a mixture of the two genetic isomers, 0-lactoglobulins A and B.) The most significant feature of the analysis is that the native protein appears to contain six imidazole groups, compared to the analytical figure of four. At the same time, the number of carboxyl groups titrated is less than the analytical figure by two groups. After denaturation, however, the group count agrees with the amino acid analysis. It is evident that two carboxyl groups of the native protein are titrated with
HYDROGEN ION TITRATION CURVES OF PROTEIN
145
a pK which is characteristic of imidazole groups. The probable reason has been discussed in Section VI, D. Titration curves of pure /3-lactoglobulins A and B have also been determined (Tanford and Nozaki, 1959). The two genetic variants differ in isoionic point, but they possess the same maximum positive charge (Fig. TABLE XI11 Group Counting for @-Lactoglobulin Amino acid analysisType of group
a-COOH Side-chain COOH Imidazole a-NHz Thiol Phenolic Side-chain NH2 Guanidyl
ZN+
@-LactA
8-Lact R
2 52 4 2 2 8 28 6 40
4 2 2 8 28 6 40
Titration curveb
Native B-Lact A
1
Native @-IdactB
1
52
I
50
I
Native lenatured mixturec mixtureC 51
a Gordon et al. (1961), Pie2 et al. (1961). The figures have been adjusted t o the nearest even integer for a two-chain molecule of molecular weight 35,500. * Titration data of Nozaki et al. (1959), Tanford et al. (1959), Tanford and Noeaki (1959). The mixture contained essentially equimolar amounts of the two genetic isomers, &lactoglobulins A and B. d Figures in parentheses are subject t o considerable uncertainty. The number of phenolic groups was determined from the total change i n extinction a t 295 mfi in going from native protein with undissociated phenolic groups t o denatured protein with all phenolic groups dissociated. No correction was made for the change i n extinction a t this wavelength which results from unfolding of the protein as a result of the emergence of the tryptophan residues from the inside of the native structure. There are four tryptophan residues per molecule, so that this change can be expected t o be quite large, certainly large enough t o account for an error of two groups in the count of phenolic groups.
20) and the same two anomalous carboxyl groups with pKintof 7.5. Thus the difference between them lies in the number of normally exposed carboxyI groups, two more of these being in form A than in form B. This result has since been confirmed by amino acid analysis (Gordon et al., 1961; I’iez et al., 1961). Another interesting feature of the titration studies (Nozaki et al., 1959) is the fact that addition of KC1 and CaClz depresses the pH of isoionic protein solutions. This means that I(+ and Ca++ ions are bound by the iso-
140
CHARLES TANFORD
ionic protein, a result which is apparently in agreement with published studies of ion binding by direct means (Carr, 1953, 1956). These studies show that K+ and Ca++ ions are bound at pH 7.4, and the quantitative difference between the number found bound at that pH, and the number calculated as bound from the pH change at the isoionic point (Section IX, B ) , is of the order of magnitude expected on the basis of the difference in protein charge a t the two pH's. More recently, however, Saroff (1961) has measured ion binding as a function of pH and has observed that there is
I
2.0
PH
2.5
I 3.0
FIG.20. Approach t o the acid end point of the titration curves of p-lactoglobulins A and B, and for the normal equimolar mixture of the two, a t 25°C and ionic strength 0.15. The value of 2 , is calculated relative t o the point of zero net proton charge, which occurs a t a different pH for each of the three samples (Tanford and Nozaki, 1959).
essentially no binding of K+ at the isoionic pH. He observes binding a t higher pH, the appearance of binding sites occurring in parallel with the conformational change during which the two anomalous carboxyl groups are titrated. The discrepancy between his results and those derived from the titration studies is at present unresolved. The values of w which one obtains from the carboxyl region of the titration curve by application of Eq. (4), assuming 2, = 2, are 0.072 and 0.039, respectively, at ionic strength 0.01 and 0.15, i.e., they are somewhat below the calculated values of Table 111. (As was mentioned earlier, the same values are compatible with the entire titration curve of the native protein.) If one attempts to evaluate the difference between 2 and 2, at all pH's
147
HYDROGEN ION TITRATION CIJRVES OF PROTEIN
from the few values of ion binding at different pH's which are available, larger values are observed (w = 0.090 and 0.058 at ionic strengths 0.01 and 0.15). Both sets of values are within the range expected for a compact globular protein. The slopes of logarithmic plots for the alkaline titration curve at t = (Fig. 2) are however very much less, showing that the alkali-denatured protein is randomly coiled. The intrinsic pK values obtained from these studies at 25OC and ionic strength 0.15, without correcting for K+ ion binding, are 4.7 for the sidechain carboxyl groups, 7.3 for the imidazole groups, and 9.9 for phenolic Q,
TABLEXIV Carboxyl Groups of LysozymeR Carboxyl groups titrated Lot number
in 0.15 M KCl
in 8 M GHCl or 5 M GHCl 1.2 M urea*
+
003L1 381-187 D638040 381-187Methylated
10.5 9 14 1 (or 2)
13.5 12
381-187Acetylated
12
12
9
12
381-187Guanidinated
-
1 (or 2)
Remarks
-
12.2Methoxyl groupsper molecule Acetylation occurs principally at amino groups Guanidination converts amino groups to hornoarginine groups
= Data from Tanford and Wagner (1954);Donovan et al. (1960,1961). b GHCl = guanidine hydrochloride.
and lysine amino groups. Correction for K+ ion binding increases the pKint values by 0.15.
M . Lysozyme Titration studies of lysozyme have revealed two unique features, both occurring in the carboxyl region of the titration curve. The pertinent data are shown in Table XIV. It is seen ( a ) that the count of carboxyl groups varies widely from one preparation of lysoByme to another, and ( b ) that three extra carboxyl groups appear in denaturing solvents such as 8 M guanidine hydrochloride. The three extra carboxyl groups which appear in denaturing solvents were apparently in their carboxylate ion form in the native protein. Since these groups are not detectable at all in the titration of the native protein
148
CHARLES TANFORD
down to pH 2, a true plateau being approached at pH 2, they must in effect be inaccessible to the solvent. It is possible for a charged group to be so located only if it is in close contact with a similarly inaccessible group of opposite charge. There is no evidence that any of the titratable cationic groups of lysozyme are so located. However, the twelve guanidyl groups are never titrated, so that three of these could be located away from the protein/solvent interface. A number of chemical derivatives of lysozyme were studied by Donovan et al. (1960), with the results shown in Table XIV. The guanidinated derivative, in which charged lysine groups are simply replaced by similarly charged homoarginine groups, showed behavior similar to that of the native protein. The acetylated derivative, in which lysine side chains are rcplaced by uncharged acetyllysine groups, titrated like denatured untreated lysozyme. The most interesting result was that obtained with the methylated derivative, in which most of the carboxyl groups are esterified and only a single titratable group in the carboxyl region is observed. The number of methoxyl groups introduced was found by analysis to be equal to the number of carboxyl groups titrated in the denatured protein, essentially confirming that the three extra groups titrated upon denaturation are in fact carboxyl groups. The only aspect of the titration data which raises a question about the existence of buried carboxylate ions in the native protein is the maximum positive charge estimated by Tanford and Wagner (1954). The figure is based on an assumed location of the point of zero net proton charge at pH 11.1. (Lysozyme precipitates on deionization, so that the normal procedure for determining this reference point is not possible.) This assumed pH, however, is supported by the fact that titration curves at three different ionic strengths, which should intersect a t Z = 0, do intersect at pH 11.1, and by the fact that the isoelectric point has been determined electrophoretically to be at pH 11.1. With this assumed pH of zero net proton charge, the maximum positive charge ( Z N + , Section IV, A ) becomes 19, which is just the analytical figure for the total number of cationic groups. If three carboxyl groups are still in their anionic form a t the acid end point of the titration curve, the experimental maximum positive charge should have been 3 less than the anlaytical figure. It would appear that the problem of the carboxyl groups of lysozyme merits further investigation. The count of other titratable groups of lysozyme agrees with analytical data. Moreover, no variation between different preparations has been reported. The titration curve of the native protein is reversible and has been analyzed by the methods of Section VI. As we have already noted (Fig. 16),
HYDROGEN ION TITRATION CURVES OF PROTEIN
149
the carboxyl region does not obey Eq. (14) if all carboxyl groups are assumed to have the same pKint. This is not an anomaly unique to lysozyme, but is shared by other proteins (chymotrypsinogen, ribonuclease) which are rich in basic nitrogen groups. The anomaly presumably reflects uneven spatial distribution of these groups, relative to the carboxyl groups. The neutral and alkaline region of the titration curve is compatible with values of w of magnitude similar to those calculated by Eq. (4). The intrinsic pK values are not remarkable (see Table V ) , except that pKint for the phenolic groups is 1.2 higher than the expected value. The likely reason has already been discussed in Section VI, D . It may be noted that Tanford and Wagner (1954) found the spectral change corresponding to dissociation of the phenolic groups to be quite abnormal, though an approximate pK for the dissociation could be determined. Their difficulties have been elegantly explained by Donovan el al. (1961) as arising from changes in the spectrum of tryptophan side chains, which occur in the same region of pH as the ionization of phenolic groups. When these changes are corrected for, the residual spectral change becomes that which is normally expected for phenolic ionization.
N . Myoglobin Special interest attaches to the titration of sperm whale myoglobin because the three-dimensional structure of this protein is well on the way to being completely elucidated (Kendrew el al., 1961). The speculative structural features, which have been invoked to explain titration data that do not conform to expectation, will in this protein soon be subject to actual test. A titration curve for sperm whale myoglobin has been reported by Breslow and Curd (1962). The most striking feature is that it exhibits a timedependent acid denaturation, which resembles that observed for the similar protein hemoglobin. To elucidate the physical nature of this reaction, emphasis was placed on the back titration to neutral pH of denatured protein. As in the case of hemoglobin (mentioned earlier) ,there are two major differences between the titration curves of native and denatured myoglobin, as shown by the data of Table XV. The first difference is in the count of imidazole groups. Only six of the twelve groups known to be present are titrated in the normal pH range in the native protein. When the native protein is titrated towards acid pH, these six groups are titrated in the carboxyl region as the protein becomes denatured. In the back titration of the denatured protein, all twelve of the groups are titrated with approximately the expected pK. To confirm the identification of the groups concerned, the kinetics of hydrolysis of p-nitrophenylacetate (Section 111, D) was studied. This method gives
150
CHARLES TANFORD
direct information as to the number of accessible uncharged imidazole groups, and in the present study confirmed exactly the conculsions reached from the titration curve as a whole. It is concluded therefore that six of the twelve imidazole groups of native myoglobin are buried in the interior in their uncharged form. One of these is of course the imidazole group by which the heme iron atom is attached to the protein. The other five have not been identified. The expectation from the present study is that the complete three-dimensional structure of the protein, when available, will show these groups in positions where they are not in contact with the solvent. The second difference between native and denatured myoglobin lies in TABLEXV Titration of Myoglobin in the Neutral pH Region" ~
Native protein Number of imidaaole groups: By analysis By titration By reaction with NPAh pKi.t for imidazole groups w At ionic strength 0.16 w At ionic strength 0.06
After acid denaturation 12
6 6 6.62 0.050
0.085
12 12 6.48 0.034" 0.044O
a Titration data of Breslow and Gurd (1962). Analytical data by Edmundsori and Hirs (1961). The protein studied was ferrimyoglobin from the sperm whale. Its molecular weight is 17,816. NPA p-nitrophenyIacetate. See Section 111, D. Determined from the carboxyl region rather than the imidazole region, assuming that all 23 carboxyl groups found by analysis are titratable with the same pKint. =i
0
the value of the electrostatic interaction factor w. In the native molecule this factor is somewhat, but not much smaller than calculated by Eq. (4). Using a radius derived from the volume of the molecule as it appears in myoglobin crystals, Breslow and Gurd (1962) calculated w = 0.106 and 0.065, respectively, at ionic strengths 0.06 and 0.16. The experimental values at the same ionic strengths are 0.085 and 0.050. (It was assumed that = ? .& , which is likely to be essentially correct for myoglobin.) The values of w for the denatured protein are considerably smaller, indicating that unfolding of the protein occurs. The values given in Table XV are obtained by assuming that all of the carboxyl groups of myoglobin have identical pKint values. If this assumption is in error, the actual values would be smaller than those given in the table. (The intrinsic pK of the carboxyl groups, obtained by a rather long extrapolation to = 0, is 4.4.) Apart from the imidazole groups of myoglobin, only one other group has
z
z
HYDROGEN ION TITRATION CURVES O F PROTEIN
151
been studied in any detail, this being the 17e(HzO)+group, which dissociates to Fc(OH) near pK 9. The pK value for this group has been determined by Theorell and Ehrenberg (1951), George and Hanania (1952), and Rreslow and Gurd (1962) from the spectral change which accompanies the dissociation. The values are given in Table V, together with comparable data from other proteins. In a recent study, Hermans (1962) has indicated that only two of the three tyrosyl phenolic groups of myoglobin titrate below pH 13, this result being obtained for both whale and horse myoglobins.
0. Myosin The titration of myosin has been studied by Mihalyi ( 1950). The count of groups in the various regions of the curve agrees with analysis to better than lo%, except for the carboxyl region, where titration indicates 165 groups per 100,OOO gm, as compared with the analytical figure of 132. It is likely that the analytical assay for amide groups is the source of the error. Mihalyi’s titration curves were obtained a t a series of ionic strengths, ranging from quite a low value to I = 1.2 M . The curves were reversible over a wide range of pH, and were considered to represent equilibrium. They were not analyzed by the methods of Section VI, but even a cursory examination shows that the carboxyl region at least would not obey the behavior predicted by the Linderstrfim-Lang equations. The carboxyl regions of the titration curves are quite flat, indicating fairly strong interaction between the groups, but if the data were plotted according to Eq. (14), with 2, as abscissa, the w values would be essentially independent of ionic strength. On the other hand, the pKint values would depend strongly on the ionic strength. Mihalyi suggests that strong anion binding in the acid region would be responsible for this kind of behavior. A spectrophotometric titration of the phenolic groups of myosin and its subunits has been reported by Stracher ( 1960). The data resemble those shown for ribonuclease in Fig. 11. About two-thirds of the tyrosine residues are titrated normally, and about one-third appear inaccessible in native myosin. An interesting feature is that 6 M urea has no effect a t all on the titration curve. Similar studies were performed on the subunits L-meromyosin and Hmeromyosin. In the former 90 % of the phenolic groups appear abnormal, but they are titrated normally in 5 M urea. In H-meromyosin all of the groups are normal, even in aqueous solution.
P. Ovalbumin The very first electrometric titration curve of a protein to be reported in the literature is a study of ovalbumin (Bugarszky and Liebermann, 1898).
152
CHARLES TANFORD
The first detailed analysis of a protein titration curve, according to the semiempirical treatment used for most of the titration curves reviewed in this paper, also involves ovalbumin (Cannan el al., 1941). The first discovery of phenolic groups inaccessible to titration was again made with this protein (Crammer and Neuberger, 1943). Within the limits of error of amino acid analyses available a t the time, the count of groups obtained by Cannan et al. agreed with expectation, except in so far as the alkaline part of the curve was concerned. The number of groups titrated here is essentially the same as the number of amino groups, rather than the sum of amino and phenolic groups. This result is in accord with the later spectrophotometric titration of phenolic groups: essentially all of these groups are inaccessible to titration in the native protein. Cannan et al. (1941) were able to describe the entire titration curve, at eight different ionic strengths, by Linderstrgim-Lang’s equation, using a single value of w a t each ionic strength. The variation of w with ionic strength was essentially that of Eq. (4),experimental values being about 0.8 of those calculated by that equation. [Itwas found that the isoionic pH depends on the concentration of KCl which was used to vary the ionic strength, the direction of the effect indicating that chloride ion is bound. Correction for the effect was made by an adjustment factor introduced into Eq. (14), this factor being a measure of the number of chloride ions bound by the protein at its isoionic point. The possibility of variation in chloride ion binding with pH was not allowed for, and this is one reason why experimental values of w fall below calculated ones. The intrinsic pK values required to fit the data were not far from expected values: 4.3 for the carboxyl groups, 6.7 for imidazole groups, and 10.0 for lysine amino groups.] Harrington (1955) has used the pH-stat method to show that there is a large difference between the titration curves of native ovalbumin in 2.6 M guanidine hydrochloride, and the denatured protein which is formed by a measurably slow reaction in this solvent. The result has been interpreted as indicating the release of eight carboxyl and eight phenolic groups which are not accessible to titration in the native protein. Since the two branches of the difference titration curve reach plateaus a t extreme acid and alkaline pH (i.e., they resemble the curve shown in the lower part of Fig. 5 ) , the interpretation of the difference as newly liberated groups appears indisputable. That the phenolic groups should be so liberated on denaturation is not surprising, since these groups are known to be buried in the interior in the native protein. The liberation of carboxyl groups is surprising, however, since the expected number is titrated in the native protein. Furthermore, the carboxyl groups which are unavailable in the native protein appear to be in their anionic form. In this form they would reduce the maximum possible positive charge of the native protein by eight groups, whereas
HYDROGEN ION TITRATION CURVES OF PROTEIN
153
Cannan et al. (1941) find complete agreement between the experimental maximum positive charge and the analytical figure for Z N'.
Q . Papain Glazer and Smith (1961) have carried out a spectrophotometric titration of the phenolic groups of papain. Of the seventeen phenolic groups known to be present, eleven to twelve ionize normally (pKint = 9.8). The remainder ionize only upon denaturation, which takes place only slowly in the range of pH 12 to 13.
R. Paramyosin A detailed study of the hydrogen ion titration of clam paramyosin has recently been reported by Riddiford and Scheraga ( 1962). Essentially all the groups expected to be titratable from analytical data were found to be titrated reversibly both in 0.3 M KC1 and in a guanidine-urea mixture in which extensive denaturation had occurred. In the native state the protein is precipitated between pH 3.5 and 6.5, but this apparently did not interfere with titration, indicating that the precipitated protein is gellike in nature, permitting free passage of water and of ions to the individual molecules. Earlier Johnson and Kahn (1959) had reported that the titration curve of paramyosin has a plateau between pH 3 and pH 5. They concluded that the carboxyl groups were titrated in two distinct steps, one near pH 6 and one below pH 3. This finding would have suggested highly unusual interactions within the molecule. Riddiford and Scheraga (1962) did not find such a plateau in their studies. However, the fact that the protein is in an insoluble state between pH 3.5 and 6.5 suggests that the differences between the two sets of results may depend upon the particular conditions of preparation and handling of the protein, in a manner not yet adequately defined. The electrostatic interaction factor w for the native protein was found by Riddiford and Scheraga to be much smaller than would be expected for a compact spherical particle by Eq. ( 4 ) . In order of magnitude, the w value agreed with the value predicted for long cylindrical rods by Hill (1955). This is in agreement with the known dimensions of the paramyosin molecule. However, a considerably larger value of w was required to fit the alkaline part of the titration curve than was required for the acid branch. In the denaturing solvent, w values even smaller than those for the native state were observed. On the alkaline side, electrostatic interaction disappeared almost entirely, presumably because extensive unfolding with penetration of salt ions into the domain of the protein had occurred. The intrinsic pK values in the guanidine-urea mixture were found to be close to the normally expected values. The pK values for the native pro-
154
CHARLES TANFORD
tein were also remarkably close to expected values. Only the pK for the lysine amino groups deviated significantly from expectation, a low value of 9.65 being obtained. Spectrophotometric titration of the phenolic groups led to the conclusion that all of the fifty-eight phenolic groups present on each paramyosin molecule were titrated in the guanidine-urea mixture, but that only forty-nine were titrated in the native state in 0.3 M KC1. (All of these had an essentially normal pKint of 9.6.) This conclusion however was based entirely on the fact that the total change in absorbance at 295 mp, between neutral pH and pH 14, is about 15 % less in the native protein than in the denatured state. If the change in absorbance per group titrated were to differ in the two solvents, then the conclusion reached would have to be revised.
S. Pep& It is not possible to determine the titration curve of pepsin over a wide range of pH because autolysis, with liberation of free carboxyl and amino groups, occurs at acid pH. Titration studies have been performed (Edelhoch, 1958) in the range of pH 5 to 8, however. This is the pH range within which pepsin undergoes denaturation, and the data clearly show that more hydrogen ions are dissociated in this pH range from the denatured protein than from the native protein. The difference is greater at low ionic strength than at high ionic strength, but a difference of six H+ ions per molecule persists at ionic strengths from 0.4 to 1.0. These six H+ ions are thought to represent a difference in the number of titratable groups, the additional difference at lower ionic strength being a measure of the greater steepness of the titration curve of the denatured protein which is the result of diminished electrostatic interaction after unfolding has taken place. The simplest explanation of the difference in group count would be that native pepsin has six uncharged carboxyl groups inaccessible to titration, and that these are exposed during denaturation.
T . Peroxidase A titration study of a peroxidase from Japanese radish has been rcported by Morita and Kameda (1958). The titration curves of native protein, acid-denatured protein, and alkali-denatured protein are dramatically different. Unfortunately only continuous titration curves were obtained, so that an interpretation of the data is not possible at this time,
U. Ribonuclease The best-known feature of the titration of ribonuclease is the fact that only three of the six phenolic groups of this protein can be titrated while the protein is in its native state, as was first reported by Shugar (1952).
HYDROGEN ION TITRATION CURVES OF PROTEIN
155
These three groups have an essentially normal intrinsic pK and their titration curve yields essentially normal values for the interaction parameter w (Tanford et al., 1955a). The three phenolic groups which are not available for titration in the native protein are titrated a t 25°C near pH 13, where the protein becomes denatured. At 6"C, where the rate of denaturation is slower, a pH of 14 can be reached with only partial titration of these groups. The titration curve of ribonuclease (Tanford and Hauenstein, 1956b) is reversible between its acid end point and the onset of alkaline denaturation. All titratable groups, which are expected to be present on the basis of amino acid analysis, are found to be titrated in the expected parts of the titration curve, with the exception of the abnormal phenolic groups mentioned above. The amino and imidasole groups appear to have normal pK's, and the neutral and alkaline regions in which they occur are compatible with the same values of w as are required to fit the titration curves of the three normal phenolic groups. A second (minor) anomaly is found in the acid part of the titration curve. Although the expected number of groups is titrated, the observed values of w are anomalously large. It is possible that this is simply a manifestation of the inadequacy of the Linderstrgm-Lang treatment, as was pointed out in Section VI, C . Two alternate explanations have been proposed, neither of which deserves to be taken very seriously. (1) Tanford and Hauenstein showed that the acid part of the titration curve could be compatible with the same values of w as were used to fit the rest of the titration curve, if one were to assume that five of the ten sidechain carboxyl groups have pKi,t = 4.0, while five others have pKint = 4.7. (2) Hermans and Scheraga (1961b) showed that a fair fit of the titration data could be obtained with these same values of w, if one assumed that one of the side-chain carboxyl groups has pKint = 2.5, another has pKint = 3.65, and the remaining eight have pKint = 4.6. The existence of groups with pKint 2.5 and 3.65 was inferred from low pH conformational changes (Hermans and Scheraga, 1961a). It is to be expected that these special features of the titration curve of ribonuclease will disappear when the protein is unfolded. In agreement with this expectation, it has been found that all six phenolic groups are titrated together in ethylene glycol (Sage and Singer, 1958, 1962), in 8 M aqueous urea (Blumenfeld and Levy, 1958), and in aqueous solutions containing 5 M guanidine hydrochloride and 1.2 M urea (Cha and Scheraga, 1960). I n the guanidine-urea medium the electrometric titration curve has also been determined (Cha and Scheraga, 1960), and it was found possible to fit the entire curve with a single value of w,including all carboxyl groups as a single set with pKint = 4.6.
156
CHARLES TANFORD
The only unexplained feature in these experiments concerns the value of w. In the guanidine-urea medium the ionic strength is exceedingly large, so that a very small value of w would be expected even if the protein were not unfolded. For an unfolded protein, w should become essentially zero. The value observed by Cha and Scheraga, both for the titration curve as a whole and for the spectrophotometric titration of the phenolic groups alone, is 0.056, which is almost as large as the value found for the native protein at an ionic strength of 0.15. In 8 M urea, by contrast, the expected result is obtained. Blumenfeld and Levy (1958) obtained w = 0.018 at an ionic strength of 0.1, which is far below the value expected for the native protein a t this ionic strength, as predicted for an unfolded protein molecule. Bigelow (1960) has found that all six of the phenolic groups of performic acid-oxidized ribonuclease behave normally, which is to be expected since the oxidized protein is believed to be highly unfolded. More interesting is his finding that a pepsin-inactivated preparation of ribonuclease can be prepared which contains five normal phenolic groups and one buried one. Tanford and Hauenstein ( 1956a) observed that the chromatographic minor component of one commercial preparation of ribonuclease has an isoionic point of 9.23, whereas the major component (ribonuclease A) has an isoionic point of 9.65. This difference corresponds to a difference of one tit,ratable group, so that the minor component either has one fewer amino or guanidyl group than ribonuclease A, or else it has an extra carboxyl group. Titration curves of the two components indicated that the latter explanation is correct, for the two components gave the same maximum positive charge, but a difference of one in the number of carboxyl groups. T t appears now that these experimental data were incorrect, as Eaker (1961) has found by amino acid analysis that this minor component in fact consists of two sub-components, both of which lack the N-terminal lysine residue of ribonuclease A, so that the glutamic acid residue normally in the second position becomes the terminal residue. In one of the sub-components this glutamic acid residue has itself been converted to a pyroglutamyl residue. These findings are incompatible with the unaltered maximum positive charge found by titration. (The titration curve error presumably resulted from the fact that only a small amount of the second component was available for study, so that all the data depend on a single determination.)
V . Serum Albumin The titration curve of serum albumin (Tanford et aE., 1955b) is independent of time and essentially completely reversible. However, the protein undergoes at least three changes in conformation during the course of titration. These changes are themselves rapid and reversible, so that sepa-
157
HYDROGEN ION TITRATION CURVES OF PROTEIN
rate titration curves for individual conformations cannot be obtained. The interpretation of the titration curve is thus dependent on an interpretation of conformational changes which are observed by other means. The over-all count of titratable groups is shown in Table XVI. The agreement with analytical data is excellent, except that about nine more carboxyl groups are titrated than the analysis predicts. This discrepancy has been ascribed to an erroneously high analytical estimate of amide groups, and no recent data have appeared to support or gainsay this. When the data are analyzed according to the methods of Section VI, one obtains essentially normal values of w between pH 4.3 and 10.5. The TABLEXVI Titration Data for Serum Albumin" Number of groups Type of group
Analysis
Titration
pK
int,
25°C
curve
a-Carboxyl Side-chain carboxyl Imidazole a-Amino Thiol Side-chain amino Phenolic Guanidino EN+
90ll
'i
100
18
.%. Ala. Leu. Cly. Arg. Leu 1 3 6 X 143 Val. Ala. Asp.&. Leu. Ma. H i s
ws 70
78 Leu.Ala,His. Leu
4 (Hemoglobin A, 0-chain)
&.Phe.@*Asp.%. Fro. 4. Symmetrical esteusions of nnalogolls fragments.
( p = 1.056 X
-16
10
)
182
F. &OR,
AND B. KEIL
a- and @-chainsof hemoglobin (Fig. 4c, d).
A notable example is found in the a-chain of hemoglobin (Fig. 4c) where the tripeptide sequence Ala.Leu. Ser, doubled in positions 79-84, recurs (with certain modification) in positions 123-127 and (reversed) 108-111. Moreover, two of these sequences are terminated at both ends by aspartic acid or histidine residues, the remaining sequence at one end by histidine. It should be recalled that histidine and aspartic acid (or asparagine) are very probably interchangeable amino acids. As has been pointed out by Gibian (1961) and also by one of the authors (gorm, 1961b) structural analogies may sometimes extend for considerable distances along the peptide chain. In these cases isolated pairs of unrelated (by standard interchange) amino acids may again be encountered at corresponding positions in the two analogous sequences. However, there is a degree of regularity even in these interpolations since the same pair of unrelated amino acids often recurs at such sites, either with the same mem-
AcetyI.
165
FIG.5 . Structural resemblances between peptide fragments of the TMV protein.
ber of the pair in one sequence, or with the members of the pair reversed as between the two sequences at different sites. Examples of such sequences are given in Fig. 5 (TMV protein) and Fig. 6 (insulin).
C . Repetition of Peptide Sequences The repetition of dipeptide and tripeptide sequences in protein primary structures is not, as has already been stated, statistically significant in the case of proteins of high molecular weight. However, such repetitions may become worthy of note if the identical or analogous lower sequences immediately follow one another in the peptide chain (contiguous sequences) or if their recurrence is connected with other, additional regularities (gorm, 1961b, 1962a). Such special instances have been encountered especially in the structures of both hemoglobin chains, and examples are listed in Fig. 7. The remarkable fragment in positions 78-85 of the a-chain (Fig. 7b) not only shows contiguous repetition of a tripeptide sequence but the hexapeptide unit so constituted is bracketed between identical amino acid residues; and, moreover, a tetrapeptide fragment of this sequences has an analogy in positions 123-126 of the same molecule. Again, the sequence
I
REGULARITIES IN THE PRIMARY BTRIJCTURE OF PROTEINS
I
183
F. ~ O R MAND B. KEIL
184
in positions 91-100 of the same chain shows a similar regularity in that two contiguous analogous sequences are symmetrically terminated by two analogous dipeptide sequences (with the standard interchange Arg/Lys) . A further interesting sequence is found in positions 62-71 of the a-chain where two alternate, overlapping pairs of contiguous analogous tetrapeptide sequences can be discerned. Sometimes the regularity of such repetitions is disturbed by interpolations, as in the undecapeptide sequence from Dipeptides 18 23 Val. Asp. Val. Asp. Glu. Val
a.
0-chin
Tripeptides 78 85 Asp. (Ala. Leu. Ser). (Ala. Leu.Ser). Asp 123 126 Ala Ser. Leu. Asp
x .
b.
a- chain
( p = 1.074
100 91 Leu. Ar2. (Val. Asp.Ero). (Val. Asp. Phe) .Lys. Leu
c. d.
134 139 (Val. Ala.%).(Val.Ala.fia)
e.
( S s . Ala. Val). (Thr. Ala. Leg)
a-cliaiii (/I
105 ( x u . Leu.-).
= 2.65 X 10 -')
0-chain /;-chain
100 108 (Leu. Leu. Ser). His. Cys. Leu. (L4eu.Val.Thr)
g-
10 -6)
14
9
f.
X
(Y-
chain
115
Asp. ( V J . Leu.Val). Cys.
(s. Leu.&)
@-chain
Tetrapeptides h.
62 71 (Val. Ala. Asp.Ala).(Leu.Thr. Asp.Ala). (Val.Ala)
a- chain
FIO.7. Special types of recurrence of lower peptide sequences in heinoglohin A .
corticotropin where the second identical tetrapeptide sequence is broken into by a run of three basic (and interchangeable) amino acids (f3orm et al., 1957) :
.Gly.Lys.Pro.Val-Gly.Lys. (Lys.Arg.Arg) .Pro.Val. A particularly fine example of multiple recurrences of short as well as longer sequences, with standard interchanges and short-range rearrangements, is provided by the heme peptide isolated as a fragment from the heme protein of the photoanaerobe Chromatium (strain D) (Dus et al., 1961)
REGULARITIES IN THE PRIMARY STRUCTURE OF PROTEINS
185
probably related to the cytochromes c to which it shows extensive structural resemblances (gorm, 1962a, c) . The great regularity with which the peptide chain of this fragment is constructed is shown in Fig. 8. The probability of such an arrangement of the peptide chain resulting from the operation of chance is very low (4.27 X lo-”) compared with the theoretical number of permutations (4.36 X loz2);in other words, there are only some 100,000 structures of the same composition which would show the same degree of regularity. Yet another regularity in the arrangement of certain parts of protein molecules becomes apparent if the alternate residues in certain sequences are matched against other, full sequences (of the same or opposite sequence sense) (gorm, 1962a, b, c). In some proteins, e.g., the a- and @-chainsof hemoglobin, such a comparison reveals the frequent occurrence of sequence pairs of this type, with one member of the pair an “abbreviated” (by omission of alternate residues) version of the other; with standard interchanges 1
LSel?Ala
.Lys.Cys.His.Thr.Ee__
27 Asp .Glu.Gly.Ser
FIG.8. Internal regularities in the structure of the heme peptide from Chromatiurn
again in evidence. In some instances members of such a pair are set between identical or analogous di- and tripeptide sequences. The probability of chance occurrence of this type of analogy is rather high in comparison with the probability of recurrence of full identical or analogous sequences; however, they are of interest by virtue of their high incidence in structures such as that of hemoglobin and of cytochrome c. Some examples of this type of structural resemblance, which may be classed as a kind of repetition, are given in Fig. 9.
D. Symmetrical Arrangement of Peptide Chains A special case of linear analogy in peptide chains is the symmetrical development of sequences, in opposite sequence sense, in both directions from a single “central” amino acid residue (gorm, 1961a, b; Keil and gorm, 1962). Here again all the familiar structural variations such as standard interchanges, short-range rearrangements, and interpolated residues are met with. Examples have been found in all known protein structures in varying numbers; they appear to be particularly frequent in both chains of hemoglobin and in myoglobin but are rare in cytochrome c. I n the two
16
22
56
a-chain
59
L~s---
Gly-His
-Cly
19 85 Ala.Leu.Ser.Ala.Leu.Ser.Asp
b.
123 Ah-Ser-Leu-Asp
Hemoglobin A
a-chain
126
49 55 Lys.Asp(NH, ).Lys.Asp(NH, ).Ala.Asp.Thr
C.
95 1leu.Ala.Tyr. Leu. Ly s 75 Ileu -Tyr-
LYS
102 Ah--Thr
Cytochrome c
7
($ = 0.666 X 10- )
Ly s
89
- A r g---Lys.Thr
w
ti
LYS
Ileu.Phe .Ala.Gly .Ileu.Lys .Ly s.Lys.Thr -
k u . P h e -GGly
PJ E
72
81
d.
Hernoglobin A
Lys.Val.Gly.Ala.His.Ala.Gly
a.
40
Cytochrome c ($I = 2.03 X 10 -I1)
FIG.9. Peptide fragments recurring with the omission of certain residues.
REGULARITIES IN THE PRIMARY STRUCTURE OF PROTEINS
187
first-named proteins quite complex structures of this type can be made out; they are shown in Fig. 10 together with simpler examples from ribonuclease. In some cases such "symmetrical" structures extend over large regions of the peptide chain, as in the hemoglobin chain (gorm, 1962a; Keil and gorm, 1962) (Fig. 11). A symmetrical histidine residue (No. 20, a-chain) is exactly the junction of two a-helix units. Val Tyr Pro.Asp(NH,)
/
Pro
*-*-
'Val.His.Phe
Ribonucleaee
121 .ASP
40 /h.-Ala.LeLe-Lr.Glu Val 60 G 'u l (NH,). A l a . g . Cys. Se r. Glu (NH,)
Ribnucleaee
(
E
0.558 X 10
Myoglobin
e.
,s.Pro.Pz.Arg Phe '=.Pro.%.Arg
Bradykinin (Elliott etg., 1961)
FIQ.10. Symmetrical development of sequences along the peptide chain.
E. Regularities in the Spacing of Amino Acid Residues The fibrous proteins such as collagen and particularly silk fibroin, and the protamines (clupeine, salmine) appear to exhibit a high degree of regularity (periodicity) in the spacing of certain amino acid residues along the peptide chain. This phenomenon was in its time the basis on which Bergmann and Niemann (1937) built their theory of the regular periodicity of amino acid residues along the protein chains. Thus, salmine has afforded peptide fragments with arginine in 1, 4 spacing in high yield, and in silk
188
F. &ORM AND B. KEIL
fibroin the dipeptide sequence Ala.Gly is thought to alternate with the analogous Ser.Gly over large regions of the chain. However, even in the globular proteins some regularities in the spacing of certain amino acid residues can occasionally be made out. A particularly striking example of this kind is provided by the spacing of lysine and half-cystine in riboiiuclcase (Sorm, lutjla) (Fig. 12). In four instances, lysine residues are spaced 30 residues apart; in three cases alternate residues are involved; in the fourth, two residues are omitted. In two further instances, alternate residues are spaced 29 and 32 positions 136 128 Gly.Ala.Val.Va1. Lys.Glu.Tyr.Ala.Ala Ala. Gly -Val.
20
/
His(
Ala. Gly
12.5, where tyrosine is almost wholly in the form of the divalent anion. But there appears to be no simple way to evaluate c(- - +) , since at pH values between 12.5 and 7.5, a tyrosine solution contains a mixture of at least three microscopic species, none in negligible quantity. A similar problem arises in the spectrophotometric titration of cysteine (Section V,B,2). The study by Martin et al. is of interest not only for the rationalization of the electrometric and spectrophotometric measurements in terms of the microconstants, but also because the spectrophotometric titration of tyrosine relates so closely to similar studies in proteins. In particular, the multiple H+-equilibria of tyrosine result from the close juxtaposition of amino and phenolic groups in the same molecule; under these circumstances the ionizations are mutually interacting. We suggest that some of the anomalies seen in tyrosyl ionization in proteins may arise in a similar fashion, but in terms of magnitude, this mechanism clearly cannot account for such anomalous tyrosyl groups as those seen in ribonuclease or ovalbumin. Katchalski and Sela (1953) studied the spectra and the spectrophotometric titration of polytyrosine (number average D.P. = 30). The behavior of these polymers was consistent with that of tyrosyl residues in a polyelectrolyte molecule. In a later paper, the same authors (Sela and Katchalski, 1956) reported spectrophotometric titrations of copolymeric polyamino acids containing tyrosyl and acidic, neutral, or basic residues in an attempt to demonstrate the often-hypothesized tyrosyl-carboxylate bond. The copolymers studied and their residue ratios were L-Tyr, L-ASP (1:1, 1:3, 1:9), L-Tyr, L-LYS(1:3, 1:9), and L-Tyr, DL-Ala (1:9). The results showed that tyrosyl groups were freely and reversibly ionizable in all the polymers, and that the shift of apparent pK for tyrosyl groups was up in the aspartic acid copolymers and down in the lysine copolymers, a result consistent with the simplest electrostatic considerations. No evidence was found for tyrosyl-carboxylate bonds. One may infer either that tyrosyl-carboxylate bonds do not exist in these model compounds, or that these experiments were inadequate to demonstrate them. The question of putative tyrosyl-carboxylate bonds is further discussed in Sections VI,c, and V1,D. Katchalski and Sela also studied (1953) poly-3,5-diiodotyrosine, finding here an inexplicably high intrinsic pK of 7.7 (after applying a reasonable electrostatic correction). An apparently adequate model compound, 3,5-diiodotyrosine, has a phenolic pK’ = 6.4 (Crammer and Neuberger, 1943; Gemmill, 1955). 2. Cysteinyl Groups and Thiols
Dissolved oxygen rapidly oxidizes simple thiols, through a mercaptide ion intermediate, to disulfides, in the presence of traces of certain metallic
338
D. B. WETLAUFER
ions. Because of this difficulty it was not until the work of Benesch and Benesch (1955) that reliable spectra of the mercaptide form of cysteine were reported. These are shown in Fig. 11, for varying degrees of thiol ionization. The progressive long-wave shift of the absorption peak with increasing pH is a secondary effect, due to the overlapping ionizations of the thiol and amino groups. By measuring absorptivity as a function of pH, these authors were able to supplement conventional titration data in such a way as to obtain four “microconstants” for relating the equilibria between hydrogen ion and the four main ionic species existing in cysteine solutions between p H 7 and pH 12. These species can be represented as
&;:I
0.8’ 0.6
0.D.
p H 12.0 p H 11.04
0.2 0.1
p H 9.85 p H 9.03
-
220
PH
230
240
as1
250
X(mp) FIG.11. Absorption spectra of cysteine; concentration of cysteine, 1.70 X 10-4 M (Benesch and Benesch, 1955.)
HSCysNHt , -SCysNHZ , HSCysNHz, and -SCysNH2. Despite the reasonably good self consistency of the results, two points in this work may be questioned. First, uncertainty was introduced by not operating a t constant ionic strength, and second, the somewhat uncertain assumption was made that the molar absorbance is the same for -SCysNHt asfor -SCysNH2. The first question has been answered by Gorin and Clary (1960), who find that the ratio of -SCysNHZ to HSCysNH2 does not change with ionic strength, and approximately confirm the Beneschs’ results on the absolute value of this ratio. The second criticism is more difficult to evaluate in terms of probable error. The studies with tyrosine and O-methyl tyrosine (Sections IV,A,2 and V1,C) show a small but real effect of the ammonium ionization on the phenolic chromophore, both in peak height and
ULTRAVIOLET SPECTRA O F PROTEINS AND AMINO ACIDS
339
peak position, but there is no necessary reason for expecting a close parallel between the n -+u* transition of cysteine and the T -+ ?r* transition of tyrosine. As support for the validity of the assumption of an equal absorptivity for -SCysNHZ and for -SCysNH; , Benesch and Benesch note that the absorptivity for cysteamine (pK’ = 8.35 for the --SH proton) is essentially unchanged a t its peak from pH 10 to pH 12 (pK’ = 10.7 for the -NH$ proton), although the peak position does shift from 2320 to 2360 A. We must tentatively accept this working assumption, which is supported by the consistency between the spectrophotometric and electrometric titration results. DeDeken and co-workers (1956) have reported spectrophotometric studies of the ionization of cysteine and related thiols, but have not accounted for multiple ionic species. In addition to using spectral measurements for determining ionization microconstants, Benesch and Benesch (1955) also determined the heat of ionization of the sulfhydryl group of thioglycolic acid. The result from their absorbance measurements, 6.5 kcal/mole, was in good agreement with their value of 7.0 kcal obtained from electrometric measurement of apH/dT for a thioglycolic acid/thioglycolate buffer. It may seem odd that the technique of spectrophotometric titration, so extensively applied to the tyrosyl residues of proteins, has not been used in the study of protein sulfhydryl groups. The only study of this sort known to the author is the determination of the ionization constant of -SH groups of thiolated gelatin (introduced by reaction with N-acetylhomocysteine thiolactone). By measuring the change in absorptivity as a function of pH at 2380 A, Benesch and Benesch (1958) found the apparent acidity of the thiol groups of the modified gelatin (pK’ = 9.8) to be nearly the same as that of an appropriate model compound, N-acetylhomocysteine (pK’ = 10.0). The very low content of aromatic amino acids in gelatin provides a favorable case for studying sulfhydryl ionization in a protein. The usual extent of interference by aromatic groups is suggested by a consideration of their absorptivities. The increment in molar absorptivity on thiol ionization at 2380A is about 4.5 X lo3. For tryptophan, at the same wavelength, E M 2.5 X lo3, and is relatively constant over the pH range of interest. (It must be noted, however, that since de/dX is large for tryptophan at 2380, E will be particularly sensitive to perturbations, as discussed in Section V1,B.) In contrast, the presence of tyrosine in a protein poses the problem of a large variable absorptivity, varying from -0.6 X loa a t pH 7 to -10 X lo3at pH 12. Thus, there is a large and variable absorptivity of tyrosine and tryptophan in the 2400 A region which is otherwise suitable for measuring thiol ionization. For these reasons, the prospects for spectrophotometric titrations of protein sulfhydryl groups seem generally poor.
340
D. B. WETLAUFER
3. Carboxyl and Amino Groups
Although carboxyl and amino groups both undergo absorptivity changes on ionization, their absorption bands are so weak (Ley and Arends, 1932) in the usual practical wavelength range (>ZOO0 A) as to offer little possibility for spectrophotometric titrations directly. Such titrations can sometimes be carried out if the ionizing group is situated close to a longer wavelength chromophore. It may then be possible to visualize the titration of an amino or carboxyl group by its perturbing influence on a more easily accessible chromophore. Saidel (1955a) demonstrated both the carboxyl and ammonium ionizations of glycylglycine in the pH-dependence of the peptide absorptivity a t 2050 A. The possibility of a spectrophotometric titration of the amino and carboxyl groups of 0-methyltyrosine by a perturbation method (using difference spectral measurements at 2840 A) was briefly indicated by Wetlaufer et al. (1958). For tryptophan, Hermans el al. (1960) not only reported the spectrophotometric titration of the amino and carboxyl groups, but also determined the heat of ionization of these two groups from the temperature-dependence of their apparent pK’s. The spectrophotometric titration of poly-L-glutamic acid was first done by Imahori and Tanaka (1959). The primary interest in this titration is the conformation-dependence (Section IV,E,3) of the peptide-bond absorptivity. Since measurements were made well below 2000 A, the spectral change due to carboxyl ionization was only one-quarter of the total change. The titration curve obtained by Imahori and Tanaka is biphasic in shape, unlike simple titration curves, but this is perhaps not unexpected with a conformational transition accompanying the titration. A complete rationalization of the titration of polyglutamic acid has not yet been achieved, although good progress has been made with the electrometric studies of Wada (1960).
C. Proteins 1. General Anomalous tyrosyl ionization in proteins was first demonstrated in the pioneering study of Crammer and Neuberger (1943), who showed by spectrophotometric titration that most of the tyrosyl groups of ovalbumin did not ionize a t all below p H 12, and that their eventual ionization at higher alkalinity apparently occurred only with concomitant denaturation of the protein. These results were in contrast with those obtained for insulin, where the phenolic pK’ appeared high, compared with that of tyrosine, but tyrosyl ionization was reversible and no protein denaturation occurred. Following the study of Crammer and Neuberger, numerous investigations on protein tyrosyls by the spectrophotometric titration technique have been published. These are summarized in Table 11. It
TABLE I1 Tabular Rum.mary of Spectrophotometric Titrations Substance
Finding
Bovine serum albumir
( a ) Most Tyr’s ionize a t high pH with irreversible denaturation ( b ) Tyr’s titrate reversibly Tyr pK high but ionization rc veraible 3 of 6 T y r ionize only at high pH and with denaturation Tyr pK high, ionization reversibl
T, ysozyme
Tyr PIC high, titration curve flat
Ribonuclease
Confirm and extend findings o Shugar (1952) pK’s for part of Tyr’s high; also titrations of insoluble films Substantial absorptivity changes a t 2050 A follow N-terminal and C-terminal ionizations Carboxyl pK’s determined from difference spectrum asf(pH) Determined “microconstants” for the multipIe -SH dissociation pathways; AH for -SH ionization Assignment of pK‘ values for amide protonation in strong HzS04 2 of 4 Tyr’s ionize only a t high p H and irreversibly Extent of T y r ionization a t pH li increases after urea denaturation Tyr pK high, but titration reversible; expanded protein conformation indicated a t pH > 10 No difference in the T y r ionizatioi pattern
( a ) Ovalbumin ( b ) Insulin Lysozyme R i bonuclease
Fibrinogen, silk fibroin, and r-globuli Glycylglycine and acetylglycine Tyrosine, AcTyrOEt, and BzArgOEt Cysteine and related thiols Proteins, model amides, and peptides Chymotrypsinogen Fibrinogen, trypsin, 7-globulin, and bovine fraction I11 8-Lactoglobulin Ribonuclease compared with guanidinated RNase Ribonuclease Ribonuclease
Thiolated gelatin Chymotrypsin and chymotrypsinogon Poly-L-glutamic acid
All 6 Tyr’s Litrate as one class in 8 M urea All Tyr’s titrate as one class in glycol; 3 anomalous Tyr’s regenerated on glycol removal Thiol pK apparently normal
Confirm and extend Wilcox (1957)
Reference Crammer and berger (1943)
Fromageot and Schnek (1950) Shugar (1952) Tanford and Roberts (1952) Tanford and Wagner (1954) Tanford et al. (1955) Schauenstein (1955) Saidel (1955a)
Schwert and Takenaka (1955) Benesch and Benesch (1955) Goldfarb et al. (1955, 1958) Wilcox (1957) Ungar et al. (1957)
Tanford and Swanson (1957) Klee and Richards (1957) Blumenfeld and Levy (1958) Sage and Singer (1958)
Benesch and Benesch (1958) Chervenka (1959)
Peptide absorption spectrum fron Imahori 1900-2400 A is conformation-de(1959) pendent 341
Neu-
and Tanaka
TABLE 11-Continued ~~~
~
~~
Substance
Finding
-
Itiboliuclease, oval - Performic acid oxidation "normalizes" all Tyr's; heat-denabumin, p-lactoglobturation normalizes only a part ulin, serum albumin Pepsin-inactivated RNase has 1 Ribonuclease anomalous Tyr; Oxidized RNasi has none anomalous Myosin and mero- Some Tyr's in myosin, and most in light meromyosin titrate onl! myosins a t high pH, irreversibly Ribonuclease in urea- All Tyr's titrate as one group; electrostatic factor high guanidinium chloride solution Three classes of tyrosyl ionizaTrypsinogen tion, one class irreversible Lysozyme, insulin, At high pH, irreversibly titrating Tyr's found in all three proteins and catalase Conalbumin with and 11 of 18 Tyr's titrate normally; rewithout Fe"' mainder a t high pH, irreversibly; with Fe"', 5 of 18 Tyr's titrate normally Analysk of low-temperature, low Ribonuclease pH transconformations from Tyr A-spectra Three classes of Tyr ionization; Papain titration irreversihle above pH
Reference Tranicr
arid
Shugsr
(1959)
Bigelow (1960) Stracher (1960) Cha and Scheraga (1960) Smillie
and
Kay
(1961) Inada (1961)
Wishnia el al. (1961)
Hermans and Scheraga (1961a, b) Glazer
and
Smith
(1961h)
12
3 of 4 Tyr's ionize a t pH