Fast and Accurate Identification of Cross-Linked Peptides for the Structural Analysis of Large Protein Complexes and Elucidation of Interaction Networks. / Tahir, Salman; Bukowski-Wills, Jimi-Carlo; Rasmussen, Morten; Rappsilber, Juri.

Morten Rasmussen

Publikation: Konferencebidrag uden forlag/tidsskriftPosterForskning

Resumé

Fast and Accurate Identification of Cross-Linked Peptides for the structural analysis of large protein complexes and to elucidate interaction networks. Salman Tahir Jimi-Carlo Bukowski-Wills; Morten Rasmussen; Juri RappsilberWellcome Trust Centre for Cell Biology, Edinburgh , United Kingdom

 

Novel Aspect: Our software efficiently and correctly identifies cross-links within large protein complexes, facilitating the construction of low-resolution 3D-models and interaction networks

 

.Introduction
Chemical cross-linking of peptides coupled with mass spectrometry emerges as a powerful method to investigate protein structure and protein-protein interactions. When applied to single proteins or small purified protein complexes, this methodology works well. However certain challenges arise when applied to more complex samples. One of the main problems is the combinatorial increase in the search space that occurs when all peptide-peptide combinations are considered in a database search. We have developed an algorithm that finds and validates cross-linked peptides in an efficient and scalable manner by adopting a number of principles both biological and computational.

 

Methods
We make use of a high accuracy library of over 1000 synthetic peptides to understand the fragmentation behaviour of cross-linked peptides. This allows us to pre-process spectra through de-isotoping, charge reduction and the removal of loss-of-water/ammonia peaks. Furthermore, using this information we are able to reduce the complexity of searching to essentially two successive searches of linear peptides as opposed to analyzing every possible combination of peptides that could potentially cross-link. We achieve further speedup using parallelization and data-structures that complement the nature of the data we search.Preliminary results
The complexity of searching for cross-linked peptides arises from analyzing every possible combination of peptides that could potentially cross-link, with approximately the same mass as one of the unexplained observed masses. Very quickly, as we consider more proteins, the number of potential peptide-peptide-combinations becomes infeasible to compute.

We utilize a high accuracy library of >1000 synthetic peptides to understand the fragmentation behaviour of cross-linked peptides. 92.4% of the most intense peaks from the annotated spectra of this library occur due to single fragmentation. Moreover, we note that 92.5% of the top peaks belong to one of the two peptides comprising a cross-linked pair.

Using this information, we are able to reduce the complexity of the cross-link search to a linear search. The presence of a primary, more dominant, peptide that fragments better than the secondary peptide of a cross-linked pair, leads us to the observation that we can first search for the primary peptide without constraining the peptide mass. The second, less prominent peptide can then be found in an ordinary database search for a modified peptide using a simplified spectrum. We can simplify a spectrum because we remove all peaks that are accounted for by the fragmentation of peptide one.

This approach is highly sensitive and scales well as revealed by searching our data of synthetic cross-links against a large sequence database. Currently, against a protein database of >1300 proteins a spectrum is searched in 0.35 seconds - a vast improvement when compared to the exhaustive search method of combining every potential cross-link for each spectrum(60 hours). In fact the search time is comparable, if not better, than existing linear search engines. Furthermore, we auto-validate the results obtained.

OriginalsprogEngelsk
Publikationsdato2009
StatusUdgivet - 2009
Begivenhed57th American Society for Mass Spectrometry Conference on Mass Spectrometry and Allied Topics - Philadelphia, USA
Varighed: 31. maj 20094. jun. 2009
Konferencens nummer: 57

Konference

Konference57th American Society for Mass Spectrometry Conference on Mass Spectrometry and Allied Topics
Nummer57
LandUSA
ByPhiladelphia
Periode31/05/200904/06/2009

Citer dette

Rasmussen, M. (2009). Fast and Accurate Identification of Cross-Linked Peptides for the Structural Analysis of Large Protein Complexes and Elucidation of Interaction Networks. / Tahir, Salman; Bukowski-Wills, Jimi-Carlo; Rasmussen, Morten; Rappsilber, Juri.. Poster session præsenteret på 57th American Society for Mass Spectrometry Conference on Mass Spectrometry and Allied Topics, Philadelphia, USA.
Rasmussen, Morten. / Fast and Accurate Identification of Cross-Linked Peptides for the Structural Analysis of Large Protein Complexes and Elucidation of Interaction Networks. / Tahir, Salman; Bukowski-Wills, Jimi-Carlo; Rasmussen, Morten; Rappsilber, Juri. Poster session præsenteret på 57th American Society for Mass Spectrometry Conference on Mass Spectrometry and Allied Topics, Philadelphia, USA.
@conference{20741440003111dfaefb000ea68e967b,
title = "Fast and Accurate Identification of Cross-Linked Peptides for the Structural Analysis of Large Protein Complexes and Elucidation of Interaction Networks. / Tahir, Salman; Bukowski-Wills, Jimi-Carlo; Rasmussen, Morten; Rappsilber, Juri.",
abstract = "Fast and Accurate Identification of Cross-Linked Peptides for the structural analysis of large protein complexes and to elucidate interaction networks. Salman Tahir Jimi-Carlo Bukowski-Wills; Morten Rasmussen; Juri RappsilberWellcome Trust Centre for Cell Biology, Edinburgh , United Kingdom  Novel Aspect: Our software efficiently and correctly identifies cross-links within large protein complexes, facilitating the construction of low-resolution 3D-models and interaction networks .IntroductionChemical cross-linking of peptides coupled with mass spectrometry emerges as a powerful method to investigate protein structure and protein-protein interactions. When applied to single proteins or small purified protein complexes, this methodology works well. However certain challenges arise when applied to more complex samples. One of the main problems is the combinatorial increase in the search space that occurs when all peptide-peptide combinations are considered in a database search. We have developed an algorithm that finds and validates cross-linked peptides in an efficient and scalable manner by adopting a number of principles both biological and computational. MethodsWe make use of a high accuracy library of over 1000 synthetic peptides to understand the fragmentation behaviour of cross-linked peptides. This allows us to pre-process spectra through de-isotoping, charge reduction and the removal of loss-of-water/ammonia peaks. Furthermore, using this information we are able to reduce the complexity of searching to essentially two successive searches of linear peptides as opposed to analyzing every possible combination of peptides that could potentially cross-link. We achieve further speedup using parallelization and data-structures that complement the nature of the data we search.Preliminary resultsThe complexity of searching for cross-linked peptides arises from analyzing every possible combination of peptides that could potentially cross-link, with approximately the same mass as one of the unexplained observed masses. Very quickly, as we consider more proteins, the number of potential peptide-peptide-combinations becomes infeasible to compute.We utilize a high accuracy library of >1000 synthetic peptides to understand the fragmentation behaviour of cross-linked peptides. 92.4{\%} of the most intense peaks from the annotated spectra of this library occur due to single fragmentation. Moreover, we note that 92.5{\%} of the top peaks belong to one of the two peptides comprising a cross-linked pair. Using this information, we are able to reduce the complexity of the cross-link search to a linear search. The presence of a primary, more dominant, peptide that fragments better than the secondary peptide of a cross-linked pair, leads us to the observation that we can first search for the primary peptide without constraining the peptide mass. The second, less prominent peptide can then be found in an ordinary database search for a modified peptide using a simplified spectrum. We can simplify a spectrum because we remove all peaks that are accounted for by the fragmentation of peptide one.This approach is highly sensitive and scales well as revealed by searching our data of synthetic cross-links against a large sequence database. Currently, against a protein database of >1300 proteins a spectrum is searched in 0.35 seconds - a vast improvement when compared to the exhaustive search method of combining every potential cross-link for each spectrum(60 hours). In fact the search time is comparable, if not better, than existing linear search engines. Furthermore, we auto-validate the results obtained.",
author = "Morten Rasmussen",
note = "Volume: 20 Sider: S93; 57<sup>th</sup> ASMS Conference on Mass Spectrometry ; Conference date: 31-05-2009 Through 04-06-2009",
year = "2009",
language = "English",

}

Rasmussen, M 2009, 'Fast and Accurate Identification of Cross-Linked Peptides for the Structural Analysis of Large Protein Complexes and Elucidation of Interaction Networks. / Tahir, Salman; Bukowski-Wills, Jimi-Carlo; Rasmussen, Morten; Rappsilber, Juri.' 57th American Society for Mass Spectrometry Conference on Mass Spectrometry and Allied Topics, Philadelphia, USA, 31/05/2009 - 04/06/2009, .

Fast and Accurate Identification of Cross-Linked Peptides for the Structural Analysis of Large Protein Complexes and Elucidation of Interaction Networks. / Tahir, Salman; Bukowski-Wills, Jimi-Carlo; Rasmussen, Morten; Rappsilber, Juri. / Rasmussen, Morten.

2009. Poster session præsenteret på 57th American Society for Mass Spectrometry Conference on Mass Spectrometry and Allied Topics, Philadelphia, USA.

Publikation: Konferencebidrag uden forlag/tidsskriftPosterForskning

TY - CONF

T1 - Fast and Accurate Identification of Cross-Linked Peptides for the Structural Analysis of Large Protein Complexes and Elucidation of Interaction Networks. / Tahir, Salman; Bukowski-Wills, Jimi-Carlo; Rasmussen, Morten; Rappsilber, Juri.

AU - Rasmussen, Morten

N1 - Volume: 20 Sider: S93

PY - 2009

Y1 - 2009

N2 - Fast and Accurate Identification of Cross-Linked Peptides for the structural analysis of large protein complexes and to elucidate interaction networks. Salman Tahir Jimi-Carlo Bukowski-Wills; Morten Rasmussen; Juri RappsilberWellcome Trust Centre for Cell Biology, Edinburgh , United Kingdom  Novel Aspect: Our software efficiently and correctly identifies cross-links within large protein complexes, facilitating the construction of low-resolution 3D-models and interaction networks .IntroductionChemical cross-linking of peptides coupled with mass spectrometry emerges as a powerful method to investigate protein structure and protein-protein interactions. When applied to single proteins or small purified protein complexes, this methodology works well. However certain challenges arise when applied to more complex samples. One of the main problems is the combinatorial increase in the search space that occurs when all peptide-peptide combinations are considered in a database search. We have developed an algorithm that finds and validates cross-linked peptides in an efficient and scalable manner by adopting a number of principles both biological and computational. MethodsWe make use of a high accuracy library of over 1000 synthetic peptides to understand the fragmentation behaviour of cross-linked peptides. This allows us to pre-process spectra through de-isotoping, charge reduction and the removal of loss-of-water/ammonia peaks. Furthermore, using this information we are able to reduce the complexity of searching to essentially two successive searches of linear peptides as opposed to analyzing every possible combination of peptides that could potentially cross-link. We achieve further speedup using parallelization and data-structures that complement the nature of the data we search.Preliminary resultsThe complexity of searching for cross-linked peptides arises from analyzing every possible combination of peptides that could potentially cross-link, with approximately the same mass as one of the unexplained observed masses. Very quickly, as we consider more proteins, the number of potential peptide-peptide-combinations becomes infeasible to compute.We utilize a high accuracy library of >1000 synthetic peptides to understand the fragmentation behaviour of cross-linked peptides. 92.4% of the most intense peaks from the annotated spectra of this library occur due to single fragmentation. Moreover, we note that 92.5% of the top peaks belong to one of the two peptides comprising a cross-linked pair. Using this information, we are able to reduce the complexity of the cross-link search to a linear search. The presence of a primary, more dominant, peptide that fragments better than the secondary peptide of a cross-linked pair, leads us to the observation that we can first search for the primary peptide without constraining the peptide mass. The second, less prominent peptide can then be found in an ordinary database search for a modified peptide using a simplified spectrum. We can simplify a spectrum because we remove all peaks that are accounted for by the fragmentation of peptide one.This approach is highly sensitive and scales well as revealed by searching our data of synthetic cross-links against a large sequence database. Currently, against a protein database of >1300 proteins a spectrum is searched in 0.35 seconds - a vast improvement when compared to the exhaustive search method of combining every potential cross-link for each spectrum(60 hours). In fact the search time is comparable, if not better, than existing linear search engines. Furthermore, we auto-validate the results obtained.

AB - Fast and Accurate Identification of Cross-Linked Peptides for the structural analysis of large protein complexes and to elucidate interaction networks. Salman Tahir Jimi-Carlo Bukowski-Wills; Morten Rasmussen; Juri RappsilberWellcome Trust Centre for Cell Biology, Edinburgh , United Kingdom  Novel Aspect: Our software efficiently and correctly identifies cross-links within large protein complexes, facilitating the construction of low-resolution 3D-models and interaction networks .IntroductionChemical cross-linking of peptides coupled with mass spectrometry emerges as a powerful method to investigate protein structure and protein-protein interactions. When applied to single proteins or small purified protein complexes, this methodology works well. However certain challenges arise when applied to more complex samples. One of the main problems is the combinatorial increase in the search space that occurs when all peptide-peptide combinations are considered in a database search. We have developed an algorithm that finds and validates cross-linked peptides in an efficient and scalable manner by adopting a number of principles both biological and computational. MethodsWe make use of a high accuracy library of over 1000 synthetic peptides to understand the fragmentation behaviour of cross-linked peptides. This allows us to pre-process spectra through de-isotoping, charge reduction and the removal of loss-of-water/ammonia peaks. Furthermore, using this information we are able to reduce the complexity of searching to essentially two successive searches of linear peptides as opposed to analyzing every possible combination of peptides that could potentially cross-link. We achieve further speedup using parallelization and data-structures that complement the nature of the data we search.Preliminary resultsThe complexity of searching for cross-linked peptides arises from analyzing every possible combination of peptides that could potentially cross-link, with approximately the same mass as one of the unexplained observed masses. Very quickly, as we consider more proteins, the number of potential peptide-peptide-combinations becomes infeasible to compute.We utilize a high accuracy library of >1000 synthetic peptides to understand the fragmentation behaviour of cross-linked peptides. 92.4% of the most intense peaks from the annotated spectra of this library occur due to single fragmentation. Moreover, we note that 92.5% of the top peaks belong to one of the two peptides comprising a cross-linked pair. Using this information, we are able to reduce the complexity of the cross-link search to a linear search. The presence of a primary, more dominant, peptide that fragments better than the secondary peptide of a cross-linked pair, leads us to the observation that we can first search for the primary peptide without constraining the peptide mass. The second, less prominent peptide can then be found in an ordinary database search for a modified peptide using a simplified spectrum. We can simplify a spectrum because we remove all peaks that are accounted for by the fragmentation of peptide one.This approach is highly sensitive and scales well as revealed by searching our data of synthetic cross-links against a large sequence database. Currently, against a protein database of >1300 proteins a spectrum is searched in 0.35 seconds - a vast improvement when compared to the exhaustive search method of combining every potential cross-link for each spectrum(60 hours). In fact the search time is comparable, if not better, than existing linear search engines. Furthermore, we auto-validate the results obtained.

M3 - Poster

ER -

Rasmussen M. Fast and Accurate Identification of Cross-Linked Peptides for the Structural Analysis of Large Protein Complexes and Elucidation of Interaction Networks. / Tahir, Salman; Bukowski-Wills, Jimi-Carlo; Rasmussen, Morten; Rappsilber, Juri.. 2009. Poster session præsenteret på 57th American Society for Mass Spectrometry Conference on Mass Spectrometry and Allied Topics, Philadelphia, USA.