Machine Learning Approaches to Modeling the Physiochemical Properties of Small Peptides

Jensen, Kyle; Styczynski, Mark; Stephanopoulos, Gregory

dc.contributor.author	Jensen, Kyle
dc.contributor.author	Styczynski, Mark
dc.contributor.author	Stephanopoulos, Gregory
dc.date.accessioned	2005-12-16T14:52:55Z
dc.date.available	2005-12-16T14:52:55Z
dc.date.issued	2006-01
dc.identifier.uri	http://hdl.handle.net/1721.1/30388
dc.description.abstract	Peptide and protein sequences are most commonly represented as a strings: a series of letters selected from the twenty character alphabet of abbreviations for the naturally occurring amino acids. Here, we experiment with representations of small peptide sequences that incorporate more physiochemical information. Specifically, we develop three different physiochemical representations for a set of roughly 700 HIV–I protease substrates. These different representations are used as input to an array of six different machine learning models which are used to predict whether or not a given peptide is likely to be an acceptable substrate for the protease. Our results show that, in general, higher–dimensional physiochemical representations tend to have better performance than representations incorporating fewer dimensions selected on the basis of high information content. We contend that such representations are more biologically relevant than simple string–based representations and are likely to more accurately capture peptide characteristics that are functionally important.	en
dc.description.sponsorship	Singapore-MIT Alliance (SMA)	en
dc.format.extent	331891 bytes
dc.format.mimetype	application/pdf
dc.language.iso	en	en
dc.relation.ispartofseries	Molecular Engineering of Biological and Chemical Systems (MEBCS)	en
dc.subject	Machine learning	en
dc.subject	peptides	en
dc.subject	modeling	en
dc.subject	physio-chemical properties	en
dc.title	Machine Learning Approaches to Modeling the Physiochemical Properties of Small Peptides	en
dc.type	Article	en

Files in this item

Name:: MEBCS010.pdf
Size:: 324.1Kb
Format:: PDF

View/Open

This item appears in the following Collection(s)

Molecular Engineering of Biological and Chemical Systems (MEBCS)

Show simple item record