?
Short-length peptides contact map prediction using Convolution Neural Networks
Inthisarticle, it isconsideredanapproachforpredictingthecontactmatrix(contactmap) for short-lengthpeptides. Contactmatrixis two-dimensional representationof theprotein. Itcan beusedfortertiarystructurereconstructionorforstartingapproximationinenergyminimization models.Forthiswork,peptideswithachainlengthfrom15upto30werechosentotestthemodel andsimplifythecalculations. Convolutionalneuralnetworks(CNNs)wereusedasaprediction toolaccordingtothefactthatthefeaturespaceofeachpeptideispresentedasatwo-dimensional matrix. SCRATCHtoolwasusedtogeneratethesecondarystructure,solventaccessibility,and profilematrix(PSSM) foreachpeptide. CNNwas implementedinthePythonprogramming languageusingtheKeras library. ToworkwiththecommonPDB-format,whichpresents the structureinformationofproteins,theBioPythonmodulewasused.Asaresult,training,validation andtestsamplesweregenerated, themultilayermulti-outputconvolutionalneuralnetworkwas constructed,whichwastrainedandvalidated.Theexperimentswereconductedonatestsample topredictthecontactmatrixandcompareitwithnativeone.Toassessthequalityofprediction, conjunctionmatricesfor thethresholdof8and12𝐴wereformed, themetricsF1-score, recall andprecisionwerecalculated. According toF1-score,wecanobserve, that evenwithsmall neuralnetworkwecanachevequitegoodresults.AtthefinalstepFT-COMARtoolwasusedto reconstruct tertiarystructureoftheproteinsfromitscontactmatrix. Theresultsshows, thatfor reconstructedstructuresfrom12threshholdcontactmatrix,RMSDmetricisbetter