COMPUTER-BASED PROTEIN DESIGN

Computer-based protein design is an exceptionally powerful tool in modern molecular biology and
chemistry research. The biggest advantage of this approach is the creation of completely new proteins,
undetected in experimental studies, for various applications in areas such as medicine, industry and
energy. In addition, it enables the precise design of protein structures and the prediction of their
properties and functions, which is extremely important for the understanding of biological processes
and the development of new treatments.


At the Institute for Artificial Intelligence Research and Development of Serbia, researchers are engaged
in the design of short proteins with the aim of blocking the function of certain proteins through their
binding. The entire design process consists of many accompanying steps necessary to optimize
performance. For example, prior to final laboratory testing, it is desirable to rank the generated proteins
according to how likely they are to be safe for humans. Currently, the Institute is engaged in adapting
the diffusion model RF diffusion, based on the RoseTTAFold model for predicting the structure of
proteins based on the sequence of amino acids. Diffusion machine learning models are trained by
learning to remove noise from real data. The result of the RF diffusion network is a series of amino acid
frames, which are organized into the desired structure. The sequence generated in this way is used as
input for the RoseTTAFold, which produces structure predictions for comparison with the structure
generated by the RF diffusion network.