Quasi-supervised strategies for compound-protein interaction prediction [Master Thesis]

Çakı, Onur

Quasi-supervised strategies for compound-protein interaction prediction [Master Thesis]

Date

2021-07

Authors

Çakı, Onur

Publisher

01. Izmir Institute of Technology

Abstract

In-silico prediction of compound-protein interaction using computational methods preserves its importance in various pharmacology applications because the wet-lab experiments are time-consuming, laborious and costly. Most machine learning methods proposed to that end approach this problem with supervised learning strategies in which known interactions are labeled as positive and the rest are labeled as negative. However, treating all unknown interactions as negative instances may lead to inaccuracies in real practice since some of the unknown interactions are bound to be positive interactions waiting to be identified as such. In this study, we propose to address this problem using the Quasi-Supervised Learning algorithm. In this framework, potential interactions are predicted by estimating the overlap between two datasets: a true positive dataset which consists of compound-protein pairs with known interactions and an unknown dataset which consists of all the remaining compound-protein pairs. The potential interactions are then identified as those in the unknown dataset that overlap with the interacting pairs in the true positive dataset in terms of the associated similarity structure between interacting pairs. Experimental results on GPCR and Nuclear Receptor datasets show that the proposed method can identify actual interactions from all possible combinations.
Laboratuvar ortamında gerçekleştirilen bileşik-protein etkileşimi belirleme deneylerinin zaman alıcı, zahmetli ve maliyetli olması nedeniyle, hesaplamalı yöntemler kullanarak dijital ortamda bileşik-protein etkileşimi tahmini önemini korumaktadır. Bu amaçla geliştirilen pek çok yapay öğrenme yöntemi bu probleme bilinen etkileşimlerin pozitif, eldeki geri kalan bütün etkileşimlerin ise negatif olarak etiketlendiği güdümlü öğrenme stratejileri ile yaklaşmıştır. Fakat bilinmeyen etkileşimler açığa çıkarılmayı bekleyen pozitif etkileşimleri de barındıracağından, bilinmeyen bütün etkileşimleri negatif örnek olarak ele almak gerçek uygulamalarda hatalı sonuçlara yol açacaktır. Bu çalışmada, bu problemin Yarı-Güdümlü Öğrenme Algoritması ile çözülmesi amaçlanmaktadır. Bu çerçevede olası etkileşimler iki veri kümesinin örtüşümü kestirilerek tahmin edilir: Etkileştikleri bilinen bileşik-protein çiftlerinden oluşan gerçek pozitif veri kümesi ve geri kalan diğer bütün bileşik-protein çiftlerinden oluşan bilinmeyen veri kümesi. Gerçek pozitif veri kümesindeki etkileşen çiftlerle ilgili yapısal benzerlik açısından örtüşen bilinmeyen veri kümesindeki bileşik-protein çiftleri potansiyel etkileşimler olarak tanımlanır. GPCR ve Nuclear Receptor veri kümeleri üzerindeki deneysel sonuçlar, amaçlanan yöntemin bütün olası çiftlerden gerçek etkileşimleri saptayabildiğini göstermektedir.

Description

Thesis (Master)--Izmir Institute of Technology, Electronics and Communication Engineering, Izmir, 2021
Includes bibliographical references (leaves: 54-59)
Text in English; Abstract: Turkish and English

ORCID

0000-0002-5068-1356

Keywords

Bioinformatics, Cheminformatics, Machine learning, Quasi-supervised learning

URI

http://standard-demo.gcris.com/handle/123456789/5433

Collections

Master Tezleri

Full item page

Quasi-supervised strategies for compound-protein interaction prediction [Master Thesis]

Date

Authors

Journal Title

Journal ISSN

Volume Title

Publisher

Open Access Color

OpenAIRE Downloads

OpenAIRE Views

Research Projects

Organizational Units

Journal Issue

Abstract

Description

ORCID

Keywords

Turkish CoHE Thesis Center URL

Fields of Science

Citation

WoS Q

Scopus Q

Source

Volume

Issue

Start Page

End Page

URI

Collections

Sustainable Development Goals

SDG data could not be loaded because of an error. Please refresh the page or try again later.