Skip to main content

Splice Site Detection in DNA Sequences Using a Fast Classification Algorithm

problems

In the field of biological research there are several issues related to the processing of DNA
related to the processing of data included in the field of Bioinformatics. some DNA-related issues to be solved in bioinformatics is classification of a group of data (sequences), similarity detection, separating the proteins into DNA sequence (splicing), predict the molecular structure, looking for a new drug structures etc. research in the field of DNA involves large data containing information such as gene, protein sequences, and other biological related data so that the
processing time and memory requires relatively large.

Pattern recognition in DNA is not an easy problem because in addition to having relatively large size of data which DNA is composed of exons (encoded in proteins) and introns (not encoded in proteins) which are separated without any characters (explanatory) that account for the separation between the two.

Goal

This paper describes the main aim of this method is to predict the location exons and introns in a large-size data with high accuracy and time reasonable. More generally, the problem of pattern recognition in the DNA is able to implement a system that is able to solve the problems of storage,
processing, and analysis of large DNA data.

Method

Previous research tend to use the SVM method determine the location of exons and introns separating. But in this paper are described fundamental weakness which is owned by the SVM method that is growing memory needs very high complexity is the square of the number of input data, so it can said that the dependence of the SVM is very high complexity to the size of the data sets. The main idea of the method proposed in this paper has a background of weakness
owned by the SVM method in the training process has complexity high. Repairs carried out by reducing the number of data sets used in conduct training with the consideration that the data are close to the limits / boundaries an important point and while the data is far from the hyperplane does not have strength / contribution in the process of training SVM. This resulted in the number of data sets used in the training process is much smaller than using the entire data set on regular SVM method.

The new method is an improvement of SVM is generally divided into three stages process. The first stage of this method is to determine the small-sized data sets of support vector (SV). The second stage is to conduct training using the Bayesian SV and without SV were obtained from previous data and reduce the input data are considered less representative and make the important data sets into a candidate SV. The third stage candidate SV is generated using the previous process and
using the second step in SVM.

Result

Tests were conducted in this paper is to test the accuracy and time dataset used in the training process.

The above table shows a comparison of the error value, true negative, false negative and tested
the two datasets were used that dataset Acceptor and Donor.

Conclusion

In the paper described a method of repair on the SVM is used for classifying large data sets. These algorithms perform the selection of relevant data for included as training data and which is not. It is intended to reduce current complexity of the model building process of training. The results show that time spent in the training process is reduced significantly when the
formation models.

ADVANTAGE
  • The proposed method is simple but very significantly reduces the processing time establishment of training data
  • Guidelines in conducting experiments also included a clear and detailed results
    research

DISADVANTAGE
  • In the title does not indicate that this method is a method derived from the method preexisting namely SVM
  • In the first stage was not given a reason as well as the specific number of data sets used because there is only a data instruction set used is small.

SUGGESTION
  • In the chapter mentioned that the method is derived from other methods (SVM) so that readers get a clear picture of the proposed method.
  • Added information about the comparison of the accuracy of the proposed method less than or equal to the other methods, so that further highlight the repair time used in the formation of a more efficient training models.

Comments

Popular posts from this blog

Indra, Radar Made in Indonesia

Wants to prove that Indonesia is not inferior to other nations, Electronics and Telecommunications Research Center Indonesian Institute of Sciences (LIPI PPET) and a division of PT Solutions 247 Radar and Communications System (RCS) makes maritime radar. Since 2006, PPET LIPI has developed two versions of the radar. Namely radar coastal surveillance and navigation radar ship. In its development, the maritime radar to be named while Indonesian Radar (Indra). To distinguish, ship navigation radar developed by PT Solutions 247-RCS named Indra-1 and a coastal surveillance radar developed by PPET-LIPI named Indra-2. Both of these radar technology frequency-modulated continuous wave (FMCW) so that the power consumption and size is much smaller than the radar radars in the market. Indra-1 managed to detect and measure the distance of a ship that was sailing with accuracy. We are proud of this result. This is proof that we can make the radar is built and functioning properly. Having escape...

60 Hari Cara Move On Dengan Cepat

Seperti dalam lagu   Endank Soekamti   - Move On. Galau karena baru putus sama pasangan? Mungkin rasanya sangat sedih ketika harus memutuskan hubungan dengan seseorang yang sudah sangat kita sayangi, apalagi sudah banyak kenangan yang dilalui bersama. Namun dunia belum berakhir ketika kamu putus cinta, matahari akan tetap bersinar dan burung akan tetep berkicau menghiasi pagi. Salah satu hal terberat ketika putus cinta adalah move on. Banyak orang yang terus terlarut dalam kesedihan dalam waktu yang sangat lama.Untuk membantu kamu agar tidak terus menerus merasa sedih, simak tips move on dalam 60 hari berikut ini Hari ke 1-7 Beri Waktu untuk Sendiri Satu hari setelah kamu putus dengan mantan pasanganmu, mungkin kamu akan merasa sangat hancur dan hal yang bisa kamu lakukan hanya menangis. Hari pertama kamu masih diizinkan untuk menangis sesuka hati, tuangkan semua emosi yang kamu rasakan, menangislah sepuasnya. Saat air matamu sudah habis, cobalah keluar kama...

Why Programmers Should Wear Framework (CodeIgniter or Yii)

The framework can be interpreted simply as a library containing a collection of functions / procedures and classes for specific purposes that are ready to be used so that it can facilitate and accelerate the work of a programmer, without having to create a function or class from the start. Advantages of using the framework are: provide a good structure in our program. Sometimes as programmers, we can create our program structure. the framework, we can make the program more structured, easy to manage, easy to develop. well for those that have been used to make the program itself, usually on their own to create a framework in the program can easily develop, that could be a problem is the use of this framework is the subjective easily according to our perspective. with a framework that is used by many people, it will provide the structure and way of working standards for our applications. The more people who use it, it means a lot of people who agree with the way the framework w...