Module 4: Markov Models
Module 4: Markov Modelos
Objective: Understand the HMM (Hidden Markov Model)
Script:
Exercise 1. Lay an HMM (Hidden Markov Model) for the following described problem.
Fair Bet Casino Problem
The "Fair Bet Casino" is a game in which a person (Casino employee) tosses a coin and the other players bet if the result will be heads or tails. The casino employee uses a couple times a coin "fair", where there are the same likely to either side be drawn, and sometimes an 'biased' coin, where the probability of head occur is 3/4. For safety reasons, the employee is not frequently changing the coins. This exchange rarely happens with probability 0.1. Given a sequence of results releases of coins tossed, the problem is to find when the employee used the fair coin and when he used the biased coin.
After laying out the HMM, give the probability of 01011101001 sequence to be obtained, since the hidden states (used coins) are FFFBBBBBFFF.
Obs.: In the given sequence the symbol 0 represents the occurrence of "tail" and is the first occurrence of "head". In the hidden sequence, F is the use of a "fair " coin and B of a coin " biased".
Exercise 2. Case study for odor receptors (Odorant Receptors - Ors).
Download the 347OR.fasta file available in the text book reference site, at the following addresshttp://www.computationalgenomics.net/case_studies/olfactoryreceptors_demo.html. From this page, download the file available in the "Download Data" link.
This file has several ORs. For this exercise, the peer group should use the OR 21, and the odd the OR 23. Build a smoothed graph of the hydrophobicity profile for the OR referring to your group. For this use a sliding window of 20 amino acids and use the average of the level of hydrophobicity of that window to construct the graph. This procedure, as hydrophobicity level of each amino acid, are described in section 4.5.2 of the book text.
Use the first 20 ORs to generate an HMM with two states. These states represent off the membrane (OUT) and within the membrane (IN) . Then use this HMM and the OR referent to your group to create the most likely hidden chain. For this use the Viterbi algorithm. Make a graph of the hidden states encountered by Viterbi, similar to that of figure 4.6 of the text book. If the algorithm finds more than one hidden chain, create a graphic to up to three of these chains.
Observations:
Use the function read.fasta() to read the file 3470R.fasta. To extract the correspondet OR use:
minhaOR < OR[[n]][1:length(OR[[n]])]
where n represents the number of the OR the group should use.
The hmm function of the hmm.discnp package uses the EM algorithm (Expectation Maximization) to generate the necessary parameters for Viterbi function, also available in the mentioned package.
Good Work!