Module 9: Gene Expression Analysis

Due: sexta, 15 dezembro 2017, 6:00

Module 9: Gene Expression Analysis

Script

Exercise 1:

Make the grouping (clustering) of the yeast gene expression data. The data are available in the text book website, in Matlab format. yeast2.txt is a text file, which can be read by python. 

Group the data into 10, 16 and 20 partitions. Reporte the number of samples in each partition (cluster). Make a graph similar to the one in Figure 9.10 (p 157) from the text book for each partition.

Observations: 

● The table containing the yeast data have missing values, represented by NA. Delete those lines.


Exercise 2:

Golub et al. (1999) created a database of different types of leukemias from oligonucleotide microarrays. The available file leukemia.txt contains this database. Make the grouping of that base using 2 and 5 partitions and analyze the results.

Observations:

● The file leukemia.txt has a label of the classes in the last column (to which each example of database belongs). To make the grouping, this columnmust be removed;

● Use the column that has the class label of each sample to compare with the partitions in which these examples were allocated. Was there any difference between the results obtained?