{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Classificação -- Classes linearmente separáveis\n",
    "\n",
    "(esta página corresponde ao notebook <code>practice_classification1.ipynb</code>)\n",
    "\n",
    "Classificação de pontos 2D : classes <font color=\"blue\">positiva</font> (<font color=\"blue\">1</font>) e <font color=\"red\">negativa</font> (<font color=\"red\">0</font>)\n",
    "\n",
    "Coloração no gráficos:\n",
    "\n",
    "<font color=\"blue\">$\\Huge \\cdot$</font> Positive, classified as positive<br>\n",
    "<font color=\"red\">$\\Huge \\cdot$</font> Negative, classified as negative<br> \n",
    "<font color=\"blue\">$\\mathtt{x}$</font> Positive, classified as negative<br>\n",
    "<font color=\"red\">$\\mathtt{x}$</font> Negative, classified as positive<br> "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "A fronteira de decisão resultante ao se aplicar a regressão linear ou logística a um problema de classificação é sempre uma função linear (reta, plano, hiperplano). Fronteiras \"tortuosas\" não são possíveis.\n",
    "\n",
    "Aqui vamos examinar a aplicação da regressão linear e logística para a classificação de dados 2D, cuja fronteira de decisão é sabidamente linear."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Criar um dataset com pontos 2D, linearmente separáveis"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false
   },
   "outputs": [],
   "source": [
    "import matplotlib.pyplot as plt\n",
    "%matplotlib inline\n",
    "import numpy as np\n",
    "\n",
    "# draw n random points\n",
    "\n",
    "N = 100\n",
    "x1 = np.random.exponential(size=N)\n",
    "x2 = np.random.standard_normal(N)\n",
    "X = np.vstack(zip(np.ones(N),x1, x2))\n",
    "\n",
    "print(\"Primeiro x: \", X[0,:])\n",
    "print(\"Segundo x : \", X[1,:])\n",
    "\n",
    "fig = plt.figure(figsize=(14,7))\n",
    "plt.subplot(121)\n",
    "plt.plot(X[:,1],X[:,2],'o')\n",
    "\n",
    "# um vetor de pesos qualquer, que definirá a fronteira de decisão\n",
    "w = np.array((-1, 0.7, 2.1))\n",
    "\n",
    "# baseado na fronteira, rotular os dados como positivo ou negativo\n",
    "# e plotar em azul (poitivos) ou vermelho (negativos)\n",
    "y = []\n",
    "plt.subplot(122)\n",
    "for i in range(N):\n",
    "    if  X[i,:].dot(w) > 0:\n",
    "        plt.plot(X[i,1],X[i,2],'bo')  # o (bolinhas) azuis (blue)\n",
    "        y.append(1)\n",
    "    else:\n",
    "        plt.plot(X[i,1],X[i,2],'ro')  # o (bolinhas) vermelhas (red)\n",
    "        y.append(0)\n",
    "        \n",
    "y = np.array(y)\n",
    "\n",
    "# plotar a fronteira linear\n",
    "x = np.arange(0, max(X[:,1]), 0.01)\n",
    "fx = [(-w[0]-w[1]*p)/w[2] for p in x ]\n",
    "plt.plot(x, fx, lw=2)\n",
    "plt.show()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Testar regressão linear"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false
   },
   "outputs": [],
   "source": [
    "# Supomos que o arquivo funcoes.py já está criado\n",
    "from funcoes import gradientDescent, computeCost\n",
    "\n",
    "\n",
    "# chutar uns pesos iniciais e calcular o custo inicial\n",
    "w = np.zeros(3)\n",
    "\n",
    "initialCost = computeCost(X, y, w)\n",
    "print('Initial cost: ', initialCost)\n",
    "R = X.dot(w)\n",
    "\n",
    "# plotar a fronteira inicial\n",
    "fig = plt.figure(figsize=(14,7))\n",
    "plt.subplot(121)\n",
    "plt.title('Initial fit')\n",
    "for i in range(N):\n",
    "    if  y[i]>0 :\n",
    "        if R[i]>0:\n",
    "            plt.plot(X[i,1],X[i,2],'bo')  # positivas corretas\n",
    "        else:\n",
    "            plt.plot(X[i,1],X[i,2],'bx')  # positivas erradas\n",
    "    else:\n",
    "        if R[i]>0:\n",
    "            plt.plot(X[i,1],X[i,2],'rx')  # negativas erradas\n",
    "        else:\n",
    "            plt.plot(X[i,1],X[i,2],'ro')  # negativas corretas\n",
    "\n",
    "plt.plot(X[:,1], X.dot(w), '-')\n",
    "plt.xlabel('x')\n",
    "plt.ylabel('y')\n",
    "\n",
    "\n",
    "# Some gradient descent settings\n",
    "iterations = 500\n",
    "alpha = 0.01\n",
    "\n",
    "# run gradient descent\n",
    "w, J_history = gradientDescent(X, y, w, alpha, iterations)\n",
    "\n",
    "finalCost = computeCost(X, y, w)\n",
    "print('Final cost: ', finalCost)\n",
    "print('w = ', w)\n",
    "\n",
    "# solução matricial \n",
    "#XT = np.transpose(X)\n",
    "#MP = np.linalg.inv(XT.dot(X))\n",
    "#w = TMP.dot(XT.dot(y))\n",
    "\n",
    "R = X.dot(w)\n",
    "\n",
    "# plot a fronteira final\n",
    "plt.subplot(122)\n",
    "plt.title('Final fit')\n",
    "\n",
    "for i in range(N):\n",
    "    if  y[i]>0 :\n",
    "        if R[i]>0:\n",
    "            plt.plot(X[i,1],X[i,2],'bo')  # positivas corretas\n",
    "        else:\n",
    "            plt.plot(X[i,1],X[i,2],'bx')  # positivas erradas\n",
    "    else:\n",
    "        if R[i]>0:\n",
    "            plt.plot(X[i,1],X[i,2],'rx')  # negativas erradas\n",
    "        else:\n",
    "            plt.plot(X[i,1],X[i,2],'ro')  # negativas corretas\n",
    "        \n",
    "x = np.arange(0, max(X[:,1]), 0.01)\n",
    "fx = [(-w[0]-w[1]*p)/w[2] for p in x ]\n",
    "plt.plot(x, fx, lw=2)\n",
    "plt.show()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Testar regressão logística"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false,
    "scrolled": true
   },
   "outputs": [],
   "source": [
    "from funcoes import sigmoid, gradientDescent2, computeCost2\n",
    "\n",
    "# chutar uns pesos iniciais e calcular o custo inicial\n",
    "w = np.zeros(3)\n",
    "initialCost = computeCost2(X, y, w)\n",
    "print('Initial cost: ', initialCost)\n",
    "\n",
    "R = X.dot(w)\n",
    "\n",
    "# plotar a fronteira inicial\n",
    "fig = plt.figure(figsize=(14,7))\n",
    "plt.subplot(121)\n",
    "plt.title('Initial fit')\n",
    "for i in range(N):\n",
    "    if  y[i]>0 :\n",
    "        if R[i]>0:\n",
    "            plt.plot(X[i,1],X[i,2],'bo')  # positivas corretas\n",
    "        else:\n",
    "            plt.plot(X[i,1],X[i,2],'bx')  # positivas erradas\n",
    "    else:\n",
    "        if R[i]>0:\n",
    "            plt.plot(X[i,1],X[i,2],'rx')  # negativas erradas\n",
    "        else:\n",
    "            plt.plot(X[i,1],X[i,2],'ro')  # negativas corretas\n",
    "\n",
    "plt.plot(X[:,1], X.dot(w), '-')\n",
    "plt.xlabel('x')\n",
    "plt.ylabel('y')\n",
    "\n",
    "# Some gradient descent settings\n",
    "iterations = 3000\n",
    "alpha = 0.01\n",
    "\n",
    "# run gradient descent\n",
    "w, J_history = gradientDescent2(X, y, w, alpha, iterations)\n",
    "\n",
    "finalCost = computeCost2(X, y, w)\n",
    "print('Final cost: ', finalCost)\n",
    "print(\"w = \", w)\n",
    "\n",
    "R = X.dot(w)\n",
    "\n",
    "plt.subplot(122)\n",
    "plt.title(\"Final fit\")\n",
    "R = X.dot(w)\n",
    "for i in range(N):\n",
    "    if  y[i]>0 :\n",
    "        if R[i]>0:\n",
    "            plt.plot(X[i,1],X[i,2],'bo')  # positivas corretas\n",
    "        else:\n",
    "            plt.plot(X[i,1],X[i,2],'bx')  # positivas erradas\n",
    "    else:\n",
    "        if R[i]>0:\n",
    "            plt.plot(X[i,1],X[i,2],'rx')  # negativas erradas\n",
    "        else:\n",
    "            plt.plot(X[i,1],X[i,2],'ro')  # negativas corretas\n",
    "        \n",
    "x = np.arange(0, max(X[:,1]), 0.01)\n",
    "fx = [(-w[0]-w[1]*p)/w[2] for p in x ]\n",
    "plt.plot(x, fx, lw=2)\n",
    "plt.show()\n"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.5.2"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 0
}