{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Regressão \n",
    "\n",
    "(O conteúdo desta página está no notebook <code>practice_regression.ipynb</code>)\n",
    "\n",
    "<b>Exemplo nesta página</b>: considera-se o ajuste de uma função linear às observações, usando a função custo soma do quadrado das diferenças. Isto é, dado $\\{(\\mathbf{x}_i,y_i) \\,:\\, i =1.\\ldots,N\\}$, busca-se $\\mathbf{w}$ que seja o ponto de mínimo de\n",
    "$$\n",
    "J(\\mathbf{w}) = \\frac{1}{2}\\sum_{i=1}^{N} (\\mathbf{w}^T\\,\\mathbf{x}_i - y_i)^2\n",
    "$$"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Função de custo\n",
    "\n",
    "- X é um array de $N \\times (n+1)$ : $N$ amostras de dimensão $n+1$, com o valor na primeira dimensão igual a 1, e uma amostra por linha\n",
    "- y é um array de $N \\times 1$\n",
    "- w é um array de $(n+1)\\times 1$. "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "# Calcula o somatório do quadrado das diferenças entre x_i^Tw e y_i \n",
    "def computeCost(X, y, w):\n",
    "    N = y.size\n",
    "    J = 0\n",
    "    for i in range(N):\n",
    "        J += np.square(X[i, :].dot(w) - y[i])\n",
    "    J /= (2 * N)\n",
    "    return (J)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Gradiente descendente\n",
    "\n",
    "- X é um array de $N \\times (n+1)$ - $N$ amostras de dimensão $n+1$, com o valor na primeira dimensão igual a 1, e uma amostra por linha\n",
    "- y é um array de $N \\times 1$\n",
    "- w é um array de $(n+1)\\times 1$\n",
    "- alpha é um real (learning rate)\n",
    "- num_iters é o número de iterações\n",
    "\n",
    "A cada iteração, para todo $j=0,1,\\ldots,n$ calcula-se \n",
    "$$\n",
    "\\delta_j = \\sum_{i=1}^{N} (\\mathbf{w}^T\\mathbf{x}_i - y_i) x_{ij}\n",
    "$$\n",
    "e\n",
    "atualiza-se o peso da seguinte forma:\n",
    "$$\n",
    "w_{ij} = w_{ij} - \\alpha \\frac{\\delta_j}{N}\n",
    "$$ \n",
    "\n",
    "Este é o batch gradient descent pois a cada iteração todas as $N$ amostras são processadas. Ao final de num_iters, devolve w (o vetor de pesos $\\mathbf{w}$) e um histórico dos custos calculados com respeito a X"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "def gradientDescent(X, y, w, alpha, num_iters):\n",
    "    N = y.size\n",
    "    J_history = np.zeros(num_iters)\n",
    "    temp = np.zeros(w.size)\n",
    "    numParameters = w.size\n",
    "\n",
    "    for iter in range(num_iters):\n",
    "        for j in range(numParameters):\n",
    "            delta_j = 0\n",
    "            for i in range(N):\n",
    "                delta_j += (X[i, :].dot(w) - y[i]) * X[i, j]\n",
    "            temp[j] = w[j] - alpha * (delta_j / N)\n",
    "        w = temp\n",
    "        J_history[iter] = computeCost(X, y, w)\n",
    "\n",
    "    return (w, J_history)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Exemplo de aplicação\n",
    "\n",
    "Vamos usar os mesmos dados usados em <code>practice_basic.ipynb</code>"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false,
    "scrolled": true
   },
   "outputs": [],
   "source": [
    "import numpy as np\n",
    "%matplotlib inline\n",
    "import matplotlib.pyplot as plt\n",
    "from mpl_toolkits.mplot3d import Axes3D \n",
    "\n",
    "fname = '../data/data1.txt'\n",
    "data = np.loadtxt(fname, delimiter = ',')\n",
    "N = data.shape[0]\n",
    "X = data[:, 0]\n",
    "y = data[:, 1]\n",
    "\n",
    "print('Dimensão do array X:', X.shape)\n",
    "print('Dimensão do array y:', y.shape)\n",
    "\n",
    "# estender x acrescentando um componente contante 1. x --->  (1,x)\n",
    "X = np.vstack(zip(np.ones(N), X))\n",
    "\n",
    "print('Dimensão do array X após extensão:', X.shape)\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Os dados de entrada são de dimensão 1. Assim, esse é um exemplo de regressão linear simples. Queremos encontrar $\\mathbf{w} = (w_0,w_1)$ que define uma função $f(x) = w_0+w_1\\,x$.\n",
    "\n",
    "Experimente alterar\n",
    "- o número de iterações num_iters\n",
    "- o learning rate alpha"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false
   },
   "outputs": [],
   "source": [
    "# vetor de pesos começa com zero\n",
    "w = np.zeros(2)\n",
    "\n",
    "initialCost = computeCost(X, y, w)\n",
    "print('Initial cost: ', initialCost)\n",
    "\n",
    "\n",
    "# plot the initial fit\n",
    "fig = plt.figure(figsize=(14,7))\n",
    "plt.subplot(121)\n",
    "plt.title('Initial fit')\n",
    "plt.plot(X[:,1], y, 'rx')\n",
    "plt.plot(X[:,1], X.dot(w), '-')\n",
    "plt.xlim(-1, 10)\n",
    "plt.ylim(-1,6)\n",
    "plt.xlabel('x')\n",
    "plt.ylabel('y')\n",
    "\n",
    "# Some gradient descent settings\n",
    "iterations = 1500\n",
    "alpha = 0.01\n",
    "\n",
    "# run gradient descent\n",
    "w, J_history = gradientDescent(X, y, w, alpha, iterations)\n",
    "\n",
    "\n",
    "finalCost = computeCost(X, y, w)\n",
    "print('Final cost: ', finalCost)\n",
    "print('Weight w found by gradient descent: (%f, %f)' % (w[0], w[1]))\n",
    "\n",
    "\n",
    "# Plot the final fit\n",
    "plt.subplot(122)\n",
    "plt.title('Final fit')\n",
    "plt.plot(X[:,1], y, 'rx')\n",
    "plt.plot(X[:,1], X.dot(w), '-')\n",
    "plt.xlim(-1, 10)\n",
    "plt.ylim(-1,6)\n",
    "plt.xlabel('x')\n",
    "plt.ylabel('y')\n",
    "\n",
    "plt.show()\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Algumas predições usando a função estimada"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false
   },
   "outputs": [],
   "source": [
    "# Predict values for x = 3.5 and x = 7\n",
    "predict1 = np.dot([1, 3.5], w)\n",
    "print('x = 3.5  predicted output = %f\\n' %(predict1))\n",
    "\n",
    "predict2 = np.dot([1, 7], w)\n",
    "print('x = 7.0  predicted output = %f\\n' % (predict2))\n",
    "\n",
    "# Plot the linear fit and predictions\n",
    "plt.figure(figsize=(7,7))\n",
    "plt.plot(X[:,1], y, 'rx')\n",
    "plt.plot([3.5, 7], [predict1, predict2], 'bo')\n",
    "plt.plot(X[:,1], X.dot(w), '-')\n",
    "plt.xlim(-1, 10)\n",
    "plt.ylim(-1,6)\n",
    "plt.xlabel('x')\n",
    "plt.ylabel('y')\n",
    "plt.show()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Evolução do custo durante o treinamento"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false
   },
   "outputs": [],
   "source": [
    "# plot da evolução do custo no treinamento já realizado\n",
    "#fig = plt.gcf()\n",
    "#fig.set_size_inches(18.5, 10.5)\n",
    "#plt.subplot(121)\n",
    "plt.plot(range(1, J_history.size + 1), J_history, '-')\n",
    "plt.xlabel('Iteration')\n",
    "plt.ylabel('Cost')\n",
    "plt.show()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false
   },
   "outputs": [],
   "source": [
    "# um segundo treinamento, com valor de alpha distinto\n",
    "w2 = np.zeros(2)\n",
    "alpha2 = 0.05\n",
    "w2, J_history2 = gradientDescent(X, y, w2, alpha2, iterations)\n",
    "\n",
    "plt.plot(range(1, J_history.size + 1), J_history, 'b-', label=\"alpha=%d\"%alpha)\n",
    "plt.plot(range(1, J_history2.size + 1), J_history2, 'g-', label=\"alpha=%d\"%alpha2)\n",
    "plt.xlabel('Iteration')\n",
    "plt.ylabel('Cost')\n",
    "plt.show()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Gráfico da função custo no espaço $\\mathbb{R}^2$.\n",
    "The cost function is computed on a grid of points."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false
   },
   "outputs": [],
   "source": [
    "# Grid over which we will calculate J\n",
    "w0_vals = np.linspace(-10, 10, 50);\n",
    "w1_vals = np.linspace(-1, 4, 50);\n",
    "\n",
    "w0_coord, w1_coord = np.meshgrid(w0_vals, w1_vals)\n",
    "\n",
    "# initialize J_vals to a matrix of 0's\n",
    "J_vals = np.zeros((w0_vals.size, w1_vals.size))\n",
    "\n",
    "# Fill out J_vals\n",
    "for i in range(w0_vals.size):\n",
    "    for j in range(w1_vals.size):\n",
    "        t = [w0_vals[i], w1_vals[j]]   \n",
    "        J_vals[i,j] = computeCost(X, y, t)\n",
    "\n",
    "# Surface plot\n",
    "fig = plt.figure(figsize=(10,10))\n",
    "ax = fig.add_subplot(111, projection = '3d')\n",
    "ax.plot_surface(w0_coord, w1_coord, J_vals.T, rstride=1, cstride=1, alpha=0.6, cmap=plt.cm.jet)\n",
    "ax.set_xlabel(r'$w_0$')\n",
    "ax.set_ylabel(r'$w_1$')\n",
    "ax.set_zlabel('Cost')\n",
    "ax.view_init(elev=15, azim=230)\n",
    "plt.show()"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.5.2"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 0
}