{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Regressão \n", "\n", "(O conteúdo desta página está no notebook practice_regression.ipynb)\n", "\n", "Exemplo nesta página: considera-se o ajuste de uma função linear às observações, usando a função custo soma do quadrado das diferenças. Isto é, dado $\\{(\\mathbf{x}_i,y_i) \\,:\\, i =1.\\ldots,N\\}$, busca-se $\\mathbf{w}$ que seja o ponto de mínimo de\n", "$$\n", "J(\\mathbf{w}) = \\frac{1}{2}\\sum_{i=1}^{N} (\\mathbf{w}^T\\,\\mathbf{x}_i - y_i)^2\n", "$$" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Função de custo\n", "\n", "- X é um array de $N \\times (n+1)$ : $N$ amostras de dimensão $n+1$, com o valor na primeira dimensão igual a 1, e uma amostra por linha\n", "- y é um array de $N \\times 1$\n", "- w é um array de $(n+1)\\times 1$. " ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": true }, "outputs": [], "source": [ "# Calcula o somatório do quadrado das diferenças entre x_i^Tw e y_i \n", "def computeCost(X, y, w):\n", " N = y.size\n", " J = 0\n", " for i in range(N):\n", " J += np.square(X[i, :].dot(w) - y[i])\n", " J /= (2 * N)\n", " return (J)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Gradiente descendente\n", "\n", "- X é um array de $N \\times (n+1)$ - $N$ amostras de dimensão $n+1$, com o valor na primeira dimensão igual a 1, e uma amostra por linha\n", "- y é um array de $N \\times 1$\n", "- w é um array de $(n+1)\\times 1$\n", "- alpha é um real (learning rate)\n", "- num_iters é o número de iterações\n", "\n", "A cada iteração, para todo $j=0,1,\\ldots,n$ calcula-se \n", "$$\n", "\\delta_j = \\sum_{i=1}^{N} (\\mathbf{w}^T\\mathbf{x}_i - y_i) x_{ij}\n", "$$\n", "e\n", "atualiza-se o peso da seguinte forma:\n", "$$\n", "w_{ij} = w_{ij} - \\alpha \\frac{\\delta_j}{N}\n", "$$ \n", "\n", "Este é o batch gradient descent pois a cada iteração todas as $N$ amostras são processadas. Ao final de num_iters, devolve w (o vetor de pesos $\\mathbf{w}$) e um histórico dos custos calculados com respeito a X" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": true }, "outputs": [], "source": [ "def gradientDescent(X, y, w, alpha, num_iters):\n", " N = y.size\n", " J_history = np.zeros(num_iters)\n", " temp = np.zeros(w.size)\n", " numParameters = w.size\n", "\n", " for iter in range(num_iters):\n", " for j in range(numParameters):\n", " delta_j = 0\n", " for i in range(N):\n", " delta_j += (X[i, :].dot(w) - y[i]) * X[i, j]\n", " temp[j] = w[j] - alpha * (delta_j / N)\n", " w = temp\n", " J_history[iter] = computeCost(X, y, w)\n", "\n", " return (w, J_history)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Exemplo de aplicação\n", "\n", "Vamos usar os mesmos dados usados em practice_basic.ipynb" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false, "scrolled": true }, "outputs": [], "source": [ "import numpy as np\n", "%matplotlib inline\n", "import matplotlib.pyplot as plt\n", "from mpl_toolkits.mplot3d import Axes3D \n", "\n", "fname = '../data/data1.txt'\n", "data = np.loadtxt(fname, delimiter = ',')\n", "N = data.shape[0]\n", "X = data[:, 0]\n", "y = data[:, 1]\n", "\n", "print('Dimensão do array X:', X.shape)\n", "print('Dimensão do array y:', y.shape)\n", "\n", "# estender x acrescentando um componente contante 1. x ---> (1,x)\n", "X = np.vstack(zip(np.ones(N), X))\n", "\n", "print('Dimensão do array X após extensão:', X.shape)\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Os dados de entrada são de dimensão 1. Assim, esse é um exemplo de regressão linear simples. Queremos encontrar $\\mathbf{w} = (w_0,w_1)$ que define uma função $f(x) = w_0+w_1\\,x$.\n", "\n", "Experimente alterar\n", "- o número de iterações num_iters\n", "- o learning rate alpha" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false }, "outputs": [], "source": [ "# vetor de pesos começa com zero\n", "w = np.zeros(2)\n", "\n", "initialCost = computeCost(X, y, w)\n", "print('Initial cost: ', initialCost)\n", "\n", "\n", "# plot the initial fit\n", "fig = plt.figure(figsize=(14,7))\n", "plt.subplot(121)\n", "plt.title('Initial fit')\n", "plt.plot(X[:,1], y, 'rx')\n", "plt.plot(X[:,1], X.dot(w), '-')\n", "plt.xlim(-1, 10)\n", "plt.ylim(-1,6)\n", "plt.xlabel('x')\n", "plt.ylabel('y')\n", "\n", "# Some gradient descent settings\n", "iterations = 1500\n", "alpha = 0.01\n", "\n", "# run gradient descent\n", "w, J_history = gradientDescent(X, y, w, alpha, iterations)\n", "\n", "\n", "finalCost = computeCost(X, y, w)\n", "print('Final cost: ', finalCost)\n", "print('Weight w found by gradient descent: (%f, %f)' % (w[0], w[1]))\n", "\n", "\n", "# Plot the final fit\n", "plt.subplot(122)\n", "plt.title('Final fit')\n", "plt.plot(X[:,1], y, 'rx')\n", "plt.plot(X[:,1], X.dot(w), '-')\n", "plt.xlim(-1, 10)\n", "plt.ylim(-1,6)\n", "plt.xlabel('x')\n", "plt.ylabel('y')\n", "\n", "plt.show()\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Algumas predições usando a função estimada" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false }, "outputs": [], "source": [ "# Predict values for x = 3.5 and x = 7\n", "predict1 = np.dot([1, 3.5], w)\n", "print('x = 3.5 predicted output = %f\\n' %(predict1))\n", "\n", "predict2 = np.dot([1, 7], w)\n", "print('x = 7.0 predicted output = %f\\n' % (predict2))\n", "\n", "# Plot the linear fit and predictions\n", "plt.figure(figsize=(7,7))\n", "plt.plot(X[:,1], y, 'rx')\n", "plt.plot([3.5, 7], [predict1, predict2], 'bo')\n", "plt.plot(X[:,1], X.dot(w), '-')\n", "plt.xlim(-1, 10)\n", "plt.ylim(-1,6)\n", "plt.xlabel('x')\n", "plt.ylabel('y')\n", "plt.show()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Evolução do custo durante o treinamento" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false }, "outputs": [], "source": [ "# plot da evolução do custo no treinamento já realizado\n", "#fig = plt.gcf()\n", "#fig.set_size_inches(18.5, 10.5)\n", "#plt.subplot(121)\n", "plt.plot(range(1, J_history.size + 1), J_history, '-')\n", "plt.xlabel('Iteration')\n", "plt.ylabel('Cost')\n", "plt.show()" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false }, "outputs": [], "source": [ "# um segundo treinamento, com valor de alpha distinto\n", "w2 = np.zeros(2)\n", "alpha2 = 0.05\n", "w2, J_history2 = gradientDescent(X, y, w2, alpha2, iterations)\n", "\n", "plt.plot(range(1, J_history.size + 1), J_history, 'b-', label=\"alpha=%d\"%alpha)\n", "plt.plot(range(1, J_history2.size + 1), J_history2, 'g-', label=\"alpha=%d\"%alpha2)\n", "plt.xlabel('Iteration')\n", "plt.ylabel('Cost')\n", "plt.show()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Gráfico da função custo no espaço $\\mathbb{R}^2$.\n", "The cost function is computed on a grid of points." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false }, "outputs": [], "source": [ "# Grid over which we will calculate J\n", "w0_vals = np.linspace(-10, 10, 50);\n", "w1_vals = np.linspace(-1, 4, 50);\n", "\n", "w0_coord, w1_coord = np.meshgrid(w0_vals, w1_vals)\n", "\n", "# initialize J_vals to a matrix of 0's\n", "J_vals = np.zeros((w0_vals.size, w1_vals.size))\n", "\n", "# Fill out J_vals\n", "for i in range(w0_vals.size):\n", " for j in range(w1_vals.size):\n", " t = [w0_vals[i], w1_vals[j]] \n", " J_vals[i,j] = computeCost(X, y, t)\n", "\n", "# Surface plot\n", "fig = plt.figure(figsize=(10,10))\n", "ax = fig.add_subplot(111, projection = '3d')\n", "ax.plot_surface(w0_coord, w1_coord, J_vals.T, rstride=1, cstride=1, alpha=0.6, cmap=plt.cm.jet)\n", "ax.set_xlabel(r'$w_0$')\n", "ax.set_ylabel(r'$w_1$')\n", "ax.set_zlabel('Cost')\n", "ax.view_init(elev=15, azim=230)\n", "plt.show()" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.5.2" } }, "nbformat": 4, "nbformat_minor": 0 }