{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Regressão \n",
"\n",
"(O conteúdo desta página está no notebook practice_regression.ipynb
)\n",
"\n",
"Exemplo nesta página: considera-se o ajuste de uma função linear às observações, usando a função custo soma do quadrado das diferenças. Isto é, dado $\\{(\\mathbf{x}_i,y_i) \\,:\\, i =1.\\ldots,N\\}$, busca-se $\\mathbf{w}$ que seja o ponto de mínimo de\n",
"$$\n",
"J(\\mathbf{w}) = \\frac{1}{2}\\sum_{i=1}^{N} (\\mathbf{w}^T\\,\\mathbf{x}_i - y_i)^2\n",
"$$"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Função de custo\n",
"\n",
"- X é um array de $N \\times (n+1)$ : $N$ amostras de dimensão $n+1$, com o valor na primeira dimensão igual a 1, e uma amostra por linha\n",
"- y é um array de $N \\times 1$\n",
"- w é um array de $(n+1)\\times 1$. "
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": true
},
"outputs": [],
"source": [
"# Calcula o somatório do quadrado das diferenças entre x_i^Tw e y_i \n",
"def computeCost(X, y, w):\n",
" N = y.size\n",
" J = 0\n",
" for i in range(N):\n",
" J += np.square(X[i, :].dot(w) - y[i])\n",
" J /= (2 * N)\n",
" return (J)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Gradiente descendente\n",
"\n",
"- X é um array de $N \\times (n+1)$ - $N$ amostras de dimensão $n+1$, com o valor na primeira dimensão igual a 1, e uma amostra por linha\n",
"- y é um array de $N \\times 1$\n",
"- w é um array de $(n+1)\\times 1$\n",
"- alpha é um real (learning rate)\n",
"- num_iters é o número de iterações\n",
"\n",
"A cada iteração, para todo $j=0,1,\\ldots,n$ calcula-se \n",
"$$\n",
"\\delta_j = \\sum_{i=1}^{N} (\\mathbf{w}^T\\mathbf{x}_i - y_i) x_{ij}\n",
"$$\n",
"e\n",
"atualiza-se o peso da seguinte forma:\n",
"$$\n",
"w_{ij} = w_{ij} - \\alpha \\frac{\\delta_j}{N}\n",
"$$ \n",
"\n",
"Este é o batch gradient descent pois a cada iteração todas as $N$ amostras são processadas. Ao final de num_iters, devolve w (o vetor de pesos $\\mathbf{w}$) e um histórico dos custos calculados com respeito a X"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": true
},
"outputs": [],
"source": [
"def gradientDescent(X, y, w, alpha, num_iters):\n",
" N = y.size\n",
" J_history = np.zeros(num_iters)\n",
" temp = np.zeros(w.size)\n",
" numParameters = w.size\n",
"\n",
" for iter in range(num_iters):\n",
" for j in range(numParameters):\n",
" delta_j = 0\n",
" for i in range(N):\n",
" delta_j += (X[i, :].dot(w) - y[i]) * X[i, j]\n",
" temp[j] = w[j] - alpha * (delta_j / N)\n",
" w = temp\n",
" J_history[iter] = computeCost(X, y, w)\n",
"\n",
" return (w, J_history)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Exemplo de aplicação\n",
"\n",
"Vamos usar os mesmos dados usados em practice_basic.ipynb
"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": false,
"scrolled": true
},
"outputs": [],
"source": [
"import numpy as np\n",
"%matplotlib inline\n",
"import matplotlib.pyplot as plt\n",
"from mpl_toolkits.mplot3d import Axes3D \n",
"\n",
"fname = '../data/data1.txt'\n",
"data = np.loadtxt(fname, delimiter = ',')\n",
"N = data.shape[0]\n",
"X = data[:, 0]\n",
"y = data[:, 1]\n",
"\n",
"print('Dimensão do array X:', X.shape)\n",
"print('Dimensão do array y:', y.shape)\n",
"\n",
"# estender x acrescentando um componente contante 1. x ---> (1,x)\n",
"X = np.vstack(zip(np.ones(N), X))\n",
"\n",
"print('Dimensão do array X após extensão:', X.shape)\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Os dados de entrada são de dimensão 1. Assim, esse é um exemplo de regressão linear simples. Queremos encontrar $\\mathbf{w} = (w_0,w_1)$ que define uma função $f(x) = w_0+w_1\\,x$.\n",
"\n",
"Experimente alterar\n",
"- o número de iterações num_iters\n",
"- o learning rate alpha"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": false
},
"outputs": [],
"source": [
"# vetor de pesos começa com zero\n",
"w = np.zeros(2)\n",
"\n",
"initialCost = computeCost(X, y, w)\n",
"print('Initial cost: ', initialCost)\n",
"\n",
"\n",
"# plot the initial fit\n",
"fig = plt.figure(figsize=(14,7))\n",
"plt.subplot(121)\n",
"plt.title('Initial fit')\n",
"plt.plot(X[:,1], y, 'rx')\n",
"plt.plot(X[:,1], X.dot(w), '-')\n",
"plt.xlim(-1, 10)\n",
"plt.ylim(-1,6)\n",
"plt.xlabel('x')\n",
"plt.ylabel('y')\n",
"\n",
"# Some gradient descent settings\n",
"iterations = 1500\n",
"alpha = 0.01\n",
"\n",
"# run gradient descent\n",
"w, J_history = gradientDescent(X, y, w, alpha, iterations)\n",
"\n",
"\n",
"finalCost = computeCost(X, y, w)\n",
"print('Final cost: ', finalCost)\n",
"print('Weight w found by gradient descent: (%f, %f)' % (w[0], w[1]))\n",
"\n",
"\n",
"# Plot the final fit\n",
"plt.subplot(122)\n",
"plt.title('Final fit')\n",
"plt.plot(X[:,1], y, 'rx')\n",
"plt.plot(X[:,1], X.dot(w), '-')\n",
"plt.xlim(-1, 10)\n",
"plt.ylim(-1,6)\n",
"plt.xlabel('x')\n",
"plt.ylabel('y')\n",
"\n",
"plt.show()\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Algumas predições usando a função estimada"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": false
},
"outputs": [],
"source": [
"# Predict values for x = 3.5 and x = 7\n",
"predict1 = np.dot([1, 3.5], w)\n",
"print('x = 3.5 predicted output = %f\\n' %(predict1))\n",
"\n",
"predict2 = np.dot([1, 7], w)\n",
"print('x = 7.0 predicted output = %f\\n' % (predict2))\n",
"\n",
"# Plot the linear fit and predictions\n",
"plt.figure(figsize=(7,7))\n",
"plt.plot(X[:,1], y, 'rx')\n",
"plt.plot([3.5, 7], [predict1, predict2], 'bo')\n",
"plt.plot(X[:,1], X.dot(w), '-')\n",
"plt.xlim(-1, 10)\n",
"plt.ylim(-1,6)\n",
"plt.xlabel('x')\n",
"plt.ylabel('y')\n",
"plt.show()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Evolução do custo durante o treinamento"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": false
},
"outputs": [],
"source": [
"# plot da evolução do custo no treinamento já realizado\n",
"#fig = plt.gcf()\n",
"#fig.set_size_inches(18.5, 10.5)\n",
"#plt.subplot(121)\n",
"plt.plot(range(1, J_history.size + 1), J_history, '-')\n",
"plt.xlabel('Iteration')\n",
"plt.ylabel('Cost')\n",
"plt.show()"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": false
},
"outputs": [],
"source": [
"# um segundo treinamento, com valor de alpha distinto\n",
"w2 = np.zeros(2)\n",
"alpha2 = 0.05\n",
"w2, J_history2 = gradientDescent(X, y, w2, alpha2, iterations)\n",
"\n",
"plt.plot(range(1, J_history.size + 1), J_history, 'b-', label=\"alpha=%d\"%alpha)\n",
"plt.plot(range(1, J_history2.size + 1), J_history2, 'g-', label=\"alpha=%d\"%alpha2)\n",
"plt.xlabel('Iteration')\n",
"plt.ylabel('Cost')\n",
"plt.show()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Gráfico da função custo no espaço $\\mathbb{R}^2$.\n",
"The cost function is computed on a grid of points."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": false
},
"outputs": [],
"source": [
"# Grid over which we will calculate J\n",
"w0_vals = np.linspace(-10, 10, 50);\n",
"w1_vals = np.linspace(-1, 4, 50);\n",
"\n",
"w0_coord, w1_coord = np.meshgrid(w0_vals, w1_vals)\n",
"\n",
"# initialize J_vals to a matrix of 0's\n",
"J_vals = np.zeros((w0_vals.size, w1_vals.size))\n",
"\n",
"# Fill out J_vals\n",
"for i in range(w0_vals.size):\n",
" for j in range(w1_vals.size):\n",
" t = [w0_vals[i], w1_vals[j]] \n",
" J_vals[i,j] = computeCost(X, y, t)\n",
"\n",
"# Surface plot\n",
"fig = plt.figure(figsize=(10,10))\n",
"ax = fig.add_subplot(111, projection = '3d')\n",
"ax.plot_surface(w0_coord, w1_coord, J_vals.T, rstride=1, cstride=1, alpha=0.6, cmap=plt.cm.jet)\n",
"ax.set_xlabel(r'$w_0$')\n",
"ax.set_ylabel(r'$w_1$')\n",
"ax.set_zlabel('Cost')\n",
"ax.view_init(elev=15, azim=230)\n",
"plt.show()"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.5.2"
}
},
"nbformat": 4,
"nbformat_minor": 0
}