## Tuesday, April 2, 2013

### Installing the Redhawk SDR framework

WHAT IS IT?
Redhawk SDR is a new SDR framework, in the vein of GNU Radio Companion and the OSSIE framework, for quickly developing and deploying signal processing algorithms. This U.S. Gov't backed software has extensive support for creating radio flows for a single computer or distributing your SDR processing chain across networked hardware, and has been used in many commercial applications. It will also be incorporating the existing work done under the banner of GNU Radio Companion and OSSIE. See more at www.redhawksdr.org

FLY ON YOUR WINGS (LIKE AN EAGLE)
The first step (or at least the easiest way to get going) is to install CentOS 6.{1-3} x86_64. Redhawk comes in rpm packages, so I am sure other flavors of RedHat Linux could work, but I personally used CentOS 6.3 x86_64 for this tutorial. Of course, some of the difficulties I experienced may be alleviated by a better repository package set like Fedora has.

To install the redhawk software, take the code from the GitHub link below and paste into a bash script, then set executable permissions with chmod +x <scriptname>. I have the latest copy of this posted on my GitHub for Redhawk utilities.

Once this script completes, you should see a folder called redhawkIDE in your home directory. Enter this folder, then type ./eclipse. This should launch the Redhawk IDE, which allows you to create components and blocks to do SDR task.

ODD FUTURE
I am planning to include support for 32bit and 5.x installs of CentOS in the get_redhawk.sh script at some point, but for now only CentOS 6.x 64bit is supported.

Redhawk has begun to integrate GNU Radio components into the framework, which will hopefully bring rtl-sdr support. This should support the E4000 and R820T digitizers, so that we can quickly begin to use the Redhawk SDR environment for some real projects. For now, the only way I can see to get rtl-sdr working with Redhawk is through a nasty hack - make a FIFO and write data from rtl-sdr to that... I will be posting on this in the near future.

## Tuesday, March 12, 2013

### Introduction to Machine Learning, Part 2: Linear Regression

NUMERO DOS
Continuing our exploration of machine learning, we will discuss the use of basis functions for regression analysis. When presented with an unknown dataset, it is very common to attempt to find trends or patterns. The most basic form of this is visual inspection - how is the data trending? Does it repeat in cycles? Can we predict future data given some past events? The mathematical approach to this "trend finding" is called regression, or line fitting. As we will see, it is possible to fit more than a simple straight line, and the general technique of regression is very effective at breaking down many types of data.

DRY DRY DERIVATION
What are some sample functions we might want to perform regression on? Take as an example:
$y = x^2 - 4x + 1$

With $x = 1\ldots100$ and no noise, the output $y$ looks like this:

What happens if we look at the same values, distorted by some noise?

Since we know the original function for the data, it is easy to "see" the underlying $x^2 - 4x +1$ (the blue line) is still there, but how can we do this mathematically, and without knowledge of the original generating function?

Looking at the function $x^2 - 4x +1$, rearranged as $1-4x+x^2$, we see that this is really a specialization of the general $2$nd order polynomial $w_0 + w_1x + w_2x^2$, with weight vector ${\bf w} = [1, -4, 1]$. If we also abstract the function vector ${\bf x} = \sum\limits_{n=0}^2x^n = [0, x, x^2]$, the result is a function $y({\bf w}, {\bf x}) = w_0 + w_1x_1 + w_2x_2$.

If we think of any $n$th order polynomial, $y({\bf w}, {\bf x})$, it can always be represented by the form.

$y({\bf w}, {\bf x}) = w_0 + w_1x_1 + w_2x_2+\ldots + w_nx_n$

This can also be represented in matrix form.

$[w_0\ldots w_{n}]\left[ \begin{array}{xmat} 1 & 0 & \cdots & 0\\ 0 & x_1 & \cdots & 0 \\ \vdots & \vdots & \ddots & \vdots \\ 0 & 0 & \cdots & x_n \end{array} \right]$

BASE CAMP 1KM
In fact, going one step further, we see that the vector ${\bf x}$ could really be better generalized as a basis vector - that is, our current vector ${\bf x} = [0, x, x^2] = x^n$ could be generalized with something besides $x^n$. We could use sine waves (Fourier basis), Gaussians, sigmoids, wavelets, or any number of other functions to perform this same regression - in fact, this is the root of the concept of transforms in DSP lingo.

When we talk about linear regression, we are really talking about a linear regression in the basis domain - a linear combination of basis functions multiplied by weighting values, which can be used to approximate the original signal. Since we have learned that we can use nearly any function as a basis, let's change notation from ${\bf x}$ to ${\bf \phi}$, where ${\bf \phi}$ is a function of $x$.

$[w_o\ldots w_{n}]\left[ \begin{array}{phimat} \phi_0(x_1) & \phi_0(x_2) & \cdots & \phi_0(x_n) \\ \phi_1(x_1) & \phi_1(x_2) & \cdots & \phi_1(x_n) \\ \vdots & \vdots & \ddots & \vdots \\ \phi_m(x_1) & \phi_m(x_2) & \cdots & \phi_m(x_n) \end{array} \right]$

The full derivation of the maximum likelihood estimate (MLE) can be found here, but the short and sweet version is that the maximum likelihood estimate of ${\bf w}$ is $(\Phi^T\Phi)^{-1}\Phi {\bf t}$, where $\Phi$ is the general basis matrix defined above, and ${\bf t}$ is a set of test datapoints. These test datapoints are generated by $\Phi {\bf w_m}$, where ${\bf w_m}$ are the measured values (red dots shown in the above figures).

$(\Phi^T\Phi)^{-1}\Phi$ is known as the Moore-Penrose pseudoinverse, and it can be replaced by the matrix inverse $\Phi^{-1}$ if the basis matrix is square (3x3, 100x100, etc.). Let's look at the code.

#!/usr/bin/python
import numpy as np
import matplotlib.pyplot as plot
import math

def gen_dft(m, n, N):
return np.exp(1j*-2*m*n/N)

def gen_polynomial(x, m):
return x**m

N = 10
N_basis = 3
noise_var = B = 5
basis = np.matrix(np.zeros((N,N)), dtype=np.complex64)
xs = np.matrix(range(N)).T
ys = np.square(xs) - 4*xs + 1

wm = ys + np.sqrt(B)*np.random.randn(N,1)
for m in range(N_basis):
for n in range(N):
if m == 0:
basis[m,n] = 1.0
else:
basis[m,n] = gen_polynomial(xs[n], m)
#To use the gen_dft basis, make sure to set N_basis = N
#basis[m,n] = gen_dft(m, n, N)

test_data = t = basis*wm
#Calculate using the Moore-Penrose pseudoinverse using the following formula
#maximum_likelihood = wml = np.linalg.inv(basis.T*basis)*basis.T*t
#Direct calculation appears to have numerical instability issues...
#Luckily the pinv method calculates Moore-Penrose pseudo inverse using SVD

#which largely avoids the numerical issues
maximum_likelihood = wml = np.linalg.pinv(basis)*t

plot.figure()
plot.title("Regression fit using polynomial basis function, number of basis functions = $\$$" + N_basis + "\$$") plot.plot(ys, 'b') plot.plot(wm, 'ro') plot.plot(np.real(wml), 'g') plot.show() As we can see, the approximation with 3 basis functions (2nd order polynomial fit) does a pretty good job of approximating the original function. Let's see what happens if we increase the number of basis functions. We can see that this fit gets closer to the measured data (red dots), but is actually further from the true signal without noise! This is called overfitting, because our estimate is beginning to fit our measurement, which has noise, rather than the underlying function (the blue line). Let's try another example, using the Fourier basis matrix. The Fourier basis does a decent job of fitting the line, but still has more error than the polynomial fit with the right number of basis functions. Because the original function is polynomial, we should expect a polynomial basis to give superior results. We are overfitting our polynomial regression by including too many basis functions - in general, we want to fit the smallest number of basis functions while still minimizing overall error. There are ways to reduce overfitting by regularizing this operation, using something called the "lasso method". There are also methods for sequential estimation, which avoid large matrix operations for higher order basis matrices. We will instead look at a Bayesian approach to regression, which naturally has regularization properties, and also eliminates most of the large matrix math by making this regression a sequential operation. This will greatly help higher order regression, and falls in line with our earlier approaches from part 1. BAYES ASPIRIN So how do we actually perform a Bayesian regression? There are some complicated derivations once again (pg 17 and onward here), but the idea is that we make iterative estimates, starting with some prior estimate and a likelihood to get a posterior estimate, which we will use as the prior for the next iteration. Remember, Bayesian is all about$posterior\: =\: likelihood\: \times\: prior$. After iterating through all the data points, we expect our estimate to approach the true generating function, without noise. Let's look at the code. #!/usr/bin/python import numpy as np import matplotlib.pyplot as plot N = 10 noise_var = B = .2**2 lower_bound = lb = 0. upper_bound = ub = 5. xs = np.matrix((ub-lb)*np.random.rand(N)+lb).T xaxis = np.matrix(np.linspace(lb,ub,num=N)).T w = np.array([1, -4, 1]) ys = w[0] + w[1]*xaxis + w[2]*np.square(xaxis) t = w[0] + w[1]*xs + +w[2]*np.square(xs) + np.sqrt(B)*np.random.randn(N,1); def gen_polynomial(x, p): return x**p N_basis = 7 alpha = a = 1. beta = b = 1./B; prior_m = np.zeros((N_basis, 1)) m = np.zeros((N_basis, N)) prior_s = np.matrix(np.diag(np.array([a]*N_basis))) s = np.zeros((N_basis, N_basis)) plot.figure() for n in range(N): poly = np.vectorize(gen_polynomial) basis = poly(xs[n], np.matrix(np.arange(N_basis))) s_inv = prior_s.I*np.eye(N_basis)+b*(basis.T*basis) s = s_inv.I*np.eye(N_basis) #Need to use .squeeze() so broadcasting works correctly m[:,n] = (s*(prior_s.I*prior_m+(b*basis.T*t[n]))).squeeze() y = m[0,n] + m[1,n]*xaxis + m[2,n]*np.square(xaxis) plot.plot(xaxis, y, "g") prior_m[:,0] = m[:,n].squeeze() prior_s = s plot.title("Bayesian regression using polynomial basis function, number of basis functions =$\$$" + N_basis + "\$$")
plot.plot(xaxis, ys, "b")
plot.plot(xs, t, "ro")
plot.plot(xaxis, y, "k")
plot.show()

Adding in the progression plots (plot.plot(xaxis, y, "g"))
The green lines show how the Bayesian regression updates on each iteration through the loop - though unlabeled, we assume that the closer the green line is to the final estimate (black line), the later it ran in the code, with the n = 10 iteration resulting in the final black line shown.

The top part of the code sets up the generating function, also generates samples corrupted by noise. We could set xs differently to get linearly spaced samples (using np.arange(lb,ub,N)) instead of random spacing, but random samples are what Bishop is using in his book and diagrams (Pattern Recognition and Machine Learning, C. Bishop). I also chose to vectorize the basis matrix generation this time - which doesn't actually increase the code speed (in numpy at least), but does turn a multi-line for loop into one line of code.

In fact, looking at page 153 in Bishop's book will give all the formulas used in the processing loop, mapped as follows.
s_inv = ${\bf S}_N^{-1}$
s = ${\bf S}_N$
m[:,n] = ${\bf m}_N$
prior_s = ${\bf S}_0^{-1}$
prior_m = ${\bf m}_0$
Updating the priors with the results of the loop each time, we end up with a reasonable approximation of the generating function as long as N_basis is the same as the generating function. What happens if we don't use the right value for N_basis?
Bad things, it turns out. Instead of an overfitting problem, we now have a model selection problem - if we don't choose N_basis right, we won't converge to the right answer!

AUTO MODELING AND DETAILING
We really want to automate this model selection process, as scientists are too busy traveling the world and winning awards to babysit their award-winning scientific models. Luckily, there is a formula for calculating Bayesian model evidence - the model that maximizes this function should be the best choice! See the notes here (PDF) for more details, starting with slide 41.

${\large {\bf A} = {\bf S}_N^{-1} = \alpha {\bf I} + \beta {\bf \Phi}^T{\bf \Phi}}$
${\large {\bf m}_N = \beta{\bf A}^{-1}{\bf \Phi}^T{\bf t}}$
${\large E({\bf m}_N) = \frac{\beta}{2}||{\bf t}-{\bf \Phi}{\bf m}_N||^2+\frac{\alpha}{2}{\bf m}_N^T{\bf m}_N}$
${\large \ln p({\bf t}| \alpha, \beta) = \frac{M}{2}ln(\alpha)+\frac{N}{2}ln(\beta)-E({\bf m}_N) - \frac{1}{2}ln(| {\bf A} | ) - \frac{N}{2}ln(2\pi)}$

Approximations for $\alpha$ and $\beta$ are:
${\large \alpha = \frac{M}{2E_w({\bf m}_N)} = \frac{M}{{\bf w}^T{\bf w}}}$
${\large \beta = \frac{N}{2E_D({\bf m}_N)} = \frac{N}{\sum\limits^N_{n=1}\{t_n - {\bf w}^T\phi({\bf x}_n)\}^2}}$
By running each model, then computing the last equation (the model evidence function), we can see which model has the highest evidence of being correct. The full code for this calculation is listed below.
#!/usr/bin/python
import numpy as np
import matplotlib.pyplot as plot

N = 50.
noise_var = B = .2**2
lower_bound = lb = -5.
upper_bound = ub = 5.
xs = np.matrix(sorted((ub-lb)*np.random.rand(N)+lb)).T
xaxis = np.matrix(np.linspace(lb,ub,num=N)).T
w = np.array([1, -4, 1])
ys = w[0] + w[1]*xaxis + w[2]*np.square(xaxis)
t = w[0] + w[1]*xs + +w[2]*np.square(xs) + np.sqrt(B)*np.random.randn(N,1);

prior_alpha = 0.005 #Low precision initially
prior_beta = 0.005 #Low guess for noise var
max_N = 7 #Upper limit on model order
evidence = E = np.zeros(max_N)
f, axarr = plot.subplots(3)
def gen_polynomial(x, p):
return x**p

itr_upper_bound = 250
all_y = []

for order in range(1, max_N+1):
m = np.zeros((order, 1))
s = np.zeros((order, order))
poly = np.vectorize(gen_polynomial)
basis = poly(xs, np.tile(np.arange(order), N).reshape(N, order))
alpha = a = prior_alpha
beta = b = prior_beta
itr = 0
while not itr < itr_upper_bound:
itr += 1
first_part = a*np.eye(order)
second_part = b*(basis.T*basis)
s_inv = a*np.eye(order)+b*(basis.T*basis)
m = b*s_inv.I*basis.T*t
posterior_alpha = pa = np.matrix(order/(m.T*m))[0,0]
posterior_beta = pb = np.matrix(N/((t.T-m.T*basis.T)*(t.T-m.T*basis.T).T))[0,0]
a = pa
b = pb
A = a*np.eye(order)+b*(basis.T)*basis
mn = b*(A.I*(basis.T*t))
penalty = emn = b/2.*(t.T-mn.T*basis.T)*(t.T-mn.T*basis.T).T+a/2.*mn.T*mn
E[order-1] = order/2.*np.log(a)+N/2.*np.log(b)-emn-1./(2*np.log(np.linalg.det(A)))-N/2.*np.log(2*np.pi)
y = (mn.T*basis.T).T
all_y.append(y)
axarr[0].plot(xs, y ,"g")

best_model = np.ma.argmax(E)
#print E
x0label = x2label = "Input X"
y0label = y2label = "Output Y"
x1label = "Model Order"
y1label = "Score"
plot.tight_layout()
axarr[0].set_xlabel(x0label)
axarr[0].set_ylabel(y0label)
axarr[1].set_xlabel(x1label)
axarr[1].set_ylabel(y1label)
axarr[2].set_xlabel(x2label)
axarr[2].set_ylabel(y2label)
axarr[0].set_title("Bayesian model estimation using polynomial basis functions")
axarr[0].plot(xaxis, ys, "b")
axarr[0].plot(xs, t, "ro")
axarr[1].set_title("Model Evidence")
axarr[1].plot(E, "b")
axarr[2].set_title("Best model, polynomial order $"+best_model+"$")
axarr[2].plot(xs, t, "ro")
axarr[2].plot(xs, all_y[best_model], "g")
plot.show()

The model selection calculation has allowed us to evaluate different models and compare them mathematically. The model evidence values for this run are:

[-200.40318142 -187.76494065 -186.72241097 -189.13724168 -191.74123924 -194.37069482 -197.01120722]

Let's add more noise and see what happens.
Adjusting the noise_var value from .2**2 to 1.**2, we see that that the higher order models have increased bias, just as we saw in our earlier experiments. Our model evidence coefficients are:

[-207.27716138 -195.14402961 -188.44030008 -190.92931379 -193.53242997 -196.16271701 -198.75960682]

This time, we are able to use the model evidence function to correctly compensate for model bias and select the best model for our data. Awesome! Let's push a little farther, and reduce the number of sample points as well.

We can see that with a reduced number of samples (N = 10), there is not enough information to accurately fit or estimate the model order. Our coefficients for this run were:

[-46.65574626 -42.33981333 -43.95725922 -46.19200103 -49.68466447 -51.98193588 -54.56531748]

This is important point - all of these techniques help determine a model which is best for your data, but if there are not enough data points, or the data points are not unique enough to make a good assessment, you will not be able to get a grasp of the underlying function, which will lead to an imprecise answer. In the next installment of this series, we will cover the basics of classification - how do we determine if there are multiple generating models for one set of data? If there are multiple models, how many models are there - and what data points belong to which model?

SOURCES
Pattern Recognition and Machine Learning, C. Bishop
Bayesian Data Analysis, A. Gelman, J. Carlin, H. Stern, and D. Rubin
Classnotes from Advanced Topics in Pattern Recognition, UTSA, Dr. Zhang, Spring 2013

## Friday, March 8, 2013

### Amazon EC2 Setup and Connection

CLOUD NINE
First, you will need to sign up for Amazon EC2 (I am currently in the Free tier).
Next, follow these directions to launch a cloud OS so that we can start serving up webpages! I chose Ubuntu 12.10 64bit, so these instructions apply for that instance type, though they may work for other types

To ssh to the newly launched instance, find the public DNS from the Amazon EC2 management page, then use these commands:
chmod 600 <keyname>.pem
ssh -i <keyname>.pem -l ubuntu <public DNS name>

Now we are in! Time to do some quick security setup, using the instructions found here. The only liberties I have taken are setting the ufw firewall rules to 22 allow, and not setting IP rules or allowed users for ssh (since we have set it to key only access).

We need to get python-pip and python-virtualenv packages. Do this by running sudo apt-get install python-{pip,virtualenv}. virtualenv isolates at special instance of python from everything else, so packages won't interfere with your main python install.

Running virtualenv venv; source venv/bin/activate should put you into the virtual environment, we can install flask here by running sudo venv/bin/pip install flask . For an easy setup of flask with authentication, SQLAlchemy, and twitter bootstrap, thanks to esbullington, we can run the following commands

sudo apt-get install python-dev libpq-dev
 
To finish launching our app, we need to create a postgreSQL database for the application to connect to. The basic instructions list the following steps:
sudo apt-get install postgreql
sudo -u postgres psql postgres
sudo -u postgres psql
create databse <dbname> with owner <username>
sudo -u postgres psql

To login to the database, use psql -d <dbname> -U <username> -h localhost.

app.cfg
Change the postgresql settings to

app.py
Change the app.run line so that we bind to port 80 and allow connections from any external IP
app.run(debug=True,host='0.0.0.0',port=80)

To run this script, do

Now the next step is figuring out how to link this server to a real deal domain name!

## Friday, March 1, 2013

### Introduction to Machine Learning, Part 1: Parameter Estimation

RISE OF THE MACHINES
Machine learning is a fascinating field of study which is growing at an extreme rate. As computers get faster and faster, we attempt to harness the power of these machines to tackle difficult, complex, and/or unintuitive problems in biology, mathematics, engineering, and physics. Many of the purely mathematical models used in these fields are extremely simplified compared to their real-world counter parts (see the spherical cow). As scientists attempt more and more accurate simulations of our universe, machine learning techniques have become critical to building accurate models and estimating experimental parameters.

Also, even very basic machine learning can be used to effectively block spam email, which puts it somewhere between water and food on the "necessary for life" scale.

BAYES FORMULA - GREAT FOR GROWING BODIES
Machine learning seems very complicated, and advanced methods in the field are usually very math heavy - but it is all based in common statistics, mostly centered around the ever useful Gaussian distribution. One can see the full derivation for the below formulas here (PDF) or here (PDF), a lot of math is involved but it is relatively straightforward as long as you remember Bayes rule:
$posterior \:=\: likelihood \:\times\: prior$

HE DIDN'T MEAN IT
From the links above, we can see that our best estimate for the mean $\mu$ given a known variance $\sigma^2$ and some Gaussian distributed data vector $X$ is:

$\sigma(\mu)^2_n = {\Large \frac{1}{\frac{N}{\sigma^2}+\frac{1}{\sigma_o^2}}}$

$\mu_n = {\Large \frac{1}{\frac{N}{\sigma^2}+\frac{1}{\sigma^2}}(\frac{\sum\limits_{n=1}^N x_n}{\sigma^2}+\frac{\mu_o}{\sigma_o^2})}$

where $N$ (typically 1) represents the number of $X$ values used to generate the mean estimate $\mu$ (just $x_n$ if $N$ is 1), $\mu_o$ is the previous "best guess" for the mean ($\mu_{n-1}$), $\sigma_o^2$ is the previous confidence in the "best guess" for $\mu$ ($\sigma(\mu)^2_{n-1})$), and $\sigma$ was known prior to the calculation. Lets see what this looks like in python.

#!/usr/bin/python
import numpy as np
import matplotlib.pyplot as plot
total_obs = 1000
primary_mean = 5.
primary_var = known_var = 4.
x = np.sqrt(primary_var)*np.random.randn(total_obs) + primary_mean
f, axarr = plot.subplots(3)
f.suptitle("Unknown mean ($\$$\mu=\$$"+primary_mean+"), known variance ($\$$\sigma^2=\$$"+known_var+")")
y0label = "Timeseries"
y1label = "Estimate for mean"
y2label = "Doubt in estimate"
axarr[0].set_ylabel(y0label)
axarr[1].set_ylabel(y1label)
axarr[2].set_ylabel(y2label)
axarr[0].plot(x)
prior_mean = 0.
prior_var = 1000000000000.
all_mean_guess = []
all_mean_doubt = []
for i in range(total_obs):
posterior_mean_doubt = 1./(1./known_var+1./prior_var)
posterior_mean_guess = (prior_mean/prior_var+x[i]/known_var)*posterior_mean_doubt
all_mean_guess.append(posterior_mean_guess)
all_mean_doubt.append(posterior_mean_doubt)
prior_mean=posterior_mean_guess
prior_var=posterior_mean_doubt

axarr[1].plot(all_mean_guess)
axarr[2].plot(all_mean_doubt)
plot.show()

This code results in this plot:
We can see that there are two basic steps - generating the test data, and iteratively estimating the "unknown parameter", in this case the mean. We begin our estimate for the mean at any value (prior_mean = 0) and set the prior_var variable extremely large, indicating that our confidence in the mean actually being 0 is extremely low.

V FOR VARIANCE
What happens in the opposite (though still "academic" case) where we know the mean but not the variance? Our best estimate for unknown variance $\sigma^2$ given the mean $\mu$ and some Gaussian distributed data vector $X$ will use some extra variables, but still represent our estimation process.:

$a_n = {\Large a_o + \frac{N}{2}}$
$b_n = {\Large b_o + \frac{1}{2}\sum\limits^N_{n=1}(x_n-\mu)^2}$
$\lambda = {\Large \frac{a_n}{b_n}}$
$\sigma(\lambda)^2 = {\Large \frac{a_n}{b_n^2}}$
where
$\lambda = {\Large \frac{1}{\sigma^2}}$

This derivation is made much simpler by introducing the concept of precision ($\lambda$), which is simply 1 over the variance. We estimate the precision, which can be converted back to variance if we prefer. $\mu$ is known in this case.

#!/usr/bin/python
import numpy as np
import matplotlib.pyplot as plot
total_obs = 1000
primary_mean = known_mean = 5
primary_var = 4
x = np.sqrt(primary_var)*np.random.randn(total_obs) + primary_mean
all_a = []
all_b = []
all_prec_guess = []
all_prec_doubt = []
prior_a=1/2.+1
prior_b=1/2.*np.sum((x[0]-primary_mean)**2)
f,axarr = plot.subplots(3)
f.suptitle("Known mean ($\$$\mu=\$$"+known_mean+"), unknown variance ($\$$\sigma^2=\$$"+primary_var+"; $\$$\lambda\$$="+1./primary_var+")") y0label = "Timeseries" y1label = "Estimate for precision" y2label = "Doubt in estimate" axarr[0].set_ylabel(y0label) axarr[1].set_ylabel(y1label) axarr[2].set_ylabel(y2label) axarr[0].plot(x) for i in range(1,total_obs): posterior_a=prior_a+1/2. posterior_b=prior_b+1/2.*np.sum((x[i]-known_mean)**2) all_a.append(posterior_a) all_b.append(posterior_b) all_prec_guess.append(posterior_a/posterior_b) all_prec_doubt.append(posterior_a/(posterior_b**2)) prior_a=posterior_a prior_b=posterior_b axarr[1].plot(all_prec_guess) axarr[2].plot(all_prec_doubt) plot.show() Here I chose to set the values for prior_a and prior_b to the "first" values of the estimation, we could just as easily have reversed the formulas by setting the precision to some value, and the "doubt" about that precision very large, then solving a system of two equations, two unknowns for a and b. INTO THE UNKNOWN(S) What happens if both values are unknown? All we really know is that the data is Gaussian distributed!$\nu_o={\Large \kappa_o-1}\mu_n={\Large \frac{\kappa_o\mu_o+\overline{X}N}{\kappa_o+N}}\kappa_n={\Large \kappa_o+N}\nu_n={\Large \nu_o+N}\sigma_n^2={\Large \frac{\nu_o\sigma^2+(N-1)s^2+\frac{\kappa_oN}{\kappa_o+N}(\overline{X}-\mu_o)^2}{\nu_n}}s={\Large var(X)}$The technique here is the same as other derivations for unknown mean, known variance and known mean, unknown variance. To estimate the mean and variance of data taken from a single Gaussian distribution, we need to iteratively update our best guesses for both mean and variance. In many cases,$N=1$, so the value for$s$is not necessary and$\overline{X}$becomes$x_n$. Let's look at the code. #!/usr/bin/python import matplotlib.pyplot as plot import numpy as np total_obs = 1000 primary_mean = 5. primary_var = 4. x = np.sqrt(primary_var)*np.random.randn(total_obs) + primary_mean f, axarr = plot.subplots(3) f.suptitle("Unknown mean ($\$$\mu=\$$"+primary_mean+"), unknown variance ($\$$\sigma^2=\$$"+primary_var+")") y0label = "Timeseries" y1label = "Estimate for mean" y2label = "Estimate for variance" axarr[0].set_ylabel(y0label) axarr[1].set_ylabel(y1label) axarr[2].set_ylabel(y2label) axarr[0].plot(x) prior_mean = 0. prior_var = 1. prior_kappa = 1. prior_v = 0. all_mean_guess = [] all_var_guess = [] for i in range(total_obs): posterior_mean = (prior_kappa*prior_mean+x[i])/(prior_kappa + 1) posterior_var = (prior_v*prior_var + prior_kappa/(prior_kappa + 1)*(x[i]-prior_mean)**2)/(prior_v + 1) prior_kappa += 1 prior_v += 1 all_mean_guess.append(posterior_mean) all_var_guess.append(posterior_var) prior_mean = posterior_mean prior_var = posterior_var axarr[1].plot(all_mean_guess) axarr[2].plot(all_var_guess) plot.show() We can see that the iterative estimation has successfully approximated the mean and variance of the underlying distribution! We "learned" these parameters, given only a dataset and the knowledge that it could be approximated by the Gaussian distribution. In the next installment of this series, I will cover linear regression. These two techniques (parameter estimation and linear regression) form the core of many machine learning algorithms - all rooted in basic statistics. SOURCES Pattern Recognition and Machine Learning, C. Bishop Bayesian Data Analysis, A. Gelman, J. Carlin, H. Stern, and D. Rubin Classnotes from Advanced Topics in Pattern Recognition, UTSA, Dr. Zhang, Spring 2013 ## Friday, February 15, 2013 ### Interacting With SQL Databases LIST ALL DATABASES WITH A USER psql -l -U <username> LOGIN TO A DATABASE AS USER psql -d <database> -U <username> or psql -d <database> -U <username> -W or psql -d <database> -U <username> -h localhost POSTGRES DATABASE COMMANDS For initial help, simpy type help; To get SQL help, type \h For psql help, type \? \d will show all the relations in a database \l will show all the databases available To see something listed in a table, simply run SELECT * FROM <TABLE> DESCRIBE <TABLE> will show what fields the table has http://www.stuartellis.eu/articles/postgresql-setup/ SPECIAL TRICKS FOR SQLITE To show all the tables in a DB .tables or SELECT * FROM sqlite_master; To show fields in a table, use pragma table_info(<TABLE>); ## Monday, February 11, 2013 ### Assorted Notes and Setup WGET FILES OF CERTAIN TYPE IN ALL SUBDIRECTORIES wget -r -A.gif <website> RESIZE MANY GIF IMAGES USING IMAGEMAGICK for i in *.gif; do convert$i -resize 400x400 $(basename$i .gif).gif; done
From here

ADDING LATEX SUPPORT TO A BLOG (LIKE THIS ONE!)

<script type="text/javascript" src="http://cdn.mathjax.org/mathjax/latest/MathJax.js"> MathJax.Hub.Config({ extensions: ["tex2jax.js","TeX/AMSmath.js","TeX/AMSsymbols.js"], jax: ["input/TeX", "output/HTML-CSS"], tex2jax: { inlineMath: [ ['$\$$','\$$'], ["$$","$$"] ], displayMath: [ ['$\$$\$$','$\$$\$$'], ["$\backslash [$","$\backslash ]$"] ], }, "HTML-CSS": { availableFonts: ["TeX"] } });< /script> I have forgotten where I got this tip, but whoever originally posted this - thank you! CREATE A PDF FROM MANY IMAGES convert *.jpg images.pdf http://dancingpenguinsoflight.com/2010/02/converting-multiple-images-to-one-pdf-on-linux/ ADD USER TO VBOXUSERS GROUP AND ENABLE USB sudo usermod -a -G vboxusers$USER

ESSENTIAL PACKAGES FOR REINSTALL
sudo apt-get install lyx texlive-full build-essential ipython python-pandas xclip openssh-server screen valgrind git trayer xmonad xmobar suckless-tools vim scrot

Running the program xev and then looking at the output can tell you the key mapping for a key to put in xmonad.hs
http://ubuntuforums.org/archive/index.php/t-1755109.html

ESSENTIAL PACKAGES FOR REINSTALL
sudo apt-get install lyx texlive-full build-essential ipython python-pandas xclip openssh-server screen valgrind git trayer xmonad xmobar suckless-tools vim scrot

Running the program xev and then looking at the output can tell you the key mapping for a key to put in xmonad.hs
http://ubuntuforums.org/archive/index.php/t-1755109.html

ssh-keygen -t rsa -C <email here>
sudo apt-get install xclip
xlip -sel clip < ~/.ssh/id_rsa.pub
Now just add the key through the github interface

To make the repo push over ssh
git remote rm origin

https://help.github.com/articles/generating-ssh-keys

LAUNCH MATLAB IN 32 BIT MODE
matlab -glnx86

SETUP FAKE ETH DEVICE
sudo modprobe dummy
sudo ip link set name eth0 dev dummy0
sudo ifconfig eth0 hw ether 00:24:49:03:1a:72
http://www.question-defense.com/2012/11/26/linux-create-fake-ethernet-interface

INSTALL LATEST FLASHPLAYER
To get the latest flash plugin in Ubuntu 12.10, simply do
sudo apt-get install flashplugin-installer

CRONJOB TO CHANGE BRIGHTNESS
Tired of your backlight resetting every time you boot your laptop or set it to hibernate? Create a new file containing the following content in /usr/local/bin, and name the file check_backlight

#!/bin/bash

WAIT_TIME=3
LOOP_COUNTER=$\$$((60/WAIT_TIME)) for i in seq 1 \$$LOOP_COUNTER; do DESIRED_BRIGHTNESS=22 BRIGHTNESS_COMMAND='/usr/bin/intel_backlight ' BRIGHTNESS_PERCENTAGE=$\$$BRIGHTNESS_COMMAND | awk '{print \$$(NF)}'
BRIGHTNESS=$\$${BRIGHTNESS_PERCENTAGE%'%'} if [ \$$BRIGHTNESS -ne$\$$DESIRED_BRIGHTNESS ]; then BRIGHTNESS_COMMAND+=\$$DESIRED_BRIGHTNESS

eval $\$$BRIGHTNESS_COMMAND fi sleep \$$WAIT_TIME done Now add the following line to /etc/crontab * * * * * root /bin/bash /usr/local/bin/check_backlight This entry in crontab will run the script /usr/local/bin/check_backlight every minute. The check_backlight script loops for 60 seconds, checking the backlight every WAIT_TIME seconds. CHANGE SCREEN BRIGHTNESS FROM COMMANDLINE There is usually a program /usr/bin/intel_backlight that will allow you to set the backlight setting /usr/bin/intel_backlight xx where xx is some value between 0 and 100 For older systems, use sudo setpci -s 00:02.0 f4.b=xx where xx is 00-FF, and 00:02.0 is my VGA display device as shown by lspci Alternatively, for newer devices which use the intel_backlight interface, use this command echo xx | sudo tee /sys/class/backlight/intel_backlight/brightness where xx is anything from 0 (this will blank out your screen...) to whatever is displayed by cat /sys/class/backlight/intel_backlight/max_brightness Though it changes more than just brightness, the command xgamma -gamma xx where xx is anywhere from 0->1.5 will also change the screen. You can go higher than 1.5, but I haven't tried anything higher. http://wilmor24.wordpress.com/2010/05/11/change-screen-brightness-from-terminal-ubuntu-10-04/ https://bbs.archlinux.org/viewtopic.php?id=137164 SET CUSTOM VIDEOMODE TO CORRECTLY RESIZE VIRTUALBOX GUEST OS First, you will need to install the Guest Additions for the VM you wish to resize. For most Linux VMs, this requires installing the latest kernel and kernel headers. To get the kernel and kernel headers in CentOS, run sudo yum -y install kernel kernel-devel Once this is done, reboot the VM. This should use the latest kernel (which was just installed) allowing for successful building of the main Guest Additions module. To build the Guest Additions, go to Dvices->Install Guest Additions in the Virtualbox menu. You should now be able to autorun the Guest Additions software (it should show up as a CD) in the VM. There may still be other issues, typically OpenGL fails to build, but generally the main Guest Additions module is all that is necessary for screen resizing. vboxmanage setextradata "Z7" "CustomVideoMode1" "1280x650x32" Where Z7 is the name of the VM, and 1280x650x32 is the custom resolution you desire. In Ubuntu, you can take a stab at what your monitor resolution is by running xrandr. You can get a list of VM names by doing vboxmanage list vms https://forums.virtualbox.org/viewtopic.php?f=6&p=105958 MAKE MAC BOOTABLE USB IN UBUNTU Put in a USB device, then do dmesg | tail Depending on what /dev/sdX device is listed, do mkfs.vfat /dev/sdX -I Note, teh above command WILL wipe the USB drive! Now take the downloaded .ISO, and do dd of=/dev/sdX if=<path_to_.iso> bs=1M Reboot the Mac, while holding Alt. A boot prompt should come up and allow you to boot from the USB. FIXING MATLAB UNDER XMONAD After switching to XMonad, I still couldn't get MATLAB to work properly The fix is to add the line import XMonad.Hooks.SetWMName at the top of xmonad.hs, and then startupHook = setWMName "LG3D" under the defaultConfig section of xmonad.hs http://www.haskell.org/haskellwiki/Xmonad/Frequently_asked_questions ADDING CLS FILE TO LATEX Add a line to ~/.bashrc that points to your .cls files The . in front is critical to source the current directory For example, my github (http://www.github.com/kastnerkyle/Papers/) has an IEEEtran file that has the IEEE standard template inside. To use it, simply add this line to ~/.bashrc export TEXINPUTS=.:<path_to_IEEEtran_file>:$TEXINPUTS

now running texi2pdf should compile IEEE format .tex files

SETTING POWER TO SUSPEND ON  XMONAD LAPTOP
You will need the acpi program for this change to work - do a

sudo apt-get install acpi

After this, simply change the line in /etc/acpi/lid.sh that says

. /usr/share/acpi-support/screenblank

to

echo -n mem > /sys/power/state

INSTALLING SKYPE ON UBUNTU 12.10 64 bit
After downloading the 64 bit Ubuntu 10.04 .deb package from skype.com, I tried to install with dpkg -i

sudo dpkg -i skype-ubuntu_4.0.0.8-1_amd64.deb

This returned with:

dpkg: dependency problems prevent configuration of skype:
skype depends on lib32stdc++6 (>= 4.1.1-21); however:
Package lib32stdc++6 is not installed.
skype depends on lib32asound2 (>> 1.0.14); however:
Package lib32asound2 is not installed.
skype depends on ia32-libs; however:
Package ia32-libs is not installed.
skype depends on libc6-i386 (>= 2.7-1); however:
Package libc6-i386 is not installed.
skype depends on lib32gcc1 (>= 1:4.1.1-21+ia32.libs.1.19); however:
Package lib32gcc1 is not installed.

To get skype working properly, the following steps must be taken

sudo dpkg -r ia32-libs

sudo apt-get update
sudo apt-get install -f

After this, running

sudo dpkg -i skype-ubuntu_4.0.0.8-1_amd64.deb

should install skype.

Press Ctrl-o (also written c-o) to access the actions windows, where you can add buddies, plugins, and other things

CHANGING FINCH KEYBINDINGS
Switching to XMonad can create a keybinding conflict with Finch. To correct this, create a file named ~/.gntrc with the following text

[GntWM::binding]
c-n = window-next
c-p = window-prev
c-r = start-resize
c-t = start-move

TRANSFERRING FILES WITH NETCAT
On the destination server, set up a listen port by doing nc -l PORT > FILENAME, where PORT is the listen port to connect to, and  FILENAME is the filename you want to have on the destination server. This could be any number, but you probably want it to be >1000 due to Linux rules about ports.

Now on the source server, do
nc IP PORT < FILE, where IP is the ip of the destination server, PORT is the same as above, and FILE is name of the file to transfer.

An example, with IP = 192.168.10.5, PORT = 9001, FILE=backup.iso

Destination:
nc -l 9001 > backup.iso

Source:
nc 192.168.10.5 9001 < backup.iso

GET THE FULL KERNEL SOURCE FOR CENTOS 6.X AND COMPILE AGAINST IT
First, follow the instructions from the link below to get the full kernel source. To confirm that this was actually a complete kernel, I also performed a make in ~/rpmbuild/BUILD/kernel-2.6.32-220.23.1.el6/ . This compiled the entire kernel, and proved to me that this was the full kernel source (unlike the packages kernel-devel and kernel-headers).

http://wiki.centos.org/HowTos/I_need_the_Kernel_Source

Once the "test build" was complete, I needed to build my kernel module against that kernel source. This was accomplished by doing ./configure --kernel-source-path=/home/myname/rpmbuild/BUILD/kernel-2.6.32-220.23.1.el6/ , followed by make && make install

USING EVOLUTION WITH AN EXCHANGE SERVER
First, install the evolution, evolution-mapi, and evolution-exchange packages. On CentOS 6.3, this was accomplished by the command sudo yum install evolution evolution-mapi evolution-exchange

When setting up the account, select Exchange MAPI for the Server Type on the Receiving Email tab. It may also be necessary to use the IP address of your Exchange server, instead of the name. The domain name should be the prefix you normally see on your Windows login i.e. WONDERCO/jdoe could be
Server: 192.168.1.101
Domain name:WONDERCO

IMPORTANT: Before you click View->Calendar, disable the "Prefer Plain Text" plugin by going to Edit->Plugins, then unchecking the "Prefer Plain Text" box. I didn't do this, and got trapped in the calendar, which would freeze evolution every time it was launched. If this happens, reset your settings by following the instructions at the link below.
http://www.samlesher.com/tag/delete-evolution-settings

https://www.linux.com/learn/tutorials/370590:connect-evolution-to-an-exchange-server
http://mylinuxnotebook.blogspot.com/2008/09/evolution-ms-outlook-invitation.html

SETTING UP EMACS24 AND CLOJURE STARTER KIT
Ubuntu
sudo add-apt-repository ppa:cassou/emacs
sudo apt-get update
sudo apt-get install emacs-snapshot
Once the ppa and emacs are installed, start emacs by doing emacs-snapshot. Hit C-x C-f (Ctrl and x, then Ctrl and f) and type in the path ~/emacs.d/init.el . Add the following text to this file

(require 'package)
(package-initialize)

Now do M-x eval-buffer, then M-x package-refresh-contents, then M-x package-install. When prompted, type starter-kit. Now you are off! I also installed starter-kit-lisp
Most of the above information is just parroting the information found at the following links.

Installing EVIL mode for VIM keys
M-x package-install RET evil RET  
M-x package-install RET evil-leader RET 
M-x package-install RET evil-numbers RET  

Now add the following line to ~/.emacs.d/init.el
(evil-mode 1)

Other packages
auto-complete
auto-indent-mode

https://github.com/technomancy/clojure-mode/blob/master/doc/gnu-linux-setup.md
https://github.com/technomancy/emacs-starter-kit
http://www.emacswiki.org/emacs/Evil

IPTABLES
Redirecting traffic to different ports - in this caser, redicrect standard http requests and https requests to a server which doesn't run as root (all ports < 1000 require root access)

sudo iptables -t nat -A PREROUTING -p tcp --dport 80 -j REDIRECT --to-port 8080
sudo iptables -t nat -A PREROUTING -p tcp --dport 443 -j REDIRECT --to-port 8443

To show the new rules, do
iptables -t nat --list

## Saturday, February 9, 2013

### Kinect Interaction with Python

SOFTWARE IS THE NEW HARDWARE, WEB IS THE NEW SOFTWARE
The following instructions apply to Ubuntu 12.10 64bit. Similar commands should work on other distributions but  I personally am running Ubuntu.

Run the following apt-get command to get the necessary packages for building libfreenect
sudo apt-get install gcc g++ ffmpeg libxi-dev libxu-dev freeglut3 freeglut3-dev cmake git

Once that is done, go to your personal project directory (I keep all my personal projects in ~/proj) and clone the latest libfreenect software
cd ~/proj
git clone http://www.github.com/OpenKinect/libfreenect
mkdir libfreenect/build
cd libfreenect/build
cmake ../CMakeLists.txt

WE ARE DEMO
Now that the core library has been built, it is demo time! Make sure your Kinect is connected to your PC and powered on, then run the following commands
cd libfreenect/bin
sudo ./glview

You should see an image that looks something like this:

CONAN THE LIBRARIAN
Next you will need to add the libfreenect libraries to your library path. I prefer to do this by adding symlinks to /usr/local/bin, but you could also add libfreenect/lib directly to LD_LIBRARY_PATH.
sudo ln -s libfreenect/lib/libfreenect.so.0.1.2 /usr/local/lib/libfreenect.so{,.0.1}
sudo ln -s libfreenect/lib/libfreenect_sync.so.0.1.2 /usr/local/lib/libfreenect_sync.so{,.0.1}
export LD_LIBRARY_PATH=\$LD_LIBRARY_PATH:/usr/local/lib

You might wish to put the export line in ~/.bashrc, so that the libfreenect libraries are always on the right path when a terminal is initialized.

PARAPPA THE WRAPPER(S)
To install the python wrappers and run the demo, we will need to also install opencv and the opencv python wrappers. Run the following command:
sudo apt-get install python-dev build-essential libavformat-dev ffmpeg libcv2.3 libcvaux2.3 libhighgui2.3 python-opencv opencv-doc libcv-dev libcvaux-dev libhighgui-dev

Now to install the python wrappers to the libfreenect libraries
cd libfreenect/wrappers
sudo python setup.py install

It is time to get the python demo, and prove that we can run it!
cd ~/proj
git clone http://www.github.com/amiller/libfreenect-goodies
cd libfreenect-goodies

Edit the demo_freenect.py file, changing the line
cv.ShowImage('both',np.array(da[::2,::2,::-1]))
to now say
cv.ShowImage('both',cv.fromarray(np.array(da[::2,::2,::-1]))

PYTHONIC PYTHARSIS
Once you are done editing the file, run the demo by doing
sudo python demo_freenect.py

If you are successful, you should see a screen like this:

If you see an image like the above - congratulations, you're now Kinect hacking with Python! I hope to expand on this exploration soon with code demos, doing different cool things and further exploring the hardware, but this is an excellent start. For any questions, just leave a comment below and I will try to get back to you!