The rotational velocity of stars can be measured with several standard diagnostic methods. However, if the star rotation axis is inclined with respect to the line-of-sight (LOS), the measured velocity is not the equatorial velocity, but

$$ v = v_e \sin i $$

where $v$ is the measured velocity, $v_e$ is the equatorial rotation velocity of the star (assuming that the star rotates as a solid body) and $i$ is the inclination angle of the rotation axis with respect to the LOS. In this article we consider first how to estimate the equatorial velocity from the measured velocity and then how to combine many observations to obtain the distribution of rotation velocities assuming that the rotation axis are isotropically distributed.

Single star¶

If we make the subsitution $\mu=\cos i$, then our generative model (the model that shows how the measurement is obtained) is given by:

$$ v = v_e \sqrt{1-\mu^2} + \epsilon $$

where we have assumed that errors are additive and determined by the statistical properties of $\epsilon$. For instance, if the measurements have a mean value and an uncertainty, we can consider that $\epsilon$ is distributed as a Gaussian with zero mean and a known variance $\sigma_n$^2. In any other case, once the distribution is defined, it will enter into the subsequent analysis.

The change of variables from $i$ to $\mu$ is helpful because a three-dimensional isotropic vector field has equal probability for every value of $\mu$, so that a suitable prior is $p(\mu)=1$.

Our aim is then to construct the posterior distribution $p(v_e|v)$, that gives the probability of finding a certain value of the equatorial velocity provided that we have measured the projected rotation velocity $v$. To find this distribution, we build the full posterior, which is obtained applying the Bayes theorem:

$$ p(v_e,\mu|v) = \frac{1}{Z} p(v|v_e,\mu) p(v_e,\mu) $$

where $Z$ is a normalizing constant that is of no interest (and we drop it from now on) in our problem because it does not depend on any of the variables. Then:

$$ p(v_e|v) = \int_0^1 d\mu p(v_e,\mu|v) = p(v_e) \int_0^1 d\mu p(v|v_e,\mu) p(\mu) $$

where the last step assumes that the priors for $v$ and $\mu$ factorize. Assuming an isotropic prior and a Gaussian statistics for the measurement process, we find

$$ p(v_e|v) = p(v_e) \int_0^1 d\mu \exp \left[ -\frac{\left( v - v_e \sqrt{1-\mu^2} \right)^2}{2\sigma_n^2}\right] $$

Example¶

import numpy as np
import matplotlib.pyplot as pl
import scipy.integrate
import emcee
import pdb
import scipy.misc as mi
%matplotlib inline

class singleCase(object):
    def __init__(self, vp, noise, N, upper):
        self.vp = vp
        self.noise = noise
        self.N = N
        self.upper = upper

        self.muLower = 0.0
        self.muUpper = 1.0
		
    def fun1(self, x, v):
        return np.exp(-(self.vp-v*np.sqrt(1.0-x**2))**2 / (2.0*self.noise**2))
	
    def integrate(self):
        self.v = np.linspace(0.0,self.upper*self.vp,self.N)
        self.pv = np.zeros(self.N)
        for i in range(self.N):
            self.pv[i] = scipy.integrate.quad(self.fun1, 0.0, 1.0, args=(self.v[i],))[0]
    
    def transformVariables(self, theta):
        veNew = np.exp(theta[0])
        muNew = self.muLower + (self.muUpper - self.muLower) / (1.0 + np.exp(-theta[1]))
        thetaNew = np.asarray([veNew, muNew])
        lnGrad = np.log(self.muUpper-self.muLower) - theta[1] - 2.0 * np.log(1.0 + np.exp(-theta[1])) + theta[0]
        return thetaNew, lnGrad
    
    def logPosterior(self, theta):
        thetaNew, lnGrad = self.transformVariables(theta)
        if (thetaNew[0] > 0.01 and thetaNew[1] > 0.01):            
            return -(self.vp-thetaNew[0]*np.sqrt(1.0-thetaNew[1]**2))**2 / (2.0*self.noise**2) + lnGrad - np.log(thetaNew[0])
        return -np.inf
    
    def sample(self):        
        ndim, nwalkers = 2, 100                          
        
        self.theta0 = np.asarray([ np.log(2.0*np.log(self.vp)), np.log( (0.0-0.5) / (0.5-1.0) ) ]) 
        p0 = [self.theta0 + 0.01*np.random.randn(ndim) for i in range(nwalkers)]
        
        self.sampler = emcee.EnsembleSampler(nwalkers, ndim, self.logPosterior)
        self.sampler.run_mcmc(p0, 1000)
        self.sampler.flatchain[:,0] = np.exp(self.sampler.flatchain[:,0])
        self.sampler.flatchain[:,1] = self.muLower + (self.muUpper - self.muLower) / (1.0 + np.exp(-self.sampler.flatchain[:,1]))
			
    def plot(self):
        f, ax = pl.subplots(nrows=1, ncols=2, figsize=(12,6))
        ax[0].plot(self.v,self.pv)
        ax[0].set_xlabel(r'$v_e$ [km/s]')
        ax[0].set_ylabel(r'p($v_e|v$)')
        ax[0].set_title('v=20 km/s')
        
        self.cdf = np.cumsum(self.pv)
        v95 = np.interp(0.95,self.cdf / np.max(self.cdf),self.v)
        ax[1].plot(self.v,self.cdf / np.max(self.cdf))
        ax[1].set_xlabel(r'$v_e$ [km/s]')
        ax[1].set_ylabel(r'cdf($v_e|v$)')
        ax[1].set_title('v=20 km/s')
        ax[1].axhline(0.95)
        ax[1].annotate('v(<95%)={0:5.1f} km/s'.format(v95),(v95,0.95),(200,0.8),arrowprops=dict(facecolor='black', shrink=0.05))
        pl.tight_layout()

res = singleCase(20.0,3.0,500,25)

res.integrate()
res.plot()

res.sample()

pl.plot(res.sampler.flatchain[:,0],res.sampler.flatchain[:,1],'.')
pl.xlim([0,300])

(0, 300)

Combination of many stars¶

When we measure the rotation velocity of several stars, it is of interest to combine them all to obtain the distribution of equatorial velocities under the assumption of the isotropic distribution of rotation axis. We follow here a hierarchical Bayesian approach to obtain the global distribution of equatorial velocities. To this end, we parameterize the global distribution of equatorial velocities with the set of parameters $\theta_v$. The full posterior for all the observed stars (all included in $D$) is then given by:

$$ p(\mathbf{v}_e,\mathbf{\mu},\theta_v|D) = p(D|\mathbf{v}_e,\mathbf{\mu},\theta_v) p(\mathbf{\mu}) p(\mathbf{v}_e|\theta_v) p(\theta_v) $$

where we have assumed that the prior for $\mu$ factorizes and we have make the hierarchical structure clear. The graphical model is the following:

import daft
from matplotlib import rc

rc("font", family="serif", size=12)
rc("text", usetex=True)

pgm = daft.PGM([4.0, 3.5], origin=[0, 0])

pgm.add_node(daft.Node("theta", r"${\mbox{\boldmath$\theta$}}$", 1, 3, scale=1.3))
pgm.add_node(daft.Node("ve", r"$v_e$", 1, 2, scale=1.3))
pgm.add_node(daft.Node("mu", r"$\mu$", 2.5, 2, scale=1.3))
pgm.add_edge("theta", "ve")

pgm.add_node(daft.Node("v", r"$v_i$", 1, 1, scale=1.3))
pgm.add_node(daft.Node("vobs", r"$v^\mathrm{obs}_i$", 2.5, 1, observed=True, scale=1.3))

pgm.add_edge("ve", "v")
pgm.add_edge("mu", "v")
pgm.add_edge("v", "vobs")

pgm.add_plate(daft.Plate([0.2, 0.2, 3.2, 2.35], label=r"Star $i$"))
pgm.render()

<matplotlib.axes.Axes at 0x504c910>

Assuming that the observation of two different stars is completely uncorrelated, we can factorize the likelihood and the priors, so that we can rewrite the posterior as:

$$ p(\mathbf{v}_e,\mathbf{\mu},\theta_v|D) = \left[ \prod_{i=1}^N p(v_i|v_{ei},\mu_i) \right] \left[ \prod_{i=1}^N p(\mu_i) \right] \left[ \prod_{i=1}^N p(v_{ei}|\theta_v) \right] p(\theta_v) $$

We are interested in obtaining the posterior over the hyperparameters of the global distribution, so that:

$$ p(\theta_v|D) = p(\theta_v) \prod_{i=1}^N \left[ \int_0^1 d\mu_i \int_0^\infty dv_{ei} p(v_i|v_{ei},\mu_i) p(\mu_i) p(v_{ei}|\theta_v) \right] $$

The aim is then to find a flexible and suitable parameterization of the global distribution $p(v_{ei}|\theta_v)$ and sample from the marginal posterior distribution to get information about the parameters of the global distribution. A very general distribution that is defined in the interval $[0,\infty)$ is the generalized gamma distribution:

$$ p(v_{ei}|a,d,p) = \frac{p}{a^d} \frac{x^{d-1} e^{-(x/a)^p}}{\Gamma(d/p)}. $$

Importance sampling trick¶

A difficulty of the previous approach is that the integrals on $\mu$ and $v_e$ do not have an analytic representation and have to be carried out numerically for every value of $\theta_v$. Therefore, the sampling of the full posterior becomes computationally complicated. To overcome this difficulty, we apply an importance sampling trick. To this end, we multiply and divide the full posterior by an "interim" prior distribution $\pi(\mathbf{v_e})$. This prior distribution should be sufficiently broad as to give non-zero mass to the region of the space of parameters where the full posterior is non-negligible. In this case, we can write:

$$ p(\theta_v|D) = p(\theta_v) \prod_{i=1}^N \left[ \int_0^1 d\mu_i \int_0^\infty dv_{ei} p(v_i|v_{ei},\mu_i) p(\mu_i) \pi(v_{ei}) \frac{p(v_{ei}|\theta_v)}{\pi(\mathbf{v_e})} \right] $$

Each one of the integrals can then be interpreted as the posterior average of the quantity $p(v_{ei}|\theta_v)/\pi(\mathbf{v_e})$, so that we end up with the final result:

$$ p(\theta_v|D) = p(\theta_v) \prod_{i=1}^N E \left[ \frac{p(v_{ei}|\theta_v)}{\pi(\mathbf{v_e})} \right] $$

or, taking logarithm:

$$ \log p(\theta_v|D) = \log p(\theta_v) + \sum_{i=1}^N \log \left( E \left[ \frac{p(v_{ei}|\theta_v)}{\pi(\mathbf{v_e})} \right] \right) $$

Therefore, once a Markov Chain Monte Carlo sampling is available for each individual star, one can use these samples to sample from the marginal posterior distribution of the hyperparameters.

Example¶

Let us generate a sample of stars following a Maxwell distribution and generate artificial measurements.

nStar = 20
sigma = 50.0
sigmaNoise = 4.0
vx = sigma*np.random.randn(nStar)
vy = sigma*np.random.randn(nStar)
vz = sigma*np.random.randn(nStar)
v = np.sqrt(vx**2+vy**2+vz**2)
vMeasured = v * np.sqrt(1.0-np.random.rand(nStar)**2) + sigmaNoise * np.random.randn(nStar)

f, ax = pl.subplots(nrows=1, ncols=2, figsize=(12,6))
ax = ax.flatten()
ax[0].hist(vMeasured)
ax[1].hist(v)

(array([ 4.,  3.,  2.,  1.,  1.,  2.,  4.,  1.,  0.,  2.]),
 array([  35.33745414,   47.77818351,   60.21891288,   72.65964225,
         85.10037162,   97.54110099,  109.98183036,  122.42255973,
        134.8632891 ,  147.30401846,  159.74474783]),
 <a list of 10 Patch objects>)

Now let us obtain the posterior samples for each one of the stars.

samples = np.zeros((1000,2,nStar))
for i in range(nStar):
    res = singleCase(vMeasured[i],sigmaNoise,500,25)
    res.sample()
    ind = np.random.permutation(res.sampler.flatchain[:,0].size)[0:1000]
    samples[:,:,i] = res.sampler.flatchain[ind,:]

np.save('samples.npy',samples)

class hyperparImportanceSampling(object):
    def __init__(self):
        samples = np.load('samples.npy')
        self.ve = samples[:,0,:]
        self.mu = samples[:,1,:]
		    
    def transformVariables(self, theta):
        vNew = np.exp(theta[0])
        thetaNew = np.asarray([vNew])
        lnGrad = theta[0]
        return thetaNew, lnGrad
    
    def logPosterior(self, theta):
        thetaNew, lnGrad = self.transformVariables(theta)        
        logP = np.sum(mi.logsumexp(2.0*np.log(self.ve) - 3.0*np.log(thetaNew[0]) - 0.5*self.ve**2 / thetaNew[0]**2, axis=0))
        if (thetaNew[0]):            
            return lnGrad + logP
        return -np.inf
    
    def sample(self):        
        ndim, nwalkers = 1, 300                          
        
        self.theta0 = np.asarray([ np.log(20.0)]) 
        p0 = [self.theta0 + 0.01*np.random.randn(ndim) for i in range(nwalkers)]
        
        self.sampler = emcee.EnsembleSampler(nwalkers, ndim, self.logPosterior)
        self.sampler.run_mcmc(p0, 100)
        self.samples = np.exp(self.sampler.chain[:, 50:, :].reshape((-1, ndim)))

impSampl = hyperparImportanceSampling()
impSampl.sample()

pl.plot(impSampl.samples,'.')

[<matplotlib.lines.Line2D at 0x2c832d0>]

print "Mean={0} - Std. deviation={1}".format(np.mean(impSampl.samples),np.std(impSampl.samples))

Mean=43.5361638918 - Std. deviation=5.17169527538

We note that the inferred value of the parameter of the Maxwellian distribution is close to the original value of 50 km/s.

vsiniDeconvolution

Single star¶

Example¶

Combination of many stars¶

Importance sampling trick¶

Example¶