ProFit/ProFound: The Full Monty

Get the latest version of ProFound and ProFit:

library(devtools)
install_github('asgr/ProFound')
install_github('ICRAR/ProFit')

Set global evaluate (basically TRUE for GitHub version and FALSE for CRAN):

evalglobal=FALSE

First load the libraries we need:

library(ProFit)
library(ProFound)
library(RColorBrewer)

So, you want to profile a galaxy, and you have next to no information about it. Well with ProFit no segmentation map, no sigma map, no PSF, no gain and no sky subtraction is (almost) no problem. Thanks to a number of utility functions we can achieve a reasonable fit bootstrapping directly from the data.

As an example, our mystery data will also be in a fairly crowded region, which will either require good (and conservative) image segmentation, or multiple object fitting.

First load the data:

image=readFITS(system.file("extdata", 'VIKING/mystery_VIKING_Z.fits', package="ProFound"))$imDat

And take a look at what we have got:

magimage(image, hicut=1)

It is Z-band data from the VIKING survey, and in the second image it is possible to see that the bright source close to our spiral galaxy of interest in the centre is in fact two very bright stars.

Making Inputs for ProFit

The first thing to do is to create an object segmentation of the full image. Here we input the known magnitude zero-point for this frame (30), and the pixel scale for VISTA VIRCAM (0.34 asec/pix). If these are not provided then outputs would be in terms on ADUs and pixels rather than AB mag and asec.

segim=profoundMakeSegim(image, magzero=30, pixscale=0.34, plot=TRUE)

This is a pretty decent start, and it has identified all the main sources of interest. We can look at the object information in the attached table:

head(segim$segstats)

The output is approximately in descending flux order. This data happened to have a mag-zero point of 30, so we can extract approximate magnitudes already. Our object of interest is the galaxy near the centre of the frame (178,178), so we already have approximate parameter properties for fitting purposes. E.g. the magnitude to start with is going to be close to 18.55 (notice since we provided a magzero argument before the flux is given in magnitudes already). This number will be useful later.

The next step is to expand out the segmantation for our target galaxy. this is a good idea because we want our model fit to descend into sky dominated pixels and not to be tightly defined by bright central pixels:

segim_expand=profoundMakeSegimExpand(image, segim$segim, expand=6, expandsigma=3, skycut=-1, magzero=30, pixscale=0.34, rotstats=TRUE, plot=TRUE)

Notice because we set expand=6 only our object of interest (with segID=6) has been expanded. The other sources are left as they were.

We can use this output to calculate an approximate sky surface brightness limit (taken as being the RMS of the sky):

SBlim=profitFlux2SB(segim_expand$skyRMS, magzero=30, pixscale=0.34)
print(SBlim)

Given this segmentation we can look at some statistics on our apparent sky pixels.

object_frac=length(which(segim_expand$objects==1))/length(image)
qnorm(1-object_frac)

The above tells us that at a skycut level of ~1.4 is the point where we would expect 50% of pixels to be False-Positive (i.e. positive sky fluctuations) rather than True-Positives (associated with real sources). If we cut higher than this then most of our pixels at the surface brightness threshold will be real.

It is also instructive to make the qqplot of sky pixels:

temp=qqnorm(sort((image[segim_expand$objects==0]-segim_expand$sky)/segim_expand$skyRMS), plot.it = FALSE)
magplot(0,0,xlim=c(-5,5),ylim=c(-5,5), grid=TRUE, type='n', xlab='Normal Quantile', ylab='Sky Pixel Quantile')
lines(temp)
abline(0,1,col='red')

As we should hope, the qqplot crosses at very close to [0,0], but we can see evidence of a sytematic excess for positive flux in our sky pixels. This is to be expected- the Universe contains only positive sources (resolved or not, faint or bright) so if we have Normal statistics to the negative side, we should always see an excess on the positive side. Given this nature of the astrophysical sky, we look to have recovered the absolute sky and RMS of the sky to within a few percent (try perturbing the sky and skyRMS values to see how this affects the qqplot).

Now we have a decent segmentation we can estimate the image gain and the sigma map:

gain=profoundGainEst(image,objects=segim_expand$objects, sky=segim_expand$sky, skyRMS=segim_expand$skyRMS)
gain

The number found is ~0.5. Since the gain estimation routine is typically accurate to a factor of a few, it is quite likely that the image is already in photo-electron counts (i.e. gain=1), but we will continue with our lower estimate regardless.

Now we have the segmentation, sky estimates and the gain, we can make a sgima map:

sigma=profoundMakeSigma(image, objects=segim_expand$objects, gain=gain, sky=segim_expand$sky, skyRMS =segim_expand$skyRMS, plot=TRUE)

The sigma map will look much liek our original image, but with sky pixels set to a fixed value of uncertainty, and object pixels having an additional component of shot-noise.

A Simple PSF Fit

Next we want to find the PSF for fitting, so we should extract a a bright but non-saturated star from the list above:

magplot(segim_expand$segstats$R50, segim_expand$segstats$con,log='x', ylim=c(0,1), col=hsv(magmap(segim_expand$segstats$axrat, flip=TRUE)$map), xlab='Major Axis / pix', ylab='Concentration')
magbar('topleft', title='Axrat', titleshift=1)
abline(h=c(0.4,0.65), lty=2)
abline(v=c(0.7,1.2), lty=2)

magplot(segim_expand$segstats$SB_N50, segim_expand$segstats$con, ylim=c(0,1), col=hsv(magmap(segim_expand$segstats$axrat, flip=TRUE)$map), xlab='SB[50] / mag/asec^2', ylab='Concentration', grid=TRUE)
magbar('topleft', title='Axrat', titleshift=1)
abline(v=c(23.5,2), lty=2)

magplot( segim_expand$segstats$asymm, segim_expand$segstats$con, ylim=c(0,1), col=hsv(magmap(segim_expand$segstats$axrat, flip=TRUE)$map), xlab='Asymmetry', ylab='Concentration', grid=TRUE)
magbar('topleft', title='Axrat', titleshift=1)
abline(v=c(0.4), lty=2)

We want to extract very red (axial ratio ~ 1) stars from the above region contained by the horizontal and vertical dashed lines:

starlist=segim_expand$segstats[segim_expand$segstats$axrat>0.9 & segim_expand$segstats$con>0.4 & segim_expand$segstats$con<0.65 & segim_expand$segstats$R50>0.7 & segim_expand$segstats$R50<1.2 & segim_expand$segstats$SB_N50<23.5 & segim_expand$segstats$asymm<0.2,]
print(starlist)

There are 5 sensible looking options it seems. We can extract the brightest of these likely stars to do our rough PSF estimate. When doing this in anger you might want to consider fitting all 5 stars (or however many make it through the selection cuts above). This gives you a way to marginalise over the uncertainty in the PSF (I discuss PSF uncertainty in some more detail at the end of this vignette).

psf_image=magcutout(image, loc=starlist[1,c('xcen','ycen')], box=c(31,31))
psf_sigma=magcutout(sigma, loc=starlist[1,c('xcen','ycen')], box=c(31,31))
psf_segim=magcutout(segim_expand$segim, loc=starlist[1,c('xcen','ycen')], box=c(31,31))

Look at our target star to check we are happy:

magimage(psf_image$image)
magimage(psf_sigma$image)
magimage(psf_segim$image)

Next we make our initial model, where we take our PSF estimates from the segmentation stats:

psf_x=psf_image$loc[1]
psf_y=psf_image$loc[2]
psf_mag=starlist[1,'mag']
psf_fwhm=starlist[1,'R50']*2/0.339
psf_con=1/starlist[1,'con']

psf_modellist=list(
  moffat=list(
    xcen=psf_x,
    ycen=psf_y,
    mag=psf_mag,
    fwhm=psf_fwhm,
    con=psf_con,
    axrat=1,
    box=0
  )
)

We also need to set up the tofit, tolog and intervals structure

psf_tofit=list(
  moffat=list(
    xcen=TRUE,
    ycen=TRUE,
    mag=TRUE,
    fwhm=TRUE,
    con=TRUE,
    axrat=FALSE,
    box=FALSE
  )
)

psf_tolog=list(
  moffat=list(
    xcen=FALSE,
    ycen=FALSE,
    mag=FALSE,
    fwhm=TRUE,
    con=TRUE,
    axrat=FALSE,
    box=FALSE
  )
)

psf_intervals=list(
  moffat=list(
    xcen=list(psf_x+c(-5,5)),
    ycen=list(psf_y+c(-5,5)),
    mag=list(psf_mag+c(-2,2)),
    fwhm=list(c(1,10)),
    con=list(c(1,10)),
    axrat=list(c(0.1,1)),
    box=list(c(-1,1))
  )
)

Since there are no other sources in the frame we can use the full psf_image to attempt our profile fit (so no need to provide psf_segim):

psf_Data=profitSetupData(psf_image$image, sigma=psf_sigma$image, modellist=psf_modellist, tofit=psf_tofit, tolog=psf_tolog, magzero=30, algo.func='optim', intervals=psf_intervals)

We can check our initial guess easily:

profitLikeModel(parm=psf_Data$init, Data=psf_Data, makeplots=TRUE, plotchisq=TRUE)

Not bad, but not perfect. We can do an optim fit to improve things:

psf_fit=optim(psf_Data$init, profitLikeModel, method='BFGS', Data=psf_Data, control=list(fnscale=-1))

Now we plot the output:

profitLikeModel(parm=psf_fit$par, Data=psf_Data, makeplots=TRUE, plotchisq=TRUE)

This looks like a very good fit. We will now contruct our model PSF:

psf_modellist_fit=profitRemakeModellist(parm=psf_fit$par, Data=psf_Data)$modellist
psf_modellist_fit$moffat$xcen=25/2
psf_modellist_fit$moffat$ycen=25/2
psf_model=profitMakeModel(modellist=psf_modellist_fit, dim=c(25,25))$z

And now we can look at our final PSF model:

magimage(psf_model)

Now we have all the key bits we need in order to fit our target galaxy!

An Advanced PSF Fit

A more complicated alternative is to fit all the good stars (5 in this case) with the same profile simultaneously. Basically we have 5 PSFs distributed around our image, but we want to fit them all with the same PSF, modulo different xcen, ycen and mag:

psf_x=starlist$xcen
psf_y=starlist$ycen
psf_mag=starlist$mag
psf_fwhm=starlist[1,'R50']*2/0.339
psf_con=1/starlist[1,'con']

psf_modellist2=list(
  pointsource = list(
    xcen = psf_x,
    ycen = psf_y,
    mag = psf_mag
  ),
  psf=list(
    moffat=list(
      mag=0,
      fwhm=psf_fwhm,
      con=psf_con,
      axrat=1,
      box=0
    )
  )
)

We need to make our own region map since our object to fit will not be centrally located. First we should expand our segments of interest (not always vital- it depends whether you think there is much PSF flux hidden in broad wings or not):

segim_expand_psf=profoundMakeSegimExpand(image, segim$segim, expand=starlist$segID, expandsigma=3, skycut=-1, magzero=30, pixscale=0.34, plot=TRUE)
psf_region=matrix(0, dim(image)[1], dim(image)[2])
psf_region[segim_expand$segim %in% starlist$segID]=1

We now need to prepare a more complicated fitting structure. We want to fit the 5 stars with the same PSF, modulo different xcen, ycen and mag:

psf_tofit2=list(
  pointsource = list(
    xcen = rep(TRUE,5),
    ycen = rep(TRUE,5),
    mag = rep(TRUE,5)
  ),
  psf=list(
    moffat=list(
      mag=FALSE,
      fwhm=TRUE,
      con=TRUE,
      axrat=FALSE,
      box=FALSE
    )
  )
)

psf_tolog2=list(
  pointsource = list(
    xcen = rep(FALSE,5),
    ycen = rep(FALSE,5),
    mag = rep(FALSE,5)
  ),
  psf=list(
    moffat=list(
      mag=FALSE,
      fwhm=TRUE,
      con=TRUE,
      axrat=TRUE,
      box=TRUE
    )
  )
)

psf_intervals2=list(
  pointsource = list(
    xcen=list(starlist$xcen[1]+c(-5,5), starlist$xcen[2]+c(-5,5), starlist$xcen[3]+c(-5,5), starlist$xcen[4]+c(-5,5), starlist$xcen[5]+c(-5,5)),
    ycen=list(starlist$ycen[1]+c(-5,5), starlist$ycen[2]+c(-5,5), starlist$ycen[3]+c(-5,5), starlist$ycen[4]+c(-5,5), starlist$ycen[5]+c(-5,5)),
    mag=list(starlist$mag[1]+c(-2,2), starlist$mag[2]+c(-2,2), starlist$mag[3]+c(-2,2), starlist$mag[4]+c(-2,2), starlist$mag[5]+c(-2,2))
  ),
  psf=list(
    moffat=list(
      mag=list(c(-1, 1)),
      fwhm=list(c(1,10)),
      con=list(c(1,10)),
      axrat=list(c(0.1,1)),
      box=list(c(-1,1))
    )
  )
)

From here everything is much simpler again, since we just feed our prepared objects into our setup function as per usual:

psf_Data2=profitSetupData(image, sigma=sigma, modellist=psf_modellist2, tofit=psf_tofit2, tolog=psf_tolog2, magzero=30, algo.func='optim', intervals=psf_intervals2, region=psf_region)

We check the initial guess. Notice the three stars are visible as the three circled regions. Given the image scale it is a bit tricky to see how good our initial guesses actually were, but probably not too bad.

profitLikeModel(parm=psf_Data2$init, Data=psf_Data2, makeplots=TRUE, plotchisq=TRUE)

So like before, not bad, but not perfect. We can do an optim fit to improve things:

psf_fit2=optim(psf_Data2$init, profitLikeModel, method='BFGS', Data=psf_Data2, control=list(fnscale=-1))

Now we plot the output:

profitLikeModel(parm=psf_fit2$par, Data=psf_Data2, makeplots=TRUE, plotchisq=TRUE)

Maybe useful to zoom in a bit:

psf_modellist_fit2=profitRemakeModellist(parm=psf_fit2$par, Data=psf_Data2)$modellist
psf_model_full2=profitMakeModel(psf_modellist_fit2,dim=dim(image),magzero=30)$z
for(i in 1:5){
  magimage(magcutout(image, loc=starlist[i,c('xcen','ycen')], box=c(51,51))$image, col=rev(colorRampPalette(brewer.pal(9, "RdYlBu"))(100)), magmap=FALSE, zlim=c(-2e2,2e2))
  magimage(magcutout(image-psf_model_full2, loc=starlist[i,c('xcen','ycen')], box=c(51,51))$image, col=rev(colorRampPalette(brewer.pal(9, "RdYlBu"))(100)), magmap=FALSE, zlim=c(-2e2,2e2))
}

This looks like a very good fit. We will now contruct our model PSF:

psf_modellist_fit2$psf$moffat$xcen=25/2
psf_modellist_fit2$psf$moffat$ycen=25/2
psf_model2=profitMakeModel(modellist=psf_modellist_fit2$psf, magzero=30, dim=c(25,25))$z

And now we can look at our final PSF model:

magimage(psf_model2)

Fitting the Target Galaxy with ProFit

Much like we did with the PSF fit, we have to extract the region of interest and setup our fitting structures:

gal_image=magcutout(image, loc=segim_expand$segstats[6,c('xcen','ycen')], box=c(101,101))
gal_sigma=magcutout(sigma, loc=segim_expand$segstats[6,c('xcen','ycen')], box=c(101,101))
gal_segim=magcutout(segim_expand$segim, loc=segim_expand$segstats[6,c('xcen','ycen')], box=c(101,101))

Check the cutout regions:

magimage(gal_image$image)
magimage(gal_sigma$image)
magimage(gal_segim$image)

As before, we have to setup some reasonable guesses for our galaxy fit. We give it a very faint initial bulge since we can tell it is a very disky system and we are only using a simple optimiser so we do not want it to get stuck in an odd local minima with a massive bulge. To do this properly and with psychic inputs you would use a better optimiser (e.g. full MCMC or genetic algorithm).

gal_x=gal_image$loc[1]
gal_y=gal_image$loc[2]
gal_mag=segim_expand$segstats[6,'mag']
gal_re=segim_expand$segstats[6,'R50']/0.339
gal_ang=segim_expand$segstats[6,'ang']
gal_axrat=segim_expand$segstats[6,'axrat']

gal_modellist=list(
  sersic=list(
    xcen=rep(gal_x,2),
    ycen=rep(gal_y,2),
    mag=rep(gal_mag,2)+c(2,0.7),
    re=c(gal_re/4,gal_re),
    nser=c(4,1),
    ang=c(0,gal_ang),
    axrat=c(1,gal_axrat),
    box=rep(0,2)
  )
)

And now we make the other inputs we need for our fitting Data structure:

gal_tofit=list(
  sersic=list(
    xcen= c(TRUE,NA), #We fit for xcen and tie the two togther
    ycen= c(TRUE,NA), #We fit for ycen and tie the two togther
    mag= c(TRUE,TRUE), #Fit for both
    re= c(TRUE,TRUE), #Fit for both
    nser= c(FALSE,FALSE), #Fit for neither
    ang= c(FALSE,TRUE), #Fit for disk
    axrat= c(FALSE,TRUE), #Fit for disk
    box= c(FALSE,FALSE) #Fit for neither
  )
)

gal_tolog=list(
  sersic=list(
    xcen= c(FALSE,FALSE),
    ycen= c(FALSE,FALSE),
    mag= c(FALSE,FALSE),
    re= c(TRUE,TRUE), #re is best fit in log space
    nser= c(TRUE,TRUE), #nser is best fit in log space
    ang= c(FALSE,FALSE),
    axrat= c(TRUE,TRUE), #axrat is best fit in log space
    box= c(FALSE,FALSE)
  )
)

gal_intervals=list(
  sersic=list(
    xcen=list(lim=gal_x+c(-5,5),lim=gal_x+c(-5,5)),
    ycen=list(lim=gal_y+c(-5,5),gal_y+c(-5,5)),
    mag=list(lim=c(10,30),lim=c(10,30)),
    re=list(lim=c(1,5),lim=c(1,100)),
    nser=list(lim=c(0.5,20),lim=c(0.5,20)),
    ang=list(lim=c(-180,360),lim=c(-180,360)),
    axrat=list(lim=c(0.001,1),lim=c(0.001,1)),
    box=list(lim=c(-1,1),lim=c(-1,1))
  )
)

Now we can set up our full fitting structure with averything we have made (notice we use our simpler PSF model here):

gal_Data=profitSetupData(gal_image$image, sigma=gal_sigma$image, modellist=gal_modellist, tofit=gal_tofit, tolog=gal_tolog, magzero=30, algo.func='optim', intervals=gal_intervals, psf=psf_model, segim=gal_segim$image)
profitLikeModel(parm=gal_Data$init, Data=gal_Data, makeplots=TRUE, plotchisq=TRUE)

Not a bad start, but certainly not great. Let's do an optim fit:

gal_fit=optim(gal_Data$init, profitLikeModel, method='BFGS', Data=gal_Data, control=list(fnscale=-1))

We can now recreate our optimised modellist:

gal_fit_modellist=profitRemakeModellist(gal_fit$par, Data=gal_Data)$modellist
print(gal_fit_modellist)

And we can then plot the output:

profitLikeModel(parm=gal_fit$par, Data=gal_Data, makeplots=TRUE, plotchisq=TRUE)
profitEllipsePlot(Data=gal_Data, modellist=gal_fit_modellist, pixscale=0.34, SBlim=SBlim)

This looks pretty good!

Conclusions

So there we have it- using image pixel data and no header information except for the magnitude zero-point (which we can always calibrate later by matching to known sources) we have managed to achieve a very reasonable looking galaxy fit in a complex region with bright nearby contaminating sources. Our full chain of work was:

  1. Load in the image data
  2. Make a segmentation map (segim), which also estimates sky properties (profitMakeSegim and profitMakeSegimExpand)
  3. Calculate the likely elec/ADU gain of the image (profitGainEst)
  4. Make a sigma map (profitMakeSigma)
  5. Find a subset of likely stars, and extract the brightest one
  6. Fit a Moffat profile to this bright star and generate a model PSF
  7. Cut down our data to the region around the galaxy of interest
  8. Estimate initial B/D parameters from the segimentation map statistics
  9. Using all of the data created (including the new PSF) fit an exponential disk + de-Vaucouleurs bulge profile

Nearly all of the above can be automated for user data of interest, and should act as a good outline of how you can use ProFit to fit a range of data where not all of the meta information is to hand.

For various reasons users might get better results from using, e.g., SExtractor as part of their processing pipeline rather than the internal utility functions, but there should be enough to get you going on your virtuous path of galaxy fitting ;-)

In all of this, the biggest gotcha is no doubt achieving a decent estimate of the PSF. Here we fit to a single star using a Moffat function, but with HST data you will almost certainly need to use Tiny-Tim, and many people advocate using shapelet fitting programs like PSF-Ex and their ilk.

For space derived data you probably do want to use tools provided by the relevant facility, but for most ground imaging better results can be achieved through fitting analytic models to stars near your target galaxy (ideally multiple stars, so you can reject bad fits etc.) and constructing an idealised PSF from the analytic model. If you are feeling really adventurous you could even marginalise over your uncertainty in the PSF by re-doing your fit with samples from the PSF model posterior.

The other parts of the process mentioned above are much more straight-forward, and there are lots of routes (including the above) that will give you decent results. In my experience when things go badly wrong (a few percent of the time when trying to fully automate all of this) the main culprit is bad segmentation due to nearby very bright stars. In these situations you probably should not be trying to achieve a good fit given the data quality. Issues with the sigma maps going wrong are easily spotted when looking at the error residuals for a good fit- the main culprit here will usually be a mistake in the gain (e.g. using inverse gain as gain). Remember: image ADUs x gain = image photo-electrons! If really unsure use the provided profitGainEst function to sanity check your value.



Try the ProFit package in your browser

Any scripts or data that you put into this service are public.

ProFit documentation built on Nov. 11, 2019, 5:07 p.m.