Astropy Data Tables

In our notebooks we are keeping the SDSS data in an astropy data table called SDSS. A data table has rows and columns.

Rows are the objects (stars, galaxies), while columns are the properties of the objects (RA, Dec, g-mag, etc).

SDSS['g'] is the column of g-magnitudes for all the objects. Something in quotes means a column!

You can make new columns simply by defining them:

SDSS['abs_gmag'] = SDSS['g'] - (5*np.log10(dist_pc) - 5)

SDSS['g-r'] = SDSS['g'] - SDSS['r']

You can get a column listing by saying SDSS.colnames

Each row is all the data for a single object, which you can access by adding a bracketed number

SDSS['g'][12] # the g-magnitude for the 13th object in the table (remember python numbering starts at 0)

SDSS[99] # show all the information for the 100th object in the table

We can access subsets of full data table using selection flags like this:

has_redshift = SDSS['redshift'] != -999

SDSS[has_redshift] # subset of just the things that have a redshift.

SDSS[~has_redshift] # subset of just the things that have a redshift

We can display a sortable table in your web browser:

SDSS[has_redshift].show_in_browser(jsviewer=True) # but don’t do this for the full table!

Plotting/Analysis tips

When plotting mags and colors, don't autoscale, or you'll get unreadable plots. For CMDs, for example, reasonable limits on the magnitude range would be r = 13-24, and limits on the color range would be g-r = -1 to +2.

When plotting lots of data points, make the marker sizes small. Try something like scatter(x,y,s=1)

Also, to select subsamples, an easy way to do this is to set a selection flag like this:

want = (SDSS['g']<20) # for selecting objects with a g mag brighter than 20
or
want = (np.abs(SDSS['redshift']-0.1)<0.05) # for selecting objects in the redshift range 0.005 to 0.015
or
want = (SDSS['type']==3) # for selecting spatially extended source (i.e., not point sources)

followed by (for example):

# plot the whole sample
scatter(SDSS['r'], SDSS['g']-SDSS['r'], s=1)
# then overplot the subsample
scatter(SDSS['r'][want], SDSS['g'][want]-SDSS['r'][want],s=20,color='red')

You can also "stack" selections like this:

bright = SDSS['g']<18 # has a g-mag brighter than 18red = (SDSS['g']-SDSS['r']>0.7) # is redder than g-r=0.7
has_redshift = (SDSS['redshift'] != -999) # has a measured redshift
want = np.logical_and(bright,red)
want = np.logical_and(want,has_redshift)

which would give you a "want" selection that is bright red galaxies with measured redshifts