Running hera_sim from the command line

As of v0.2.0 of hera_sim, quick-and-dirty simulations can be run from the command line by creating a configuration file and using hera_sim’s run command to create simulated data in line with the configuration’s specifications. The basic syntax of using hera_sim’s command-line interface is (this can be run from anywhere if hera_sim is installed):

$ hera-sim-simulate.py run --help
usage: hera-sim-simulate.py [-h] [-o OUTFILE] [-v] [-sa] [--clobber] config

Run a hera_sim-managed simulation from the command line.

positional arguments:
  config                Path to configuration file.

options:
  -h, --help            show this help message and exit
  -o OUTFILE, --outfile OUTFILE
                        Where to save simulated data. Overrides outfile
                        specified in config.
  -v, --verbose         Print progress updates.
  -sa, --save_all       Save each simulation component.
  --clobber             Overwrite existing files in case of name conflicts.

An example configuration file can be found in the config_examples directory of the repo’s top-level directory. Here are its contents:

$ cat -n ../config_examples/template_config.yaml
     1	# This document is intended to serve as a template for constructing new
     2	# configuration YAMLs for use with the command-line interface.
     3	
     4	bda:
     5	    max_decorr: 0
     6	    pre_fs_int_time: !dimensionful
     7	        value: 0.1
     8	        units: 's'
     9	    corr_FoV_angle: !dimensionful
    10	        value: 20
    11	        units: 'deg'
    12	    max_time: !dimensionful
    13	        value: 16
    14	        units: 's'
    15	    corr_int_time: !dimensionful
    16	        value: 2
    17	        units: 's'
    18	filing:
    19	    outdir: '.'
    20	    outfile_name: 'quick_and_dirty_sim.uvh5'
    21	    output_format: 'uvh5'
    22	    clobber: True
    23	# freq and time entries currently configured for hera_sim use
    24	freq:
    25	    n_freq: 100
    26	    channel_width: 122070.3125
    27	    start_freq: 46920776.3671875
    28	time:
    29	    n_times: 10
    30	    integration_time: 8.59
    31	    start_time: 2457458.1738949567
    32	telescope:
    33	    # generate from an antenna layout csv
    34	    # array_layout: 'antenna_layout.csv'
    35	    # generate using hera_sim.antpos
    36	    array_layout: !antpos
    37	        array_type: "hex"
    38	        hex_num: 3
    39	        sep: 14.6
    40	        split_core: False
    41	        outriggers: 0
    42	    omega_p: !Beam
    43	        # non-absolute paths are assumed to be specified relative to the
    44	        # hera_sim data path
    45	        datafile: HERA_H2C_BEAM_MODEL.npz
    46	        interp_kwargs:
    47	            interpolator: interp1d
    48	            fill_value: extrapolate
    49	            # if you want to use a polynomial interpolator instead, then
    50	            # interpolator: poly1d
    51	            # kwargs not accepted for this; see numpy.poly1d documentation
    52	defaults:
    53	    # This must be a string specifying an absolute path to a default
    54	    # configuration file or one of the season default keywords
    55	    'h2c'
    56	systematics:
    57	    rfi:
    58	        # see hera_sim.rfi documentation for details on parameter names
    59	        rfi_stations:
    60	            seed: once
    61	            stations: !!null
    62	        rfi_impulse:
    63	            impulse_chance: 0.001
    64	            impulse_strength: 20.0
    65	        rfi_scatter:
    66	            scatter_chance: 0.0001
    67	            scatter_strength: 10.0
    68	            scatter_std: 10.0
    69	        rfi_dtv:
    70	            seed: once
    71	            dtv_band: 
    72	                - 0.174
    73	                - 0.214
    74	            dtv_channel_width: 0.008
    75	            dtv_chance: 0.0001
    76	            dtv_strength: 10.0
    77	            dtv_std: 10.0
    78	    sigchain:
    79	        gains:
    80	            seed: once
    81	            gain_spread: 0.1
    82	            dly_rng: [-20, 20]
    83	            bp_poly: HERA_H1C_BANDPASS.npy
    84	        sigchain_reflections:
    85	            seed: once
    86	            amp: !!null
    87	            dly: !!null
    88	            phs: !!null
    89	    crosstalk:
    90	        # only one of the two crosstalk methods should be specified
    91	        gen_whitenoise_xtalk:
    92	            amplitude: 3.0
    93	        # gen_cross_coupling_xtalk:
    94	            # seed: initial
    95	            # amp: !!null
    96	            # dly: !!null
    97	            # phs: !!null
    98	    noise:
    99	        thermal_noise:
   100	            seed: initial
   101	            Trx: 0
   102	sky:
   103	    Tsky_mdl: !Tsky
   104	        # non-absolute paths are assumed to be relative to the hera_sim
   105	        # data folder
   106	        datafile: HERA_Tsky_Reformatted.npz
   107	        # interp kwargs are passed to scipy.interp.RectBivariateSpline
   108	        interp_kwargs:
   109	            pol: xx # this is popped when making a Tsky object
   110	    eor:
   111	        noiselike_eor:
   112	            eor_amp: 0.00001
   113	            min_delay: !!null
   114	            max_delay: !!null
   115	            seed: redundant # so redundant baselines see same sky
   116	            fringe_filter_type: tophat
   117	    foregrounds:
   118	        # if using hera_sim.foregrounds
   119	        diffuse_foreground:
   120	            seed: redundant # redundant baselines see same sky
   121	            delay_filter_kwargs:
   122	                standoff: 0
   123	                delay_filter_type: tophat
   124	                normalize: !!null
   125	            fringe_filter_kwargs:
   126	                fringe_filter_type: tophat
   127	        pntsrc_foreground:
   128	            seed: once
   129	            nsrcs: 1000
   130	            Smin: 0.3
   131	            Smax: 300
   132	            beta: -1.5
   133	            spectral_index_mean: -1.0
   134	            spectral_index_std: 0.5
   135	            reference_freq: 0.5
   136	        # Note regarding seed_redundantly:
   137	        # This ensures that baselines within a redundant group see the same sky;
   138	        # however, this does not ensure that the sky is actually consistent. So,
   139	        # while the data produced can be absolutely calibrated, it cannot be
   140	        # used to make sensible images (at least, I don't *think* it can be).
   141	
   142	simulation:
   143	    # specify which components to simulate in desired order
   144	    # this should be a complete list of the things to include if hera_sim
   145	    # is the simulator being used. this will necessarily look different
   146	    # if other simulators are used, but that's not implemented yet
   147	    #
   148	    components: [foregrounds,
   149	                 noise,
   150	                 eor,
   151	                 rfi,
   152	                 sigchain, ]
   153	    # list particular model components to exclude from simulation
   154	    exclude: [sigchain_reflections,
   155	              gen_whitenoise_xtalk,]

The remainder of this tutorial will be spent on exploring each of the items in the above configuration file.

BDA

The following block of text shows all of the options that must be specified if you would like to apply BDA to the simulated data. Note that BDA is applied at the very end of the script, and requires the BDA package to be installed from http://github.com/HERA-Team/baseline_dependent_averaging.

$ sed -n 4,17p ../config_examples/template_config.yaml
bda:
    max_decorr: 0
    pre_fs_int_time: !dimensionful
        value: 0.1
        units: 's'
    corr_FoV_angle: !dimensionful
        value: 20
        units: 'deg'
    max_time: !dimensionful
        value: 16
        units: 's'
    corr_int_time: !dimensionful
        value: 2
        units: 's'

Please refer to the bda.apply_bda documentation for details on what each parameter represents. Note that practically each entry has the tag !dimensionful; this YAML tag converts the entries in value and units to an astropy.units.quantity.Quantity object with the specified value and units.

Filing

The following block of text shows all of the options that may be specified in the filing section; however, not all of these must be specified. In fact, the only parameter that is required to be specified in the config YAML is output_format, and it must be either miriad, uvfits, or uvh5. These are currently the only supported write methods for UVData objects.

$ sed -n 18,24p ../config_examples/template_config.yaml
filing:
    outdir: '.'
    outfile_name: 'quick_and_dirty_sim.uvh5'
    output_format: 'uvh5'
    clobber: True
# freq and time entries currently configured for hera_sim use
freq:

Recall that run can be called with the option --outfile; this specifies the full path to where the simulated data should be saved and overrides the outdir and outfile_name settings from the config YAML. Additionally, one can choose to use the flag -c or --clobber in place of specifying clobber in the config YAML. Finally, the dictionary defined by the kwargs entry has its contents passed to whichever write method is chosen, and the save_seeds option should only be used if the seed_redundantly option is specified for any of the simulation components.

Setup

The following block of text contains three sections: freq, time, and telescope. These sections are used to initialize the Simulator object that is used to perform the simulation. Note that the config YAML shows all of the options that may be specified, but not all options are necessarily required.

$ sed -n 26,53p ../config_examples/template_config.yaml
    channel_width: 122070.3125
    start_freq: 46920776.3671875
time:
    n_times: 10
    integration_time: 8.59
    start_time: 2457458.1738949567
telescope:
    # generate from an antenna layout csv
    # array_layout: 'antenna_layout.csv'
    # generate using hera_sim.antpos
    array_layout: !antpos
        array_type: "hex"
        hex_num: 3
        sep: 14.6
        split_core: False
        outriggers: 0
    omega_p: !Beam
        # non-absolute paths are assumed to be specified relative to the
        # hera_sim data path
        datafile: HERA_H2C_BEAM_MODEL.npz
        interp_kwargs:
            interpolator: interp1d
            fill_value: extrapolate
            # if you want to use a polynomial interpolator instead, then
            # interpolator: poly1d
            # kwargs not accepted for this; see numpy.poly1d documentation
defaults:
    # This must be a string specifying an absolute path to a default

If you are familiar with using configuration files with pyuvsim, then you’ll notice that the sections shown above look very similar to the way config files are constructed for use with pyuvsim. The config files for run were designed as an extension of the pyuvsim config files, with the caveat that some of the naming conventions used in pyuvsim are somewhat different than those used in hera_sim. For information on the parameters listed in the freq and time sections, please refer to the documentation for hera_sim.io.empty_uvdata. As for the telescope section, this is where the antenna array and primary beam are defined. The array_layout entry specifies the array, either by specifying an antenna layout file or by using the !antpos YAML tag and specifying the type of array (currently only linear and hex are supported) and the parameters to be passed to the corresponding function in hera_sim.antpos. The omega_p entry is where the primary beam is specified, and it is currently assumed that the beam is the same for each simulation component (indeed, this simulator is not intended to produce super-realistic simulations, but rather perform simulations quickly and give somewhat realistic results). This entry defines an interpolation object to be used for various hera_sim functions which require such an object; please refer to the documentation for hera_sim.interpolators.Beam for more information. Future versions of hera_sim will provide support for specifying the beam in an antenna layout file, similar to how it is done by pyuvsim.

Defaults

This section of the configuration file is optional to include. This section gives the user the option to use a default configuration to specify different parameters throughout the codebase. Users may define their own default configuration files, or they may use one of the provided season default configurations, located in the config folder. The currently supported season configurations are h1c and h2c. Please see the defaults module/documentation for more information.

$ sed -n 54,57p ../config_examples/template_config.yaml
    # configuration file or one of the season default keywords
    'h2c'
systematics:
    rfi:

Systematics

This is the section where any desired systematic effects can be specified. The block of text shown below details all of the possible options for systematic effects. Note that currently the sigchain_reflections and gen_cross_coupling_xtalk sections cannot easily be worked with; in fact, gen_cross_coupling_xtalk does not work as intended (each baseline has crosstalk show up at the same phase and delay, with the same amplitude, but uses a different autocorrelation visibility). Also note that the rfi section is subject to change, pending a rework of the rfi module.

$ sed -n 58,96p ../config_examples/template_config.yaml
        # see hera_sim.rfi documentation for details on parameter names
        rfi_stations:
            seed: once
            stations: !!null
        rfi_impulse:
            impulse_chance: 0.001
            impulse_strength: 20.0
        rfi_scatter:
            scatter_chance: 0.0001
            scatter_strength: 10.0
            scatter_std: 10.0
        rfi_dtv:
            seed: once
            dtv_band: 
                - 0.174
                - 0.214
            dtv_channel_width: 0.008
            dtv_chance: 0.0001
            dtv_strength: 10.0
            dtv_std: 10.0
    sigchain:
        gains:
            seed: once
            gain_spread: 0.1
            dly_rng: [-20, 20]
            bp_poly: HERA_H1C_BANDPASS.npy
        sigchain_reflections:
            seed: once
            amp: !!null
            dly: !!null
            phs: !!null
    crosstalk:
        # only one of the two crosstalk methods should be specified
        gen_whitenoise_xtalk:
            amplitude: 3.0
        # gen_cross_coupling_xtalk:
            # seed: initial
            # amp: !!null
            # dly: !!null

Note that although these simulation components are listed under systematics, they do not necessarily need to be listed here; the configuration file is formatted as such just for semantic clarity. For information on any particular simulation component listed here, please refer to the corresponding function’s documentation. For those who may not know what it means, !!null is how NoneType objects are specified using pyyaml.

Sky

This section specifies both the sky temperature model to be used throughout the simulation as well as any simulation components which are best interpreted as being associated with the sky (rather than as a systematic effect). Just like the systematics section, these do not necessarily need to exist in the sky section (however, the Tsky_mdl entry must be placed in this section, as that’s where the script looks for it).

$ sed -n 97,130p ../config_examples/template_config.yaml
            # phs: !!null
    noise:
        thermal_noise:
            seed: initial
            Trx: 0
sky:
    Tsky_mdl: !Tsky
        # non-absolute paths are assumed to be relative to the hera_sim
        # data folder
        datafile: HERA_Tsky_Reformatted.npz
        # interp kwargs are passed to scipy.interp.RectBivariateSpline
        interp_kwargs:
            pol: xx # this is popped when making a Tsky object
    eor:
        noiselike_eor:
            eor_amp: 0.00001
            min_delay: !!null
            max_delay: !!null
            seed: redundant # so redundant baselines see same sky
            fringe_filter_type: tophat
    foregrounds:
        # if using hera_sim.foregrounds
        diffuse_foreground:
            seed: redundant # redundant baselines see same sky
            delay_filter_kwargs:
                standoff: 0
                delay_filter_type: tophat
                normalize: !!null
            fringe_filter_kwargs:
                fringe_filter_type: tophat
        pntsrc_foreground:
            seed: once
            nsrcs: 1000
            Smin: 0.3

As of now, run only supports simulating effects using the functions in hera_sim; however, we intend to provide support for using different simulators in the future. If you would like more information regarding the Tsky_mdl entry, please refer to the documentation for the hera_sim.interpolators.Tsky class. Finally, note that the seed_redundantly parameter is specified for each entry in eor and foregrounds; this parameter is used to ensure that baselines within a redundant group all measure the same visibility, which is a necessary feature for data to be absolutely calibrated. Please refer to the documentation for hera_sim.eor and hera_sim.foregrounds for more information on the parameters and functions listed above.

Simulation

This section is used to specify which of the simulation components to include in or exclude from the simulation. There are only two entries in this section: components and exclude. The components entry should be a list specifying which of the groups from the sky and systematics sections should be included in the simulation. The exclude entry should be a list specifying which of the particular models should not be simulated. Here’s an example:

$ sed -n -e 137,138p -e 143,150p ../config_examples/template_config.yaml
        # This ensures that baselines within a redundant group see the same sky;
        # however, this does not ensure that the sky is actually consistent. So,
    # specify which components to simulate in desired order
    # this should be a complete list of the things to include if hera_sim
    # is the simulator being used. this will necessarily look different
    # if other simulators are used, but that's not implemented yet
    #
    components: [foregrounds,
                 noise,
                 eor,

The entries listed above would result in a simulation that includes all models contained in the foregrounds, noise, eor, rfi, and sigchain dictionaries, except for the sigchain_reflections and gen_whitenoise_xtalk models. So the simulation would consist of diffuse and point source foregrounds, thermal noise, noiselike EoR, all types of RFI modeled by hera_sim, and bandpass gains, with the effects simulated in that order. It is important to make sure that effects which enter multiplicatively (i.e. models from sigchain) are simulated after effects that enter additively, since the order that the simulation components are listed in is the same as the order of execution.