From setiquest wiki
This section explains how the SonATA system operates.
System Configuration and Resource Allocation
The resources are allocated with a set of configuration scripts and files. The environmental variables scripts reside in ~SonATA/scripts and are installed in ~/sonata_install-scripts. These scripts define the beam assignments and parameters for the Channelizers and DXs as well as the physical hosts on which they will run. These scripts copy a matching version of the Expected SonATA Components file from ~/SonATA/sse_pkg/setup to ~/sonata_install/setup/expectedSonATAComponents.cfg. The "expectedSonATAComponents.cfg" file is used by the Site and lists which DXs receive input from which beam and channelizer
A set of default parameters including observation length, thresholds, pulse resolutions, etc. will be selected at this time.
A scheduler will create an observing schedule based on resource availability, target visibility and observing strategy. Examples of observing strategies include selecting targets nearest their rise time or targets that are close to Earth in light years or targets with low declination or target directions in the galactic plane that are at a sufficiently large angle from the Sun. Typically, a database containing the observation history will be queried for unobserved targets and frequencies. A set of 2 or 3 targets will be selected within the primary field of view. The same set of center frequencies will be assigned to DXs on each beam. Each DX will have the same parameters with the exception of the frequency and target.
When a "start obs" is issued, the scheduler starts an Observing Strategy. The Strategy will read the target catalogs from the database. There are scheduler parameters to specify the high priority (catshigh) and low priority (catslow) catalogs. The high priority catalog will be searched first for the primary beam and/or beam1 targets. The low priority catalogs will be only be used if there are no suitable targets available in the high priority catalogs.
If the user has set the scheduler parameter for target selection (target) to "auto", the targets are selected by evaluating the following factors: target declination, target visibility, angular separation from the Sun, target type, distance, position near the meridian, remaining time until the target sets, and for multi-target observation, angular separation between targets and target visibility within the primary beam.
If the user has set the target selection parameter to "semiauto", the user must specify the target id for the primary beam, i.e. the center of the field of view. The Strategy will select only targets that fall within that field of view using the factors listed above.
The frequency range to be observed is specified by the scheduler parameters for beginning frequency (beginfreq) and ending frequency (endfreq). A database query determines which frequencies in that range have already been observed for the selected targets. All the unobserved frequency ranges for the targets will be passed to the DX tuning routine.
If the user has set the target selection parameter to "user", the user must specify the target ids for all the beams as well as the primary beam.
Preparing Antennas and Beamformers
At the start of an observing session a subset of the antennas in the array are selected, based on their system temperature and drive status. The beamformers are initialized with this antenna list, and a series of delay, phase, frequency, and polarization calibrations are performed.
A pointing request for the position (RA,DEC) of the primary field of view (if SonATA is the primary user) along with the coordinates of each of the synthesized beams and the RF tunings are sent to the ATA telescope interface. The center frequency for the RF tuning is selected so that all the DX frequencies fall within the RF bandwidth (104 MHz).
Target and Frequency Selection
There are several different methods for tuning the DXs. If the target selection parameter is set to "auto" or "semiauto", the DX tune parameter (dxtune) is ignored. DX tuning utilizes the permanent RFI mask for these 2 target selection options. Any frequency ranges in the mask that cover an entire channel are skipped over. The lowest frequency in the unobserved range is assigned to Channel 0 and the first DX on beam1 (typically dx1000). The channels and frequencies are assigned sequentially, skipping over the channels that contain masked frequencies and the center channel where the Channelizer DC channel falls. The DC channel has too many artifacts to be usable. When we go to wider channels (1.6 MHz or 3.2 MHz) we will employ the receiver birdie mask to eliminate the artifacts. The channels are assigned until there are no more DXs available on the first beam or all the available channels have already assigned. TuneDxsObsRange::tune()
If the target selection method is set to "user", the DX tune options are range, forever, or user. For the range dxtune option, the frequency range specified by the scheduler parameters beginfreq and endfreq determine the range. The total number of channels that can be covered by the available DXs is computed and the Channelizer DC channel is assigned to the middle channel so that half the DX frequencies are below the DC channel and half are above. The permanent RFI mask is not applied. The DC channel is skipped.TuneDxsRange::tune().
The "forever" and "user" dxtune options are essentially no-ops. The Strategy does not change the channel or frequency that is currently assigned to the DXs. It is expected that the DXs were assigned channels and frequencies in a previous activity or that the user has specified the channel and frequency with the "dx load skyfreq ffff.ffff dxName" and "dx load channel NN dxName" commands. In the case of "forever", the Strategy repeats the same activity over and over until the user issues a "stop" command. TuneDxsForever::tune(). In the case of "user", the Strategy executes the activity once. TuneDxsUser::tune(). In the case of followup activities, the DXs are tuned by retrieving the channel and frequency for each DX from the parent activity in the database.
The Channelizer was configured with the parameters specified in the environmental variables file. These parameters include the input bandwidth, the number of channels to create, the number of channels to be output, and their bandwidth, plus the multicast port numbers and addresses, and the beam and polarization assignments. The beamformers output 104MHz bandwidth. SonATA typically configures the Channelizers to create 128 channels with .8192 MHz bandwidth and to output 49 channels with up to 24 of them actually used. For setiQuest data collection, the Channelizer is configured to create 16 channels with 6.5 MHz bandwidth and to output 2 channels with only 1 of them actually used.
Once the DX frequencies and channels are assigned, the center frequency for the channelizer is computed. The frequency and channel of the first DX are used to calculate the frequency of the DC channel. At present the channelizers are tuned and started immediately, even before the telescope and beamformer have been pointed and tuned. The Channelizers are synchronized by the time stamp that the beamformers insert into the packet headers. The start time for the Channelizers is computed by adding a delay (default 3 seconds) to the current time. The start time is sent to all the Channelizers. They reset the packet counter and start processing when the time stamp in a packet is greater than or equal to the start time.
The input data stream is processed with a 7-folding Digital Filter Bank and an FFT to yield the appropriate number of channels with the appropriate bandwidth. The channels are sent out using multicast protocol to go to the DX that has subscribed to a particular channel.
After the targets have been selected and the frequencies and channels assigned, the Strategy creates a new Activity.
Telescope Pointing, RF Receiver tuning, Beamformer pointing
The Activity sends a pointing request to the telescope for the primary beam target and a frequency tuning request for the LO (Local Oscillator) Receivers being used. These requests are actually sent to a Backend Server Process that handles all the communication with the ATA site. When the telescope is on target and the receivers are tuned, the Activity sends a point request to the Beamformers for each of the beam targets and waits for a ready reply.
Activity Unit Creation
After the telescope, receivers, and beamformers are all ready. The Activity creates an Activity Unit for each DX to handle all the communication with that DX for this activity.
When a DX first connects to the SSE, it is sent two masks to be applied during the observations. The Receiver Birdie Mask represents persistent signals (1 subchannel bandwidth) that are generated within the observatory IF Chain. For each observation, these masks are adjusted for the current DX center frequency and applied to the input data stream before signal detection is done.The Permanent RF Mask represents sky frequency bandwidths (usually in 800 kHz increments) with high occupancy RFI such as satellites, ground based radar, airplane communication, etc..
For each observation, the DX will be sent observing parameters, the Recent RFI mask and scientific data request for display/archive. The recent RFI mask consists of noise and transient signals received within the last week that were not seen on the current target. This mask will be applied to the detected signals.
The Activity Unit retrieves the DX center frequency and channel number that were assigned to the DxProxy by the Strategy and stores them in the DX Parameters structure before sending the parameters to the DX via the DxProxy. The DxProxy relays the command to the DX Process. For the rest of the command sequence description, the DxProxy will be omitted. The DX reports back that it is tuned after the observation initialization is complete.
For a followup observation, the followup candidates are sent next.
For both a normal and followup observation, the start time of the observation is computed by the Activity Module after all the DXs have reported back tuned. A delay (default 15 seconds) is added to the current time to allow adequate time for the multicast port and IP address subscriptions to be registered. The start time will be sent to all the DXs simultaneously triggered by a message from the parent Activity. The DXs monitor the time stamp in incoming packets until the time stamp is greater than or equal to the start time so that all the DXs are synchronized. Then they begin processing. The start time refers to the start of baseline accumulation; data collection begins immediately after baselining is complete.
Typically 20 half-frames of data are collected to create an initial baseline that will be used to normalize subsequent data. The baseline is recalculated each half-frame during the baseline accumulation and data collection using a box car decay: the decay factor is specified in the activity parameters, with the default of 90% of the previous baseline and 10% of the current.
Data Collection starts immediately after Baseline Accumulation completes. Each DX receives packets for left and right polarizations that contain the time domain data for one channel. The current default channel bandwidth is .8192 MHz. The bandwidth may also be .4096 MHz (as in the Voyager Demo) or .546133 MHz (as in the case of replaying setiQuest data).
A frame is ~1.5 seconds of data. The data collection length is specified in seconds and the number of frames is automatically calculated as the largest power of two that can be observed in that amount of time. The number of frames must be a power of 2 to accommodate the CW power detection algorithm. The Channel is first subchannelized to 1536 subchannels of .533 KHz bandwidth for .8192 MHz Channel. Each subchannel is further transformed to create bins ranging from ~360 Hz to ~.7 Hz. During this transformation the data are overlapped 50 % in time. The bin width and the number of spectra created depends on the resolution. For 1 Hz resolution, there are 2 spectra and the bin width is ~.7 Hz. For 2 Hz resolution, there are 4 spectra and the bin width is ~1.4 Hz. For 4 Hz resolution there are 8 spectra and the bin width is ~2.8 Hz and so on.
During Data Collection, the DXs create 4 types of data: Baselines, 2 bit CW power data for a single resolution (1, 2, or 4 Hz), thresholded power data for pulse detection, and Confirmation Data (4 bit complex pairs).
The accumulated baseline data represents the mean for each subchannel. It is used to normalize the data to a mean of 1 and a standard deviation of 1. The baselines will be reported every N half-frames (default: 20) during data collection. The baselines are checked for exceeding limits on mean, range, minimum, and maximum. Warning and error messages are sent when any limits are exceeded. This feature may be turned off in the user interface. ( dx set baseerror|basewarn off)The baseline data is sent to the SSE to be archived for later analysis and graphical display.
The 2 bit CW power data is used for the DADD algorithm (see below) and is saturated at 3, i.e. any power over 3 is set to 3. After saturation the mean bin power is .553 with a standard deviation of .846.
The thresholded power data for pulse detection (see below) consists of the bin, spectrum, and power for power values that exceed the single pulse threshold (pulsethresh, default: 12).
The confirmation data is used for primary and secondary Coherent CW Detection (see below) and is archived for later analysis and graphical display. The bandwidth of the confirmation data is equal to the subchannel width. The mean of the Confirmation Data is 1 and the standard deviation is 1. Each DX also sends one subchannel's confirmation data every half-frame. This is the data that is used for Waterfall Displays. The default subchannel is 384. For followup observations, the subchannel will be set to the subchannel containing the first of the followup candidates.
Signal Detection begins after Data Collection completes. The 2 bit CW data is used for the DADD (Doubling Accumulation Drift Detection) algorithm which sums the power along approximate straight line paths for drift rates between +1 and -1 bin/spectrum.
The Doubling Accumulation Drift Detection (DADD) Algorithm is an n log m approach to accumulating the power sums along all possible drift paths between +1 and -1 bin/spectrum for each starting bin, where n is the number of spectral bins and m the number of time spectra. The paths are approximations to straight lines that give relatively good sensitivity as well as being easy to construct. The algorithm combines pairs of length m path sums to form path sums of length 2m. The gain in computational efficiency (relative to brute force summation) arises from the fact that each length m path sum enters into three length 2m path sums. The algorithm as used requires that the number of spectra be a power of 2. The individual paths that exceed the DADD threshold (daddthresh, default: 8.5) are clustered with other paths that are close in frequency. The path with the strongest power is used as the signal description. The DADD algorithm is applied separately to the left and right polarizations.
The paths in each polarization that exceed threshold are combined using a clustering technique that computes the bin number for each path at the midpoint in time of the observation. Any paths that are separated by less than or equal to the delta bin parameter (cwclustdeltafreq, default: 2 bins) are reported as a single signal using the path with the greatest power.
The Thresholded Power data consists of the bin, spectrum, and power for power values that exceed the single pulse threshold (pulsethresh, default: 12). The pulse detection algorithm first looks for triplets of evenly spaced pulses that lie along a straight line. The total power of the triplets must exceed the triplet threshold (tripletthresh, default: 48) These triplets are clustered in a similar manner to the CW technique with any other triplets that are close in frequency using the pulse delta bin parameter (pulseclustdeltafreq, default: 25).The signal description is obtained with a linear regression on all the pulses in the resulting pulse train.
All the signals that are reported from the left and right CW detectors as well as the pulse detector are further combined into Super Clusters. This clustering algorithm uses the frequency tolerance parameter clustfreqtol (default: 266, half a sub-channel width) to group together signals that occur within the same frequency area. If a signal is detected by both the CW and Pulse detectors, the CW signal description is always used because it will be more accurate. If the signal is detected by both CW polarizations, the signal description of the stronger signal is used and the polarization is set to 'both'. A signal description includes the frequency at the start of the observation, drift rate, power, width, polarization, signal type (CW or Pulse), signal number, class, reason for its classification, subchannel number, and if the signal was part of a Bad Band, i.e. too many signals concentrated in one frequency range.The width of a cluster is calculated as the highest frequency minus the lowest frequency.
There is a limit on how many paths may be in a cluster. For the CW detector the default limit is 250 paths (badbandcwpathlim). When more than the limit is reached, a Bad Band is created to represent the signal. The strongest path up to that point is used as the signal description and the remaining paths within the frequency area are ignored. For the Pulse Detector, there is a limit on the number of pulses in a subchannel (badbandpulselim, default: 300). There is also a limit on the number of triplets per subchannel (badbandpulsetriplim, default: 5000).The width of a bad band is calculated as the highest frequency minus the lowest frequency.
RFI (Radio Frequency Interference) Mitigation
The DX uses several techniques to filter out terrestrial RFI. There are various frequency masks that contain known RFI, plus simple tests on the drift rate.
Receiver Birdie Mask
The Receiver Birdie Mask reflects the RFI that is internal to the receiver chain. It is determined by sampling the RFI environment with several scans over the frequency range that record signals, but do not do any candidate selection. The signals taken during the frequency scans are adjusted by subtracting the receiver tuning frequency. If the adjusted frequency occurs in more than 30% of the observations, it is included in the mask, which is stored in ~/SonATA/sse-pkg/setup/rcvrBirdieMask.tcl. When a DX starts up and connects to the assigned socket, the DxProxy that is created to handle all the communication with that DX reads the receiver birdie mask file and sends it to the DX. The frequencies in this mask usually correspond to the width of a subchannel. The DX sets bins/subchannels in the frequency ranges in the mask to zero during Data Collection. So these frequencies will not yield any signals during Signal Detection. The range of this mask covers the maximum range that can be observed by all the DXs during a single observation.
Permanent RFI Mask
The permanent RFI Mask reflects the RFI environment at the observing site. It is determined by sampling the RFI environment with several scans over the frequency range that record signals, but do not do any candidate selection. The frequency ranges that have large numbers of signals are inserted into the mask that is stored in ~/SonATA/sse-pkg/setup/permRfiMask.tcl. Also, if a DX assigned to a frequency range fails consistently, that range is added to the mask. When a DX starts up and connects to the assigned socket, the DxProxy that is created to handle all the communication with that DX reads the permanant RFI mask file and sends it to the DX. This mask covers the entire observing range. During Signal Detection, any signal that falls within the frequency ranges specified by the permanent RFI mask is labeled as RFI. As implemented, most of the frequency ranges in the mask match the size of a channel. When the Strategy selects targets and frequencies to observe, the frequency ranges in the permanent RFI mask are skipped over. Hence the DX rarely observes any of the frequency ranges in the mask.
Recent RFI Mask
The Recent RFI Mask reflects the signals that have been seen during the last week. After the DXs are tuned, the Activity sends a message to all the ActivityUnits to create a recent RFI Mask for their own frequency range. The Activity also includes a list of targets to be excluded from the query, so that a signal seen on the current target will not be marked as RFI. The ActivityUnits execute a database query that retrieves all the signals seen within its frequency range in the last week, excluding all the targets that are contained in the current field of view of the telescope. After all the signals have been clustered together, each clustered signal is compared with the recent RFI Mask. If its frequency falls within a range in the mask, it is labeled as RFI.
The clustered signals are also tested for zero drift that would indicate a signal that is locked to the observatory frequency standard, and therefore of terrestrial origin. If the drift rate is less than the drift rate tolerance (default: .007 Hz/sec), the signal is labeled as RFI. The zero drift tolerance is a parameter to the DX. The default value represents a drift of less than one bin during a 93 second observation at the 1 Hz resolution. For observing at higher frequencies, the 2 Hz or 4 Hz resolution may be more appropriate. For those resolutions, the bin width is 1.4 Hz and 2.8 Hz respectively, while the number of spectra per frame will be 4 and 8. The zero drift rate tolerance should be set to 0.014 Hz/sec for 2 Hz and 0.028 Hz/sec for 4 Hz.
Drift too High
Another DX parameter is the maximum Drift Rate Tolerance. In some cases, clustered pulse signals will calculate a drift rate that exceeds maximum Drift rate tolerance (maxdrifttol, default: 1 Hz/sec). If the drift rate exceeds the maximum, the signal is labeled as RFI. When observing at higher frequencies, the maximum allowed drift rate will be higher, since the maximum drift rate should reflect one part per billion of the frequency (i.e.nudot/nu < 10-9). The frequency of the signal is converted from MHz to GHz and multiplied by the maximum drift rate tolerance to get the actual maximum drift rate. The maximum drift rate may be set higher when trying to detect a fast moving spacecraft like Mars Express.
Signal Not in Channel
It sometimes happens that a signal crosses the boundary between two channels. The channels do not overlap in frequency so there is insufficient data to process them in the secondary confirmation process. These signals are labeled as UNKNOWN since they cannot be processed.
Any Signal that survives all the RFI mitigation tests is selected as a candidate for further processing. For Pulse Signals, a pfa (probability of false alarm) and an SNR (signal to noise ratio) are both computed. For CW signals, an additional Coherent detection algorithm is applied.
Primary Coherent Detection
For the Coherent detection algorithm, the DX creates a Confirmation Channel that consists of the 4-bit complex confirmation data for 16 subchannels with subchannel containing the signal in subchannel 8. These data are dedrifted and heterodyned by multiplying by a complex function whose value depends on the drift rate and relative frequency. The data are transformed to microbins with a minimum width of 1/(# of spectra) and a maximum width of 2.1 Hz. For each path represented by a start bin, drift rate, and microbin width (multiples of 2 of the minimum microbin), the power is summed exactly (non-DADD) and the coherence calculated. The signal path and width with the greatest power is reported. The coherence is measured by the pfa (probability of false alarm) and SNR (signal to noise ratio). In previous incarnations of the observing system, there was a threshold that for the pfa of a coherent detection (cwthresh). In the past the threshold was set to -20 and the pfa had to be less than that for the signal to become a candidate. Currently, the threshold is set to 0.0, so all CW power detection Candidate are passed on as Candidates regardless of their pfa.
Candidate and Signal Reporting
The DX reports the CW and Pulse Candidates first. The hand shaking involves a message for starting to sending CW or Pulse Candidates, followed by the Candidates, and ending with a message for done sending. Similarly, the CW and Pulse signals that are not candidates are sent, followed by CW and Pulse the Bad Bands. Finally, the CW Coherent Signal descriptions are sent.
Candidate Verification: Multi-beam Mode
In Multi-beam mode, there are 2 or 3 DXs observing the same frequency range, but with different target pointings or beams. The beam separation is a user parameter. The default value is 5 synthesized beam-widths, so that each beam can serve as an Off-source for the other beams.
Multi-beam Secondary Confirmation
After all the DXs have finished reporting, the Activity coordinates all the Activity Units with a message to send secondary Candidates, i.e. the Candidates seen by the counterpart DXs on the other beam(s). Each Activity Unit queries the database to obtain the signal descriptions for any CW Coherent or Pulse Candidates that the DXs processing the same channel on the other beams have detected. These candidates are sent to the DXs in a similar manner using start sending and done sending messages to frame the candidates. Upon receiving the counterpart candidates, the DXs applies the CW coherent detection algorithm described above to the CW signals. Since there was a 0.0 threshold for the primary coherent threshold, the primary pfa may be greater that the secondary pfa threshold, i.e. the primary signal strength is weaker than secondary threshold. Therefore an adjustment is made to the threshold test to account for weaker signals. If the pfa adjustment (secondarypfamargin, default: 3) added to the pfa of the primary signal is less than the secondary pfa threshold (secondarycwthresh, default: -20), the secondary pfa threshold is used. On the other hand, if the adjusted secondary threshold is greater, it is used. Signals that pass the threshold test, i.e. pfa is less than threshold, are returned as "seen".
For Pulse Signals, the individual pulses contained in the primary signal are passed to the DX. The pulse confirmation process sums the powers in the secondary confirmation data in the exact bin and spectrum of each pulse in the primary pulse train. The pfa is computed for the resultant pulse train. If the pfa is less than the secondary Pulse pfa threshold (secondarytrainsignifthresh, default:-17) the signal is returned as "seen".
The zero drift rejection and drift too high rejection is not used in the secondary confirmations. The Activity Units for each DX write the results for the secondary confirmation to the database.
After the all the DXs have reported their secondary candidate results and the results have been written to the database, each Activity Unit queries the database for the results from its counterpart DXs. The DXs classified the found signals as candidates with reason "PASSED_COHERENT_THRESHOLD" for CW and "PASSED_POWER_THRESH" for pulses. The signals not found are classified as RFI with reason "FAILED_COHERENT_DETECT" for CW and "FAILED_POWER_DETECT" for pulses. The Activity Units change the reason for Candidates to "SECONDARY_FOUND_SIGNAL" (appears as SSawSig in the database). The RFI is reclassifies as UNKNOWN with reason "SECONDARY_NO_SIGNAL_FOUND" (appears as SNoSigl in the database). The modified classifications are updated in the database.
The strength of a signal coming from the primary target should be reduced in power in the secondary beam because of the nulls generated by beamformer. Therefore the Candidates are further tested by the Activity Unit to assure that a signal is not considered RFI if its strength was reduced in a manner consistent with expected null effectiveness (multitargetnulls, default:on) (nulldepth, default: 7). If the SNR (signal to noise ratio) ratio of the primary signal SNR to the secondary signal SNR is greater than the null depth, the signal is still a Candidate. The primary candidate entries in the database are updated as follows. If the original candidates for this DX were not seen in any counterpart DX or if seen, were reduced in strength consistent with the nulls in those beams, then they are reclassified as Candidates with reason CONFIRMED. Otherwise the signals that were seen are reclassified as RFI with reason SEEN_MULTIPLE_BEAMS (appears as SnMulBm in the database).
There is an activity parameter that controls the archiving of candidate signals. The parameter is candarch in the act subsystem. It can have the values confirmed, all, or none. If the value is confirmed, only confirmed primary candidates are archived. If the value is all, then all primary and secondary candidates are archived. If the value is none, there is no archiving. The Activity Unit sends the archive request to the DX for each candidate signal and the DX sends a 16-subchannel bandwidth of confirmation data around the candidate to the Archiver to be written to disk.
If a candidate has not been eliminated by any of the RFI mitigation techniques, then it is reobserved at the next opportunity for a new activity to start.First, an ON Observation is scheduled. Each DX is initially sent a list of the confirmed Candidates for the frequency and beam that it is observing. The Candidates must appear within +- 100 Hz of the predicted frequency. If the Candidates are not seen again, they are marked as RFI. If the Candidates are reconfirmed, they are sent to the other DXs at the same frequency for secondary confirmation as above. If a Candidate is seen on 2 or more DXs, it is marked as RFI. If a Candidate is seen only on one DX, an OFF Observation is scheduled.
For the OFF observation, each DX is sent the list of confirmed Candidates for the frequency and beam that it is observing. If the candidate is seen off-source, it is marked as RFI. If a candidate is not seen off-source, another ON Observation is scheduled. The OFFs do not exchange candidates for secondary processing.
Alternating ON and OFF source Observations are scheduled as long as the Candidate continues to be confirmed or until the source sets or an observer/scientist intervenes. During the testing phase, there is a maximum of 4 on/off followup pairs, then normal observing resumes.
The signal descriptions will be accessible from the database. The scientific data will be accessible from the archive storage.
|← Disk||Index||Packet Programs →|