Signal path overview ==================== This section gives a logical overview of the signal path. The actual implementation is mathematically equivalent (when ignoring floating-point rounding errors), but it splits, combines, and reorders steps for efficiency; refer to the :doc:`fgpu.design`, :doc:`xbgpu.design` and :doc:`vgpu.design` sections for details. Edges in diagrams are annotated to indicate the data type. The following types are used: :samp:`i{N}` Signed integer or fixed-point with :samp:`{N}` total bits :samp:`f{N}` Floating-point with :samp:`{N}` bits, following IEEE 754-2019. :samp:`c{X}` Complex values formed from real and imaginary components of type :samp:`{X}` e.g., ``cf32`` is single-precision floating point complex. Dotted boxes and arrows represent control parameters that can be adjusted at runtime. Channelisation and delay correction ----------------------------------- Note that the input and output bit depths (shown as ``i10`` and ``ci8`` on the diagram) are configurable. Between unpacking and quantisation, all calculations are performed in single precision. Since the input has a bounded range, overflow is only possible at the quantisation step (which saturates). The figure below shows the signal path for wide-band channelisation. .. tikz:: Signal path for wide-band channelisation. :libs: chains, positioning \tikzset{ base/.style={minimum width=2.5cm, minimum height=1cm, align=center}, op/.style={draw, base}, control/.style={draw, base, rounded corners, dotted}, lbl/.style={font=\scriptsize}, every join/.style={draw,->}, >=latex, } \newcommand{\side}[2]{ \node[op, on chain, join=by {#2, edge label=f32}] (cdelay#1) {Coarse delay}; \node[op, on chain, join=by {#2, edge label=f32}] (pfb#1) {PFB}; \node[op, on chain, join=by {#2, edge label=cf32}] (eq#1) {Fine delay\\ Equalisation}; \node[op, on chain, join=by {#2, edge label=cf32}] (dither#1) {Dither}; \node[op, on chain, join=by {#2, edge label=cf32}] (quant#1) {Quantise}; } \node[op] (receive) {Receive}; \begin{scope}[start chain=chainx going below] \node[op, below left=of receive, on chain] (unpackx) {Unpack}; \side{x}{lbl, swap} \end{scope} \begin{scope}[start chain=chainy going below] \node[op, below right=of receive, on chain] (unpacky) {Unpack}; \side{y}{lbl} \end{scope} \begin{scope}[start chain=sink going below] \node[op, on chain, below right=of quantx] (pack) {Corner turn\\ Pack}; \node[op, on chain, join=by {lbl, edge label=ci8}] (transmit) {Transmit}; \end{scope} \node[control, right=of pfbx] (delays) {Delays}; \node[control, right=of eqx] (eq) {Eq coefficients}; \draw[->] (receive) -| node[lbl, very near start, auto, swap] {pol0} node[lbl, near end, auto, swap] {i10} (unpackx); \draw[->] (receive) -| node[lbl, very near start, auto] {pol1} node[lbl, near end, auto] {i10} (unpacky); \draw[->, dotted] (delays) to[lbl, auto, edge label'=i32] (cdelayx); \draw[->, dotted] (delays) to[lbl, auto, edge label=f32] (eqx); \draw[->, dotted] (delays) to[lbl, auto, edge label=i32] (cdelayy); \draw[->, dotted] (delays) to[lbl, auto, edge label'=f32] (eqy); \draw[->, dotted] (eq) to[lbl, auto, edge label'=cf32] (eqx); \draw[->, dotted] (eq) to[lbl, auto, edge label=cf32] (eqy); \draw[->] (quantx) |- node[lbl, auto, swap, near start] {ci8} (pack); \draw[->] (quanty) |- node[lbl, auto, near start] {ci8} (pack); Delay ^^^^^ Delays may be specified with sub-sample precision. To handle this, the delay is split into two components: a :dfn:`coarse` delay (a whole number of samples) and a :dfn:`fine` delay (between -0.5 and +0.5 samples). The coarse delay is applied as a shift in time, while the fine delay is applied as a phase slope in the frequency domain. As noted in :ref:`math-delay`, the user provides the overall phase adjustment for the centre frequency, and the constant term of the phase slope is computed from that (taking into account the effect of the coarse delay on phase). The fine delay and the fixed phase offset for each spectrum are computed in double precision then reduced to single precision for application. Conversion of the delay to a per-channel phase correction, and of phases to complex phasors are done in single precision. Polyphase filter bank (PFB) ^^^^^^^^^^^^^^^^^^^^^^^^^^^ A finite impulse response (FIR) filter is applied to the signal to condition the frequency-domain response. The filter is the product of a window function (to reduce spectral leakage) and a sinc (to broaden the peak to cover the frequency bin). Specifically, if there are :math:`n` output channels and :math:`t` taps in the polyphase filter bank, then the filter has length :math:`w = 2nt`, with coefficients .. math:: x_i = AW_i\operatorname{sinc}\left(w_c\cdot \frac{i + \tfrac 12 - nt}{2n}\right), where :math:`i` runs from 0 to :math:`w - 1`, and :math:`W` is the window function, for which there are two choices: - Hann: :math:`W_i = \sin^2\left(\frac{\pi i}{w - 1}\right)` - Rect: :math:`W_i = 1`. :math:`A` is a normalisation factor which is chosen such that :math:`\sum_i x_i^2 = 1`. This ensures that given white Gaussian noise as input, the expected output power in a channel is the same as the expected input power in a digitised sample. Note that the input and output are treated as integers rather than as fixed-point values. The tuning parameter :math:`w_c` (specified by the :option:`!--w-cutoff` command-line option) scales the width of the response in the frequency domain. The default value is 1, which makes the width of the response (at -6dB) approximately equal the channel spacing. In some cases spectral leakage is less important than the ability to reconstruct the original signal. Setting :math:`t = 1`, :math:`w_c = 0` and using the rectangular window function gives a degenerate PFB in which each block of :math:`2n` samples is Fourier transformed. .. _signal-path.narrow: Dithering ^^^^^^^^^ To improve linearity, a random value selected uniformly from the interval (-0.5, 0.5) is added to each component (real and imaginary) before quantisation. The random seeds are carefully chosen to ensure that random sequences are not shared across antennas. Narrowband ^^^^^^^^^^ Narrowband outputs are those in which only a portion of the digitised bandwidth is channelised and output. Typically they have narrower channel widths. The overall approach is as follows: 1. The signal is multiplied (:dfn:`mixed`) by a complex tone of the form :math:`e^{2\pi jft}`, to effect a shift in the frequency of the signal. The centre of the desired band is placed at the DC frequency. 2. The signal is convolved with a low-pass filter. This suppresses most of the unwanted parts of the band, to the extent possible with a FIR filter. 3. The signal is subsampled (every Nth sample is retained), reducing the data rate. The low-pass filter above limits aliasing. At this stage, twice as much bandwidth as desired is retained. The steps up to this one are referred to as :dfn:`digital down-conversion` (DDC). 4. The coarse delay and PFB proceed largely as before, but using double the final channel count (since the bandwidth is also doubled, the channel width is as desired). The input is now complex rather than real (due to the mixing), so the PFB is complex-to-complex rather than real-to-complex. 5. Half the channels (the outer half) are discarded. .. note:: To avoid confusion, the "subsampling factor" is the ratio of original to retained samples in the subsampling step, while the "decimation factor" is the factor by which the bandwidth is reduced. Because the mixing turns a real signal into a complex signal, the subsampling factor is twice the decimation factor in step 3 (but equal to the overall decimation factor). The decimation is thus achieved by a combination of time-domain (steps 2 and 3) and frequency domain (step 5) techniques. This has better computational efficiency than a purely frequency-domain approach (which would require the PFB to be run on the full bandwidth), while mitigating many of the filter design problems inherent in a purely time-domain approach (the roll-off of the FIR filter can be hidden in the discarded outer channels). The figure below shows the modified signal path. .. tikz:: Signal path for narrow-band channelisation (with new stages in blue). :libs: chains, positioning \tikzset{ base/.style={minimum width=2.5cm, minimum height=1cm, align=center}, op/.style={draw, base}, extra/.style={draw=blue, color=blue}, control/.style={draw, base, rounded corners, dotted}, lbl/.style={font=\scriptsize}, every join/.style={draw,->}, >=latex, } \newcommand{\side}[2]{ \node[op, extra, on chain, join=by {#2, edge label=cf32}] (ddc) {DDC}; \node[op, on chain, join=by {#2, edge label=cf32}] (cdelay#1) {Coarse delay}; \node[op, on chain, join=by {#2, edge label=cf32}] (pfb#1) {PFB}; \node[op, extra, on chain, join=by {#2, edge label=cf32}] (discard#1) {Discard\\ channels}; \node[op, on chain, join=by {#2, edge label=cf32}] (eq#1) {Fine delay\\ Equalisation}; \node[op, on chain, join=by {#2, edge label=cf32}] (dither#1) {Dither}; \node[op, on chain, join=by {#2, edge label=cf32}] (quant#1) {Quantise}; } \node[op] (receive) {Receive}; \begin{scope}[start chain=chainx going below] \node[op, below left=of receive, on chain] (unpackx) {Unpack}; \side{x}{lbl, swap} \end{scope} \begin{scope}[start chain=chainy going below] \node[op, below right=of receive, on chain] (unpacky) {Unpack}; \side{y}{lbl} \end{scope} \begin{scope}[start chain=sink going below] \node[op, on chain, below right=of quantx] (pack) {Corner turn\\ Pack}; \node[op, on chain, join=by {lbl, edge label=ci8}] (transmit) {Transmit}; \end{scope} \node[control, right=of pfbx] (delays) {Delays}; \node[control, right=of eqx] (eq) {Eq coefficients}; \draw[->] (receive) -| node[lbl, very near start, auto, swap] {pol0} node[lbl, near end, auto, swap] {i10} (unpackx); \draw[->] (receive) -| node[lbl, very near start, auto] {pol1} node[lbl, near end, auto] {i10} (unpacky); \draw[->, dotted] (delays) to[lbl, auto, edge label'=i32] (cdelayx); \draw[->, dotted] (delays) to[lbl, auto, edge label=f32] (eqx); \draw[->, dotted] (delays) to[lbl, auto, edge label=i32] (cdelayy); \draw[->, dotted] (delays) to[lbl, auto, edge label'=f32] (eqy); \draw[->, dotted] (eq) to[lbl, auto, edge label'=cf32] (eqx); \draw[->, dotted] (eq) to[lbl, auto, edge label=cf32] (eqy); \draw[->] (quantx) |- node[lbl, auto, swap, near start] {ci8} (pack); \draw[->] (quanty) |- node[lbl, auto, near start] {ci8} (pack); Discarding half the channels after channelisation allows for a lot of freedom in the design of the DDC FIR filter: the discarded channels can have an arbitrary response. This allows for a gradual transition from passband to stopband. We use :func:`scipy.signal.remez` to produce a filter that is as close as possible to 1 in the passband and 0 in the stopband. A weighting factor (which the user can override) balances the priority of the passband (ripple) and stopband (alias suppression). The filter performance is slightly improved by noting that the discarded channels have multiple aliases, and the filter response in those aliases is also irrelevant. We thus use :func:`scipy.signal.remez` to only optimise the response to those channels that alias into the output. Narrowband without discard ~~~~~~~~~~~~~~~~~~~~~~~~~~ The above combined time-frequency approach to narrowband can be disabled, giving a purely time-domain FIR filter. In this case, step 5 is skipped. The primary use case is for reconstructing a time-domain signal from the channelised output, where completely discarding channels appears to lose necessary information. The filter design in this case is more critical, and needs to trade off factors such as passband ripple, roll-off, and alias rejection. The user provides the desired pass bandwidth. We then use :func:`scipy.signal.remez` with the stop bandwidth calculated such that the roll-off is symmetrically located around the Nyquist frequency. This means that after subsampling, half of the roll-off region will alias, but it will not alias with the passband. As before, users can override a weighting factor that balances the priority of the passband and stopband. It may be possible to improve this further by leaving other aliases of the roll-off region unconstrained, as was done above, but this has not currently been investigated. Correlation ----------- Given a baseline (p, q) and time-varying channelised voltages :math:`e_p` and :math:`e_q`, the correlation product is the sum of :math:`e_p \overline{e_q}` over the accumulation period. This is computed in integer arithmetic and so is lossless except when saturation occurs. The figure below shows the signal path. .. tikz:: Signal path for correlation :libs: chains \tikzset{ base/.style={minimum width=2.5cm, minimum height=1cm, align=center}, op/.style={draw, base}, control/.style={draw, base, rounded corners, dotted}, lbl/.style={font=\scriptsize}, every join/.style={draw,->}, >=latex, } \begin{scope}[start chain=going below] \node[op, on chain] {Receive}; \node[op, on chain, join=by {lbl,edge label=ci8}] {Correlate\\ Accumulate}; \node[op, on chain, join=by {lbl,edge label=ci64}] {Saturate}; \node[op, on chain, join=by {lbl,edge label=ci32}] {Transmit}; \end{scope} Beamforming ----------- The signal path below is repeated for each single-polarisation beam. Delays are computed purely with a phase slope in the frequency domain, similarly to the fine delays in the channeliser. Dithering is done the same way as for channelisation. Since all calculations are performed in single precision floating point and the input has a limited range, overflow can only occur during quantisation (which saturates). .. tikz:: Signal path for beamforming :libs: chains \tikzset{ base/.style={minimum width=2.5cm, minimum height=1cm, align=center}, op/.style={draw, base}, control/.style={draw, base, rounded corners, dotted}, lbl/.style={font=\scriptsize}, every join/.style={draw,->}, >=latex, } \begin{scope}[start chain=going below] \node[op, on chain] {Receive}; \node[op, on chain, join=by {lbl,edge label=ci8}] (mult) {Taper/Scale\\ Delay}; \node[op, on chain, join=by {lbl,edge label=cf32}] {Sum}; \node[op, on chain, join=by {lbl,edge label=cf32}] {Dither}; \node[op, on chain, join=by {lbl,edge label=cf32}] {Quantise}; \node[op, on chain, join=by {lbl,edge label=ci8}] {Transmit}; \node[control, above right=of mult] (taper) {Tapering\\ coefficients}; \node[control, right=of mult] (gain) {Requantisation\\ gain}; \node[control, below right=of mult] (delay) {Delays}; \draw[->, dotted] (taper) to[lbl, near start, edge label=f32] (mult); \draw[->, dotted] (gain) to[lbl, edge label=f32] (mult); \draw[->, dotted] (delay) to[lbl, near start, edge label'=f32] (mult); \end{scope} VLBI Resampling --------------- The VLBI resampler (V-Engine) performs the following functions: 1. Channelised beamformer data is converted back to the time domain using a Discrete Fourier Transform (DFT). To obtain reasonable results, the channeliser must be configured to use a DFT, which can be done by passing ``taps=1,w_cutoff=0.0,window_function=rect`` when configuring the stream. 2. A bandpass filter is used to reduce the bandwidth. The input must have already been mixed to place the desired centre frequency at DC. The narrowband mode of fgpu will do this. 3. If necessary, a Jones matrix is applied to alter the polarisation basis. This does not implement parallactic angle correction, but it can for example convert linear to circular polarisation. 4. The signal is separated into real sideband signals corresponding to the positive and negative frequencies of the complex signal. At this point there are four real signals: lower and upper sideband, for each polarisation. The remaining processing treats these four signals independently. 5. The signal power level is normalised. Samples are chunked into fixed-length intervals (e.g., 1 second), and the power of each interval is scaled so that the voltages have a standard deviation of 1.0. 6. The signals are quantised and encoded into VDIF frames. .. todo:: Add a diagram here