katgpucbf.xbgpu.correlation module
Module wrapping the ASTRON Tensor-Core Correlation Kernels in the MeerKAT katsdpsigproc framework.
Todo
Eventually modify the classes to support 4 and 16 bit input samples. The kernel supports this, but it is not exposed to the reader. There is no use case for this at the moment, so this is a low priority.
- class katgpucbf.xbgpu.correlation.Correlation(template: CorrelationTemplate, command_queue: AbstractCommandQueue, n_batches: int)[source]
Bases:
OperationTensor-Core correlation kernel.
Specifies the shape of the input sample and output visibility buffers required by the kernel. The parameters specified in the
CorrelationTemplateobject are used to determine the shape of the buffers. There is an outer-most dimension called “batches”, over which the operation is parallelised. Not all batches need to be processed every time: set thefirst_batchandlast_batchattributes to control which batches will be computed.The input sample buffer must have the shape:
[n_batches][n_ants][channels][spectra_per_heap][polarisations]There is an alignment requirement for
spectra_per_heapdue to the implementation details of the kernel. For 8-bit input mode,spectra_per_heapmust be a multiple of 16.Each input element is a complex 8-bit integer sample.
numpydoes not support 8-bit complex numbers, so the dimensionality is extended by 1, with the last dimension sized2to represent the complexity.With 8-bit input samples, the value -128i is not supported by the kernel as there is no 8-bit complex conjugate representation of this number. Passing
-128iinto the kernel will produce incorrect values at the output.The output visibility buffer must have the shape
[channels][baselines][COMPLEX]. In 8-bit mode, each element in this visibility matrix is a 32-bit integer value.Calling this object does not directly update the output. Instead, it updates an intermediate buffer (called
mid_visibilities). To produce the output, callreduce(). This function can also flag data that was missing during the accumulation, by writing a special value. This is controlled by thepresent_baselinesslot, which has one boolean entry per baseline (antenna pair).Currently only 8-bit input mode is supported.
- static get_baseline_index(ant1: int, ant2: int) int[source]
Get index in the visibilities matrix for baseline (ant1, ant2).
The visibilities matrix indexing is as follows:
ant2 = 0 1 2 3 4 +--------------- ant1 = 0 | 00 01 03 06 10 1 | 02 04 07 11 2 | 05 08 12 3 | 09 13 4 | 14
This function requires that \(ant2 \ge ant1\)
- class katgpucbf.xbgpu.correlation.CorrelationTemplate(context: AbstractContext, n_ants: int, n_channels_per_substream: int, n_spectra_per_heap: int, input_sample_bits: int)[source]
Bases:
objectTemplate class for the Tensor-Core correlation kernel.
The template creates a
Correlationthat will run the compiled kernel. The parameters are used to compile the kernel and by theCorrelationto specify the shape of the memory buffers connected to this kernel.The number of baselines calculated here is not the canonical way that it is done in radio astronomy:
\[n_{baselines} = \frac{n_{ants} * (n_{ants} + 1)}{2}\]Because we have a dual-polarised telescope, we calculate four ‘baselines’ for each canonical baseline as calculated above, namely \(h_1 h_2\), \(h_1 v_2\), \(v_1 h_2\), and \(v_1 v_2\). So the list of baselines appears four times as long as you might expect.
- Parameters:
n_ants – The number of antennas that will be correlated. Each antennas is expected to produce two polarisations.
n_channels_per_substream – The number of frequency channels to be processed.
n_spectra_per_heap – The number of time samples to be processed per frequency channel.
input_sample_bits – The number of bits per input sample. Only 8 bits is supported at the moment.
- instantiate(command_queue: AbstractCommandQueue, n_batches: int) Correlation[source]
Create a
Correlationusing this template to build the kernel.
- katgpucbf.xbgpu.correlation.MIN_COMPUTE_CAPABILITY = (7, 2)
Minimum CUDA compute capability needed for the kernel (with 8-bit samples)
- katgpucbf.xbgpu.correlation.MISSING = array([-2147483648, 1], dtype=int32)
Magic value indicating missing data
- katgpucbf.xbgpu.correlation.device_filter(device: AbstractDevice) bool[source]
Determine whether a device is suitable for running the kernel.