Appendix C. Decoded reference tables

This appendix collects the decoded values cited throughout the guide into one reference: enum tokens, struct layouts, status codes, the register-init table, command table, register map, and program-container schema. Each section begins with its table and names the research-corpus file that contains the full set when only a representative excerpt is reproduced here.

Each section reproduces the representative and structurally important rows of a larger table. Where a table runs to hundreds or thousands of rows, the full set is in the research corpus and the section names the file that contains it. Every value here is read out of an M1/H13 binary by static analysis.

C.1 Operation-attribute enum tokens to integer values

These are the integer codes the compiler resolves attribute string tokens to. Table C.1 is the front-end activation token map (MILOpConverter::NeuronTypeFromString, 33 entries, miss resolves to 0).

tokeninttokenint
relu1gelu23
leaky_relu2degamma25
clamped_relu3trunc26
relu_n (relu6)4round_nearest27
sigmoid5floor28
sigmoid_high_precision6ceil29
tanh7erf31
silu (alias swish)9threshold_relu32
swish_hard10gamma33
sqr11inv14
sqrt12log215
rsqrt13exp216
elu18exp17
sin20sign19
cos21sigmoid_hard22

Table C.1. The front-end activation token names and the integer neuron codes they resolve to.

The serialization dtype enum (ANECIRDataType, the dtype / storage_type attribute) is an 11-code space, given as table C.2.

inttypeinttype
0int46uint16
1uint87int32
2int88uint32
3fp169int64
4fp3210uint64
5int16

Table C.2. The serialization data-type codes and the element types they name.

The MLIR symbolize* enums are a length-dispatched chain of packed integer string compares, fully static, each miss resolving to 0, collected in table C.3.

attributetokenint
padding_styleEXPLICIT / TF_VALID / TF_SAME / EXPLICIT_OFFSET / ONNX_SAME_LOWER0 / 1 / 2 / 3 / 4
nearest_rounding_moderound_prefer_ceil / round_prefer_floor / ceil / floor / round_to_even / round_to_odd0 / 1 / 2 / 3 / 4 / 5
reduce_opmin / max / sum / prod / argMin / argMax0 / 1 / 2 / 3 / 4 / 5
data_layoutNCHW / NHWC / OIHW / HWIO / CHW / HWC / HW0 / 1 / 2 / 3 / 4 / 5 / 6
data_layoutNCDHW / NDHWC / OIDHW / DHWIO7 / 8 / 9 / 10
scatter modeadd / subtract / multiply / divide / min / max / set0 / 1 / 2 / 3 / 4 / 5 / 6
pool indices_modeGlobalFlatten1D..4D / LocalFlatten1D..4D0..3 / 4..7
RNN gate activationnone / relu / tanh / sigmoid / hard_sigmoid / scaled_tanh0 / 1 / 2 / 3 / 4 / 5
stencil padding_modeconstant / mirror / mirrorWithEdge / clampToEdge / zero / periodic / antiPeriodic0 / 1 / 2 / 3 / 4 / 5 / 6
pixel_formatR8Unorm / RG8Unorm / RGBA8Unorm / BGRA8Unorm / R16Float0 / 1 / 2 / 3 / 4
pixel_formatRG16Float / RGBA16Float / R32Float / RG32Float / RGBA32Float5 / 6 / 7 / 8 / 9
arith::RoundingModeto_nearest_even / downward / upward / toward_zero / to_nearest_away0 / 1 / 2 / 3 / 4
collective reductionsum / max / min / prod / mean1 / 2 / 3 / 4 / 5

Table C.3. The symbolized attribute tokens and the integer values each one maps to.

A second class of enum is in a packed pointer table in the data segment, the internal Zin enums the lowering layer dispatches on, listed in table C.4.

enumvalue to token
ZinIrPoolingType1 Avg, 2 Max, 3 ChannelMax, 4 Min, 5 ChannelMin, 6 L1, 7 L2, 8 SpatialAndChannelAvg, 9 SpatialAndChannelMax, 10 SpatialAndChannelMin, 11 SpatialArgMax, 12 ChannelArgMax, 13 SpatialArgMin, 14 ChannelArgMin
ZinIrReductionType0 Sum, 1 Min, 2 Max, 3 Avg, 4 SatSum, 5 SatSub, 6 ArgMin, 7 ArgMax, 8 BitwiseAnd, 9 BitwiseOr, 10 BitwiseXor
ZinIrEWType1 Add, 2 Mult, 3 Square, 4 Sub, 5 Power, 6 Div, 7 Max, 8 Min, 9 Abs, 10 EqualZero, 11 NotEqualZero, 12 LessThanZero, 13 LessThanEqualZero, 14 GreaterThanEqualZero, 15 GreaterThanZero, 16 Equal, 17 NotEqual, 18 LessThan, 19 LessThanEqual, 20 GreaterThanEqual, 21 GreaterThan
ZinIrScaledEWType1 Add, 2 Mult, 3 SumSquare, 4 Max, 5 Min
ZinIrPaddingMode1 Zero, 2 Negative, 3 Replication, 5 Symmetric, 6 Reflective, 7 Background, 8 DontCare
ZinIrSamplingMethod0 Linear, 1 NearestNeighbor
ZinIrSamplingGridMode0 AlignedCorners, 1 UnalignedCorners, 2 OffsetCorners, 3 Default, 4 OffsetDefault, 5 OffsetDefaultWithNominalScale, 6 StrictAlignedCorners
ZinIrCoordinateMode0 NonNormalized, 1 NormalizedSymmetric, 2 NormalizedReflect
ZinArgMode1 SpatialArgMin, 2 ChannelArgMin, 3 SpatialArgMax, 4 ChannelArgMax
ZinIrSortDirection0 Invalid, 1 Ascending, 2 Descending
ZinIrTopKType0 Invalid, 1 Min, 2 Max
ZinIrFlattenType1 NCHW, 2 NHWC
ZinIrDimension0 N, 1 D, 2 C, 3 H, 4 W

Table C.4. The internal Zin enum value-to-token maps the lowering layer dispatches on.

The op-class selector is the ZinUnitType table, 79 entries, the engine-unit each layer routes to, given in full as table C.5.

intunitintunitintunit
1Conv28LayerNormalization55RandomGenerator
2Pooling29LocalResponseNormalization56Alias
3Concat30CostVolume57CrossProduct
4ElementWise31PixelShuffle58Quant
5ScaledElementWise32PixelUnshuffle59DeQuant
6Neuron33FurthestPointSampling60Linear
7NeuronCustom34SpaceToBatch61RingBufferWriter
8GOC35BatchToSpace62RingBufferReader
9DynamicGOC36SpaceToChannel63BatchNorm
10ConstMatrixMatrixMult37ChannelToSpace64Phi
11Flatten38RadiusSearch65Condition
12Unflatten39Gather66WaitForEvent
13CrossCorrelation40AffineTransform67SignalEvent
14KernelRasterizer41Resize68NEConv
15ArgMinMax42ResizeAs69NEMatMul
16GlobalArgMinMax43Resample70NEPool
17InputView44Padding71NEBypass
18MatrixMultiplication45Tile72PEPool
19Broadcast46CropResize73PEElementWise
20Reduction47DynamicSlice74PEGOC
21Transpose48PlaneReader75AllSlice
22Reshape49PlaneWriter76AllGather
23Shape50Sort77SDPA
24Softmax51TopK78AllReduce
25InstanceNormalization52NMS79FunctionCall
26L2Normalization53MatrixDecomposition
27MinMaxNormalization54Dropout

Table C.5. The complete 79-entry ZinUnitType op-class table and the engine unit each value selects.

The activation non-linear-mode space is a parallel table, NonLinearModeToString, 48 slots indexed directly by the lower hardware mode value, reproduced as table C.6.

idxmodeidxmodeidxmode
0none16rsqrt32sin
1relu17clamped_relu_rsqrt33cos
2sigmoid18inv34gelu
3sigmoid_high_precision19sqr35gelu_sigmoid_approximation
4relu_sigmoid20log236degamma
5sigmoid_hard21exp237round_nearest
6tanh22exp38trunc
7clamped_relu23elu39floor
8prelu24sign40ceil
9relun25equal_zero41atan
10swish26not_equal_zero42atan_part1
11swish_hard27less_than_zero43atan_part2
12dirac28less_than_equal_zero44erf
13int29greater_than_equal_zero45thresholded_relu
14frac30greater_than_zero46gamma
15sqrt31custom_lut47abs

Table C.6. The complete 48-slot NonLinearModeToString table indexed by hardware non-linear-mode value.

The micro-op opcode space (ZinIrOpLayerOpCodeType, 126 codes, 0x00..0x7d) has the codes the task-descriptor builder dispatches on, given in full as table C.7.

opdecstringopdecstring
0x000CONV0x3f63AFFINE_TRANFORM
0x011POOL0x4064PLANE_READER
0x022SCALE_BIAS0x4165PLANE_WRITER
0x033TERNARY_DYNAMIC_GOC0x4266SORT
0x044ACTIVATION0x4367TOP_K
0x055EW0x4468RCAS
0x066SCALED_EW0x4569INDEX
0x077CONCAT0x4670NMS
0x088SPLIT0x4771DROPOUT
0x099COPY0x4872TYPE_CAST
0x0a10FLATTEN0x4973STOCHASTIC_ROUND
0x0b11UNFLATTEN0x4a74RANDOM_GENERATOR
0x0c12CROSS_CORRELATION0x4b75LINEAR
0x0d13CROSS_PRODUCT0x4c76RINGBUFFER_WRITER
0x0e14KERNEL_RASTERIZER0x4d77RINGBUFFER_READER
0x0f15ARG_MIN_MAX0x4e78CONDITION
0x1016GLOBAL_ARG_MIN_MAX0x4f79PHI
0x1117MATRIX_MULT0x5080BASICBLOCK_IN
0x1218BROADCAST0x5181BASICBLOCK_OUT
0x1319FLATTEN_COMPOSITE0x5282BATCHNORM
0x1420UNFLATTEN_COMPOSITE0x5383WAIT_FOR_EVENT
0x1521FPS_WITH_RADIUS_COMPOSITE0x5484SIGNAL_EVENT
0x1622PIXEL_SHUFFLE_COMPOSITE0x5585ALL_SLICE
0x1723PIXEL_UNSHUFFLE_COMPOSITE0x5686ALL_GATHER
0x1824CONV_COMPOSITE0x5787SCALED_DOT_PRODUCT_ATTENTION
0x1925MATDECOMP_MATMULT_COMPOSITE0x5888ALL_REDUCE
0x1a26CHANNEL_TO_SPACE_LARGE_FACTOR_COMPOSITE0x5989PEFUSED_ELEMENTWISE
0x1b27LIVE_IN0x5a90PEFUSED_SECUREFLUSH
0x1c28LIVEIN_PARAM0x5b91PEFUSED_POOL
0x1d29CONST_IN0x5c92PEFUSED_GOC
0x1e30LIVE_STATE0x5d93NEFUSED_CONV
0x1f31LIVE_OUT0x5e94NEFUSED_KERNEL_RASTERIZER
0x2032REDUCTION0x5f95NEFUSED_CROSS_CORRELATION
0x2133ALIAS0x6096NEFUSED_MATMUL
0x2234REINTERPRET_INNERMOST_DIMENSION0x6197NEFUSED_POOL
0x2335REINTERPRET_CAST0x6298NEFUSED_EW
0x2436RESHAPE0x6399NEFUSED_DUAL_SOURCE_EW
0x2537VIEW0x64100NEFUSED_BYPASS
0x2638TRANSPOSE0x65101NEFUSED_RCAS
0x2739SPACE_TO_BATCH0x66102TRANSPOSE_ENGINE_OP
0x2840BATCH_TO_SPACE0x67103TE_RESAMPLE
0x2941SPACE_TO_CHANNEL0x68104TE_AFFINE_TRANSFORM
0x2a42CHANNEL_TO_SPACE0x69105TE_PAD
0x2b43SOFTMAX0x6a106TE_CROP_RESIZE
0x2c44INSTANCE_NORM0x6b107TE_SLICE
0x2d45L2_NORM0x6c108TE_GATHER
0x2e46MINMAX_NORM0x6d109TE_RESIZE
0x2f47LAYER_NORM0x6e110TM_WAIT_FOR_EVENT
0x3048LRN0x6f111TM_SIGNAL_EVENT
0x3149COST_VOLUME0x70112TM_BRANCH
0x3250PIXEL_SHUFFLE0x71113TM_FETCH
0x3351PIXEL_UNSHUFFLE0x72114TM_STORE
0x3452MATRIX_DECOMPOSITION0x73115TM_OPERATE
0x3553FPS0x74116TM_USER_SLOT_LOAD
0x3654RS0x75117DMA_CONVERT
0x3755RESAMPLE0x76118QUANT
0x3856GATHER0x77119DEQUANT
0x3957TILE0x78120SNE_COND
0x3a58SLICE0x79121SNE_GOC
0x3b59PAD0x7a122CCDMA_CONST
0x3c60RESIZE0x7b123CCDMA_MEMORY
0x3d61RESIZEAS0x7c124SPILL_FILL_DUMMY
0x3e62CROP_RESIZE0x7d125INVALID

Table C.7. The complete 126-entry micro-operation opcode table and the layer-kind name each value dispatches on.

The opcode 0x3f is spelled AFFINE_TRANFORM in the binary, a vendor source typo preserved on the wire.

C.2 Operation-attribute schema and IOKit external-method struct layouts

The attribute schema is string-keyed: the token is the wire encoding the compiler matches on, and most integer constants are not recoverable statically. The converter recognizes 171 literal attribute keys, of which roughly 140 are op-facing. Table C.8 gives representative keys with their value types, meanings, and wire encodings.

keytypemeaningvalue encoding
activationenumneuron modetoken into the 22-field PWL descriptor
stridesint[]per-axis strideNDCHW int array; deconv restricted to {1,2}
groupsintgroup countchannel-wise requires groups == out.C
padding_modeenumfill ruleconstant / reflect / replicate / symmetric
weights_layoutenumweight axis orderNCHW / NHWC / OIHW / HWIO; weight buffer is MACI
compressedbool/enumweight compressionformat set by the MIL-op-count contract
interleaveintchannel-tiling quantumone of {1,2,3,4,8}
epsilonfloatnorm stabilityscalar

Table C.8. Representative operation-attribute keys with their value types, meanings, and wire encodings.

The host-to-kernel IOKit dispatch key is the (selector, struct-size) tuple, not the selector alone. Table C.9 gives the control-client selectors with their method names and decoded input and output struct sizes.

selmethodin-structout/scalar
0ANE_DeviceOpen104104
2ANE_ProgramSendRequest2376 + 1 scalar40 async
3ANE_ProgramCreate320
4ANE_ProgramPrepare5656
6ANE_ProgramDestroy160
7ANE_GetStatus032
8ANE_ProgramCreateInstance320
10ANE_GetVersion01 scalar
21ANE_ProgramInputsReady31040
22ANE_MemoryMapRequest2080 + 1 scalar1 scalar

Table C.9. The IOKit control-client selectors with their method names and decoded input and output struct sizes.

The ANEDeviceOpen shared in/out buffer (104 bytes, selector 0) decodes by byte offset as table C.10.

offsetfield
+0x00usage type (1 standard, 2 unsupported) plus session token
+0x08callback function pointer
+0x10receiver context pointer
+0x18timeout 0x2710 (10000)
+0x48version pair 32, 256
+0x50NumANEs 0, 1

Table C.10. The field layout of the ANEDeviceOpen shared input and output buffer by byte offset.

The full 171-key attribute corpus, the decoded enum-value tables, and the per-selector field layouts for the HW direct-path client are in the research corpus.

C.3 Numeric error, status, and return codes

The ANE stack has no flat numeric status enum. The fixed numeric values that exist are the IOKit return constants and the firmware magic and sentinel words, the first of which table C.11 gives with their meanings on the dispatch path.

macrohexANE path meaning
kIOReturnSuccess0x00000000success
kIOReturnError0xe0000001general failure
kIOReturnBusy0xe0000007gate or command busy
kIOReturnNoMemory0xe00002bdallocation failure
kIOReturnNoResources0xe00002beout of resources, queue or slot exhaustion
kIOReturnNotPrivileged0xe00002c1privilege check failed
kIOReturnBadArgument0xe00002c2typed-args validation failure
kIOReturnUnsupported0xe00002c7disabled or stub path; also what a gated feature returns when its entitlement is absent
kIOReturnNotReady0xe00002d0device or channel not ready
kIOReturnAborted0xe00002ebrequest aborted
kIOReturnNotFound0xe00002f0program or process handle not found
kIOReturnTimeout0xe0000404firmware op timed out

Table C.11. The IOKit return constants and their meanings on the engine dispatch path.

At every layer the error surface is name-based or message-based. The client-visible surface above IOKit is an error factory that wraps the lower-layer code into a structured error across four domains, named errorDomainCompiler, errorDomainEspresso, errorDomainGeneric, and errorDomainVirtIO. Its factory methods are the taxonomy: a generic wrapper, a missing-code-signing form, program-load and new-instance-load forms that hold the lower-layer code, surface map and unmap forms, and a virtualization-kernel form. The single most look-up-worthy client-visible value is 0xe00002c7 (kIOReturnUnsupported), returned on a disabled or unsupported path and when a gated feature's entitlement is absent.

Table C.12 gives the fixed firmware magic words and sentinel constants with their meanings.

constanthexmeaning
package magic0x414E4548 (ANEH)loader package header
program magic0x414E4550 (ANEP)loader program header
section magic0x414E4553 (ANES)loader section header
AFPP control magic0x55AA55AAAFPP control struct
checksum-valid sentinel0xFFFFFFFFcommand checksum initialized and valid
invalid id0xFFFFFFFFECSneCmdId_Invalid, unbound program or process
padding0x00000000command padding must be zero
power-status byte0xFF / 0x00fully on / fully off

Table C.12. The fixed firmware magic words and sentinel constants with their meanings.

This section shows the three loader magic words as 32-bit integers; on disk the bytes are little-endian, so a raw byte scan finds HENA, PENA, and SENA (the characters of ANEH, ANEP, ANES reversed). The firmware-to-host notification names, inline status=0x%x print sites, AArch64 and L2C fault-register dump fields, and compiler diagnostic categories are in the research corpus.

C.4 The tunable register-init table

The per-chip register init is a sequence of 12-byte (offset, mask, value) records, each applied as a masked read-modify-write, reg = (reg & ~mask) | value, where reg is the block MMIO base plus the offset. The M1 (ASC AscChinook) firmware has 1994 records across 10 named MMIO blocks, each block reached through a 32-byte descriptor of name, MMIO base, record pointer, and count, the blocks and their counts given in table C.13.

block nameMMIO baserecords
ASC_CHINOOK0x2_6b00_000024
ASCWRAP0x2_6b40_00002
sneCtrl0x2_6b84_000015
ANE0x2_6bc0_000047
aneDpePpt0x2_6b8e_c000304
aneDpePptAccp00x2_6b8e_d000528
aneDpePptAccp10x2_6b8e_e000528
aneDpePptAccp20x2_6b8e_f000528
aneDpeSys0x2_6b8f_00009
aneDpePpt_soc_dpe_lee0x2_6b8f_40009

Table C.13. The named MMIO register-init blocks with their bases and record counts.

Table C.14 gives representative records, one or more per block, with their address, mask, value, and meaning.

regAddrmaskvaluemeaning
0x2_6b14_00200xf800000x780000ASC clock or PLL divider field set to 15
0x2_6b40_080c0x6000_00010x6000_0001fabric clock and QoS enable
0x2_6b84_0028 .. 00440x8fff_c0000x8fff_c0008 identical SNE QoS and credit words, one per set
0x2_6bc0_d014 .. fec40x10x132 per-tile MAC clock and power enables
0x2_6bc1_400c0xffff_ff000x4010_1000DMA descriptor base and config word
0x2_6b8e_c0000xffff0x267epeak-power-tracking base budget word
0x2_6b8e_c42c0x3fff0x0DPE trailing-control reg, armed live to 0x3fff
0x2_6b8e_d0000xffff_ffff0x0077_3594per-counter energy scale coefficient (7811476)
0x2_6b8f_00140xffff_ffff0x0000_23e1DPE config and period word
0x2_6b8f_00380xffff_ffff0x0003_2dccDPE accumulation window and divisor (207820)
0x2_6b8f_40000x1e0xcSoC-level LEE control field

Table C.14. Representative register-init records with their address, mask, value, and meaning.

The 32 mask=1 value=1 records at a regular stride are direct evidence of the 32-tile MAC array geometry, each tile individually clock and power gateable. The DPE system block has seven ascending sampling thresholds (25, 50, 70, 85, 95, 105, 115), and the SoC leakage-estimation block has eight (10, 22, 39, 64, 89, 121, 164, 189): the firmware-side breakpoints of the power model. All 1994 decoded records are in the research corpus.

C.5 The CSNE_CMD_* numeric command table

The host-to-firmware command set is 93 entries, numbered 0x00 through 0x5c, indexed by eCSneCmdId into the firmware command-name string table, whose index is the numeric command identifier. 0xFFFFFFFF is the no-command sentinel. Table C.15 gives the full set; its dir column is H->FW for a host request and FW->H for a firmware notification, and the subsystem codes are lifecycle, power, secure, program, execution, cache, ipc, buffer, property, and stats.

idnamedirsubsystempurpose
0x00STOPH->FWlifecyclestop the controller
0x01RESETH->FWlifecyclereset controller state
0x02CONFIG_GETH->FWpropertyread config blob
0x03PRINT_ENABLEH->FWstatsenable firmware print
0x04REG_FILE_LOADH->FWlifecycleload a register file
0x05BUILDINFOH->FWlifecyclereturn firmware build string
0x06TIMEPROFILE_STARTH->FWstatsbegin time-profiling
0x07TIMEPROFILE_STOPH->FWstatsstop time-profiling
0x08TIMEPROFILE_SHOWH->FWstatsdump profile
0x09FW_RUN_MODEH->FWlifecycleselect firmware run mode
0x0aPOWER_DOWNH->FWpowerfull power-down
0x0bSET_SNE_PMU_BASEH->FWpowerset PMU MMIO base
0x0cSET_SNE_RPC_CHECK_CMDH->FWpropertyRPC sanity-check command
0x0dRPC_ENABLEH->FWpropertyenable the back-channel RPC channel
0x0ePLATFORM_INFOH->FWlifecycleplatform descriptor
0x0fBOOTH->FWlifecyclebring firmware to booted state
0x10PINGH->FWlifecycleliveness probe
0x11CONFIG_GET_EXTH->FWpropertyextended config read
0x12POWER_DEVICE_ONH->FWpowerpower the device on
0x13POWER_DEVICE_OFFH->FWpowerpower device off
0x14IPC_ENDPOINT_SETH->FWipcbind an IPC endpoint
0x15IPC_ENDPOINT_UNSETH->FWipcunbind endpoint
0x16CH_INFO_GETH->FWbufferchannel info query
0x17CH_BUFFER_RECYCLE_MODE_SETH->FWbufferset buffer-recycle mode
0x18CH_BUFFER_RECYCLE_STARTH->FWbufferstart recycling
0x19CH_BUFFER_RECYCLE_STOPH->FWbufferstop recycling
0x1aCH_BUFFER_RETURNH->FWbufferreturn one pooled buffer
0x1bCH_BUFFER_POOL_CONFIG_GETH->FWbufferread buffer-pool config
0x1cCH_BUFFER_POOL_CONFIG_SETH->FWbufferconfigure buffer-pool
0x1dCH_DATA_FILE_LOADH->FWbufferstream a data file over channel
0x1eCH_PROPERTY_WRITEH->FWpropertywrite a register or property
0x1fCH_PROPERTY_READH->FWpropertyread a register or property
0x20TRACE_ENABLEH->FWstatsenable tracing
0x21RESOURCE_INFO_GETH->FWlifecyclequery engine resources
0x22STATS_BUFFER_SIZE_GETH->FWstatscompute required stats-buffer size
0x23SUSPENDH->FWlifecyclesuspend engine
0x24DSID_SETH->FWcacheset data-set identifiers for prefetch
0x25MCACHE_SIZE_GETH->FWcachequery memory-cache size
0x26SECURE_MODE_STARTH->FWsecureenter secure mode
0x27SECURE_MODE_STOPH->FWsecureleave secure mode
0x28SET_SNE_PMU_BASE2H->FWpowerversion 2 PMU base set
0x29IPC_ENDPOINT_SET2H->FWipcversion 2 endpoint bind
0x2aIPC_ENDPOINT_UNSET2H->FWipcversion 2 unbind
0x2bCH_DATA_FILE_LOAD2H->FWbufferversion 2 data-file load
0x2cSET_DYNAMIC_POWERGATEH->FWpowerconfigure dynamic clock and power gating
0x2dANE_DEFAULT_SETTING_SETH->FWlifecyclebulk default-settings push
0x2eINIT_SHARED_EVENT_INFOH->FWipcinitialize shared-event table
0x2fEXCLAVE_MODE_STARTH->FWsecureenter exclave mode (stubbed on H13)
0x30EXCLAVE_MODE_STOPH->FWsecureleave exclave mode
0x31QUIESCE_STATEH->FWlifecycledrain in-flight work
0x32CPU_LOAD_GETH->FWstatssample CPU load
0x33SECURE_MODE_RESUME_TRANSITIONH->FWsecureresume a paused secure transition
0x34CH_ERROR_NOTIFICATIONFW->Hstatserror notification
0x35CH_POWER_CONTROLH->FWpowerchannel-level power control
0x36CH_SIGNPOST_NOTIFICATIONFW->Hstats32-bit signpost notification
0x37CH_SIGNPOST_NOTIFICATION_GROUPFW->Hstatsgrouped 32-bit signpost
0x38CH_RESET_NOTIFICATIONFW->Hstatsreset notification
0x39CH_SIGNPOST64_NOTIFICATIONFW->Hstats64-bit signpost
0x3aCH_SIGNPOST64_NOTIFICATION_GROUPFW->Hstatsgrouped 64-bit signpost
0x3bCPU_LOAD_NOTIFICATIONFW->HstatsCPU-load notification
0x3cTM_SYNC_ERR_NOTIFICATIONFW->Hstatstile-manager sync error
0x3dLOAD_PROGRAMH->FWprogramload a compiled program into a slot
0x3eUNLOAD_PROGRAMH->FWprogramunload a program
0x3fCREATE_PROCESSH->FWprograminstantiate a process for a program
0x40TERMINATE_PROCESSH->FWprogramtear down a process
0x41PROCEDURE_CALLH->FWexecutionbaseline network invocation
0x42LOAD_AFPPH->FWprogramload AFPP prefetch program
0x43UNLOAD_AFPPH->FWprogramunload AFPP
0x44PROGRAM_INTERFACE_VERSION_CHECKH->FWprogramnegotiate program-interface version
0x45PROCEDURE_CALL_CACHE_REQUESTH->FWcacheinstall a resident cache request
0x46PROCEDURE_CALL_TRIGGER_CACHE_REQUESTH->FWcachefire an installed cache request
0x47PROCEDURE_CALL_RECYCLE_OUTPUT_BUFFERH->FWcachereturn a consumed output buffer
0x48PROCEDURE_CALL_INVALIDATE_CACHE_REQUESTH->FWcachedestroy a cache request
0x49PROCEDURE_CALL_WITH_CUSTOM_BARSH->FWexecutionproc-call with custom barrier array
0x4aPREMAP_BUFFERH->FWcachepre-map an inference-property buffer
0x4bPROCEDURE_CALL_CACHE_REQUEST_WITH_CUSTOM_BARSH->FWcachecache request with custom bars
0x4cPROCEDURE_CALL_CACHE_REQUEST_WITH_SHARED_EVENTSH->FWcachecache request with shared events
0x4dFORCE_DISABLE_CACHE_REQUESTSH->FWcacheglobal cache-request disable
0x4ePROCEDURE_CALL_WITH_SIGNAL_EVENTSH->FWexecutionproc-call with wait and signal events
0x4fSET_ACTIVE_CACHE_REQUEST_IN_GROUPH->FWcacheselect active member of a cache-request group
0x50PROGRAM_EVENTFW->Hprogramper-program event notification
0x51USER_EVENTFW->Hstatsuser event marker
0x52DBG_EVENTFW->Hstatsdebug event
0x53DATA_CHAINING_EVENTFW->Hcachedata-chaining stage completion
0x54PREFETCH_DSID_EVENTFW->Hcacheprefetch completion
0x55SECURE_MODE_EVENTFW->Hsecuresecure-mode state-change event
0x56REQUEST_PROGRAM_IDH->FWprogramallocate a program-id slot
0x57RETURN_PROGRAM_IDH->FWprogramfree a program-id slot
0x58REQUEST_PROCESS_IDH->FWprogramallocate a process-id
0x59RETURN_PROCESS_IDH->FWprogramfree a process-id
0x5aINFERENCE_CALLH->FWexecutionhigh-level inference submission
0x5bBACK_CHANNEL_RPCFW<->Hpropertyfirmware-initiated back-channel RPC
0x5cDEBUG_COMMAND_DATA_CHECKH->FWstatsvalidate command-data integrity

Table C.15. The complete 93-entry host-to-firmware command table with name, direction, subsystem, and purpose.

Two strings the prior corpus counted have no numeric identifier: CSNE_CMD_START is a standalone lifecycle log alias, and CSNE_CMD_IPC_ENDPOINT_TYPE_DATA_CHAINING is an endpoint-type enum value rather than a command. The per-call numeric limits the command bodies enforce on the M1 are fixed. The dispatch caps are at most 16 signal events per call, at most 32 custom barriers on the wire (128 in the program container), at most 128 custom execute-order entries, at most 16 trigger input buffers, and fewer than 2 active shared events. Priority levels run 0 through 7, split into a privileged band of 0 and 1 and a normal band of 2 through 7. The single dispatch path takes exactly one output buffer set, one task-descriptor partition, and one engine request per list.

Table C.16 gives the fixed-header layout (sCSneControllerCmdHdr) that prefixes every ring message, with each field's offset, width, and meaning.

fieldoffsetwidthmeaning
id0x00u32the eCSneCmdId selector
size0x04u32byte length of the command body
priority0x08u32scheduling band, 0..7 (0..1 realtime, 2..7 normal)
programId0x0ci32loaded-program slot, -1 invalid
processId0x10i32per-program process instance, -1 none
procedureId0x14u32index into the program's procedure table

Table C.16. The fixed command-header fields with their offsets, widths, and meanings.

The full 93-entry table with file offsets, the decoded request structs, and the per-call numeric limits are in the research corpus.

C.6 The task-descriptor hardware register map

A captured ane_reg record is a (regAddr, regValue) pair whose low address selects one of 7 aperture groups. Table C.17 is the aperture map that converts a raw address to a group, with the image base and window of each.

regAddr rangegroupimage basewindow
< 0x4cG1 dimensions0xf419 words
0x4100 .. 0x4177G3 elementwise / planar / pad0x26430 words
0x4500 .. 0x4537G4 L2 / texture0x2e414 words
0x4900 .. 0x492bG5 kernel-fmt / op-mode0x32411 words
0x4d00 .. 0x4e13G2 tile DMA0x14869 words
0x5100 .. 0x5153G6 L2-result0x35821 words
0x5500 .. 0x5587G0 kernel / common0x2434 words

Table C.17. The register-address ranges and the aperture group, image base, and window each maps to.

Table C.18 gives representative G1 dimension fields, which also pack the format and control bits, with the bit range and width of each.

regAddrfieldbitswidthmeaning
0x00Win[14:0]15input tile width
0x02Hin[14:0]15input tile height
0x0cCin[16:0]17input channels, max 131071
0x10Cout[16:0]17output channels
0x10CommonInFmt[1:0]2source-1 element format
0x10CommonOutFmt[5:4]2output element format
0x28numGroups[12:0]13convolution groups
0x38CommonTaskType[7:4]4hardware task class (9 valid)

Table C.18. Representative G1 dimension register fields with their bit ranges, widths, and meanings.

To invert a raw value: DMA strides are 26-bit signed at bits [31:6]; L2-result base and strides are 17-bit at bits [20:4]; a full device address is (hi << 32) | lo with lo 64-byte aligned and hi 10 bits, capped at 42 bits. The complete inventory of roughly 190 register fields across the 7 groups, 11 reloc slots, and on-M1 stubbed engines (CCDMA, atomic scatter, LDTID) is in the research corpus.

Each operation descriptor in the task-descriptor stream has a 32-bit opcode word. Table C.19 gives the words decoded from a live M1 program for three operations.

operationopcode word
convolution0x5042a063
reduce-mean0x5000a021
matrix multiply0x5000b021

Table C.19. The version-7 / H13 codegen opcode words for three operations, with the high half-word shared and the low 16 bits selecting the operation.

C.7 The .e5 FlatBuffer schema

The program container is a FlatBuffer whose root table holds four fields, the schema given as listing C.1. The schema reconstructs from the serializer method set and the wire bytes, since the binary strips the reflection schema, and round-trips cleanly through the FlatBuffers tool to 23 tables, 4 enums, and 1 union. The data-type and op-type enums are recovered by name; the numeric ordinals are inferred.

namespace E5RT.fb;

enum TensorDataType : int {
  Invalid = 0, Float16 = 1, Float32 = 2, Int8 = 3, UInt8 = 4,
  Int16 = 5, Int32 = 6, Int4 = 7, Bool = 8, E4M3 = 9, E5M2 = 10
}

enum OpType : int {
  Cast = 0, AneInference = 1, EirInference = 2, CpuInference = 3,
  BnnsCpuInference = 4, MlcCpuInference = 5, MpsGraphInference = 6,
  E5MinimalCpu = 7, Quant = 8, Dequant = 9, Barrier = 10, JitCall = 11
}

table TensorDescriptor {
  dim:[ulong];
  stride:[ulong];
  width:ulong;
  height:ulong;
  channels:ulong;
  batch_number:ulong;
  sequence_length:ulong;
  stride_width:ulong;
  stride_height:ulong;
  stride_channels:ulong;
  stride_batch_number:ulong;
  stride_sequence_length:ulong;
  storage_type:TensorDataType;
  component_pack:int;
}

table BuildInfoEntry { key:string; value:string; }
table BuildInfo { entries:[BuildInfoEntry]; }   // 7 key-value pairs in the sample

table AliasSymbol { name:string; symbol_index:uint; addr_offset:uint; }

table IOPort { name:string; byte_size:ulong; aperture_va:ulong; }

table Operand { descriptor:TensorDescriptor; }

table CastAttrs { src_dtype:TensorDataType; dst_dtype:TensorDataType; component_pack:int; }

table AneInferenceAttrs {
  procedure_name:string;
  anehash:string;
  program_symbol:string;
  intermediate_buffer_handle:uint;
  compiler_options:string;
}

table Operation {
  name:string;
  op_type:OpType;
  inputs:[uint];
  outputs:[uint];
  arg_frame:string;        // __arg_frame section reference
  attrs_section:string;    // __op_attrs section reference
}

table Block { name:string; operations:[Operation]; }

table Function { name:string; anehash_path:string; blocks:[Block]; }

table Section { name:string; kind:int; }

table E5Program {
  symbol_names:[string];   // field[0] name vector
  build_info:BuildInfo;    // field[1] 7-entry sub-table
  sections:[Section];      // field[2] 6-entry section vector
  format_version:int;      // field[3] inline scalar == 4
}

root_type E5Program;

Listing C.1. The reconstructed program-container FlatBuffer schema: root table, type enums, tensor descriptor, section tables, and operation structure.

The whole fused graph collapses to a single AneInference operation, with the surrounding Cast operations holding the input and output dtype conversion. The validation against the round-9 H13C.e5 sample reads the four root fields, the seven build-info pairs (built-for-profiling, input-file-path, the component versions, and on-device-compilation), and the operation chain Cast, AneInference, Cast straight out of the bytes with no contradiction. The enum ordinals, the e4m3 and e5m2 dtypes, segment-chaining fields, and field sets of the ten op-attribute tables other than CastAttrs and AneInferenceAttrs are inferred rather than byte-confirmed in this single-segment sample.