spirv_tools_sys/opt.rs
1#[repr(C)]
2pub struct Optimizer {
3 _unused: [u8; 0],
4}
5
6#[repr(C)]
7pub struct OptimizerOptions {
8 _unused: [u8; 0],
9}
10
11#[derive(Copy, Clone, Debug)]
12#[repr(C)]
13#[allow(clippy::upper_case_acronyms)]
14pub enum Passes {
15 /// Create aggressive dead code elimination pass
16 /// This pass eliminates unused code from the module. In addition,
17 /// it detects and eliminates code which may have spurious uses but which do
18 /// not contribute to the output of the function. The most common cause of
19 /// such code sequences is summations in loops whose result is no longer used
20 /// due to dead code elimination. This optimization has additional compile
21 /// time cost over standard dead code elimination.
22 ///
23 /// This pass only processes entry point functions. It also only processes
24 /// shaders with relaxed logical addressing (see opt/instruction.h). It
25 /// currently will not process functions with function calls. Unreachable
26 /// functions are deleted.
27 ///
28 /// This pass will be made more effective by first running passes that remove
29 /// dead control flow and inlines function calls.
30 ///
31 /// This pass can be especially useful after running Local Access Chain
32 /// Conversion, which tends to cause cycles of dead code to be left after
33 /// Store/Load elimination passes are completed. These cycles cannot be
34 /// eliminated with standard dead code elimination.
35 AggressiveDCE,
36 /// Replaces the extensions VK_AMD_shader_ballot,VK_AMD_gcn_shader, and
37 /// VK_AMD_shader_trinary_minmax with equivalent code using core instructions and
38 /// capabilities.
39 AmdExtToKhr,
40 /// Creates a block merge pass.
41 /// This pass searches for blocks with a single Branch to a block with no
42 /// other predecessors and merges the blocks into a single block. Continue
43 /// blocks and Merge blocks are not candidates for the second block.
44 ///
45 /// The pass is most useful after Dead Branch Elimination, which can leave
46 /// such sequences of blocks. Merging them makes subsequent passes more
47 /// effective, such as single block local store-load elimination.
48 ///
49 /// While this pass reduces the number of occurrences of this sequence, at
50 /// this time it does not guarantee all such sequences are eliminated.
51 ///
52 /// Presence of phi instructions can inhibit this optimization. Handling
53 /// these is left for future improvements.
54 BlockMerge,
55 /// Creates a conditional constant propagation (CCP) pass.
56 /// This pass implements the SSA-CCP algorithm in
57 ///
58 /// Constant propagation with conditional branches,
59 /// Wegman and Zadeck, ACM TOPLAS 13(2):181-210.
60 ///
61 /// Constant values in expressions and conditional jumps are folded and
62 /// simplified. This may reduce code size by removing never executed jump targets
63 /// and computations with constant operands.
64 ConditionalConstantPropagation,
65 /// Creates a CFG cleanup pass.
66 /// This pass removes cruft from the control flow graph of functions that are
67 /// reachable from entry points and exported functions. It currently includes the
68 /// following functionality:
69 ///
70 /// - Removal of unreachable basic blocks.
71 CFGCleanup,
72 /// Create a pass to do code sinking. Code sinking is a transformation
73 /// where an instruction is moved into a more deeply nested construct.
74 CodeSinking,
75 /// Create a pass to combine chained access chains.
76 /// This pass looks for access chains fed by other access chains and combines
77 /// them into a single instruction where possible.
78 CombineAccessChains,
79 /// Creates a compact ids pass.
80 /// The pass remaps result ids to a compact and gapless range starting from %1.
81 CompactIds,
82 /// Create pass to convert relaxed precision instructions to half precision.
83 /// This pass converts as many relaxed float32 arithmetic operations to half as
84 /// possible. It converts any float32 operands to half if needed. It converts
85 /// any resulting half precision values back to float32 as needed. No variables
86 /// are changed. No image operations are changed.
87 ///
88 /// Best if run after function scope store/load and composite operation
89 /// eliminations are run. Also best if followed by instruction simplification,
90 /// redundancy elimination and DCE.
91 ConvertRelaxedToHalf,
92 /// Create copy propagate arrays pass.
93 /// This pass looks to copy propagate memory references for arrays. It looks
94 /// for specific code patterns to recognize array copies.
95 CopyPropagateArrays,
96 /// Create dead branch elimination pass.
97 /// For each entry point function, this pass will look for SelectionMerge
98 /// BranchConditionals with constant condition and convert to a Branch to
99 /// the indicated label. It will delete resulting dead blocks.
100 ///
101 /// For all phi functions in merge block, replace all uses with the id
102 /// corresponding to the living predecessor.
103 ///
104 /// Note that some branches and blocks may be left to avoid creating invalid
105 /// control flow. Improving this is left to future work.
106 ///
107 /// This pass is most effective when preceeded by passes which eliminate
108 /// local loads and stores, effectively propagating constant values where
109 /// possible.
110 DeadBranchElim,
111 /// Creates a dead insert elimination pass.
112 /// This pass processes each entry point function in the module, searching for
113 /// unreferenced inserts into composite types. These are most often unused
114 /// stores to vector components. They are unused because they are never
115 /// referenced, or because there is another insert to the same component between
116 /// the insert and the reference. After removing the inserts, dead code
117 /// elimination is attempted on the inserted values.
118 ///
119 /// This pass performs best after access chains are converted to inserts and
120 /// extracts and local loads and stores are eliminated. While executing this
121 /// pass can be advantageous on its own, it is also advantageous to execute
122 /// this pass after CreateInsertExtractPass() as it will remove any unused
123 /// inserts created by that pass.
124 DeadInsertElim,
125 /// Create dead variable elimination pass.
126 /// This pass will delete module scope variables, along with their decorations,
127 /// that are not referenced.
128 DeadVariableElimination,
129 /// Create descriptor scalar replacement pass.
130 /// This pass replaces every array variable |desc| that has a DescriptorSet and
131 /// Binding decorations with a new variable for each element of the array.
132 /// Suppose |desc| was bound at binding |b|. Then the variable corresponding to
133 /// |desc[i]| will have binding |b+i|. The descriptor set will be the same. It
134 /// is assumed that no other variable already has a binding that will used by one
135 /// of the new variables. If not, the pass will generate invalid Spir-V. All
136 /// accesses to |desc| must be OpAccessChain instructions with a literal index
137 /// for the first index.
138 DescriptorScalarReplacement,
139 /// Creates a eliminate-dead-constant pass.
140 /// A eliminate-dead-constant pass removes dead constants, including normal
141 /// contants defined by OpConstant, OpConstantComposite, OpConstantTrue, or
142 /// OpConstantFalse and spec constants defined by OpSpecConstant,
143 /// OpSpecConstantComposite, OpSpecConstantTrue, OpSpecConstantFalse or
144 /// OpSpecConstantOp.
145 EliminateDeadConstant,
146 /// Creates an eliminate-dead-functions pass.
147 /// An eliminate-dead-functions pass will remove all functions that are not in
148 /// the call trees rooted at entry points and exported functions. These
149 /// functions are not needed because they will never be called.
150 EliminateDeadFunctions,
151 /// Creates an eliminate-dead-members pass.
152 /// An eliminate-dead-members pass will remove all unused members of structures.
153 /// This will not affect the data layout of the remaining members.
154 EliminateDeadMembers,
155 /// Create a pass to fix incorrect storage classes. In order to make code
156 /// generation simpler, DXC may generate code where the storage classes do not
157 /// match up correctly. This pass will fix the errors that it can.
158 FixStorageClass,
159 /// Creates a flatten-decoration pass.
160 /// A flatten-decoration pass replaces grouped decorations with equivalent
161 /// ungrouped decorations. That is, it replaces each OpDecorationGroup
162 /// instruction and associated OpGroupDecorate and OpGroupMemberDecorate
163 /// instructions with equivalent OpDecorate and OpMemberDecorate instructions.
164 /// The pass does not attempt to preserve debug information for instructions
165 /// it removes.
166 FlattenDecoration,
167 /// Creates a fold-spec-constant-op-and-composite pass.
168 /// A fold-spec-constant-op-and-composite pass folds spec constants defined by
169 /// OpSpecConstantOp or OpSpecConstantComposite instruction, to normal Constants
170 /// defined by OpConstantTrue, OpConstantFalse, OpConstant, OpConstantNull, or
171 /// OpConstantComposite instructions. Note that spec constants defined with
172 /// OpSpecConstant, OpSpecConstantTrue, or OpSpecConstantFalse instructions are
173 /// not handled, as these instructions indicate their value are not determined
174 /// and can be changed in future. A spec constant is foldable if all of its
175 /// value(s) can be determined from the module. E.g., an integer spec constant
176 /// defined with OpSpecConstantOp instruction can be folded if its value won't
177 /// change later. This pass will replace the original OpSpecContantOp instruction
178 /// with an OpConstant instruction. When folding composite spec constants,
179 /// new instructions may be inserted to define the components of the composite
180 /// constant first, then the original spec constants will be replaced by
181 /// OpConstantComposite instructions.
182 ///
183 /// There are some operations not supported yet:
184 /// OpSConvert, OpFConvert, OpQuantizeToF16 and
185 /// all the operations under Kernel capability.
186 /// TODO(qining): Add support for the operations listed above.
187 FoldSpecConstantOpAndComposite,
188 /// Creates a freeze-spec-constant-value pass.
189 /// A freeze-spec-constant pass specializes the value of spec constants to
190 /// their default values. This pass only processes the spec constants that have
191 /// SpecId decorations (defined by OpSpecConstant, OpSpecConstantTrue, or
192 /// OpSpecConstantFalse instructions) and replaces them with their normal
193 /// counterparts (OpConstant, OpConstantTrue, or OpConstantFalse). The
194 /// corresponding SpecId annotation instructions will also be removed. This
195 /// pass does not fold the newly added normal constants and does not process
196 /// other spec constants defined by OpSpecConstantComposite or
197 /// OpSpecConstantOp.
198 FreezeSpecConstantValue,
199 /// Creates a graphics robust access pass.
200 ///
201 /// This pass injects code to clamp indexed accesses to buffers and internal
202 /// arrays, providing guarantees satisfying Vulkan's robustBufferAccess rules.
203 ///
204 /// TODO(dneto): Clamps coordinates and sample index for pointer calculations
205 /// into storage images (OpImageTexelPointer). For an cube array image, it
206 /// assumes the maximum layer count times 6 is at most 0xffffffff.
207 ///
208 /// NOTE: This pass will fail with a message if:
209 /// - The module is not a Shader module.
210 /// - The module declares VariablePointers, VariablePointersStorageBuffer, or
211 /// RuntimeDescriptorArrayEXT capabilities.
212 /// - The module uses an addressing model other than Logical
213 /// - Access chain indices are wider than 64 bits.
214 /// - Access chain index for a struct is not an OpConstant integer or is out
215 /// of range. (The module is already invalid if that is the case.)
216 /// - TODO(dneto): The OpImageTexelPointer coordinate component is not 32-bits
217 /// wide.
218 ///
219 /// NOTE: Access chain indices are always treated as signed integers. So
220 /// if an array has a fixed size of more than 2^31 elements, then elements
221 /// from 2^31 and above are never accessible with a 32-bit index,
222 /// signed or unsigned. For this case, this pass will clamp the index
223 /// between 0 and at 2^31-1, inclusive.
224 /// Similarly, if an array has more then 2^15 element and is accessed with
225 /// a 16-bit index, then elements from 2^15 and above are not accessible.
226 /// In this case, the pass will clamp the index between 0 and 2^15-1
227 /// inclusive.
228 GraphicsRobustAccess,
229 /// Creates a pass that converts if-then-else like assignments into OpSelect.
230 IfConversion,
231 /// Creates an exhaustive inline pass.
232 /// An exhaustive inline pass attempts to exhaustively inline all function
233 /// calls in all functions in an entry point call tree. The intent is to enable,
234 /// albeit through brute force, analysis and optimization across function
235 /// calls by subsequent optimization passes. As the inlining is exhaustive,
236 /// there is no attempt to optimize for size or runtime performance. Functions
237 /// that are not in the call tree of an entry point are not changed.
238 InlineExhaustive,
239 /// Creates an opaque inline pass.
240 /// An opaque inline pass inlines all function calls in all functions in all
241 /// entry point call trees where the called function contains an opaque type
242 /// in either its parameter types or return type. An opaque type is currently
243 /// defined as Image, Sampler or SampledImage. The intent is to enable, albeit
244 /// through brute force, analysis and optimization across these function calls
245 /// by subsequent passes in order to remove the storing of opaque types which is
246 /// not legal in Vulkan. Functions that are not in the call tree of an entry
247 /// point are not changed.
248 InlineOpaque,
249 /// Creates an insert/extract elimination pass.
250 /// This pass processes each entry point function in the module, searching for
251 /// extracts on a sequence of inserts. It further searches the sequence for an
252 /// insert with indices identical to the extract. If such an insert can be
253 /// found before hitting a conflicting insert, the extract's result id is
254 /// replaced with the id of the values from the insert.
255 ///
256 /// Besides removing extracts this pass enables subsequent dead code elimination
257 /// passes to delete the inserts. This pass performs best after access chains are
258 /// converted to inserts and extracts and local loads and stores are eliminated.
259 InsertExtractElim,
260 /// Replaces the internal version of GLSLstd450 InterpolateAt* extended
261 /// instructions with the externally valid version. The internal version allows
262 /// an OpLoad of the interpolant for the first argument. This pass removes the
263 /// OpLoad and replaces it with its pointer. glslang and possibly other
264 /// frontends will create the internal version for HLSL. This pass will be part
265 /// of HLSL legalization and should be called after interpolants have been
266 /// propagated into their final positions.
267 InterpolateFixup,
268 /// Creates a local access chain conversion pass.
269 /// A local access chain conversion pass identifies all function scope
270 /// variables which are accessed only with loads, stores and access chains
271 /// with constant indices. It then converts all loads and stores of such
272 /// variables into equivalent sequences of loads, stores, extracts and inserts.
273 ///
274 /// This pass only processes entry point functions. It currently only converts
275 /// non-nested, non-ptr access chains. It does not process modules with
276 /// non-32-bit integer types present. Optional memory access options on loads
277 /// and stores are ignored as we are only processing function scope variables.
278 ///
279 /// This pass unifies access to these variables to a single mode and simplifies
280 /// subsequent analysis and elimination of these variables along with their
281 /// loads and stores allowing values to propagate to their points of use where
282 /// possible.
283 LocalAccessChainConvert,
284 /// Creates an SSA local variable load/store elimination pass.
285 /// For every entry point function, eliminate all loads and stores of function
286 /// scope variables only referenced with non-access-chain loads and stores.
287 /// Eliminate the variables as well.
288 ///
289 /// The presence of access chain references and function calls can inhibit
290 /// the above optimization.
291 ///
292 /// Only shader modules with relaxed logical addressing (see opt/instruction.h)
293 /// are currently processed. Currently modules with any extensions enabled are
294 /// not processed. This is left for future work.
295 ///
296 /// This pass is most effective if preceeded by Inlining and
297 /// LocalAccessChainConvert. LocalSingleStoreElim and LocalSingleBlockElim
298 /// will reduce the work that this pass has to do.
299 LocalMultiStoreElim,
300 /// Create value numbering pass.
301 /// This pass will look for instructions in the same basic block that compute the
302 /// same value, and remove the redundant ones.
303 LocalRedundancyElimination,
304 /// Creates a single-block local variable load/store elimination pass.
305 /// For every entry point function, do single block memory optimization of
306 /// function variables referenced only with non-access-chain loads and stores.
307 /// For each targeted variable load, if previous store to that variable in the
308 /// block, replace the load's result id with the value id of the store.
309 /// If previous load within the block, replace the current load's result id
310 /// with the previous load's result id. In either case, delete the current
311 /// load. Finally, check if any remaining stores are useless, and delete store
312 /// and variable if possible.
313 ///
314 /// The presence of access chain references and function calls can inhibit
315 /// the above optimization.
316 ///
317 /// Only modules with relaxed logical addressing (see opt/instruction.h) are
318 /// currently processed.
319 ///
320 /// This pass is most effective if preceeded by Inlining and
321 /// LocalAccessChainConvert. This pass will reduce the work needed to be done
322 /// by LocalSingleStoreElim and LocalMultiStoreElim.
323 ///
324 /// Only functions in the call tree of an entry point are processed.
325 LocalSingleBlockLoadStoreElim,
326 /// Creates a local single store elimination pass.
327 /// For each entry point function, this pass eliminates loads and stores for
328 /// function scope variable that are stored to only once, where possible. Only
329 /// whole variable loads and stores are eliminated; access-chain references are
330 /// not optimized. Replace all loads of such variables with the value that is
331 /// stored and eliminate any resulting dead code.
332 ///
333 /// Currently, the presence of access chains and function calls can inhibit this
334 /// pass, however the Inlining and LocalAccessChainConvert passes can make it
335 /// more effective. In additional, many non-load/store memory operations are
336 /// not supported and will prohibit optimization of a function. Support of
337 /// these operations are future work.
338 ///
339 /// Only shader modules with relaxed logical addressing (see opt/instruction.h)
340 /// are currently processed.
341 ///
342 /// This pass will reduce the work needed to be done by LocalSingleBlockElim
343 /// and LocalMultiStoreElim and can improve the effectiveness of other passes
344 /// such as DeadBranchElimination which depend on values for their analysis.
345 LocalSingleStoreElim,
346 /// Create LICM pass.
347 /// This pass will look for invariant instructions inside loops and hoist them to
348 /// the loops preheader.
349 LoopInvariantCodeMotion,
350 /// Creates a loop peeling pass.
351 /// This pass will look for conditions inside a loop that are true or false only
352 /// for the N first or last iteration. For loop with such condition, those N
353 /// iterations of the loop will be executed outside of the main loop.
354 /// To limit code size explosion, the loop peeling can only happen if the code
355 /// size growth for each loop is under |code_growth_threshold|.
356 LoopPeeling,
357 /// Creates a loop unswitch pass.
358 /// This pass will look for loop independent branch conditions and move the
359 /// condition out of the loop and version the loop based on the taken branch.
360 /// Works best after LICM and local multi store elimination pass.
361 LoopUnswitch,
362 /// create merge return pass.
363 /// changes functions that have multiple return statements so they have a single
364 /// return statement.
365 ///
366 /// for structured control flow it is assumed that the only unreachable blocks in
367 /// the function are trivial merge and continue blocks.
368 ///
369 /// a trivial merge block contains the label and an opunreachable instructions,
370 /// nothing else. a trivial continue block contain a label and an opbranch to
371 /// the header, nothing else.
372 ///
373 /// these conditions are guaranteed to be met after running dead-branch
374 /// elimination.
375 MergeReturn,
376 /// Creates a null pass.
377 /// A null pass does nothing to the SPIR-V module to be optimized.
378 Null,
379 /// Create a private to local pass.
380 /// This pass looks for variables delcared in the private storage class that are
381 /// used in only one function. Those variables are moved to the function storage
382 /// class in the function that they are used.
383 PrivateToLocal,
384 /// Create line propagation pass
385 /// This pass propagates line information based on the rules for OpLine and
386 /// OpNoline and clones an appropriate line instruction into every instruction
387 /// which does not already have debug line instructions.
388 ///
389 /// This pass is intended to maximize preservation of source line information
390 /// through passes which delete, move and clone instructions. Ideally it should
391 /// be run before any such pass. It is a bookend pass with EliminateDeadLines
392 /// which can be used to remove redundant line instructions at the end of a
393 /// run of such passes and reduce final output file size.
394 PropagateLineInfo,
395 /// Create a pass to reduce the size of loads.
396 /// This pass looks for loads of structures where only a few of its members are
397 /// used. It replaces the loads feeding an OpExtract with an OpAccessChain and
398 /// a load of the specific elements.
399 ReduceLoadSize,
400 /// Create global value numbering pass.
401 /// This pass will look for instructions where the same value is computed on all
402 /// paths leading to the instruction. Those instructions are deleted.
403 RedundancyElimination,
404 /// Create dead line elimination pass
405 /// This pass eliminates redundant line instructions based on the rules for
406 /// OpLine and OpNoline. Its main purpose is to reduce the size of the file
407 /// need to store the SPIR-V without losing line information.
408 ///
409 /// This is a bookend pass with PropagateLines which attaches line instructions
410 /// to every instruction to preserve line information during passes which
411 /// delete, move and clone instructions. DeadLineElim should be run after
412 /// PropagateLines and all such subsequent passes. Normally it would be one
413 /// of the last passes to be run.
414 RedundantLineInfoElim,
415 /// Create relax float ops pass.
416 /// This pass decorates all float32 result instructions with RelaxedPrecision
417 /// if not already so decorated.
418 RelaxFloatOps,
419 /// Creates a remove duplicate pass.
420 /// This pass removes various duplicates:
421 /// * duplicate capabilities;
422 /// * duplicate extended instruction imports;
423 /// * duplicate types;
424 /// * duplicate decorations.
425 RemoveDuplicates,
426 /// Creates a remove-unused-interface-variables pass.
427 /// Removes variables referenced on the |OpEntryPoint| instruction that are not
428 /// referenced in the entry point function or any function in its call tree.
429 /// Note that this could cause the shader interface to no longer match other
430 /// shader stages.
431 RemoveUnusedInterfaceVariables,
432 /// Creates a pass that will replace instructions that are not valid for the
433 /// current shader stage by constants. Has no effect on non-shader modules.
434 ReplaceInvalidOpcode,
435 /// Creates a pass that simplifies instructions using the instruction folder.
436 Simplification,
437 /// Create the SSA rewrite pass.
438 /// This pass converts load/store operations on function local variables into
439 /// operations on SSA IDs. This allows SSA optimizers to act on these variables.
440 /// Only variables that are local to the function and of supported types are
441 /// processed (see IsSSATargetVar for details).
442 SSARewrite,
443 /// Creates a strength-reduction pass.
444 /// A strength-reduction pass will look for opportunities to replace an
445 /// instruction with an equivalent and less expensive one. For example,
446 /// multiplying by a power of 2 can be replaced by a bit shift.
447 StrengthReduction,
448 /// Creates a strip-debug-info pass.
449 /// A strip-debug-info pass removes all debug instructions (as documented in
450 /// Section 3.32.2 of the SPIR-V spec) of the SPIR-V module to be optimized.
451 StripDebugInfo,
452 /// Creates a strip-nonsemantic-info pass.
453 /// A strip-nonsemantic-info pass removes all reflections and explicitly
454 /// non-semantic instructions.
455 StripNonSemanticInfo,
456 /// Creates a unify-constant pass.
457 /// A unify-constant pass de-duplicates the constants. Constants with the exact
458 /// same value and identical form will be unified and only one constant will
459 /// be kept for each unique pair of type and value.
460 /// There are several cases not handled by this pass:
461 /// 1) Constants defined by OpConstantNull instructions (null constants) and
462 /// constants defined by OpConstantFalse, OpConstant or OpConstantComposite
463 /// with value 0 (zero-valued normal constants) are not considered equivalent.
464 /// So null constants won't be used to replace zero-valued normal constants,
465 /// vice versa.
466 /// 2) Whenever there are decorations to the constant's result id id, the
467 /// constant won't be handled, which means, it won't be used to replace any
468 /// other constants, neither can other constants replace it.
469 /// 3) NaN in float point format with different bit patterns are not unified.
470 UnifyConstant,
471 /// Create a pass to upgrade to the VulkanKHR memory model.
472 /// This pass upgrades the Logical GLSL450 memory model to Logical VulkanKHR.
473 /// Additionally, it modifies memory, image, atomic and barrier operations to
474 /// conform to that model's requirements.
475 UpgradeMemoryModel,
476 /// Create a vector dce pass.
477 /// This pass looks for components of vectors that are unused, and removes them
478 /// from the vector. Note this would still leave around lots of dead code that
479 /// a pass of ADCE will be able to remove.
480 VectorDCE,
481 /// Creates a workaround driver bugs pass. This pass attempts to work around
482 /// a known driver bug (issue #1209) by identifying the bad code sequences and
483 /// rewriting them.
484 ///
485 /// Current workaround: Avoid OpUnreachable instructions in loops.
486 Workaround1209,
487 /// Create a pass to replace each OpKill instruction with a function call to a
488 /// function that has a single OpKill. Also replace each OpTerminateInvocation
489 /// instruction with a function call to a function that has a single
490 /// OpTerminateInvocation. This allows more code to be inlined.
491 WrapOpKill,
492}
493
494extern "C" {
495 pub fn optimizer_create(env: crate::shared::TargetEnv) -> *mut Optimizer;
496 pub fn optimizer_destroy(opt: *mut Optimizer);
497
498 pub fn optimizer_run(
499 opt: *const Optimizer,
500 input_ptr: *const u32,
501 input_size: usize,
502 binary: *mut *mut crate::shared::Binary,
503 message_callback: crate::diagnostics::MessageCallback,
504 message_ctx: *mut std::ffi::c_void,
505 options: *const OptimizerOptions,
506 ) -> crate::shared::SpirvResult;
507
508 /// Creates an optimizer options object with default options. Returns a valid
509 /// options object. The object remains valid until it is passed into
510 /// |spvOptimizerOptionsDestroy|.
511 #[link_name = "spvOptimizerOptionsCreate"]
512 pub fn optimizer_options_create() -> *mut OptimizerOptions;
513
514 /// Destroys the given optimizer options object.
515 #[link_name = "spvOptimizerOptionsDestroy"]
516 pub fn optimizer_options_destroy(options: *mut OptimizerOptions);
517
518 /// Records whether or not the optimizer should run the validator before
519 /// optimizing. If |val| is true, the validator will be run.
520 #[link_name = "spvOptimizerOptionsSetRunValidator"]
521 pub fn optimizer_options_run_validator(options: *mut OptimizerOptions, run: bool);
522
523 /// Records the validator options that should be passed to the validator if it is
524 /// run.
525 #[link_name = "spvOptimizerOptionsSetValidatorOptions"]
526 pub fn optimizer_options_set_validator_options(
527 options: *mut OptimizerOptions,
528 validator_opts: *mut crate::val::ValidatorOptions,
529 );
530
531 /// Records the maximum possible value for the id bound.
532 #[link_name = "spvOptimizerOptionsSetMaxIdBound"]
533 pub fn optimizer_options_set_max_id_bound(options: *mut OptimizerOptions, max: u32);
534
535 /// Records whether all bindings within the module should be preserved.
536 #[link_name = "spvOptimizerOptionsSetPreserveBindings"]
537 pub fn optimizer_options_preserve_bindings(options: *mut OptimizerOptions, preserve: bool);
538
539 /// Records whether all specialization constants within the module
540 /// should be preserved.
541 #[link_name = "spvOptimizerOptionsSetPreserveSpecConstants"]
542 pub fn optimizer_options_preserve_spec_constants(
543 options: *mut OptimizerOptions,
544 preserve: bool,
545 );
546
547 pub fn optimizer_register_pass(opt: *mut Optimizer, which: Passes);
548
549 /// Registers passes that attempt to improve performance of generated code.
550 /// This sequence of passes is subject to constant review and will change
551 /// from time to time.
552 pub fn optimizer_register_performance_passes(opt: *mut Optimizer);
553
554 /// Registers passes that attempt to improve the size of generated code.
555 /// This sequence of passes is subject to constant review and will change
556 /// from time to time.
557 pub fn optimizer_register_size_passes(opt: *mut Optimizer);
558
559 /// Registers passes that have been prescribed for converting from Vulkan to
560 /// WebGPU. This sequence of passes is subject to constant review and will
561 /// change from time to time.
562 pub fn optimizer_register_vulkan_to_webgpu_passes(opt: *mut Optimizer);
563
564 /// Registers passes that have been prescribed for converting from WebGPU to
565 /// Vulkan. This sequence of passes is subject to constant review and will
566 /// change from time to time.
567 pub fn optimizer_register_webgpu_to_vulkan_passes(opt: *mut Optimizer);
568
569 /// Registers passes that attempt to legalize the generated code.
570 ///
571 /// Note: this recipe is specially designed for legalizing SPIR-V. It should be
572 /// used by compilers after translating HLSL source code literally. It should
573 /// *not* be used by general workloads for performance or size improvement.
574 ///
575 /// This sequence of passes is subject to constant review and will change
576 /// from time to time.
577 pub fn optimizer_register_hlsl_legalization_passes(opt: *mut Optimizer);
578
579 // Some passes take arguments, so we create those separately on a
580 // case-by-case basis
581
582 // #[repr(C)]
583 // pub struct SpecConstantDefault {
584 // pub id: u32,
585 // pub value_ptr: *const c_char,
586 // pub value_len: usize,
587 // }
588
589 // Creates a set-spec-constant-default-value pass from a mapping from spec-ids
590 // to the default values in the form of string.
591 // A set-spec-constant-default-value pass sets the default values for the
592 // spec constants that have SpecId decorations (i.e., those defined by
593 // OpSpecConstant{|True|False} instructions).
594 // SetSpecConstantDefaultValuePass(
595 // const std::unordered_map<uint32_t, std::string>& id_value_map);
596
597 // Create a pass to instrument OpDebugPrintf instructions.
598 // This pass replaces all OpDebugPrintf instructions with instructions to write
599 // a record containing the string id and the all specified values into a special
600 // printf output buffer (if space allows). This pass is designed to support
601 // the printf validation in the Vulkan validation layers.
602 //
603 // The instrumentation will write buffers in debug descriptor set |desc_set|.
604 // It will write |shader_id| in each output record to identify the shader
605 // module which generated the record.
606 // InstDebugPrintfPass(uint32_t desc_set,
607 // uint32_t shader_id);
608
609 // Create a pass to instrument bindless descriptor checking
610 // This pass instruments all bindless references to check that descriptor
611 // array indices are inbounds, and if the descriptor indexing extension is
612 // enabled, that the descriptor has been initialized. If the reference is
613 // invalid, a record is written to the debug output buffer (if space allows)
614 // and a null value is returned. This pass is designed to support bindless
615 // validation in the Vulkan validation layers.
616 //
617 // TODO(greg-lunarg): Add support for buffer references. Currently only does
618 // checking for image references.
619 //
620 // Dead code elimination should be run after this pass as the original,
621 // potentially invalid code is not removed and could cause undefined behavior,
622 // including crashes. It may also be beneficial to run Simplification
623 // (ie Constant Propagation), DeadBranchElim and BlockMerge after this pass to
624 // optimize instrument code involving the testing of compile-time constants.
625 // It is also generally recommended that this pass (and all
626 // instrumentation passes) be run after any legalization and optimization
627 // passes. This will give better analysis for the instrumentation and avoid
628 // potentially de-optimizing the instrument code, for example, inlining
629 // the debug record output function throughout the module.
630 //
631 // The instrumentation will read and write buffers in debug
632 // descriptor set |desc_set|. It will write |shader_id| in each output record
633 // to identify the shader module which generated the record.
634 // |input_length_enable| controls instrumentation of runtime descriptor array
635 // references, and |input_init_enable| controls instrumentation of descriptor
636 // initialization checking, both of which require input buffer support.
637 // InstBindlessCheckPass(
638 // uint32_t desc_set, uint32_t shader_id, bool input_length_enable = false,
639 // bool input_init_enable = false, bool input_buff_oob_enable = false);
640
641 // // Create a pass to instrument physical buffer address checking
642 // // This pass instruments all physical buffer address references to check that
643 // // all referenced bytes fall in a valid buffer. If the reference is
644 // // invalid, a record is written to the debug output buffer (if space allows)
645 // // and a null value is returned. This pass is designed to support buffer
646 // // address validation in the Vulkan validation layers.
647 // //
648 // // Dead code elimination should be run after this pass as the original,
649 // // potentially invalid code is not removed and could cause undefined behavior,
650 // // including crashes. Instruction simplification would likely also be
651 // // beneficial. It is also generally recommended that this pass (and all
652 // // instrumentation passes) be run after any legalization and optimization
653 // // passes. This will give better analysis for the instrumentation and avoid
654 // // potentially de-optimizing the instrument code, for example, inlining
655 // // the debug record output function throughout the module.
656 // //
657 // // The instrumentation will read and write buffers in debug
658 // // descriptor set |desc_set|. It will write |shader_id| in each output record
659 // // to identify the shader module which generated the record.
660 // InstBuffAddrCheckPass(uint32_t desc_set,
661 // uint32_t shader_id);
662
663 // Create loop unroller pass.
664 // Creates a pass to unroll loops which have the "Unroll" loop control
665 // mask set. The loops must meet a specific criteria in order to be unrolled
666 // safely this criteria is checked before doing the unroll by the
667 // LoopUtils::CanPerformUnroll method. Any loop that does not meet the criteria
668 // won't be unrolled. See CanPerformUnroll LoopUtils.h for more information.
669 //LoopUnrollPass(bool fully_unroll, int factor = 0);
670
671 // Create scalar replacement pass.
672 // This pass replaces composite function scope variables with variables for each
673 // element if those elements are accessed individually. The parameter is a
674 // limit on the number of members in the composite variable that the pass will
675 // consider replacing.
676 //ScalarReplacementPass(uint32_t size_limit = 100);
677
678 // Creates a loop fission pass.
679 // This pass will split all top level loops whose register pressure exceedes the
680 // given |threshold|.
681 //LoopFissionPass(size_t threshold);
682
683 // Creates a loop fusion pass.
684 // This pass will look for adjacent loops that are compatible and legal to be
685 // fused. The fuse all such loops as long as the register usage for the fused
686 // loop stays under the threshold defined by |max_registers_per_loop|.
687 //LoopFusionPass(size_t max_registers_per_loop);
688}