pgrx_pg_sys/submodules/ffi.rs
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250
//LICENSE Portions Copyright 2019-2021 ZomboDB, LLC.
//LICENSE
//LICENSE Portions Copyright 2021-2023 Technology Concepts & Design, Inc.
//LICENSE
//LICENSE Portions Copyright 2023-2023 PgCentral Foundation, Inc. <contact@pgcentral.org>
//LICENSE
//LICENSE All rights reserved.
//LICENSE
//LICENSE Use of this source code is governed by the MIT license that can be found in the LICENSE file.
#![deny(unsafe_op_in_unsafe_fn)]
#[cfg(not(all(
any(target_os = "linux", target_os = "macos"),
any(target_arch = "x86_64", target_arch = "aarch64")
)))]
mod cee_scape {
#[cfg(not(feature = "cshim"))]
compile_error!("target platform cannot work without feature cshim");
use libc::{c_int, c_void};
use std::marker::PhantomData;
#[repr(C)]
pub struct SigJmpBufFields {
_internal: [u8; 0],
_neither_send_nor_sync: PhantomData<*const u8>,
}
pub fn call_with_sigsetjmp<F>(savemask: bool, mut callback: F) -> c_int
where
F: for<'a> FnOnce(&'a SigJmpBufFields) -> c_int,
{
extern "C" {
fn call_closure_with_sigsetjmp(
savemask: c_int,
closure_env_ptr: *mut c_void,
closure_code: extern "C" fn(
jbuf: *const SigJmpBufFields,
env_ptr: *mut c_void,
) -> c_int,
) -> c_int;
}
extern "C" fn call_from_c_to_rust<F>(
jbuf: *const SigJmpBufFields,
closure_env_ptr: *mut c_void,
) -> c_int
where
F: for<'a> FnOnce(&'a SigJmpBufFields) -> c_int,
{
let closure_env_ptr: *mut F = closure_env_ptr as *mut F;
unsafe { (closure_env_ptr.read())(&*jbuf) }
}
let savemask: libc::c_int = if savemask { 1 } else { 0 };
unsafe {
let closure_env_ptr = core::ptr::addr_of_mut!(callback);
core::mem::forget(callback);
call_closure_with_sigsetjmp(
savemask,
closure_env_ptr as *mut libc::c_void,
call_from_c_to_rust::<F>,
)
}
}
}
use cee_scape::{call_with_sigsetjmp, SigJmpBufFields};
/**
Given a closure that is assumed to be a wrapped Postgres `extern "C"` function, [pg_guard_ffi_boundary]
works with the Postgres and C runtimes to create a "barrier" that allows Rust to catch Postgres errors
(`elog(ERROR)`) while running the supplied closure. This is done for the sake of allowing Rust to run
destructors before Postgres destroys the memory contexts that Rust-in-Postgres code may be enmeshed in.
Wrapping the FFI into Postgres enables
- memory safety
- improving error logging
- minimizing resource leaks
But only the first of these is considered paramount.
At all times PGRX reserves the right to choose an implementation that achieves memory safety.
Currently, this function is used to protect **every** bindgen-generated Postgres `extern "C"` function.
Generally, the only time *you'll* need to use this function is when calling a Postgres-provided
function pointer.
# Safety
It is undefined behavior if the function passed to `pg_guard_ffi_boundary` have objects with
destructors on the stack when postgres raises an `ERROR`. For example, the following is
both a resource leak, and undefined behavior (as it needs to be a [trivially-deallocated
stack frame]):
```rust,ignore
// This is UB!
pgrx::pg_sys::ffi::pg_guard_ffi_boundary(|| {
let data = vec![1, 2, 3, 4, 5];
// call FFI function that raises an ERROR
});
```
Instead, you should write it like
```rust,ignore
let data = vec![1, 2, 3, 4, 5];
pgrx::pg_sys::ffi::pg_guard_ffi_boundary(|| {
// call FFI function that raises an ERROR
});
```
Further, it is undefined behavior if the function passed into `pg_guard_ffi_boundary` panics. It
is recommended that you keep the body of the `pg_guard_ffi_boundary` closure very small -- ideally
*only* containing a call to some C function, rather than containing any logic or variables of its
own.
Furthermore, Postgres is a single-threaded runtime. As such, [`pg_guard_ffi_boundary`] should
**only** be called from the main thread. In fact, [`pg_guard_ffi_boundary`] will detect this
and immediately panic.
More generally, Rust cannot guarantee destructors are always run, PGRX is written in Rust code, and
the implementation of `pg_guard_ffi_boundary` relies on help from Postgres, the OS, and the C runtime;
thus, relying on the FFI boundary catching an error and propagating it back into Rust to guarantee
Rust's language-level memory safety when calling Postgres is unsound (i.e. there are no promises).
Postgres can and does opt to erase exception and error context stacks in some situations.
The C runtime is beholden to the operating system, which may do as it likes with a thread.
PGRX has many magical powers, some of considerable size, but they are not infinite cosmic power.
Thus, if Postgres gives you a pointer into the database's memory, and you corrupt that memory
in a way technically permitted by Rust, intending to fix it before Postgres or Rust notices,
then you may not call Postgres and expect Postgres to not notice the code crimes in progress.
Postgres and Rust will see you. Whether they choose to ignore such misbehavior is up to them, not PGRX.
If you are manipulating transient "pure Rust" data, however, it is unlikely this is of consequence.
# Implementation Note
The main implementation uses `sigsetjmp`, [`pg_sys::error_context_stack`], and [`pg_sys::PG_exception_stack`].
which, when Postgres enters its exception handling in `elog.c`, will prompt a `siglongjmp` back to it.
This caught error is then converted into a Rust `panic!()` and propagated up the stack, ultimately
being converted into a transaction-aborting Postgres `ERROR` by PGRX.
[trivially-deallocated stack frame]: https://github.com/rust-lang/rfcs/blob/master/text/2945-c-unwind-abi.md#plain-old-frames
**/
use crate as pg_sys;
use crate::panic::{CaughtError, ErrorReport, ErrorReportLocation, ErrorReportWithLevel};
use core::ffi::CStr;
use std::mem::MaybeUninit;
#[inline(always)]
#[track_caller]
pub unsafe fn pg_guard_ffi_boundary<T, F: FnOnce() -> T>(f: F) -> T {
// SAFETY: Caller promises not to call us from anything but the main thread.
unsafe { pg_guard_ffi_boundary_impl(f) }
}
#[inline(always)]
#[track_caller]
unsafe fn pg_guard_ffi_boundary_impl<T, F: FnOnce() -> T>(f: F) -> T {
//! This is the version that uses sigsetjmp and all that, for "normal" Rust/PGRX interfaces.
// The next code is definitely thread-unsafe (it manipulates statics in an
// unsynchronized manner), so we may as well check here.
super::thread_check::check_active_thread();
// SAFETY: This should really, really not be done in a multithreaded context as it
// accesses multiple `static mut`. The ultimate caller asserts this is the main thread.
unsafe {
let caller_memxct = pg_sys::CurrentMemoryContext;
let prev_exception_stack = pg_sys::PG_exception_stack;
let prev_error_context_stack = pg_sys::error_context_stack;
let mut result: std::mem::MaybeUninit<T> = MaybeUninit::uninit();
let jump_value = call_with_sigsetjmp(false, |jump_buffer| {
// Make Postgres' error-handling system aware of our new
// setjmp/longjmp restore point.
pg_sys::PG_exception_stack = std::mem::transmute(jump_buffer as *const SigJmpBufFields);
// execute the closure, which will be a wrapped internal Postgres function
result.write(f());
0
});
if jump_value == 0 {
// Flag is 0, so we've taken the successful return path. We're not
// here as the result of a longjmp.
// Restore Postgres' understanding of where its next longjmp should go
pg_sys::PG_exception_stack = prev_exception_stack;
pg_sys::error_context_stack = prev_error_context_stack;
result.assume_init()
} else {
// We've landed here b/c of a longjmp originating in Postgres
// the overhead to get the current [ErrorData] from Postgres and convert
// it into our [ErrorReportWithLevel] seems worth the user benefit
//
// Note that this only happens in the case of us trapping an error
// At this point, we're running within `pg_sys::ErrorContext`, but should be in the
// memory context the caller was in before we call [CopyErrorData()] and start using it
pg_sys::CurrentMemoryContext = caller_memxct;
// SAFETY: `pg_sys::CopyErrorData()` will always give us a valid pointer, so just assume so
let errdata_ptr = pg_sys::CopyErrorData();
let errdata = errdata_ptr.as_ref().unwrap_unchecked();
// copy out the fields we need to support pgrx' error handling
let level = errdata.elevel.into();
let sqlerrcode = errdata.sqlerrcode.into();
let message = errdata
.message
.is_null()
.then(|| String::from("<null error message>"))
.unwrap_or_else(|| CStr::from_ptr(errdata.message).to_string_lossy().to_string());
let detail = errdata.detail.is_null().then_some(None).unwrap_or_else(|| {
Some(CStr::from_ptr(errdata.detail).to_string_lossy().to_string())
});
let hint = errdata.hint.is_null().then_some(None).unwrap_or_else(|| {
Some(CStr::from_ptr(errdata.hint).to_string_lossy().to_string())
});
let funcname = errdata.funcname.is_null().then_some(None).unwrap_or_else(|| {
Some(CStr::from_ptr(errdata.funcname).to_string_lossy().to_string())
});
let file =
errdata.filename.is_null().then(|| String::from("<null filename>")).unwrap_or_else(
|| CStr::from_ptr(errdata.filename).to_string_lossy().to_string(),
);
let line = errdata.lineno as _;
// clean up after ourselves by freeing the result of [CopyErrorData] and restoring
// Postgres' understanding of where its next longjmp should go
pg_sys::FreeErrorData(errdata_ptr);
pg_sys::PG_exception_stack = prev_exception_stack;
pg_sys::error_context_stack = prev_error_context_stack;
// finally, turn this Postgres error into a Rust panic so that we can ensure proper
// Rust stack unwinding and also defer handling until later
std::panic::panic_any(CaughtError::PostgresError(ErrorReportWithLevel {
level,
inner: ErrorReport {
sqlerrcode,
message,
detail,
hint,
location: ErrorReportLocation { file, funcname, line, col: 0, backtrace: None },
},
}))
}
}
}