odbc_api/guide.rs
1#![allow(clippy::needless_doctest_main)]
2/*!
3# Introduction to `odbc-api` (documentation only)
4
5## About ODBC
6
7ODBC is an open standard which allows you to connect to various data sources. Mostly these data
8sources are databases, but ODBC drivers are also available for various file types like Excel or
9CSV.
10
11Your application does not link against a driver, but will link against an ODBC driver manager
12which must be installed on the system you intend to run the application. On modern Windows
13Platforms ODBC is always installed, on OS-X or Linux distributions a driver manager like
14[unixODBC](http://www.unixodbc.org/) must be installed.
15
16To connect to a data source a driver for the specific data source in question must be installed.
17On windows you can type 'ODBC Data Sources' into the search box to start a little GUI which
18shows you the various drivers and preconfigured data sources on your system.
19
20This however is not a guide on how to configure and setup ODBC. This is a guide on how to use
21the Rust bindings for applications which want to utilize ODBC data sources.
22
23## Quickstart
24
25```no_run
26//! A program executing a query and printing the result as csv to standard out. Requires
27//! `anyhow` and `csv` crate.
28
29use anyhow::Error;
30use odbc_api::{buffers::TextRowSet, Cursor, Environment, ConnectionOptions, ResultSetMetadata};
31use std::{
32 ffi::CStr,
33 io::{stdout, Write},
34 path::PathBuf,
35};
36
37/// Maximum number of rows fetched with one row set. Fetching batches of rows is usually much
38/// faster than fetching individual rows.
39const BATCH_SIZE: usize = 5000;
40
41fn main() -> Result<(), Error> {
42 // Write csv to standard out
43 let out = stdout();
44 let mut writer = csv::Writer::from_writer(out);
45
46 // If you do not do anything fancy it is recommended to have only one Environment in the
47 // entire process.
48 let environment = Environment::new()?;
49
50 // Connect using a DSN. Alternatively we could have used a connection string
51 let mut connection = environment.connect(
52 "DataSourceName",
53 "Username",
54 "Password",
55 ConnectionOptions::default(),
56 )?;
57
58 // Execute a one of query without any parameters.
59 match connection.execute("SELECT * FROM TableName", (), None)? {
60 Some(mut cursor) => {
61 // Write the column names to stdout
62 let mut headline : Vec<String> = cursor.column_names()?.collect::<Result<_,_>>()?;
63 writer.write_record(headline)?;
64
65 // Use schema in cursor to initialize a text buffer large enough to hold the largest
66 // possible strings for each column up to an upper limit of 4KiB.
67 let mut buffers = TextRowSet::for_cursor(BATCH_SIZE, &mut cursor, Some(4096))?;
68 // Bind the buffer to the cursor. It is now being filled with every call to fetch.
69 let mut row_set_cursor = cursor.bind_buffer(&mut buffers)?;
70
71 // Iterate over batches
72 while let Some(batch) = row_set_cursor.fetch()? {
73 // Within a batch, iterate over every row
74 for row_index in 0..batch.num_rows() {
75 // Within a row iterate over every column
76 let record = (0..batch.num_cols()).map(|col_index| {
77 batch
78 .at(col_index, row_index)
79 .unwrap_or(&[])
80 });
81 // Writes row as csv
82 writer.write_record(record)?;
83 }
84 }
85 }
86 None => {
87 eprintln!(
88 "Query came back empty. No output has been created."
89 );
90 }
91 }
92
93 Ok(())
94}
95```
96
97## 32 Bit and 64 Bit considerations.
98
99To consider whether you want to work with 32 Bit or 64 Bit data sources is especially important
100for windows users, as driver managers (and possibly drivers) may both exist at the same time
101in the same system.
102
103In any case, depending on the platform part of your target triple either 32 Bit or 64 Bit
104drivers are going to work, but not both. On a private windows machine (even on a modern 64 Bit
105Windows) it is not unusual to find lots of 32 Bit drivers installed on the system, but none for
10664 Bits. So for windows users it is worth thinking about not using the default toolchain which
107is likely 64 Bits and to switch to a 32 Bit one. On other platforms you are usually fine
108sticking with 64 Bits, as there are not usually any drivers preinstalled anyway, 64 Bit or
109otherwise.
110
111No code changes are required, so you can also just build both if you want to.
112
113## Connecting to a data source
114
115### Setting up the ODBC Environment
116
117To connect with a data source we need a connection. To create a connection we need an ODBC
118environment.
119
120```no_run
121use odbc_api::Environment;
122
123let env = Environment::new()?;
124
125# Ok::<(), odbc_api::Error>(())
126```
127
128This is it. Our ODBC Environment is ready for action. We can use it to list data sources and
129drivers, but most importantly we can use it to create connections.
130
131These bindings currently support two ways of creating a connections:
132
133### Connect using a connection string
134
135Connection strings do not require that the data source is preconfigured by the driver manager
136this makes them very flexible.
137
138```no_run
139use odbc_api::{ConnectionOptions, Environment};
140
141let env = Environment::new()?;
142
143let connection_string = "
144 Driver={ODBC Driver 18 for SQL Server};\
145 Server=localhost;\
146 UID=SA;\
147 PWD=My@Test@Password1;\
148";
149
150let mut conn = env.connect_with_connection_string(connection_string, ConnectionOptions::default())?;
151# Ok::<(), odbc_api::Error>(())
152```
153
154There is a syntax to these connection strings, but few people go through the trouble to learn
155it. Most common strategy is to google one that works for with your data source. The connection
156borrows the environment, so you will get a compiler error, if your environment goes out of scope
157before the connection does.
158
159> You can list the available drivers using [`crate::Environment::drivers`].
160
161### Connect using a Data Source Name (DSN)
162
163Should a data source be known by the driver manager we can access it using its name and
164credentials. This is more convenient for the user or application developer, but requires a
165configuration of the ODBC driver manager. Think of it as shifting work from users to
166administrators.
167
168```no_run
169use odbc_api::{Environment, ConnectionOptions};
170
171let env = Environment::new()?;
172
173let mut conn = env.connect(
174 "YourDatabase", "SA", "My@Test@Password1",
175 ConnectionOptions::default()
176)?;
177# Ok::<(), odbc_api::Error>(())
178```
179
180How to configure such data sources is not the scope of this guide, and depends on the driver
181manager in question.
182
183> You can list the available data sources using [`crate::Environment::data_sources`].
184
185### Connection pooling
186
187ODBC specifies an interface to enable the driver manager to enable connection pooling for your
188application. It is off by default, but if you use ODBC to connect to your data source instead of
189implementing it in your application, or importing a library you may simply enable it in ODBC
190instead.
191Connection Pooling is governed by two attributes. The most important one is the connection
192pooling scheme which is `Off` by default. It must be set even before you create your ODBC
193environment. It is global mutable state on the process level. Setting it in Rust is therefore
194unsafe.
195
196The other one is changed via [`crate::Environment::set_connection_pooling_matching`]. It governs
197how a connection is chosen from the pool. It defaults to strict which means the `Connection`
198you get from the pool will have exactly the attributes specified in the connection string.
199
200Here is an example of how to create an ODBC environment with connection pooling.
201
202```
203use odbc_api::{Environment, environment, sys::{AttrConnectionPooling, AttrCpMatch}};
204
205fn main() {
206 // Enable connection pooling. Let driver decide whether the attributes of two connection
207 // are similar enough to change the attributes of a pooled one, to fit the requested
208 // connection, or if it is cheaper to create a new Connection from scratch.
209 // See <https://docs.microsoft.com/en-us/sql/odbc/reference/develop-app/driver-aware-connection-pooling>
210 //
211 // Safety: This call changes global mutable space in the underlying ODBC driver manager. Easiest
212 // to prove it is not causing a raise by calling it once right at the startup of your
213 // application.
214 unsafe {
215 Environment::set_connection_pooling(AttrConnectionPooling::DriverAware).unwrap();
216 }
217
218 // As long as `environment` is called for the first time **after** connection pooling is
219 // activated the static environment returned by it will have it activated.
220 let env = environment();
221 // ... use env to do db stuff
222}
223```
224
225## Executing SQL statements
226
227### Executing a single SQL statement
228
229With our ODBC connection all set up and ready to go, we can execute an SQL query:
230
231```no_run
232use odbc_api::{Environment, ConnectionOptions};
233
234let env = Environment::new()?;
235
236let mut conn = env.connect(
237 "YourDatabase", "SA", "My@Test@Password1",
238 ConnectionOptions::default()
239)?;
240let query = "SELECT year, name FROM Birthdays;";
241let parameters = (); // This query does not use any parameters.
242let timeout_sec = None;
243if let Some(cursor) = conn.execute(query, parameters, timeout_sec)? {
244 // Use cursor to process query results.
245}
246# Ok::<(), odbc_api::Error>(())
247```
248
249The first parameter of `execute` is the SQL statement text. The second parameter is used to pass
250arguments of the SQL Statements itself (more on that later). Ours has none, so we use `()` to
251not bind any arguments to the statement. You can learn all about passing parameters from the
252[`parameter module level documentation`](`crate::parameter`). It may feature an example for
253your use case.
254
255Note that the result of the operation is an `Option`. This reflects that not every statement
256returns a [`Cursor`](crate::Cursor). `INSERT` statements usually do not, but even `SELECT`
257queries which would return zero rows can depending on the driver return either an empty cursor
258or no cursor at all. Should a cursor exists, it must be consumed or closed. The `drop` handler
259of Cursor will close it for us. If the `Option` is `None` there is nothing to close, so is all
260taken care of, nice.
261
262### Executing a many SQL statements in sequence
263
264Execution of a statement, its bound parameters, its buffers and result set, are all managed by
265ODBC using a statement handle. If you call [`crate::Connection::execute`] a new one is allocated
266each time and the resulting cursor takes ownership of it (giving you an easier time with the
267borrow checker). In a use case there you want to execute multiple SQL Statements over the same
268connection in sequence, you may want to reuse an already allocated statement handle. This is
269what the [`crate::Preallocated`] struct is for. Please note that if you want to execute the same
270query with different parameters you can have potentially even better performance by utilizing
271prepared Statements ([`crate::Prepared`]).
272
273```
274use odbc_api::{Connection, Error};
275use std::io::{self, stdin, Read};
276
277fn interactive(conn: &Connection) -> io::Result<()>{
278 let mut statement = conn.preallocate().unwrap();
279 let mut query = String::new();
280 stdin().read_line(&mut query)?;
281 while !query.is_empty() {
282 match statement.execute(&query, ()) {
283 Err(e) => println!("{}", e),
284 Ok(None) => println!("No results set generated."),
285 Ok(Some(cursor)) => {
286 // ...print cursor contents...
287 }
288 }
289 stdin().read_line(&mut query)?;
290 }
291 Ok(())
292}
293```
294
295### Executing prepared queries
296
297Should your use case require you to execute the same query several times with different parameters,
298prepared queries are the way to go. These give the database a chance to cache the access plan
299associated with your SQL statement. It is not unlike compiling your program once and executing it
300several times.
301
302```
303use odbc_api::{Connection, Error, IntoParameter};
304use std::io::{self, stdin, Read};
305
306fn interactive(conn: &Connection) -> io::Result<()>{
307 let mut prepared = conn.prepare("SELECT * FROM Movies WHERE title=?;").unwrap();
308 let mut title = String::new();
309 stdin().read_line(&mut title)?;
310 while !title.is_empty() {
311 match prepared.execute(&title.as_str().into_parameter()) {
312 Err(e) => println!("{}", e),
313 // Most drivers would return a result set even if no Movie with the title is found,
314 // the result set would just be empty. Well, most drivers.
315 Ok(None) => println!("No result set generated."),
316 Ok(Some(cursor)) => {
317 // ...print cursor contents...
318 }
319 }
320 stdin().read_line(&mut title)?;
321 }
322 Ok(())
323}
324```
325
326## Fetching results
327
328ODBC offers two ways of retrieving values from a cursor over a result set. Row by row fetching and
329bulk fetching using application provided buffers.
330
331### Fetching results by filling application provided buffers
332
333The most efficient way to query results is not query an ODBC data source row by row, but to
334ask for a whole bulk of rows at once. The ODBC driver and driver manager will then fill these
335row sets into buffers which have been previously bound. This is also the most efficient way to
336query a single row many times for many queries, if the application can reuse the bound buffer.
337This crate allows you to provide your own buffers by implementing the [`crate::RowSetBuffer`]
338trait. That however requires `unsafe` code.
339
340This crate also provides three implementations of the [`crate::RowSetBuffer`] trait, ready to be
341used in safe code:
342
343* [`crate::buffers::ColumnarBuffer`]: Binds to the result set column wise. This is usually helpful
344 in dataengineering or data sciense tasks. This buffer type can be used in situations there the
345 schema of the queried data is known at compile time, as well as for generic applications which do
346 work with wide range of different data. Checkt the struct documentation for examples.
347* [`crate::buffers::TextRowSet`]: Queries all data as text bound in columns. Since the columns are
348 homogeneous, you can also use this, to iterate row wise over the buffer. Excellent if you want
349 to print the contents of a table, or are for any reason only interessted in the text
350 representation of the values.
351* [`crate::buffers::RowVec`]: A good choice if you know the schema at compile time and your
352 application logic is build in a row by row fashion, rather than column by column.
353
354You can read more about them in the documentation of the [`crate::buffers`] module.
355
356## Inserting values into a table
357
358### Inserting a single row into a table
359
360Inserting a single row can be done by executing a statement and binding the fields as parameters
361in a tuple.
362
363```no_run
364use odbc_api::{Connection, Error, IntoParameter};
365
366fn insert_birth_year(conn: &Connection, name: &str, year: i16) -> Result<(), Error>{
367 conn.execute(
368 "INSERT INTO Birthdays (name, year) VALUES (?, ?)",
369 (&name.into_parameter(), &year),
370 None,
371 )?;
372 Ok(())
373}
374```
375
376### Columnar bulk inserts
377
378Inserting values row by row can introduce a lot of overhead. ODBC allows you to perform either
379row or column wise bulk inserts. Especially in pipelines for data science you may already have
380buffers in a columnar layout at hand. [`crate::ColumnarBulkInserter`] can be used for bulk inserts.
381
382```no_run
383use odbc_api::{Connection, Error, IntoParameter, buffers::BufferDesc};
384
385fn insert_birth_years(conn: &Connection, names: &[&str], years: &[i16]) -> Result<(), Error> {
386
387 // All columns must have equal length.
388 assert_eq!(names.len(), years.len());
389
390 let prepared = conn.prepare("INSERT INTO Birthdays (name, year) VALUES (?, ?)")?;
391
392 // Create a columnar buffer which fits the input parameters.
393 let buffer_description = [
394 BufferDesc::Text { max_str_len: 255 },
395 BufferDesc::I16 { nullable: false },
396 ];
397 // The capacity must be able to hold at least the largest batch. We do everything in one go, so
398 // we set it to the length of the input parameters.
399 let capacity = names.len();
400 // Allocate memory for the array column parameters and bind it to the statement.
401 let mut prebound = prepared.into_column_inserter(capacity, buffer_description)?;
402 // Length of this batch
403 prebound.set_num_rows(capacity);
404
405
406 // Fill the buffer with values column by column
407 let mut col = prebound
408 .column_mut(0)
409 .as_text_view()
410 .expect("We know the name column to hold text.");
411
412 for (index, name) in names.iter().enumerate() {
413 col.set_cell(index, Some(name.as_bytes()));
414 }
415
416 let col = prebound
417 .column_mut(1)
418 .as_slice::<i16>()
419 .expect("We know the year column to hold i16.");
420 col.copy_from_slice(years);
421
422 prebound.execute()?;
423 Ok(())
424}
425```
426*/