Friday, December 19, 2008

Enumerating Foreign Key Constraints

After enumerating columns and indexes, now it's time to work on the first of the three types of constraints that are supported by SQL Compact: foreign key constraints. Constraint definitions are also retrieved by calling ITableCreation:.GetTableDefinition. These are returned in an array of DBCONSTRAINTDESC structures that contain all the supported constraints. Each record contains one constraint whose type is stored on the ConstraintType member. The SQL Compact header file defines the following values:
  • DBCONSTRAINTTYPE_UNIQUE - A unique constraint.
  • DBCONSTRAINTTYPE_FOREIGNKEY - A foreign key constraint.
  • DBCONSTRAINTTYPE_PRIMARYKEY - A primary key constraint.
  • DBCONSTRAINTTYPE_CHECK - A check constraint (not supported by the current SQL Compact versions)
  • DBCONSTRAINTTYPE_SSCE_DEFAULT - Apparently this is a default value constraint. I have found no documentation about this constant and will investigate it later.
For now, we concentrate on the foreign key constraints and how to show them on the SchemaTree sample aplpication. In this version of tha sample, foreign keys will be displayed in a folder of their own, but for future versions this folder will be renamed "Constraints" and will contain all table contraints, not just foreign keys.

A foreign key definition is stored in the new CForeignKey class. You can see these classes being built from the array of DBCONSTRAINTDESC structures in the new CTableDefinition::FillForeignKeyArray function, where each array item is tested for the correct type and then fed to the CForeignKey class constructor, where all thework takes place. Each foreign key contains a list of matched columns from the reference table and the base table (each pair is stored in an instance of the CForeignKeyPair class).

Displaying these in the schema tree is quite straightforward and follows the same rules I have used for columns and indexes. Please note that there are new accessors for foreign keys on the CTableSchema class.

On my next post I will also enumerate primary keys and unique constraints, put them all under the same folder and start moving towards an on-device SQL Compact editor. I already got a name suggestion from Alberto Silva: SQL Explorer. More names, anyone?

Sample code: SchemaTree3.zip (1.2 MB)

Saturday, December 13, 2008

Native Pointers

No, I'm not talking about memory addresses, just interesting bits of information for the Native Mobile developer.

MSDN forums have recently moved to a new platform and got new URLs. Here are my favorites:
Visual Studio Smart Device Development - Native C++ Project
SQL Server Compact

On another note, Christopher Fairbairn just wrote another great post in his blog where he discusses some very cool techniques for Native Mobile developers, such as playing sound, using COM and displaying PNG images. A must read!

Thursday, December 11, 2008

Code changes

After a second look at the array CAtlArray implementation, I'm planning to change all the containers from value to pointer (instead of CAtlArray, I will change this to CAtlArray). The reason is quite simple: when growing the buffer through Add, the existing items are moved to the new buffer with a simple Checked::memmove_s call. The contained class copy constructor is not called which may lead to memory leaks (depending on the contained class implementation). If the array merely contains pointers, everything works as expected and all memory relocations will be much faster (only the pointers are copied, not the objects themselves). The downsidr to this approach is that I will have to manually implement the code to release all the objects pointed to by the array.

Finally, don't miss this great post on the Windows Mobile Team Blog: Uninstalling Applications Programmatically in Windows Mobile.

Wednesday, December 10, 2008

The perfect time-waster

Reinventing the wheel. That's what I have been doing for the last few days implementing object arrays in C++... Why? Apparently some template classes are not very easily exportable on a DLL so you cannot use useful stuff like CString or CAtlArray as exported class members. After having implemented a very simple string class and while working on object array allocation I decided it was enough: from now on the OLE DB Client library will not live in a DLL but either on a LIB or embedded in the client project. Enough is enough - I want to move ahead and implement the cool stuff, not arrays and strings so that the damned thing can be neatly exported as a DLL.

The code I'm presenting today has some additions to the OLE DB Client library, namely some new schema-related classes:
  • CSchema - Contains an array of table CTableSchema objects and a reference to a CSession.
  • CTableSchema - Contains a table schema information. This class is prepared to load this information on demand in order to avoid a performance penalty when enumerating the database schema (all tables are loaded and each table schema is loaded on demand).
  • CTableDefinition - Helper class that loads the table definition in a single OLE DB call. This populates the columns and constraints collections. Indexes are loaded separately through a specialized schema rowset (see the LoadIndexes method).
  • CColumn - Contains column schema information.
  • CIndex - Contains index schema information and a list of index columns.
  • CIndexColumn - An individual index column.
Please note that some of these classes will change in content and placement (I mean the .cpp file).

The sample project is the same - it enumerates in a tree the schema of the sample database placed on the device root. As you can see from the code, the tree lazily loads the table schema information (when the user expands either the "Columns" or the "Indexes" folder).

After loking a bit at this sample and to how the OLE DB Client has evolved, I started wondering about writing a "Query Analyzer" type of application. This would mimick most of the "old" SQL Compact Query Analyzer application and would add a few more features. Writing this application would require development in other areas such as the user interface, but I think that it will be a very interesting challenge.

What features would you like to see in the open source QA? What name would you give such app?

Sample: SchemaTree2.zip (1.21 MB)

Tuesday, December 02, 2008

Enumerating Columns with IDBSchemaRowset

While it's freezing out there (for Lisbon standards, of course) I have been busy working on the schema enumeration code for the OLE DB Client library. My first approach is to enumerate database tables and columns using the IDBSchemaRowset interface. Access to this interface is encapsulated in the CRowsetInterface class that is derived in CTablesRowset and CColumnsRowset classes to enumerate tables and columns respectively. The IDBSchemaRowset interface works by accepting a set of "restrictions" and generates a schema rowset containing the requested data. Schema rowsets are formally defined on the Appendix B of the OLE DB Programmer's Guide, where you can see the returned columns and the accepted "restrictions". SQL Compact implements a subset of these, as described in the BOL.

Using the CTablesRowset is quite easy (and you have seen it at work in a previous sample where it was used to test for the presence of a given table):

CTablesRowset tables(m_session);
CRowset rowset;
HRESULT hr;
hr = tables.Open(NULL, NULL, NULL, NULL, rowset);

The first four NULL parameters mean that you don't want to filter on any of the supported restrictions, so you will get a list of all the tables in the database. SQL Compact does not support table catalogs nor table schemas so the first two parameters are always NULL for this provider. The third restriction is the table name and the fourth is the table type (see the possible values for this in the TABLES schema rowset reference).

When scrolling through the tables rowset, you retrieve the current table name like this:

CString strTable;
rowset.GetValue(3, strTable);

Now that we have the table name, we can enumerate its columns by using the CColumnsRowset and imposing a restriction on the table name (see the COLUMNS schema rowset reference):

CColumnsRowset columns(m_session);
CRowset rsColumns;

hr = columns.Open(NULL, NULL, strTable, NULL, rsColumns);

This retrieves all columns from the given table and you can get the column name from ordinal 4. As you can see from the COLUMNS schema rowset, there are lots of additional schema information about a column, like its OLE DB type, "nullability", size and more.

Although this is a quite convenient way to retrieve a table schema, we can use another OLE DB interface that in a single call returns column and constraint information. On the next post, I will look at ITableCreation.

Sample code:
SchemaTree.zip
OleDbClientLib2.zip

Monday, November 24, 2008

Connection Manager Article

The latest issue of MSDN Magazine is out and it features a great article from Marcus Perryman: Going Places: How Connection Manager Connects. The reason why I'm also referring to this article here is that its code samples are in C, a rarity these days.

Sunday, November 23, 2008

SQL Compact Schemas

Now that I have looked into the most basic principles of using a SQL Compact database from native code (base table cursors and command execution), I will now look at the available tools to enumerate and manipulate the database schema. There are essentially two ways to go about this:
  1. SQL commands
  2. OLE DB interfaces
Using SQL commands
This may be the easiest approach and is also language-independent - you can use these techniques from both native and managed code. To determine what's in your database you can execute SELECT commands against the INFORMATION_SCHEMA system views. These work like regular SELECT commands and return specific rowsets:
  • COLUMNS
  • INDEXES
  • KEY_COLUMN_USAGE
  • PROVIDER_TYPES
  • TABLES
  • TABLE_CONSTRAINTS
  • REFERENTIAL_CONSTRAINTS (not documented for 2.0)
To retrieve information from one of these views, you just execute a simple query like:

SELECT * FROM INFORMATION_SCHEMA.TABLES

This will return all tables in your database. The advantage of using a SELECT command is that you can add a WHERE clause and filter the result. If you need to make changes to your database schema you can use all the supported DDL commands. Since version 3.0 came out, Microsoft even provided us with a stored procedure to rename tables: sp_rename. Unfortunately you cannot rename columns which sometimes is needed.

Using OLE DB interfaces
All of this (and a bit more) can also be done through some of the exposed OLE DB interfaces, namely the following Session interfaces:
  • IDBSchemaRowset - Generates rowsets that are similar to the ones returned by the INFORMATION_SCHEMA views.
  • IIndexDefinition - Creates and drops indexes.
  • ITableCreation - Retrieves the table definition (full schema).
  • ITableDefinition - Creates drops tables, adds and drops columns.
  • ITableDefinitionWithConstraints - Same as above, but also manages table constraints.
  • IAlterIndex - Only renames indexes in SQL Compact.
  • IAlterTable - Allows changing table properties like the table name, column defaults and identity properties. You can also use this interface to rename columns (not available in managed code).
Note that when using these interfaces, you are accessing the storage engine directly, circumventing the Query Processor.

On my next post I'm planning to publish a sample that illustrates how to use these interfaces.

Thursday, November 20, 2008

Logo Certification Article

Here's a very nice article for native mobile developers that addresses the required coding for some of the strict Windows Mobile Logo Certification. A worthwhile read.

Tuesday, November 18, 2008

Implementing SQL Compact Command Parameter BLOBs

After much ramblings, tests and misconceptions, I finally closed the chapter on SQL Compact command parameters. The sample code (see below) illustrates how to use a BLOB command parameter using the "variable length" binding buffer. Whenever the total size of the columns exceeds the allocated binding buffer size, the binding process is repeated increasing the binding buffer size. Although this is not an optimal solution (being able to use storage objects for this purpose would be much better), it frees you from having to specify the maximum size for BLOB parameters - the code calculates this for you.

Before you can execute a command with parameters, you must prepare it by calling the Prepare method (failing to do so will cause a run-time error). Here the code determines the number of existing parameters and allocates memory for a "user buffer". This user buffer is an array of CDbValue objects, a VARIANT-like class whose sole purpose is to store parameter values. When the user sets a parameter it is first written to one of these objects and upon execution it is copied to the data buffer. Meanwhile, the code determines if the new set of parameters on the CDbValue array requires rebinding, does so if needed and finally copies the values to the data buffer. This way you just have to set the parameters and execute the command (see the sample code).

As a side-effect of all these travails, I ended up creating a new class - CDbValueRef - from which CDbValue derives that I use as the "engine" that sets and gets data from various types, performing the necessary data type conversions. As you can now see, the column data access on the CRowset class is performed through an array of such objects that directly reference the bound data buffer, and they know how to handle storage object BLOBs. Now you can access your CRowset data using a type-safe interface.

Sample: InsertBlob.zip (152 KB)

Monday, November 17, 2008

I was right...

... for the wrong reasons. In fact, you cannot use storage objects to set BLOB command parameters. I found that the usual mechanism does not work because:
  1. When retrieving command parameter information through GetParameterInfo, you don't get the DBPARAMFLAGS_ISLONG bit set for BLOBs;
  2. Binding a BLOB parameter through a storage object causes the Execute method to fail reporting a DBSTATUS_E_CANTCONVERTVALUE on the BLOB column.
So I had to revert to the old code that buffers the parameter values in a buffer and forces a rebind when the new set of parameters is larger than the old parameter buffer (used by the accessor). Now that I got this straight, it's time to finish the new demo that inserts BLOBs using a parametrized INSERT command. Please stay tuned for the next post.

Thursday, November 13, 2008

I was wrong

Tomorrow I will deliver (using the English language) my second presentation in an international Microsoft conference. My first time ever was in 2005 in Nice for the now defunct MEDC (Mobile and Embedded Developers Conference). I have been splitting my time in attending interesting presentations, preparing my own presentation and completing the OLE DB Client library code. It was while working some final details of the presentation that I had an epiphany: I was wrong about BLOB command parameters!

The fact that the OLE DB specification limits us to using just one parameter accessor handle to set all the parameters does not mean that we are unable to use storage objects. Why? When I discussed the limitation of the SQL Compact engine that prevents it to create more than one storage object at the same time, it actually applies to when the data flows from the provider to the consumer, like when you are reading data from a base table cursor or from a query cursor. It does not apply to when data flows from the consumer to the provider. In fact, we experienced no such limitation when inserting or updating data through a base table cursor. The consumer application can create as many storage object as it pleases to, and this is the exact same situation with command parameters: the consumer application can only write to them because SQL Compact does not support output parameters.

Right now this is still a theory that seems to make a lot of sense to me and I will focus now on proving it right. Meanwhile, I developed a new set of classes to manage column values in such a way that they would do double duty when used on a rowset and on command parameters. For the latter, these classes would work as intermediate data buffers that would be used as temporary storage for application-provided parameter data. When needed, the CCommand code would bind these to a real data buffer, recreating the accessor handle when needed. In fact, I have just tested the code and it works... But this can't be right. If I can bind the parameters before setting them (just like in the CRowset class) then I will have the best solution.

Back to the drawing board. (If you want to see the code as it is, please drop me a line.)

Friday, October 31, 2008

SQL Compact Command Parameter BLOBs

The problem du jour is to make the CCommand class accept BLOB parameters, something that I have been thinking about the last week or so. Let me recap what's at stake.

When executing a SQL command with parameters, the OLE DB standard allows only for a single accessor handle to be provided by the consumer. This accessor handle is created to bind the command parameter data to a data buffer, just like in a normal rowset. By just having one accessor to handle bind the whole set of parameters means that we cannot bind more than one BLOB with SQL Compact, due to its OLE DB provider limitations. In order to have a general solution, we must bind the BLOB parameter as regular columns but we must know how much memory to allocate for each BLOB parameter, and for each instance of the command execution.

After some thought I concluded that there are two acceptable approaches:
  1. Ask the user to provide maximum acceptable size for each BLOB. This is what the ADO .NET provider does and essentially delegates this responsibility to the user. Do you know in advance what is the maximum size of each BLOB that you are going to feed as a parameter? If so this is acceptable, if not... The great advantage of this approach is that you can make an apriori parameter binding (and make it only once).
  2. Be smart about it and recreate the binding when needed. This requires keeping track of the current BLOB binding sizes and increasing the binding array when needed. This approach frees the user from having to specify a maximum size for the BLOB columns but requires that the binding be rebuilt when a larger BLOB is used.
Because the second option seems to be hardest to implement, that is exactly the one I chose. See you in the next post with the code!

Thursday, October 23, 2008

SQL Compact Command Parameters - II

After a bit of work, here's the new version of the OLE DB Client library with basic support for command parameters. The sample code implements a very simple query:

SELECT * FROM Customers WHERE Country = @c

and executes it by setting the parameter value to 'Germany' (using the same sample database).

The first change you will see in the sample code is that I downgraded the solution to VS 2005 format due to some requests to do so. VS 2008 users can easily upgrade the solution, so this is a minor issue. For convenience, I'm also packaging everything in a single ZIP file. Also, all the samples from now on will deploy to the same device directory (typically in \Program Files\OleDbClient).

Executing a SQL Compact command with parameters is quite simple. In a nutshell:

command.SetText(pszCommand);
hr = command.Prepare();
if(FAILED(hr)) ...
command.SetParam(1, _T("Germany"));
hr = command.Execute(rowset);

Remember from my last post that parameterized commands must be prepared before executing. If you look under the hood, you will see that parameters are bound (using code very similar to CRowset's column binding code) in the Prepare method. You can only set command parameter values after the Prepare method executes successfully.

This code still has some limitations, and the most notable one is the inability to handle BLOB parameters. When executing SQL commands, the OLE DB provider only accepts one accessor handle to bind all parameters. As we have seen in previous posts, this prevents us from binding more than one BLOB parameter. The only solution I devised for this when using the ATL OLE DB Consumer Templates was to force the BLOB parameters to have a maximum lenght and bind them as regular columns. Of course, this implies a change to the parameter binding code and some user intervention...

Sample code: OleDbClientLib.zip

Sunday, October 19, 2008

SQL Compact Command Parameters

Command parameters can be thought of as value replacements or variables for SQL commands. You use a command parameter when you need to execute a SQL command multiple times changing one or more values on each execution. Instead of rebuilding the command text every time the command is executed, your code merely changes a variable value. This means that your code changes from this:

SELECT * FROM Customers WHERE Country = 'Germany'

to this:

SELECT * FROM Customers WHERE Country = @p

Note that the command parameter (@p) replaces a literal value that will change every time you execute the command. Command parameters have some interesting advantages over string concatenation:
  • Command parameters are strongly typed;
  • Command parameters make the command run faster;
  • Command parameters are safe because they avoid SQL injection attacks.
Suppose that you need to pass a date value as a parameter to a SQL command. How exactly do you format a date value in order to be correctly interpreted by the query processor? When using command parameters you just don't are about this - you just feed the parameter with the binary value and it atakes care of the rest, no string formatting contortions needed.

The OLE DB provider requires that parameterized commands be prepared (compiled) before executing. Contrary to the managed ADO .NET provider, the OLE DB provider does check the command for errors and enumerates all the given parameters and even infers their data types (your code can then read this informtion and use it). Preparing a command with the ADO .NET provider for CF merely marks the command as requiring preparation before executing. Either way, a prepared command runs faster than an unprepared one.

Finally, with command parameters there is no way to let your user change the syntax of the underlying SQL command. If you use string concatenation to build command strings (with user input), you risk exposing your code to SQL injection attacks that may break your application code or even its security.

Since version 3.0, SQL Compact supports named parameters meaning that you can give your command parameters some meaningful names. All command parameter identifiers must be prepended by a '@' character. If you are still using SQL CE 2.0 then you are out of luck as all command parameters are identified by a single '?' character and distinguishable by their ordinal position only.

After this brief introduction, I will look at the details of implementing a SQL command parameter management code and some of the finer details of binding, such as what to do with BLOBs.

Thursday, October 16, 2008

Very fast row counting on SQL Compact

What's the fastest way to retrieve a table's row count in SQL Compact 3.0 and 3.5? Use the sp_show_statistics stored procedure! Pass your table name as the first parameter (the second is optional) and you will get back a few statistics about the table and its indexes, so more than one row might be returned. The row count is on the ROWS column (ordinal 3), an unsigned 32-bit integer.

Note that this is not supported in versions 2.0 and earlier.

Meet me at Tech Ed EMEA 2008

Meet me in Tech Ed EMEA 2008 where I will deliver a presentation about "Microsoft SQL Server 2005 Compact Edition Ultimate Performance Tuning" (Windows Mobile track). See you there!

Wednesday, October 15, 2008

New column binding and commands

As promised in a previous post, I changed the way columns are bound in the CRowset class so that the class retains all the important column information retrieved from the IColumnsInfo interface, but also the binding offsets. You can see all of this at work in the new version of CRowset in the sample files.

You will also find the first implementation of the CCommand class, an abstraction over the OLE DB Command object. This first implementation is very simple and allows only for execution of a command text (no parameters yet) returning a CRowset and/or an affected row count.

Using a CCommand object is very simple (much like using CTable):

CCommand command(m_session);

command.SetText(_T("SELECT * FROM Employees"));
hr = command.Execute(m_rowset);

Start by declaring a CCommand object passing the open CSession object as parameter. Next, set the command text and then call execute passing a CRowset object as parameter. You can now use the rowset to navigate the rows and retrieve data.

Note that the returning rowset may not have a bookmark (not in this case, anyway) so you must be careful how you index the resulting columns. Remember that CRowset::GetColumnCount returns the total number of columns including the bookmark, if present. Also, columns are referred to by their ordinal position and the first data column is always at ordinal 1, so if you are iterating through all columns using GetColumnCount, make sure you always start at 1 and also add 1 if the rowset does not have bookmarks (see CRowset::HasBookmark).

Finally, if you don't set any specific command properties (as in this case) you get a forward-only cursor, so there is no going back...

Sample files: EditRow5.zip, oledbcli4.zip

P.S.: Don't try to use the EditRow sample to edit rows coming from the command rowset! I just adapted the sample to show how easy it is to display command data. The resulting is read-only.

Friday, October 10, 2008

OLE DB Commands and SQL Compact

OLE DB Commands are created through the IDBCreateCommand Session object interface. Just find your Session object and call QueryInterface using the IID_IDBCreateCommand GUID. This interface just exposes one method - CreateCommand - that you use to create OLE DB Command objects. As with most COM and OLE DB objects, several interfaces are exposable for a given object subject to provider implementation. For SQL Compact the following interfaces are exposed:
  • ICommand - Execute the command through this interface. As I noted in a previous post, the Cancel method is not implemented in SQL Compact.
  • ICommandPrepare - Use this interface to Prepare or Unprepare the command. Command preparation is useful when executing the same command more than once and is required when using command parameters. Contrary to what happens with the ADO .NET provider, the Prepare method actively validates the SQL Command against the existing database, returning errors for incorrect syntax.
  • ICommandProperties - Gets or sets the properties associated with the Command.
  • ICommandText - Gets or sets the command text. This interface inherits from ICommand.
  • ICommandWithParameters - Manages command parameters (we will have to implement a data binding scheme similar to what exists in CRowset). SQL Compact allows only for input parameters, not output.
As you can see, this is enough to run SQL commands against SQL Compact's Query Processor. For now, we leave this here as a reference while I go back to work and start implementing these features in the OLE DB Client library.

BLOB update code

The sample code I'm posting today illustrates the techniques I described on my last post. Please note that this is really a sample - I will change it in the next incarnation in order to be more effective when binding data. Right now you can see a bit of a mess when binding columns because both the DBCOLUMNINFO and DBBINDING arrays are being used to find the column information addresses on the data buffer. Next version will have a unique array with all the required info to correctly and accurately describe each column. Enough of this and let's look at the code.

First of all, take a look at CRowset::SetValue. Here you see how text strings (only!) are written and how the code decides if the column is a BLOB or not. If we are writing to a BLOB column, the code creates a CBlobStream object and writes the string contents to it. Upon success, a pointer to the CBlobStream object is stored in the rowset data buffer. When this data is written to the provider (by calling IRowsetChange::SetData on the BLOB column's accessor handle), the provider calls the CBlobStream::Read method to read all the string contents into the BLOB storage. When it is done with it, the provider calls the Release method and the CBlobStream deletes itself - neat.

Finally, take a look at the CRowset::Insert, CRowset::Delete and CRowset::SetData methods. These implement the changes I described before. Also, note that these methods will work even if the IRowsetUpdate interface is not exposed.

Next up - After rewriting most of the column binding code, I will start talking about commands.

Sample files: EditRow4.zip, oledbcli3.zip

Thursday, October 09, 2008

Updating rows with BLOBs

As I have previously discussed, we need to make some substantial changes to our code if we want to update rows with BLOB columns. So how exactly do we update these things? The first thing we need is to implement an object that behaves like an ISequentialStream COM object. Our code will create one such object per BLOB column, store the data in this object and then store a pointer to this object in the rowset buffer. When updating data, the OLE DB provider will read it (just like our code reads the storage objects coming from the provider) and then releases it. To make our life easier, our implementation of this object should delete itself when the reference count reaches zero. This way we will not have to worry about deleting it later.

After setting all the row data (BLOB and otherwise), we can insert or update it. If you want to delete a row it is pointless to write data to it. If you remember, I said that when using BLOBs with SQL Compact we must use the deferred update mode by requesting the IRowsetUpdate interface. Here's how the code changes for each operation (insert, update or delete):

Insert
  1. Call IRowsetChange::InsertRow - this sets the non-BLOB data;
  2. Call IRowsetChange::SetData for each BLOB accessor;
  3. Call IRowsetUpdate::Update to commit the changes.
Update
  1. Call IRowsetChange::SetData for each accessor.
  2. Call IRowsetUpdate::Update to commit the row changes.

Delete
  1. Call IRowsetChange::DeleteRow.
  2. Call IRowsetUpdate::Update.
Note that if any of these processes fails before calling Update, you should call Undo to revert any changes made to the row. I'll show you the code in the next post.

Wednesday, October 08, 2008

ICommand::Cancel is a dummy

No, I'm not going to talk about the ICommand interface right now, but I just found out that the ICommand::Cancel implementation on the SQL Compact OLE DB provider is a dummy. Good to know!

Tuesday, October 07, 2008

Reading BLOBs

My first approach to the BLOB-handling code took a bit longer than expected due to my aging brain. Please understand that the code I'm publishing today is incomplete and will change in the future. I hope that it helps you in understanding what it takes to read BLOB data through the SQL Compact OLE DB provider.

There are two important changes to the code since last post:
  1. Columns are indexed by their ordinal.
  2. Columns are bound using a two-pass approach - On the first pass the non-BLOB columns are bound using the same accessor handle. The second pass binds the BLOB columns allocating an accessor handle for each column.
The bulk of the code that reads the BLOB into memory is on the new CRowset::GetValue method. Note how the column accessor handle is used to retrieve the ISequentialStream pointer and how it is read into a CString. You have to consume all the data immediately because you cannot keep that stream object alive very long, especially if you want to read or write to another BLOB.

Most of the complexity you see on the code (the difference between binding and column indexes) stems from the fact that columns are being moved around in order to make binding simpler. When life was simpler and there were no BLOBs, there was a direct relation between column and binding entries and there was no need to map between the two. Now that we are moving binding entries around, we must make sure that a mapping exists between both arrays (see the GetBindingIndex method).

Next post will handle writing data to the BLOB.

Sample files: EditRow3.zip, oledbcli2.zip

Wednesday, October 01, 2008

CStdDialogImpl and the OK button

Are using CStdDialogImpl as the base class for your WTL dialogs? Are you having difficulty hiding the "ok" button? Here's a simple solution:

Derive your dialog class (say CMyDialog) from

CStdDialogImpl<CMyDialog, SHIDIF_SIZEDLGFULLSCREEN | SHIDIF_SIPDOWN>

On the OnInitDialog handler, call:

SHDoneButton(m_hWnd, SHDB_HIDE);
ModifyStyle(0, WS_NONAVDONEBUTTON, SWP_NOSIZE);

Make sure you change all the calls to CStdDialogImpl<CMyDialog> methods with the new base (you can use a typedef for that).

Tuesday, September 30, 2008

Let's talk about BLOBs

So far we have seen how to edit table data using the IRowsetChange interface. The provided sample managed to edit a simple table where all columns had limited size. Table edits are performed in a single step, where all the data is read or written at once, and all the data items are sequentially stored on the same data buffer. This scheme works very well but does not support BLOBs.

A BLOB is a different beast because it can store a very large amount of data. In SQL Compact there are only two types of BLOBs: the image (variable length byte array) and the ntext types (variable length character array). These can be really large and the maximum theoretical size of a BLOB field is 4 GB in SQL Compact (the same size limit for the whole database).Contrary to smaller data types, you may not be able to store the whole BLOB data in the device memory, so we need a different approach to handling these types.

The OLE DB standard allows two different ways to bind BLOB data to your accessor: by reference or by value. SQL Compact does not allow by reference binding, so we are limited to binding these by value? What are the differences between both types of data binding?
  • When binding by reference, the OLE DB provider allocates memory for each bound BLOB field and fills it with data. A pointer to this data is stored in the accessor buffer along with its status and size. Column data is accessed through an extra pointer indirection and our code must make sure that the BLOB data buffer is released when the data is not needed anymore.
  • When binding by value, we can either specify (force) a fixed column size or bind through a storage interface. The first approach allows your BLOB data to be manipulated just like any other column but forces you to put a limit on how much data you will handle per BLOB column and this defeats the whole purpose of using them. Alternatively (and most commonly) your code will request that the provider creates a storage object to handle the BLOB data (when reading - when writing data your code must do this). When data is read ftom the table, the provider stores a pointer to either an ISequentialStream or an ILockBytes interface and then your code can read the data through these. When writing, your code must create one of these objects per BLOB column.
The last approach - binding by value through storage objects - is the recommended approach, but it becomes harder to implement when your code is handling more than one BLOB per row. In fact, SQL Compact will only manage (read from or write to) one storage object at a time, and if you ask it to instantiate more than one you will get an error. So how do you read from a table (or query) that returns more than one BLOB column? And how do you write the data? Well, you do so one at a time and this means that your code cannot read nor write the whole data in one single step.

The first change we must make to the rowset code is to add more accessor handles. The first accessor handle will bind the non-BLOB columns and these can be read and written to just like we are doing right now. For each BLOB column the rowset must allocate an additional accessor handle and make a note of it.

When data is read, the non-BLOB columns are all read, but none of the BLOB columns are. When your client code requests data from a BLOB column, the rowset must know that this is a special case, retrieve the colsumn's accessor handle and call IRowset::GetData on that handle. On the data buffer, the OLE DB provider will write a pointer to one of the above-mentioned storage objects (it is chosen by the rowset code when the data binding is created) and your client code must read through it and release it when no longer needed.

When you need to write data back to the table, your client code will have to create one storage object per BLOB column, store the respective pointers in the data buffer and engage in a multi-step update (new row insertion or existing row update). This means that your code will store the non-BLOB columns first and then store the BLOB columns one at a time, but to be able to achieve this the rowset must be in delayed update mode. You specify this mode by setting the DBPROP_IRowsetUpdate property to VARIANT_TRUE and the rowset now exposes the IRowsetUpdate interface. When this interface is exposed, you can call IRowsetChange::SetData on the allocated accessor handles without having to worry about having the row in an inconsisten state (due to table constraints, for instace) while you are updating data. When your code is done setting data through all the accessors, just call IRowsetUpdate::Update to update all the changes at once.

This will be easier to understand through a sample, which I will publish on my next post.

Monday, September 29, 2008

Using the OLE DB Client code

I was recently asked for permission to use the OLE DB Client library code in a third party application. One of the blog readers asked me about a specific license to go with the code and I pointed him to the CPOL, the permissive license used by most Code Project articles. I'm actually planning to convert the material from these posts into a series of articles to be published in Code Project under this license. This is a very simple license: you can use the code for whatever purposes you like (even in a commercial application) and the only thing you are required to do is to acknowledge who wrote the original code (see the CPOL details).

Now that you know this, should you use this code? If you want to use it to learn about OLE DB and the SQL Compact native provider, please go right ahead and use it (and change it, and tweak it and experiment with it). As to using it in a commercial application, I would say "not right now" for two reasons:
  1. The code has not been extensively tested. Although I plan to publish a bug-free (ah ah ah!) library, I cannot take resposibility for any use you make of it. You will be on your own.
  2. This code will change. Don't start writing your own code based on this library expecting that the library interfaces will not change. They will, and your code will fail to compile.
Now that this is clarified, let's have a look at BLOBs and what we need to support them.

Wednesday, September 24, 2008

Moving to a DLL

Can I put the OLE DB Client library in a DLL? I tried but got some compiler warnings (C4251) concerning some of the ATL class templates (CComPtr and CString). The reason is that other classes might inherit from these and template definitions are not always exportable.

To solve this problem, I replaced all references of CComPtr in the protected interface with raw COM pointers, which required me to implement some missing destructors. I also removed the protected CString in the CTable class and now the code cleanly compiles into a DLL.

You can download the rewritten sample code (EditRow2.zip and oledbcli.zip) and test them. Make sure that you place the OleDbClient project in a directory with the same name and at the same level of EditRow.

Monday, September 22, 2008

Updating data

There are two ways to update data using OLE DB: using SQL DML commands (which I will look at in later posts), or through the IRowsetChange interface exposed through a base table IRowset. It supports the basic methods to insert new rows (InsertRow), update existing rows (SetData) and to delete rows (DeleteRows) and is only exposed when the consumer sets the DBPROP_IRowsetChange property to VARIANT_TRUE.

To illustrate this type of data update, I put up a sample that allows you to edit the "Suppliers" table from the Northwind sample database. The list view display code has been changed in order to support a load-on-demand cache of rows, stored in their native (CRowset's) format. The cache items are preloaded when the list view notifies the parent through the LVN_ODCACHEHINT notification message. Row bookmarks are stored along the row data, so we can now very quickly position the rowset cursor in any row so we can edit its data.

Note that when modifying any data, we have to make sure that both the bookmark and the row ID columns are not touched because both are managed by the engine (the row ID is an IDENTITY column that gets automatically filled in upon insert). Apart from this, setting data is just a matter of copying the new row value to the rowset and correctly setting the column status and length (optional for fixed-length types).

This code does not support BLOBs yet. I will look at these in my next post.

Sample project: EditRow.zip (1.29 MB)

Wednesday, September 17, 2008

Bookmarks

So what is a bookmark and why should you care? To put it simply, a bookmark is a row unique identifier on an open rowset that allows for very fast random seeks. In order to get a rowset with bookmarks, you must set the DBPROP_BOOKMARKS property to VARIANT_TRUE. The resulting rowset has the bookmark value at column ordinal zero and its size and type varies according to the database engine. On all versions of SQL Compact the bookmark type is a 32-bit integer (DBTYPE_I4 on SQL CE 2.0 and DBTYPE_UI4 on 3.0 and 3.5).

There are two important bookmark restrictions that you must be aware of:
  1. The bookmark value is valid only for the particular rowset that generated it. You should not store this value for later use after the rowset closes because the bookmark values will be rendered invalid. Even if you open the same rowset again, there is no guarantee that the bookmark values will be the same. (Ok, now that I wrote this disclaimer I can tell you that apparently on base table cursors, SQL Compact does seem to reuse the same bookmark values. Nevertheless don't reuse them, please.)
  2. There is no way to know a bookmark's value before loading the row data to memory. This means that if you are using bookmarks to navigate in your rowset, your code must visit the row before knowing its bookmark value, just like in a book.
So why should you care about bookmarks? I have found a very interesting use for them: editing table data without using unique indexes or primary keys. The bookmark gives you an exact and unique identifier to any row and the rowset pointer moves very quickly to that particular row.

Before you can seek to a known bookmark, your rowset must be created with the DBPROP_IRowsetBookmark property set to VARIANT_TRUE. Doing so exposes the optional IRowsetBookmark interface on the IRowset, and you can use it to very quickly position your rowset pointer to any valid bookmark by calling the PositionOnBookmark method.

Before looking at BLOBs, the next post will show how to edit data using a base table cursor and bookmarks.

Monday, September 15, 2008

Sample Databases

Isn't this nice? I have been publishing the "OpenTable" sample code but forgot to publish the sample databases. My apologies! You can find the three databases (test20.sdf, test30.sdf and test35.sdf) on this zip file: testxx.zip

Please note that the sample has hard-coded the location of the SDF file in the device root. You can change this, of course.

Database Schema Rowsets

Let's make a small detour here, now that we have mastered the IRowset, and take a brief look at database schema rowsets. The OLE DB specification requires that providers expose a set of rowsets that contain information about the database schema. You can find information about the existing tables, columns, constraints and more (depending on your provider).

Rowset schemas are created by IDBSchemaRowset, an optional Session interface that is supported by SQL Compact, through the GetRowset method. When calling this method, you must specify the rowset schema GUID and an array of VARIANTs containing restrictions that will filter the output of the rowset (each rowset has its own set of restrictions). Each restriction value corresponds to a column in the schema rowset and is either empty (VT_EMPTY), meaning no filtering will occur, or has a value of a type compatible with the column's (see here) by which the rowset will be filtered.

To illustrate this, I adapted the last sample code so that it displays the TABLES schema rowset. The very simple implementation is on the CSchemaRowset (base class) and the CTablesRowset classes.

Sample code: OpenTable4.zip

Saturday, September 13, 2008

Moving around

A little change of plans is in order as instead of discussing bookmarks, I will take a deeper look into the IRowset::GetNextRows method. The reason behind this little change of plans is that I got fed up waiting for the sample's list view to populate. I told you before that this was the least-effective way of displaying large amounts of data in a list view: by filling it all up we have to wait for all the internal memory allocations to run and worse, we essentially duplicate the rowset data (much like reading a table into a .NET's DataSet). In fact, list views work much faster when they don't own the data but rely on the parent window to provide it for them. To do this, you must specify the LVS_OWNERDATA style bit when creating the list view control, thereby making it a virtual list view. Virtual list views query each item and sub-item data through the LVN_GETDISPINFO notification message (more info on this here). If you want to cache the data in advance, the virtual list view notifies the parent through an LVN_ODCACHEHINT notification, where the indexes of the rows about to be displayed are reported. This gives the parent a chance to cache the rows ahead. So how does this affect our sample code?

The first major difference os that we have to keep the CRowset object open thrughout the list view's lifetime so that ic can move back and forth to get the requested row. This means that we must find a way to map the zero-based list view item indexes and the table rows, and the most obvious one is to use the number of skipped rows from the rowset start. When the rowset is open, we must make sure that the internal row handle is initialized by calling MoveFirst. This will read the first row into memory and set the next reading position to "before the second row". What does this mean?

If you look at the IRowset::GetNextRows documentation, you see that the second (lRowsOffset) and third (cRows) parameters are signed numbers. We already know that the cRows parameter can only have the value of 1 in SQL Compact, because this engine only manages one row at a time. Interestingly it can also be -1, meaning that the cursor is reading backwards (the DBPROP_CANFETCHBACKWARDS is VARIANT_TRUE). This value is also by how much the "next read position" is incremented for next read. So if you open a brand-new rowset, call RestartPosition to position it "before the first row" and then consecutively call GetNextRows(NULL, 0, 1, ...) until the return value is different from S_OK, you will have visited all the rows from the rowset.

Moving back on the rowset would entail feeding a negative lRowsOffset value to the call but there are two gotchas:
  1. What value would you use to read the previous row? If you think -1 you are wrong (if you keep cRows = 1) and the reason is simple: you would go back one row from the current reading position (after the current row) and put it before the current row, so you would effectively re-read the current row! To move back one row (with cRows = 1) you must set lRowsOffset to -2.
  2. Can you really scroll back? If you look at all the available SQL Compact BOLs you will see that DBPROP_CANSCROLLBACKWARDS is read-only and VARIANT_FALSE, meaning that lRowsOffset cannot be negative. (I'm still trying to figure out if this is true for 3.0 and 3.5 becase elsewhere these BOLs say that you can have a negative offset.)
Just to be on the safe side, if you need to move back just restart the rowset position and move forwards from there (sigh).

Now go and have a look at the updated sample code. You will see some differences in the code, especially how fast the list view is displayed.


Download sample code: OpenTable3.zip

Friday, September 12, 2008

Date, time and money

The sample from my last post did not show all the table column data because it only understood integers and strings. If you look at the "Orders" table schema you will see datetime and currency columns. Let's look at these types in a bit more detail

datetime
The datetime type is identified by the DBTYPE_DBTIMESTAMP data type identifier and is implemented by the DBTIMESTAMP structure. As you can see from its definition, this is a very simple structure and the revised sample application merely uses it to display the date part as a string. If you want to manipulate the date and time values, I advise you to use the MFC/ATL shared COleDateTime class (it's very easy to convert between the two).

currency
The currency type is identified by the DBTYPE_CY data type identifier and is implemented as the well-known CY or CURRENCY structure. It's essentially an __int64 that represents decimal values scaled by 10,000 so that it has four decimal digits. To manage this data type, and especially to render it as a string, I created the new CCurrency class (not under the OLEDBCLI namespace because this represents a more general Windows data type). The string formatting method (named Format) takes two parameters:
  1. The numeric format of the value: None, Number or Currency. The first format returns the number as it is internally formatted, with trailing zeores and a decimal point (locale-insensitive). The second format returns the value as a number in the given locale and the third does the same but as a currency.
  2. The locale identifier (see above).
You can also use this class to perform some basic arithmetic with currency values (although I have not tested it extensively). 

I will be addressing other OLE DB types as we meet them, but for the next post I will show you a very interesting and useful feature of base table (and scrollable) cursors: bookmarks.

Thursday, September 11, 2008

Showing rows

After showing how to use a rowset to count table rows, let's see how we can use it to actually display the table rows. To illustrate this I wrote a sample (OpenTable) that uses tha last sample's databases and displays the "Orders" table rows in a list view control. As you will see, the sample uses a very basic (and slow) approach to filling in the list view, but this issue will be addressed on a later post when I will discuss bookmarks and their uses. Now, let's see what's new in the code.

First of all, I changed the way a CTable is opened in order to produce a CRowset. Now you can see a new method called CTable::Open that returns a reference to an open CRowset object as its last parameter. This method calls IOpenRowset::OpenRowset and upon success calls the changed CRowset::Open method. As you can see, this method just takes an IRowset pointer as parameter and does the bulk of the work of binding the table columns to a local data buffer.

The method starts by requesting an IColumnsInfo interface from the IRowset via QueryInterface and uses it to retrieve the table column schema information via a call to GetColumnInfo. This information is stored in the provider-owned m_pColumnsInfo (DBCOLUMNINFO*) while the column names are stored in a single OLECHAR buffer (m_pColumnNames), also provider-owned memory.

Now the rowset knows exactly what columns to bind to. Note that if you want to implement a custom column binding (like binding to a subset of columns, or specifying data type conversions on binding), you must replace this step and provide your own set of column specifications. For the time being we will stick to the reported schema information.

To bind the rowset data to your application's buffer, you must build an array of DBBINDING structures and fill it with appropriate data. Then you feed this to a newly-acquired IAccessor interface in order to create an accessor handle. Finally you can use the accessor handle with IRowset::GetData to fill your application's buffer with data from the current row (the row handle must not be NULL and must be valid).

Each entry in the DBBINDING array describes how each column will be bound. This assumes that the row data will be stored somewhere in your application's data space and that each column value, lenght and status will be stored at fixed offsets. It also defines how the data types will be mapped between the provider and the consumer (in this case there is no conversion) and also how BLOBs are handled (more on this on a later post). As you can see from the sample code, the binding assumes that you are placing all the row data in a large buffer where each column binding starts with the column length (DWORD), followed by the column status (also a DWORD) and finally the raw column data (variable size). Note how each binding field is DWORD aligned to avoid alignment faults. This is a very similar approach to the CDynamicAccessor class template from the ATL OLE DB Consumer Templates. An alternative approach would be to build your own structure to store the row data and feed the DBBINDING array with your structure's offsets.

Finally, the code allocates the row data buffer and a DBBINDSTATUS array to store any column binding errors or statues. The code then gets the IAccessor interface and calls CreateAccessor. If all goes well, you can start retrieving data from the table by using the slightly modified MoveFirst and MoveNext methods. These now call IRowset::GetData with the current row handle and the created accessor handle to fill in the data buffer (now we lost the performance edge that allowed us to very quickly count rows, but I will address that later).

To retrieve individual column data from the data buffer, the CRowset class now has three new mehods: GetStatus (retrieves the column status), GetLength (retrieves the column length) and GetValue (retrieves a raw BYTE pointer to the data). The sample code only recognizes two data types: DBTYPE_I4 (int) and DBTYPE_WSTR (WCHAR*) so you will see some empty columns in the list view. I will start covering some of the more exotic data types on the next post, so stay tuned.

Sunday, September 07, 2008

Counting rows

After a bit of theory on how to open a base table cursor and on the SQL Compact OLE DB limitations, let's take a first look at how to use the IRowset interface to count table rows. To illustrate this, I pulled together a sample (RowCount) that counts the rows of all tables in a sample database (the good old Northwind database). To run the sample, copy the SDF file (depending on your installed version of SQL Compact: test20.sdf, test30.sdf or test35.sdf) to your device's root and execute the code (it's more interesting to step through the debugger).

In this sample, I introduce two new classes: CTable and CRowset. The CTable class encapsulates the IOpenRowset interface and will be used later to store schema information about a table. The CRowset class encapsulates the IRowset interface and in its present incarnation allows only to scroll forward through the row collection, no data is read. To count rows, the sample application calls the MoveFirst and MoveNext methods on an open rowset. 

These methods still have a very simple implementation but already illustrate the fact that only one row handle is returned by the provider. Also, note that the row handle must be released before the cursor moves to the next row. By combining a row handle and an accessor, our code will be able to fetch data. Right now this is not needed: we just want to count rows, so loading them into memory is a wasteful operation.

The next sample will be a bit more complex as it will open a base table cursor, create all the necessary bindings (no blobs yet) and then display the table data on a list view.

Thursday, September 04, 2008

SQL Compact's OLE DB limitations

There are a few bits of information that we must keep in mind when developing OLE DB applications for SQL Compact. I will briefly describe here what I believe to be the most important limitations and their implications.

Rowset limitations
SQL Compact's OLE DB provider returns only one row handle through IRowset::GetNextRows. This is not a dramatic limitation, but do keep it in mind.

Binding limitations
You cannot bind columns by reference, only by value. What does this mean? It means that you can only create bindings that have enough allocation for the bound data sizes. So how do we handle BLOBs? On desktop providers (like SQL Server's and Access) you can bind a BLOB column through a pointer to the provider memory. The provider allocates enough storage for the BLOB data and passes back a pointer that the consumer application later releases. This option is not available with SQL Compact, so a different approach is provided: structured storage. Instead of returning a pointer to the BLOB data, SQL Compact's provider returns a pointer to either an ISequentialStream or ILockBytes interface (the client application decides which). The major difference between the two is that the first provides sequential access while the second provides random access to the BLOB data. This is a bit of a pain, but gets worse.

Accessor limitations
Only one storage object can be open at a time (see DBPROP_MULTIPLESTORAGEOBJECTS). So what happens if you have more than one BLOB column to bind to at the same time? You can actually bind more than one BLOB column but you cannot get and set data to more than one BLOB column at the time: you have to get each storage object, use it and dispose of it. This means that you cannot read the whole row at once...

The trick is to create more than one accesor handle per row, binding all the non-BLOB columns with the first handle and creating an extra accessor handle per BLOB column. Reading and writing row data now becomes a multi-step process, and for that reason you must make sure that you use delayed update mode (use DBPROP_IRowsetUpdate).

Confusing? Things will become clearer with some code...

Wednesday, September 03, 2008

Opening a base table cursor

A base table cursor is a means to read and write table data without using the SQL query processor. Through this object, one can read and write data in an ISAM-like fashion. All you need to know is the table name and optionally an index name. To open a base table cursor and read data from it, we must do quite a bit of work. We start by getting an IOpenRowset pointer from our session object through QueryInterface.

IOpenRowset
This interface has only one method: OpenRowset. Set the second parameter to the requested table name (set the DBID structure with eKind = DBKIND_NAME and pwszName pointing to a string containing the UNICODE table name) and the third to the optional index name (set to NULL if you don't want to use any index order). Set the fourth parameter to the IID_IRowset constant so that upon success you get an IRowset pointer in the last parameter. The remaining two parameters are used to specify the requested cursor properties. After successfully calling this method, you get an IRowset interface pointer.

IRowset
The IRowset interface is essentially used to navigate through the data rows (RestartPosition and GetNextRows) and to read data from them (GetData). This is done independently of the data structure of the table (or query result). To map between the data row structure and the consumer's memory we need another interface.

IAccessor
To retrieve data from the row, one or more accessors must be created through the IAccessor interface (retrieved via QueryInterface on the IRowset). Each accessor maps a set of columns in the data row to memory addresses in the consumer memory through CreateAccessor. Each column mapping is represented by an entry in the DBBINDING array that the cosumer passes as the third parameter. This array may either be filled with application-provided bindings (if you just want to bind a subset of the columns), or with data from the provider itself. To get this type of information, you typically request an IColumnsInfo interface pointer and call the GetColumnInfo method to retrieve all the column information for the table. The process of mapping between the two is not linear and requires some thought and attention to the provider's peculiarities. But we will see all of this on the next post.

Sunday, August 31, 2008

CDataSource and CSession

With this post, I'm presenting the first incarnation of the CDataSource and CSession classes on the OLE DB client library. These still bear some similarity to the ATL OLE DB Consumer Templates counterparts (I did stole some variable names, sorry), but you will see some differences. For instance, there is a CDataSource::Create method that can optionally open a CSession object. This means that besides creating the database, you can immediately open a session with it. As always, I'm using the CreateSDF sample to implement this code.

From now on things will start to get a bit more interesting and useful: besides creating or opening a database using OLE DB, we want to access and edit both data and structure. We can do this either through SQL commands or through the exposed interfaces. My next challenge will be to tackle base table cursors and SQL commands. These require the introduction of two other objects that are central to OLE DB programming: the rowset and the accessor. While a rowset represents a set of rows that you can navigate on, an accessor defines how the individual columns are accessed and mapped to your application data. Rowsets are not only used to handle cursor data (both from a table and from a SQL query) but they are also internally used by the OLE DB provider to report other data to the consumer (like schema information). Accessors are required to map data between your application's memory and the provider's data, for three specific purposes:
  • Table or query data;
  • SQL command parameters;
  • Index column data.

Here we will meet some very interesting challenges, like handling the BLOB peculiarities of SQL Compact, but this will be an interesting ride. I promise.

Friday, August 29, 2008

Connecting to a SQL Compact database via OLE DB

The generic principles of connecting to a database via an OLE DB provider are described in this MSDN page: First you create a Data Source object and then you use it to create a Session object. While the Data Source represents the target database, the Session object represents an individual connection to that database. You use the Session object to (among other things) create and execute SQL commands, open base table cursors for fast data insertion, manage transactions and make changes to the database schema.

Both the Data Source and the Session objects are COM objects and thus can be exposed thrugh a number of different interfaces (see the "CoType" definitions for both objects). Some of the interfaces are marked as mandatory while others as optional. SQL Compact implements a subset of these interfaces:

DataSource

Interestingly, the mandatory IPersist interface is not implemented in SQL Compact.


Session
Now, we can start writing high-level code to connect to a SQL Compact database. On my next post I will publish a minimalistic approach to both these objects.

Wednesday, August 27, 2008

Detecting the installed SQL Compact OLE DB providers

Here is a minor (but important) improvement to the CreateSDF project: installed SQL Compact engine detection. Why should the little application allow the user to select all versions of the SQL Compact engine when the device has only one installed? We are opening the door for a bad user experience and some unexplained errors. So what can we do to detect the installed SQL Compact engines?

A word of caution is required here: this method detects only the installed OLE DB providers, not the SQL Compact engines. From version 3.0 onwards, Microsoft split the engine interfaces into two different stacks (managed and native) that are separately installable. This means that you can install the managed SQL Compact 3.5 SP1 stack without installing the native one. This methods detects only the installed OLE DB stacks.

The process of detecting the installed OLE DB providers is quite simple: just try to instantiate the IDBInitialize interface using the engine's CLSID:

bool IsInstalled(const CLSID& clsid)
{
CComPtr<IDBInitialize> spInitialize;
HRESULT hr;


hr = CoCreateInstance(clsid, NULL, CLSCTX_INPROC_SERVER, IID_IDBInitialize, (void**)&spInitialize);

return SUCCEEDED(hr);
}


You simply use this code like this:

if(IsInstalled(CLSID_SQLSERVERCE_3_5)) ...

In the revised sample, this test is used to filter out the nonexistent OLE DB providers from the engine combo box.

Monday, August 25, 2008

CDbException

After my last post's explanation on how to retrieve OLE DB error information, here's a first implementation approach (sample code here). To retrieve the error information you just need to create an instance of the CDbException class, using the HRESULT value that you used to detect the error. Right now this class provides a minimal interface to retrieveing error information and will be updated in future releases. Nevertheless, you can now get a textual description that you can associate to the HRESULT value when reporting back the error to the user.

These errors tend to be very descriptive about what cause them, but there are some exceptions. For instance, try to create a SQL CE 2.0 or 3.0 database without a password and specify that you want to encrypt it. Database creation will fail with a very terse "Errors occurred." message...

When such an error occurs, you should also look at the "basic error information" (ERRORINFO structure) for the native error code. You must look this value up on the provider's published error messages where you should get more detailed info about the error.

Thursday, August 21, 2008

Handling OLE DB Errors

Just by looking at the MSDN documentation you will have a bit of a hard time figuring out how to report error information from OLE DB. The first thing you have to do is to test the HRESULT value that most methods return by using either the SUCCEEDED or FAILED macros. Typically an error is negative and a success status is positive or zero. Please note that some methods may return S_FALSE which is not an error (SUCCEDED(S_FALSE) is true). Now that you know that an error was generated, you can retrieve more information about it.

The first thing you need to do is call the GetErrorInfo API function. If you call any other OLE DB method the error information will be reset and lost. The GetErrorInfo function returns an IErrorInfo interface pointer that does not contain any valuable information. Instead, you must use it to call QueryInterface and request an IErrorRecords interface pointer. This interface implements an error collection (typically contais one error only) where each error object is exposed as an IErrorInfo interface. Confused? It gets a bit better...

Getting the underlying error information requires some work because it is split between the IErrorRecords and each contained IErrorInfo. For instance, to retrieve the error parameters (the six parameter values you might have seen in a SqlCeError object), you must call the IErrorRecords::GetErrorParameters method. It takes as parameters the requested zero-based error index and a pointer to a DISPPARAMS structure. You can also get some basic error information by calling IErrorRecords::GetBasicErrorInfo with an ERRORINFO structure pointer as the second argument. For each IErrorInfo in the error collection, you can get information such as description, the source (reports the database engine and version) and the interface GUID that generated the error.

Oh, and when you are done make sure everything is cleaned up. Yes, we need another abstraction to manage error reporting. Hopefully this will be the theme for the next post and from then on we can move on to opening a database connection, executing SQL statements and more.