Friday, August 31, 2007

Database Diagrams for Data Port Console

The second release candidate of Data Port Console 1.0 is almost out and while I was removing some bugs, I failed to resist the urge to bring back some code that had been sleeping for amost one year: database diagrams. Yes, coding should also be fun so I decided to add this to the 1.0 and in this post I'm offering my readers the chance to try this new feature. You can create multiple diagrams per database and now all information is persisted as an XML file instead of using the registry. You will also notice that there is a new folder for tables and that you can group them by their first letter. Please let me know what you think about these new features while RC2 is not uploaded to the site. Grab your copy of this interim build here.

Tuesday, August 21, 2007

ZIP and RAPI - The article

My latest musings on this subject have been published on CodeProject. Read the article here.

Sunday, August 19, 2007

ZIP and RAPI - Yes, it's faster

I really was not pleased with the performance results I got from the RAPI Unzip extension I published in my last post. My reasoning was that if you send a compressed data stream to a device and decompress it you should get a better performance than decompressing the data on the PC and sending it to the device. The results I got were not very satisfying: the RAPI extension DLL was not clearly faster than the PC decompression, especially for ZIP files with small compression rates.

So I decided to take a harder look at the code and found the culprit: WriteFile latency. The function was being called way too often and sometimes with small sets of data. If you look at the code, you will see that WriteFile is called when either the decompression input or output buffers are emptied. This is wrong. The call must only be made when the output buffer is empty and not when the input buffer is. When the input buffer is emptied it must be refilled from the RAPI stream and the output buffer should not be flushed, otherwise we risk flushing a small amout of data expending one WriteFile call.

The final result is now ready and is consistently faster than the PC decompression approach. Try it!

Friday, August 17, 2007

ZIP and RAPI - Is it worth it?

After a few more days working on Lucian Wischik's Zip Utils code I finally managed to build a working RAPI ZIP decompression server. The architecture of this RAPI extension is quite simple: it exports a single method (ZipServer) that implements a streamed RAPI call. Streaming (non-blocking) RAPI calls are very nice because they allow you to keep an open connection with the desktop which is especially useful when expanding very large compressed files: you can slice the compressed stream and have it decompress on the device. For more implementation details please see the source VS 2005 project in the above link.

To make this work I changed the CeUnzip project in order to support calling a RAPI extension and added an extra mandatory command line argument: -i for expanding on the PC from a ZIP file on the device and -e for the exact opposite operation. It's here that I make the remote call to the RAPI extension DLL. Please note that the two projects use different versions of the unzip.cpp file. I will correct this before publishing an article that describes in greater detail the inner workings of the code.

How about results? Is this really faster than using an on-the-fly decompression algorithm on the PC and using CeWriteFile to write the decompressed data on the device?

My first results were very disappointing. When compared to an installer application I wrote for a customer that decompresses on the PC and writes to the device, I consistently got worse timings. The only remedy that made the device DLL run as fast as the PC was to increase the IRAPIStream read buffer size to 128 or 256 KB. As a matter of fact, RAPI read latency is a big issue here. Before giving up I looked at the particular ZIP file I was using to test: it was quite large and had a relatively small compression ratio. Maybe this explains something?

My next test was to use a very high compression ratio ZIP file and run the same tests (expanding it to the device using the two different approaces). This ZIP file was built from a very large SDF file which almost disappeared when compressed: 193 KB to only 3 KB. Now the results were a bit different...

Depending on the device, the DLL expansion was from 2 to 4.5 times faster than the PC's. This means that for compression ratios closer to 1 it is (almost) indifferent which approach to use, but when the ZIP compression ratios get larger, it's far better to use the server DLL.

Sunday, August 12, 2007

ZIP and RAPI

For quite sometime I have used Lucian Wischik's Zip Utils, a set of zipping and unzipping functions that work very well both under Win32 and Windows CE. This code has allowed me to do a lot of fun stuff both on the desktop (specialized installers) and on devices (an HTML browser for ZIP files).

Recently I had to use this code to write a specialized device installer (no cabs) for a consumer application that runs on all sorts of devices. This installer was designed to extract files from a ZIP on the desktop and copy the extracted files to the device over an ActiveSync connection. I used Zip Utils to extract the data and RAPI to copy the expanded files to the device. This works but can be slow on some devices. Wouldn't it be great if I could expand the files on the device while having the source ZIP file on the PC? I would surely make the whole thing faster because the RAPI transport times are a big bottleneck in this whole process.

Due to time constraints, the installer is now shipping like this: data is expanded on the PC and copied to the device. Now I'm starting to write an application that will update the device code and data. Interestingly most of the static data is stored on ZIP files on the device and the updater must extract specific files on the ZIPs in order to know some details. The problem is that some of these ZIP files are over 2 MB and copying them to the PC just to extract one tiny file is out of the question due to slow transfer times.

No options here: I had to roll up my sleeves and started to read Lucien's code in order to see what could be done. This was no walk in the park because he uses a very packed code indentation, and understanding the code was not easy at first. Then I noted that all the ZIP file access methods were neatly encapsulated in specific functions (like lufopen, lufread and so on) so it would be quite easy to replace the desktop file access functions with RAPI's.

My approach was to expand the LUFILE structure and include a set of function pointers that would be initialized with the regular API when the ZIP file is on the PC and with RAPI's when the ZIP file is on the device. After 20 minutes the code was working and I was able to write a very simple command line unzipper (get the code here).

Now I want to solve the first problem I had: efficiently unzipping a file to a CE device. This will require a RAPI DLL on the device that will have all the decompression code. When this is ready I will publish the results in an article. Stay tuned.

Friday, August 03, 2007

Bitten by the desktop GC

If you downloaded the Console Beta 4 I published yesterday, you might have noticed how badly it bombed when exporting an SDF database to Access. The error message you see usually means that some native code DLL is writing to NULL pointers or doing some other equally responsible stuff. In this case what you see is a NULL pointer dereference (yes, that bad).

While running perfectly in Debug mode, the error almost always showed up in Release mode. When exporting very short databases or single tables it might not show up. But when I tried to export the Northwind sample the Console would bomb exporting one of the largest tables (Order Details or Orders). The keyword here is random because as it turned out, the error was being caused by the garbage collector (well, my bad code was causing it - the GC just made it apparent).

The Access database code was implemented in C++ with a similar approach as DesktopSqlCe's low-level classes: there are separate classes for the table structure and the table cursor. In order to open a base table cursor and insert data on the table (no SQL is used for data manipulation) the code creates a table object and "opens" it returning a "rowset" object. My mistake was to have the table class include an ATL CTable-derived object, not a pointer to it. The CTable-derived class contains all the accessor and rowset logic that are used by the rowset object. Making it static is a bad idea because the desktop .NET GC did not know about this and kills the table object because it thinks it is no longer needed. After all the only active object is the rowset... Now you see the problem: by killing the table, the rowset is implicitly killed as well because it shares an object that is statically owned by the table. Closing the rowset means that a few internal pointers will be deleted and set to NULL. Bang! Am I smart or what? (The answer is "No").

I uploaded a revised Beta 4 that solves this issue, so if you installed the one I uploaded yesterday please replace it with the new one.

Thursday, August 02, 2007

Data Port Console 1.0 Beta 4 available for download

I have just uploaded the new Data Port Console Beta 4. As previously promised, import and export to Microsoft Access (2002-2003 and 2007) is now supported and this looks like it will be the last Beta. Now I will focus on cleaning a few bugs and will try to have the final version 1.0 by mid-August.

What's next? Primeworks.Data - the merger between DesktopSqlCe and Data Port Sync with a few "benefits" such as the ability to make P2P connections via Bluetooth or WiFi between two devices. Also I will add support for the JET (ACE) engines on the desktop server and will publish the services via a muti-threaded Windows Sockets server (I don't anticipate the need for IOCP anytime soon because there will be few concurrent connections).

The new Beta 4 can be downloaded from here.