Microsoftģ Jet 3.5 is the database engine that is used in Microsoft Office 97, which includes Microsoft Access 97, and in Microsoft Visual Basicģ 5.0. Any application that has access to the version of Visual Basic for Applications that shipped with Office 97 will have access to Microsoft Jet data. There is no format change in Microsoft Jet 3.5; thus, it is possible to access a database that was created with Microsoft Jet 3.0 without converting it. (Note: Microsoft Access 97 does require a conversion process, primarily due to a new method of storing Visual Basic for Applications). Even though the database format has not changed, an application cannot access Microsoft Jet 3.5 using DAO 3.0. Thus, if you are using Microsoft Visual Basic 4.0, you will not be able to reference the DAO 3.5 type library to gain access to Microsoft Jet 3.5. Microsoft Jet 3.5 also has new file names for the three primary DLLs that it uses (MSJET35.DLL, MSJINT35.DLL, and MSJTER35.DLL).
The primary goal of this paper is to disseminate information regarding features of Microsoft Jet 3.5 that you can use to improve and optimize performance. Since many of the features are primarily exposed through DAO, this paper will also illustrate some of the new properties and methods included in DAO 3.5. This paper will go over each feature and, where applicable, include sample code to illustrate the features, along with performance numbers generated from the Microsoft Jet performance lab. In addition, tips and techniques discovered during our performance analysis will be shown along with other pertinent information discovered from customers in the field. Below is a high level list of the major enhancements that were put into Microsoft Jet/DAO 3.5 to improve and fine tune performance.
To improve multiuser performance from over that of Microsoft Jet 3.0, the Microsoft Jet team took a hard look at where concurrency bottlenecks were occurring. This analysis lead to a modification of locking algorithms in two areas: write locks on index pages and read locks on long value (LV) data pages.
When a Microsoft Jet 3.0 database used enforced referential integrity, Microsoft Jet would always place a read lock on any index page that was being read. This was necessary to prevent other users from changing data in an upper-level index Btree that might be necessary to enforce referential integrity. However, this did not come without cost, as write locks on a series of index pages would prevent other users from even seeking on that index.
This was discovered too late in the Microsoft Jet 3.0 time frame for the Microsoft Jet team to make an architectural change to reduce the concurrency hit. This was resolved in Microsoft Jet 3.5 by replacing the write locks with read locks on index pages that were not being modified. This allows Microsoft Jet to continue to enforce referential integrity while still allowing other users to read the index pages.
The other locking enhancement in Microsoft Jet 3.5 concerns placing read locks on LV pages. LV pages typically contain columns with the Memo and OLE data types used in Microsoft Access. In Microsoft Jet 3.0 a read lock would be placed on every LV page that was being read. The primary reason for using read locks was to ensure that the user would see unchanged data when reading large LV pages. However, while we examined more customer databases, we discovered that many users were using a Memo field with only a minimal amount of data being stored. Since LV data that did not span more than one page did not meet the original criteria for placing read locks, the Microsoft Jet team devised a way to eliminate read locks in those scenarios. The outcome of this was to remove read locks on LV pages that did not span more than one page. This substantially reduced concurrency conflicts and substantially increased performance.
Below is a chart showing the multiuser performance throughput improvements in Microsoft Jet 3.5:
Since Microsoft Jet 3.0, a large effort has been made to measure performance with Microsoft Jet. The Microsoft Jet 3.5 team expanded on this effort by increasing test suites to over 1,500 performance benchmarks. The chart above and graphs that will follow are results from some of those tests.
The lab was also upgraded during the 3.5 cycle to reflect the operating system and machine hardware that a high-end customer might use. Of course, testing was still done with memory restrictions as low as 5 MB of RAM to represent users with low-end hardware. The majority of the multi-user tests were conducted on 36 machines. 27 of them were identically configured Pentium 60mhz machines with 32 MB of RAM while the remaining nine were Pentium 120 machines with 64 MB of RAM. All machines had a 540 MB IDE hard disk drive and many had a second 1.2 or 2.5 GB EIDE hard disk drive with a PCI network interface card (NIC). The tests were run using only DAO/SQL commands on Windows NTģ Server/Workstation 3.51 SP5, Windows NT Server/Workstation 4.0, Windowsģ 95 SP1 or OSR2. When run in a network environment the network operating system was NetWare 4.11 on a Dell XPE PowerEdge P120 with four 1 GB RAID drives running off a dedicated EISA RAID SCSI host adapter, two four-port PCI full-duplexed Ethernet NICs and 64 MB of RAM.
Microsoft Jet 3.5 was improved to use Rushmore technology on more operands in predicates. In particular, the FALSE and <> operands now provide substantial performance improvements.
Even with all the work in Microsoft Jet 3.0 to eliminate transactions in order to obtain better performance, SQL DML statements were still placed in an implicit transaction. This negated some of the performance work and typically resulted in DAO looping code to substantially outperform SQL DML statements. Microsoft Jet 3.5 has removed that restriction and SQL DML statements now are no longer placed in an implicit transaction. This results in a substantial performance benefit when running SQL DML statements that affect many rows of data.
While this change provides a substantial performance increase, it also introduces a change to the behavior of SQL DML statements. Previously, an SQL DML statement would roll back if any part of the SQL DML statement could not be completed. This was because Microsoft Jet placed it in an implicit transaction. It is now possible to have some of the rows committed by a SQL DML statement while others are not. An example of this would be when Microsoft Jetís cache is exceeded. The data in the cache would be flushed to disk and the next set of rows would be modified and placed in the cache. Therefore, if the connection were terminated, there would be a possibility that some of the rows would be saved to disk and others would not. This is the exact behavior that Microsoft Jet 3.0 users would see when using DAO looping constructs to update data without an explicit transaction. If this behavior is not desired, then the user will need to add explicit transactions around the SQL DML statement to define a set of work and sacrifice the performance gains.
Microsoft Jet 3.5 has been enhanced to not re-seek an index page if the index is a primary or unique index. This reduces the amount of I/O that occurs since Microsoft Jet does not need as many reads.
Microsoft Jet 3.0 would always refresh its cache on a page when a read lock was promoted to a write lock. In Microsoft Jet 3.5 it is no longer be necessary to refresh the cache when this operation occurs, thus resulting in reduced I/O.
While working with the Microsoft Access team, we discovered that their implementation of Microsoft Jet through the user interface required frequent checking for security on objects. Microsoft Jet 3.5 now keeps the MSysObjects and MSysACEs tables open for the duration of a Microsoft Jet instance to improve performance through the Microsoft Access user interface, and other areas where security is implemented and objects are being referenced through DAO. This reduces the overhead of constantly opening and closing those tables when security is implemented and reduces I/O.
When adding data to a table, Microsoft Jet 3.0 would always flush its cache every two seconds after each extent (up to 16K) was added. By examining the performance characteristics when adding one million rows of data, Microsoft Jet 3.5 was modified to only flush its cache two seconds after all new extent data was added to the database. This results in less I/O.
Microsoft Jet 3.0 would allocate up to 8 pages (16K) at a time when adding data to a table. This was done to allow data to be contiguous within a table, thus optimizing sequential reads and Microsoft Jetís read-ahead cache. It was determined during performance testing that increasing this maximum allocation resulted in better performance. Microsoft Jet 3.5 has been modified to add up to 32 pages (64K) at a time when adding data. Like Microsoft Jet 3.0, Microsoft Jet 3.5 will only add large extents to a table when large numbers of rows exist in the table.
One of the big performance improvements in Microsoft Jet 3.0 was the reduced time it took to remove a table or delete all the rows in a table. While Microsoft Jet 3.0 offered substantial performance improvements, there was still room for improvement and Microsoft Jet 3.5 took advantage of that by optimizing how it removed pages in the database by reading and processing less directory pages.
During the course of performance tests, it was determined that users that utilized temporary queries (for example, DB.Execute SQL) or constantly modified stored queries in code were experiencing slow performance. Microsoft Jet 3.5 addressed this issue by optimizing the query plan generation process.
With Microsoft Jet 3.0 it was difficult for users to manipulate registry values to control Microsoft Jet because the key names, types, and values were not created in the registry. Microsoft Jet 3.5 remedies that by explicitly loading the key names and default values when Microsoft Jet 3.5 is registered. Explicit registry names and values are also loaded for the Microsoft Jet 2.x installable ISAM. Below is a listing of the default registry entries for Microsoft Jet 3.5. New registry entries will be discussed later in this paper.
When using the default value of zero for the MaxBufferSize registry setting in Microsoft Jet 3.0, Microsoft Jet would calculate how much RAM to dynamically allocate for its cache. This formula was ((Total in MB Ė 12) / 4 + 512K). However, it was discovered that machines with large amounts of RAM (that is, over 64 MB) were becoming more common and causing Microsoft Jet to start using a large amount of RAM for the cache. This could potentially cause a lot of swapping to occur due to the use of virtual memory, especially when Microsoft Jet was being used with Microsoft Internet Information Server (IIS). Microsoft Jet 3.5 addressed this issue by putting a cap of 13,824K on the result of the formula. The user can override the formula and the cap by putting a value other than zero for the MaxBufferSize setting.
From talking with customers and examining the results of different registry settings while running performance tests, we realized that there was only so much self-tuning that Microsoft Jet could do to optimize all scenarios. The only way to fine-tune Microsoft Jet was by exposing programmatic run-time controls via DAO.
Probably the most exciting enhancement to Microsoft Jet 3.5 is the ability to control the majority of Microsoft Jetís registry value settings during run time. This provides the developer with ultimate control over performance and functionality for Microsoft Jet 3.5. These new features are exposed through DAO 3.5 with the new SetOption method of the DBEngine object. Below is a list of the Microsoft Jet registry values and their corresponding DAO constants that can be manipulated during run time.
|Registry Name||DAO Constant|
It is important to note that using the SetOption method only affects the run-time values of the registry and does not physically change the values in the registry. Thus, once Microsoft Jet is restarted, it will read the values in the registry. This means that in order to control Microsoft Jetís registry setting the developer must use the SetOption method in code that executes every time an application starts. Below is a code sample that illustrates how a developer might use the SetOption method to optimize code to take advantage of Microsoft Jetís buffer setting:
Sub Main() On Error GoTo ErrorHandler Dim db As Database, rs As Recordset, ws As Workspace Dim strCompanyName As String, _ strContactName As String, _ lngReads As Long, lngWrites As Long Set db = OpenDatabase("c:\northwind.mdb", False, False) DBEngine.SetOption dbMaxBufferSize, 128 Set rs = db.OpenRecordset _ ("SELECT * FROM Customers", dbOpenDynaset) Set ws = Workspaces(0) lngReads = DBEngine.ISAMStats(0, True) lngWrites = DBEngine.ISAMStats(1, True) While Not rs.EOF rs.Edit strCompanyName = rs!CompanyName strContactName = rs!ContactName rs!CompanyName = strCompanyName rs!ContactName = strContactName rs.Update rs.MoveNext Wend ' The null transaction ensures no more asynchronous ' activity that could yield inaccurate statistics. ws.BeginTrans ws.CommitTrans lngReads = DBEngine.ISAMStats(0) lngWrites = DBEngine.ISAMStats(1) MsgBox "Total reads " & CStr(lngReads) & _ " Total writes " & CStr(lngWrites) Exit Sub ErrorHandler: MsgBox "An error has occurred " & Err & " " & Error Resume Next End Sub
Note: If the dbMaxBufferSize value above is modified from 128 to 2048, the number of writes will decrease from 20 to 11. This is an extreme example to illustrate the point while using the Northwind database. Using these numbers for a real world scenario is not recommended.
While programmatically flushing Microsoft Jetís write cache isnít necessarily a performance enhancement, it allows developers to guarantee that their data has
been written to disk after they issue the DAO CommitTrans method. While Microsoft Jet does tell the operating system to write its data to disk after a
CommitTrans, the operating system has a lazy-write cache that does not necessarily write data to disk, but does notify Microsoft Jet that the data has been
written to its cache. While Microsoft Jet could tell the operating system to bypass its lazy-write cache, doing so would cause significant performance degradation.
Therefore, to give developers the best default performance, but allow for increased functionality, Microsoft Jet exposed an interface to bypass Windows 95ís and
Windows NT Server/Workstationís lazy-write cache.
(Note: This feature works only for Windows 95 and Windows NT Server/Workstation. If the .mdb file is used with any other operating system, the feature will not work.) You can programmatically flush the cache by using an additional property on the CommitTrans method called dbForceOSFlush (for example, ws.CommitTrans dbForceOSFlush).
This property should be used only when it is critical to know that all the data in a transaction has been written to disk before proceeding with the next command. Below is a chart demonstrating the performance impact when utilizing this property.
A feature of Microsoft Jet 3.0 was the ability to have performance parity between a database being opened as shared and a database being opened exclusively. This was largely accomplished by improved buffer refreshing. However, to implement that feature, it was necessary to modify the behavior of PageTimeout to check the database header page (DBH). The default for checking the DBH was determined by the PageTimeOut setting in the registry and by default would retry it every five seconds. While this proved optimal for performance, the drawback was that a user could go for almost ten seconds before seeing changes made by other users. The workaround for this behavior was to set the PageTimeOut registry setting to a lower value. However, this resulted in a performance hit as Microsoft Jet was forced to read the DBH more frequently, resulting in increased I/O.
Microsoft Jet 3.5 addressed this problem by exposing an interface to force the cache to be refreshed regardless of the PageTimeOut setting. DAO exposed this interface by adding a new argument, dbRefreshCache, to the Idle method of the DBEngine object. The DBEngine.Idle dbRefreshCache statement forces Microsoft Jet to immediately read the DBH to see if any changes have occurred. If they have, Microsoft Jetís cache will be refreshed and the user that issued the command will see any changes made by other users. This change allows for the PageTimeOut setting to be left at its default, thus providing optimal performance, and giving the developer control over when to check for other usersí changes. However, just as may happen when PageTimeOut is set to a low value, using DBEngine.Idle dbRefreshCache inappropriately may cause performance problems. Below is a chart illustrating the increased I/O that is caused by repeatedly calling dbRefreshCache.
Depending on the network operating system, Microsoft Jet can cause short bursts of network traffic when attempting to retry for a lock. In order to prevent these short bursts of network traffic that could cause performance problems for the network administrator, Microsoft Jet 3.5 introduced a new registry setting called LockDelay. This registry setting works in conjunction with the LockRetry registry setting and places a default delay of 100 milliseconds between every lock retry. Besides eliminating the short burst of lock retries that could occur on certain network operating systems, this feature provides a more consistent feel to users when they encounter locking conflicts.
All versions of Microsoft Jet place locks on the .ldb file while modifying data. Many locks can accumulate when data is being manipulated inside a transaction. The accumulation of locks caused particular problems with NetWare servers because they could handle only up to 10,000 locks per connection. This limit ensured good performance on a NetWare server, but caused Microsoft Jet not to complete large transactions. When users would encounter this scenario, they would experience long delays before they would even get an error message stating that the transaction would need to be rolled back. The only workaround was to break transactions that affected many rows into mini transactions and replace SQL DML statements (prior to version 3.5) with the equivalent DAO looping scenarios that would use mini transactions that affected fewer rows.
Microsoft Jet 3.5 addresses this problem by introducing a new registry setting called MaxLocksPerFile. With Microsoft Jet 3.5, an explicit transaction will automatically partially commit when more than the default setting of 9500 locks have accumulated. While this setting is primarily for NetWare users, it is also useful to Windows NT Server/Workstation 3.5x users because performance will start to degrade when Windows NT Server/Workstation has to manage that many locks. Windows NT Server/Workstation 4.0 addresses many of those performance issues, but still runs optimally with the MaxLocksPerFile registry setting at its default.
Many applications would experience database bloat when manipulating LV data types. This behavior was caused by performance enhancements with LV data in Microsoft Jet 3.0. A particular area of database bloat would occur when developers would manipulate form, report or Visual Basic for Application modules in Microsoft Access. There were only two workarounds to recover the empty space from modified LV data types. The first workaround required a user to be the last user to close the database. Once the database was reopened, the empty LV space would be recycled, but the size of the .mdb file would not shrink. The other solution was to compact the database, which would reclaim the LV data pages and shrink the databaseís physical size.
Microsoft Jet 3.5 resolved this issue by modifying the way that it recycles pages. Microsoft Jet 3.5 can now recycle LV pages once a new allocation of LV pages has been created. The size of the database will still grow initially, but it will remain stable with only small increases of size over time. However, the Microsoft Jet performance team discovered that this feature caused a performance hit when manipulating LV data. To circumvent the performance issue, a new registry entry called RecycleLVs was added. By default, RecycleLVs is turned off to ensure performance parity with Microsoft Jet 3.0. However, Microsoft Access will dynamically turn this feature on immediately before manipulating any of its objects that utilize LV data and will turn it off when those objects are no longer being manipulated. This provides the user with optimal performance while minimizing database bloat.
DAO users who are manipulating LV data should leave the RecycleLVs registry setting turned off and turn it on only when necessary by utilizing the SetOption method of the DBEngine object discussed above.
The biggest performance change for Microsoft Jet 3.5 was a modification to how Microsoft Jet uses its internal cache for manipulating data outside transactions. One of the biggest performance improvements in Microsoft Jet 3.0 was the introduction of asynchronous writes. However, due to potential concurrency issues, this feature could not be fully exploited.
In Microsoft Jet 3.0, asynchronous writes were controlled by two registry entries: MaxBufferSize and SharedAsyncDelay/ExclusiveAsyncDelay. The SharedAsyncDelay setting is used for databases opened in shared mode; likewise, ExclusiveAsyncDelay is used for databases opened in exclusive mode. The combination of the two settings determined how long modified data pages could be held in Microsoft Jetís cache before being flushed to disk. The most restrictive of the two registry entries was SharedAsyncDelay. The default SharedAsyncDelay value was 50 milliseconds. This value prevents concurrency issues that could arise with Microsoft Visual Basic or Microsoft Access users modifying data through forms, because any modified data that is held in Microsoft Jetís cache also has a corresponding write lock. Therefore it was necessary to keep the SharedAsyncDelay setting low as to prevent pages from sitting in Microsoft Jetís cache and holding write locks. This scenario would cause a significant concurrency issue if a higher SharedAsyncDelay value were used as the pages in Microsoft Jetís cache would continue to hold write locks. However, leaving the value low also directly affected performance on operations that manipulated many rows of data at once (such as looping constructs in DAO or SQL DML statements) due to the fact that Microsoft Jet was not utilizing its cache as it flushed itís cache every 50 milliseconds.
This situation made it very difficult for a developer to take advantage of the asynchronous writing techniques that were present in Microsoft Jet 3.0 without affecting concurrency. The only way to take advantage of these techniques was to have one machine with a high SharedAsyncDelay setting designated as the machine to only run operations that affected many rows.
Microsoft Jet 3.5 introduces a new method of determining when to flush the cache for asynchronous writes that eliminates this problem. This was accomplished by adding a new registry setting called FlushTransactionTimeout that would ignore the AsyncDelay registry settings. While Microsoft Jet 3.0 would use the SharedAsyncDelay setting to determine a maximum amount of time to elapse before flushing the cache to disk, Microsoft Jet 3.5 uses the FlushTransactionTimeout setting to determine an amount of time of inactivity before flushing the cache to disk.
This eliminates the problem with users modifying data in forms because 500 milliseconds, the default setting for FlushTransactionTimeout, would expire before the user could modify and save the next row. Therefore, the maximum amount of time that a lock would be held on the data and index pages associated with the row being edited would be 500 milliseconds. This eliminated any concurrency issues when using forms in Microsoft Visual Basic or Microsoft Access. On the performance side, users would see significant performance increases, as Microsoft Jet would now be able to use the maximum amount of cache before flushing to disk. This is due to the fact that any operation that modifies many rows (for example, a DAO looping construct or an SQL DML statement) would be adding rows to Microsoft Jetís cache before the 500 milliseconds setting would expire. This behavior results in Microsoft Jet fully utilizing its cache and substantially less I/O because Microsoft Jet does fewer reads and writes. Below is an example of the performance differences with the new behavior.
As the chart illustrates, almost a 50% reduction in I/O was accomplished with this new setting, while not increasing concurrency with forms usage. While the Microsoft Jet performance team did not encounter any reasons to not use this feature, setting the FlushTransactionTimeout value to zero disables the feature. Disabling this feature causes Microsoft Jet to use the AysncDelay settings in the same manner as Microsoft Jet 3.0.
This section is intended to illustrate performance optimization tips and tricks that the Microsoft Jet performance team discovered during our internal tests and while examining performance issues provided by customers.
An area that can cause performance degradation is the constant recompiling of stored queries. We have seen this occur when DAO objects are not explicitly closed. The reason for this is that stored queries reference a particular memory address in order to use the Visual Basic for Applications expression service. If a DAO object is opened followed by the execution of a stored query, re-executing that query may cause Microsoft Jet to recompile because the original memory address space is occupied. In certain circumstances this can also cause the database to increase in size. Many of these issues were resolved in Microsoft Jet 3.5, but some of the issues can be resolved only by explicitly closing DAO objects when they are no longer being used. The developer should not rely on Visual Basic for Applications to implicitly close objects when a Sub or Function procedure is exited.
If explicitly closing the DAO object is not an option due to nested explicit transactions or some other scenario, another possible workaround is to use the dbForwardOnly syntax when opening a Recordset. Using the dbOpenShapshot, dbForwardOnly syntax prevents Microsoft Jet from recompiling the query. We have found this to be most useful when using the data grid that comes with Microsoft Visual Basic 4.0, because the developer has no control to explicitly close the DAO objects being used.
From a performance perspective, there are many reasons to frequently compact a database. One reason is that compacting will create a new database that stores all table rows in a contiguous order. If a primary key or unique index is defined, the rows will be sorted in order of the primary key or unique index. This allows Microsoft Jet to take full advantage of its read-ahead cache and also reduces disk I/O when doing sequential scans of a table. Compacting also causes all the statistics in the database to be recalculated. Statistics can become out of date during the course of database operations, thus resulting in inaccurate query plans. Probably the most important performance reason for compacting the database is that the CompactDatabase command or CompactDatabase method switches a flag in all stored queries that causes them to recompile the next time they are executed. This is important because it ensures that the query plan retrieves the latest statistics and creates the best execution path to retrieve the data.
Compacting is also important from a stability standpoint because it removes all deleted pages, recopies all pages (thus ensuring integrity in the pages), and recreates all index pages.
A somewhat related issue to this concerns repairing a database. A bug was found in Microsoft Jet 3.0 where issuing the RepairDatabase command (or the RepairDatabase method) before compacting the database could result in a database that could no longer be opened. This problem (due to a very rare bug that could allow duplicate indexes on the system tables) has been resolved in Microsoft Jet 3.5 and a special release of Microsoft Jet 3.0, which is now available on Troubleshoot Jet Database and Recovery Steps. Note that the problem will never occur if the database is compacted before it is repaired. Previously it was recommended to repair the database before compacting it. This was primarily for Microsoft Jet 2.x databases, because the RepairDatabase command and RepairDatabase method had additional functionality to recover truncated rows of data. This is no longer true for Microsoft Jet 3.x file formats and it is recommended that users only repair a database if a Microsoft Jet error message indicates that this is necessary.
Probably the biggest performance issue that we have seen from customer databases results from the use of expressions in queries. Having an expression in a query prevents Microsoft Jetís query optimizer from recognizing the field in the expression, thus not using the field for index or Rushmore optimization. This alone can cause a substantial performance hit. In addition, expressions are evaluated for every row, not just the rows that are returned. This can cause substantial overhead. One customerís query went from two minutes to two seconds by removing the embedded expressions. The best place to put expressions is at the form or report level because then expressions are only evaluated for the rows returned.
While having indexes can typically decrease data-retrieval times, they always carry a cost in maintenance and concurrency issues. Below is a test showing the throughput differences when randomly updating one row of data in one table from six workstations.
By simply adding an index to the column that was being updated, overall throughput diminished over five times! The question then becomes: When should a column be indexed? There is no concrete answer for this, as it depends on the type of application. The first rule of thumb is that highly duplicated data types should not be indexed (for example, Boolean data types, and columns that represent gender, state abbreviations, or country codes). The second rule of thumb is to not add indexes to columns simply to force Rushmore to use more than one index. An example of this would be indexing a column called City and a column called ZipCode in a customer table when the application is always going to be using both columns for retrieval purposes. In this instance, ZipCode is going to be the most unique index and would return a faster result set if City was not indexed. This is because Rushmore need not use the index on City, thus reducing overall I/O. Of course, if both values were not always being entered and they were used alternatively and equally, then having an index on both columns would probably be advantageous. Rushmore is best utilized when combined indexes generate a unique result set.
It is also important to remember that indexes create concurrency issues, as one index page represents many data pages. Therefore, modifying an index page can cause users with data on an entirely different data page to be locked out when trying to update the indexed column. This is illustrated in the chart above.
To see this behavior, open the Northwind database in Microsoft Access 97 and turn pessimistic locking on. Update a value in one indexed field in the Customer table but donít move to the next record. On another workstation, open the Customer table and try to edit another value in the same indexed field that the other workstation is editing. Next try updating a value in an non-indexed field in the Customer table. What will become evident is that substantially more records of data are locked when you try to update a value in an indexed field than when you try to update a value in a non-indexed field.
While we are not stating that developers should not index, we are saying that developers and database administrators should be aware of the pros and cons of indexing.
In the majority of cases (Microsoft Jet 3.5 only), it is better to use a single SQL DML statement than a looping DAO construct. One of the reasons that this particular example shows a performance improvement is due to the fact that string values are being manipulated. VBA is UNICODE enabled and DAO inherently uses VBA, but Microsoft Jet currently only stores string data in ANSI for US code pages. This causes VBA to do an ANSI-UNICODE-ANSI translation whenever string data passes through DAO. This translation, especially for long string types, can be expensive in terms of performance. Since the query optimizer in Microsoft Jet does not process string data through VBA (unless expressions are being used), doing bulk type operations from SQL can be significantly faster when manipulating string data. Below are two examples of code, with a chart following showing the time differences.
Set rs = gdbOrder.OpenRecordset("Customer") While Not rs.EOF rs.Edit rs![AddressStatus] = rs![AddressStatus] + 1 rs![CTitle] = "President" rs![CCompany] = "Olivetti" rs![EntryDate] = "10:20AM" rs![Country] = 12 rs.Update rs.MoveNext Wend gdbOrder.Execute "UPDATE Customer SET AddressStatus = (AddressStatus+1), Ctitle = ĎPresidentí, Ccompany = ĎOlivettií, EntryDate = #10:20AM#, Country=12;", dbFailOnError
While this is one of the more extreme examples that we have seen, and it is only inherent in Microsoft Jet 3.5 due to removal of implicit transactions for SQL DML statements, it demonstrates that the developer should examine the code for potential performance enhancements by re-coding DAO looping constructs with SQL DML statements.
Another performance issue raised by customers using Microsoft Jet 3.0 was due to a change in how Microsoft Jet dealt with the .ldb file. The .ldb file is used to track which users have the database open and to track the locking of pages in the .mdb file. In Microsoft Jet 3.0, the .ldb file was deleted when the last user closed the database. This was done to prevent littering of .ldb files when replication was being used. However, in situations where only one user was accessing a linked table, particularly in a looping construct, a significant performance hit was experienced. This was because linked tables do not keep persistent connections to the database where they reside. This causes a lot of additional I/O to delete, recreate, and establish locks on the .ldb file.
The workaround for this is to have the application keep a persistent connection to any table where other linked tables reside in another database file. This prevents Microsoft Jet from constantly deleting, creating, and obtaining lock information on the .ldb file. One customerís scenario was improved from three minutes to thirty seconds by employing this technique.
Due to the changes in Microsoft Jet 3.5 regarding the FlushTransactionTimeout registry setting, it is possible to see significant throughput decreases due to concurrency issues when OLTP is being implemented from many workstations. The reason for this is that Microsoft Jet 3.5 will cache modified pages until no activity is encountered or the cache is full. Thus the fast throughput of OLTP statements will cause Microsoft Jet to not flush its cache and release the locks on index and data pages. This may cause many pages to be locked during this time period, which will cause concurrency issues with many users trying to update one row at a time on the same table. The workaround is to use explicit transactions for these scenarios to force Microsoft Jet to flush its cache when the transaction is committed. Below is an extreme example of throughput degradation when not using an explicit transaction for OLTP type work to quickly update one random row at a time from six workstations.
While this illustrates why you should use explicit transactions for this type of operation, it also illustrates, again, the cost of updating an indexed column in regards to how many rows of indexed columns are locked.
Parameterized queries can only be implemented by using a stored query. Since stored queries have a precompiled query plan, parameterized queries that contain parameters on indexed columns may not execute efficiently. Since the query engine does not know the values to be passed in a parameter in advance, it can only guess as to the most efficient query plan. Based on customer performance scenarios that we have examined, we have discovered that in some instances substantial performance gains can be achieved by replacing a stored parameterized query with a temporary query. This means creating the SQL string in code and passing it to the DAO OpenRecordset or Execute methods of the Database object.
The information below has been available, but not thoroughly documented, since Microsoft Jet 3.0. Microsoft will not support this information and the sole intention of documenting it here is to have a valid measurement to experiment with the ideas in this white paper. Much of the information is excerpted from the Microsoft Jet Database Engine Programmerís Guide (available from Microsoft Press). These features were implemented primarily for use by the Microsoft Jet performance team as a way to better measure and improve Microsoft Jet performance, thus the availability or similar functionality of these features in future releases of Microsoft Jet is not guaranteed.
The DAO object model exposes a function that allows the developer to get information about the raw disk reads, writes, locks, and caching. The graphs in this white paper used these functions. The following syntax is used for the ISAMStats function:
ISAMStats(StatNum as Long[, Reset as Boolean]) as Long
This method returns the value of a given engine statistic as defined by StatNum, which is defined in the following table. If the optional Reset argument is supplied, then the statistic defined by StatNum is reset and no value is returned. A Reset argument value of False is equivalent to not supplying the argument. The statistics returned apply to the whole engine, regardless of how many databases or sessions are active, including temporary databases.
|0||Number of disk reads|
|1||Number of disk writes|
|2||Number of reads from cache|
|3||Number of reads from read-ahead cache|
|4||Number of locks placed|
|5||Number of release lock calls|
The following code illustrates a sample use of the ISAMStats function:
Sub Main() Dim dbs As Database, ws As Workspace Dim strSQL As String Dim lngDiskRead As Long, lngDiskWrite As Long, _ lngCacheRead As Long, _ lngCacheReadAheadCache As Long, _ lngLocksPlaced As Long, _ lngLocksReleased As Long ' Explicitly set the counters to zero. lngDiskRead = DBEngine.ISAMStats(0, True) lngDiskWrite = DBEngine.ISAMStats(1, True) lngCacheRead = DBEngine.ISAMStats(2, True) lngCacheReadAheadCache = DBEngine.ISAMStats(3, True) lngLocksPlaced = DBEngine.ISAMStats(4, True) lngLocksReleased = DBEngine.ISAMStats(5, True) Set dbs = OpenDatabase("northwind.mdb", False, False) Set ws = Workspaces(0) strSQL = _ "UPDATE Customers SET ContactName = ContactName" dbs.Execute strSQL, dbFailOnError ' The null transaction ensures no more asynchronous ' activity that could yield inaccurate statistics. ws.BeginTrans ws.CommitTrans ' The following ISAMStats calls will retrieve the latest ' values. The values will accumulate until they ' are reset lngDiskRead = DBEngine.ISAMStats(0) lngDiskWrite = DBEngine.ISAMStats(1) lngCacheRead = DBEngine.ISAMStats(2) lngCacheReadAheadCache = DBEngine.ISAMStats(3) lngLocksPlaced = DBEngine.ISAMStats(4) lngLocksReleased = DBEngine.ISAMStats(5) Debug.Print "Disk reads " & lngDiskRead Debug.Print "Disk writes " & lngDiskWrite Debug.Print "Cache reads " & lngCacheRead Debug.Print "Cache reads from RA cache " & lngCacheReadAheadCache Debug.Print "Locks placed " & lngLocksPlaced Debug.Print "Locks released " & lngLocksReleased End Sub
Using these statistics in conjunction with the SetOption method can allow the developer to instantly see results that may produce timing differences on a LAN that may not be apparent from running on a local machine. This is why many of the graphs above show I/O instead of timings as it more accurately represents what is occurring.
The Microsoft Jet query engine implements a cost-based query optimizer. When a query is compiled, the query engine creates a query plan. This plan is used internally to find the quickest way to execute a query. Using the ShowPlan key in the registry will cause Microsoft Jet to create a text file containing the query execution plans.
The ShowPlan function was available in Microsoft Jet 3.0 by adding the following key to the registry:
Under this key, add a string data type called JETSHOWPLAN (make sure to use all capital letters). To turn on ShowPlan, set the value to ON. To turn it off, set the value to OFF.
When ShowPlan is turned on, Microsoft Jet appends text to a file called SHOWPLAN.OUT every time a query is compiled. You must modify or compact the database in order to have a stored query show its query plan. It is also important to note that SHOWPLAN.OUT appends new data for every new query plan. Leaving ShowPlan on could result in an extremely large SHOWPLAN.OUT file.
Microsoft Jet 3.5 includes enhancements and bug fixes to ShowPlan. For example, you can now determine the inputs to the query. This is very useful in determining the uniqueness of an index; thus determining how useful that index is in retrieving the overall result set and what affect it may have on concurrency. Since Microsoft Jet 3.5 utilizes a different registry key structure, the physical location of ShowPlan has changed. To activate ShowPlan for Microsoft Jet 3.5 you must use this key location:
Below is a sample output generated by running the Invoices query. This query comes with the Northwind database in Microsoft Access 97.
--------------------------------------------- DATE: 01/19/97 VER: 3.50.3428 NOTE: Currently does not handle subqueries, vt [virtual table] parameters, and subqueries NOTE: You may see ERROR messages in these cases --- Invoices --- - Inputs to Query - Table 'Orders' Table 'Order Details' Using index 'OrdersOrder Details' Having Indexes: OrdersOrder Details 2155 entries, 8 pages, 830 values which has 1 column, fixed OrderID 2155 entries, 8 pages, 830 values which has 1 column, fixed Table 'Customers' Using index 'PrimaryKey' Having Indexes: PrimaryKey 91 entries, 1 page, 91 values which has 1 column, fixed, unique, primary-key, no-nulls PostalCode 91 entries, 1 page, 87 values which has 1 column, fixed CompanyName 91 entries, 3 pages, 91 values which has 1 column, fixed City 91 entries, 1 page, 69 values which has 1 column, fixed Table 'Employees' Table 'Products' Table 'Shippers' - End inputs to Query - 01) Sort table 'Orders' 02) Inner Join table 'Shippers' to result of '01)' using temporary index join expression "Shippers.ShipperID=Orders.ShipVia" 03) Sort table 'Employees' 04) Inner Join result of '02)' to result of '03)' using temporary index join expression "Orders.EmployeeID=Employees.EmployeeID" 05) Inner Join result of '04)' to table 'Customers' using index 'Customers!PrimaryKey' join expression "Orders.CustomerID=Customers.CustomerID" 06) Inner Join result of '05)' to table 'Order Details' using index 'Order Details!OrdersOrder Details' join expression "Orders.OrderID=[Order Details].OrderID" 07) Sort table 'Products' 08) Inner Join result of '06)' to result of '07)' using temporary index join expression "[Order Details].ProductID=Products.ProductID" --- temp query --- - Inputs to Query - Table 'Customers' Using index 'CompanyName' Having Indexes: CompanyName 91 entries, 3 pages, 91 values which has 1 column, fixed City 91 entries, 1 page, 69 values which has 1 column, fixed - End inputs to Query - 01) Scan table 'Customers' Using index 'CompanyName' --- temp query --- - Inputs to Query - Table 'Products' Using index 'ProductName' Having Indexes: ProductName 77 entries, 1 page, 77 values which has 1 column, fixed PrimaryKey 77 entries, 1 page, 77 values which has 1 column, fixed, unique, clustered and/or counter, primary-key, no-nulls CategoryID 77 entries, 1 page, 8 values which has 1 column, fixed CategoriesProducts 77 entries, 1 page, 8 values which has 1 column, fixed - End inputs to Query - 01) Scan table 'Products' Using index 'ProductName'
The Microsoft Jet team takes performance very seriously and has had a dedicated team for performance since Microsoft Jet 3.0. However, even with our 1,500 plus performance tests, we cannot encounter all the scenarios that our customers put Microsoft Jet through. If you believe that you have encountered a performance issue with Microsoft Jet, please contact Microsoft support with a precise reproducible scenario. They will forward this information to the Microsoft Jet performance team and we will examine it and try to come up with workarounds or attempt to correct the issue in the next release of Microsoft Jet.
Another method of letting the Microsoft Jet team know about features or performance requests specific to the Microsoft Jet database engine is to e-mail us at JetWish@Microsoft.com.
If this paper interests you, I would highly recommend purchasing the Microsoft Jet Database Engine Programmerís Guide by Microsoft Press. While it does not address Microsoft Jet 3.5 (a revised release is currently being planned) it does have pertinent information on understanding how Microsoft Jet works and the best ways to utilize Microsoft Jet.
© 1997 Microsoft Corporation. All rights reserved.
The information contained in this document represents the current view of Microsoft Corporation on the issues discussed as of the date of publication. Because Microsoft must respond to changing market conditions, this paper should not be interpreted to be a commitment on the part of Microsoft, and Microsoft cannot guarantee the accuracy of any information presented after the date of publication.
This document is for informational purposes only. MICROSOFT MAKES NO WARRANTIES, EXPRESS OR IMPLIED, IN THIS DOCUMENT.
Microsoft, Rushmore, Visual Basic, Windows, and Windows NT are either registered trademarks or trademarks of Microsoft Corporation in the United States and/or other countries.