16212 NE 113th Ct.
Redmond, WA 98052


Beta V Corporation

ADO Performance
“Best Practices”

How to get the best performance out of your ADO application and development team.



 

ADO Performance
“Best Practices”

How to get the best performance out of your ADO application and development team.

Performance—where to start looking for it

“Performance” is an interesting word. It brings to mind several diverse and subtle meanings. When someone starts talking about how well something performs, it can mean how much work is getting done over a period of time. For example, a well-performing engine runs quietly and can produce lots of torque. This same yardstick can be applied to your development team as well. That is, a well-performing coder also works quietly and produces lots of good code. To me performance has grown to mean (at least) these two things—how well my code does its job and how well my team and I do our jobs. Hopefully, this article will help you do both—write code faster and write faster code—and do so quietly, with fewer noisy errors.

When writing my latest book (ADO Examples and Best Practices, APress ISBN: 1‑893115-16-X), I spent quite some time working out which coding techniques, which architectures, and which development practices made the most sense performance-wise. In some cases, a technique would have little bearing on the overall performance of the application or component as its savings were not multiplied. For example, in a client-server application, one might be able to save a second or two by using a DSN-less connection. This could have little overall affect on the suitability or performance of the application, but if that same technique were applied to a middle-tier component where connections are made and broken several hundred (or thousand) times an minute/hour/day, the effect of this technique can significantly affect how the system performs. For each technique discussed, be sure to consider how many times this code is executed over the course of time—the multiplication factor.

When beginning your search for ways to improve performance, you should consider where your application (component or ASP code) is spending most of its time waiting and processing. If you discover you’re spending a lot of time waiting for the Open or Execute method to complete, you need to take another hard look at your query strategy. After years of exhaustive study here and abroad, I’ve come to the startling yet irrefutable conclusion that all data access interfaces wait at the same speed. Whether you’re using VBSQL, DAO, RDO, ADO, OLE DB, ODBC API or (heaven forbid) ODBCDirect, all of these interfaces wait for results at the same speed. For example, if you have a query that takes the database engine 20 seconds to execute, no matter which interface you use to execute the query, none of them will return with the result any faster. While some can open connections (slightly) faster and some can process the resultsets faster, none can get the engine to compile the query and execute it any faster.

This means that all the ADO techniques in the world might not help your performance if you’re submitting “challenged” queries, your indexes aren’t setup correctly/optimally, you haven’t discovered stored procedures, the server is overloaded, or you’re returning too many rows. To solve these problems you need to get on speaking terms a good DBA that knows how to use Query Analyzer. Yes, I know it’s tough to talk to DBAs but they often hold the keys to better performance. Just send over some flowers—that should get the conversation started. Just remember that you’ll be wasting your time until these basic query problems are solved.

There’s not nearly enough room in this article to discuss the intimate details of these techniques, but there’s enough here to get you started working in the right direction. Yes, that’s another shameless plug for my book(s). Let’s break down the techniques discussed in this article:

·         Establishing the Connection. How can you open connections faster? Should you leave the connection open to save time?

·         Building and submitting the query. What can you do to ask “smarter” questions? What kind of queries execute more quickly? How can you avoid making too many trips to the server?

·         Processing the results. Once the data has arrived, how can you improve performance referencing individual fields? How can you save round trips to the server to clarify or sort the result set?

·         Client/server vs Middle-Tier vs ASP strategies. How do these basic data access paradigms vary in regard to performance strategies?

Establishing the Connection

In a client/server application you can hide the time it takes to establish an initial connection in several ways. First, try connecting asynchronously—thus permitting your application to execute most of its initialization process and complete the form_load event more quickly. If you close the connection and reconnect in less time than it takes for the connection pool to release your connection, your connection is virtually instantaneous. But in many cases (especially when there aren’t that many users) simply leaving the connection open makes the most sense. Yes, you can leave the Connection object open in a middle-tier or ASP component if it makes repeated queries to the database.

Another approach that can help performance is to avoid the use of ODBC and DSNs. ODBC is now in “QFE” mode at Microsoft. That means no further work is being done on ODBC or its drivers unless there is something so egregious that Bill gets called at home to deal with it (the other Bill). ODBC DSNs are also an issue when it comes to managing performance—not to mention deployment issues. DSNs require a registry lookup and can take longer to establish than OLE DB connections—especially if you hard-code the ConnectionString. In real terms, your savings here will be minor. If you eliminate the connection altogether you would probably save a 2-5 seconds. However, if your application has to constantly connect and re-connect, these savings can (slowly) add up.

When establishing your connection you also choose the data provider. Microsoft suggests you use OLE DB instead of the default ODBC provider. While your results might vary, my tests show there are fewer surprises with the latest OLE DB “native” providers than with their older (but more experienced) ODBC counterparts. In any case, be sure to test your application thoroughly before committing to a new provider—your performance, supported features and behavior may vary.

In the middle-tier and in Active Server Pages, you can’t (in a practical sense) hold a connection open and still create a scalable component—at least not between invocations. Typically, a component or ASP is loaded and discarded quite often as instances are referenced and released. Since the ADO-based code must establish, use and release a connection each time the code is executed, strategies to minimize connection setup can help in clearly measurable terms. For these scenarios, the connection/session pool plays a big role in how quickly you get connected. If you code your Command object’s ConnectionString property correctly (the same way each time), the chances of a connection being open and available are quite good. If this is the case, the time to connect (reconnect) will be nil.

However, if you don’t release your Connection object (or Visual Basic doesn’t) or if you change the ConnectionString from instance to instance (an easy mistake), OLE DB has to establish a new connection each and every time. If this happens, you’ll soon run out of connections as the pool overflows into the back yard. To make sure the connection object is freed, make sure to set the Connection object to Nothing once its closed. Another tip: don’t use a ConnectionString in the Recordset Open method. Open the Connection object independently. This way it’s easy to reference when you close it and set it to Nothing. 

Building and Submitting the query

I spent several chapters discussing the whys and why-nots of query construction in my books—this is not a simple subject. However, there are some basic guidelines that can help make things run more smoothly. Generally, you should tend toward use of queries that don’t waste the server’s time. There are lots of ways to do this:

·         Don’t force the query engine to recompile and construct a query plan for your query each time its executed. An easy way to avoid this is to create and use parameter-based stored procedures. When creating stored procedures, you can further help ADO performance by eliminating unneeded “rows affected” values from the returned result sets—simply add SET NOCOUNT ON to the stored procedure.

·         Make as few round-trips to the server as possible. That is, if you have several related operations to perform, combine them into a single stored procedure or at least a single multi-part query that can be executed as a script on the server. Avoid use of methods (such as Refresh) and properties that force ADO to make unnecessary round-trips to the server.

·         In Client/Server applications, construct Command objects once—not each time they are used. Reset the Command parameter values and re-execute the Commands as needed.

·         Don’t use Command objects at all when executing simple stored procedures. All stored procedures are exposed as methods of the Connection object (as are Command objects). This technique can yield significant performance benefits as well as far simpler code. While this technique does not help for stored procedures that echo Return Status values or OUTPUT parameters, it can be very helpful for action queries and other rowset-returning queries. The stored procedure’s input arguments can be passed as method arguments and if a Recordset is returned it can be referenced as the last argument in the method call. For example, the following ADO statement executes a stored procedure “Fred” with two input parameters and a returned Recordset:

MyConnection.Fred  “InputArg1”, 2, myRecordset

 

·         Don’t return a rowset unless you must. ADO must construct a Recordset object whenever the query being executed returns rows. Keep in mind that it’s entirely possible to execute queries that return results, but no rows. For example, returning an integer value can be achieved through the Return Status. In addition, you can return up to 1000 OUTPUT parameters (from SQL Server) instead of a rowset.

·         Whenever possible, execute action queries instead of using updatable Recordsets. In addition, make sure you use the Execute method and its adExecuteNoRecords option to ensure that ADO knows that it does not have to construct a Recordset.

·         Don’t ask the server to sort unless necessary. In many cases, it’s faster to sort reasonably sized Recordset objects after they arrive at the client. This also gives the client application more flexibility when it comes to alternative sort sequences.

·         Keep the index structure in mind when you write your queries. Fetch performance can be improved by creating appropriate indexes and by properly tuning your query syntax to leverage these indexes. Use the Query Analyzer to help decide when and if additional indexes are needed.

·         Don’t return too many rows. All to often we find performance is severely limited by bulky result sets. Just return the rows you need—now. Fetch more rows later when and if the client really needs them.

·         Don’t return too many columns. Avoid use of SELECT * that basically says: “Ya know, I don’t really care what comes back—just give me all of the columns. My code can deal with them all.” If your code is ready to deal with “n” columns then go for it. Otherwise, just select the specific columns you need. This way there will be no surprises when someone (even you) adds more columns to the tables.

·         Avoid use of cursors. Cursors have their place—despite being fundamentally evil. If you must indulge in cursors, don’t inhale. That is, don’t choose a cursor that requires more resources than necessary. Don’t ask for scrolling, updatability or cached data if you don’t need it.

In any case, learn to use your diagnostic tools to determine how much time is being spent on the server and in your application—and where it’s being spent. The SQL Server Profiler is an invaluable tool in this respect. It can make it abundantly clear what the server is being asked to execute based on your code. It can highlight sloppy query construction or the wrong choice of command properties in a flash.  The Query Analyzer as previously mentioned can also help tune queries. This tool can diagram how a query is going to be executed and actually make suggestions as to how it can be improved. It will also implement those suggestions (by adding or removing indexes for example) at the click of a button.

Processing the Results

Once the rows have arrived, your code can spend a considerable amount of time processing the result set(s). While each of the architectures (Client/Server, middle-tier, ASP) has their own techniques for optimizing code in this phase, there are several tips here that can markedly improve performance.

One of the most common mistakes I see in code reviews is the use of late binding when referring to Recordset Field.Value items. Since the Value property is referenced so often and there are usually many, many Field objects to reference, the multiplication factor plays a big role here—any improvement here will yield great improvement in performance.

The problem here is two-fold. First, developers want to be able to visually identify which column of the SELECT statement is being referred to—the human readability factor. To this end many developers tend to use the quoted literal technique. For example, to refer to a Field “Cows” in the Fields collection of the Recordset RS, one would code:

RS(“Cows”)
or

RS.Fields(“Cows”).Value

 

Another variation on this is use of the “bang” (!) operator as in:

RS!Cows

 

Both of these techniques take considerably longer for ADO (COM) to resolve than other techniques, as I’ll discuss in a moment. This is because COM is forced to resolve the reference to the Value property at runtime instead of at compile time. Each and every reference to this object requires a similar serial lookup behind the scenes.

However, in defense of this technique, there is no question as to which column is being referred to. If you simply coded:

RS(0)

 

The code would run considerably faster, but only someone who knew what columns were being returned and in which order they were returned would know what this expression referred to. If the developer did not have control over the source of the query, as is often the case, this approach can be problematic.

On the other hand there are alternatives that solve both problems—quick runtime referencing and easy human readability. One approach requires some discipline. In this case the developer creates and publishes an enumerated list of the columns returned by the query. If the query returned additional columns or the columns changed in order, the enumeration would have to be corrected and re-deployed. This can be somewhat tough to administer. For example, the following SELECT would be coupled (somehow) with the following enumeration:

SELECT CatName, CatType, CatSize from Cats Where …

 

Enum enuCatsQuery

     CatName

     CatType

     CatSize

End Enum

 

In code when it came time to reference the Recordset Fields collection one would code:

StrMyName = Rs(enuCatsQuery.CatName)

 

This way the code is “human” readable and still binds at design time so it’s considerably faster.

But there is another way. After a long discussion held on the list services we’ve come up with the following approach that combines two approaches. I call it “pre-binding”. This approach won’t help in cases where you only reference the Field object once. However, in client-server applications, this works quite nicely. In this case, one creates individually named Field objects and Sets these objects to the Recordset object’s Fields collection items. In code, it looks like this. First, setup a named Field object for each Field to be managed.

Dim fldName as ADODB.Field

Dim fldType as ADODB.Field

Dim fldSize as ADODB.Field

 

Next, later in the code, set the named Field objects to refer to the desired columns in the SELECT query. Yes, you can use the quoted string referencing here or even the bang operator. Since you’re only using it once, it won’t make that much difference.

Set fldName = RS!CatName

Set fldType = RS!CatType

Set fldSize = RS!CatSize

 

Next, when it comes time to reference the Field objects (post query) simply use the pre-bound variables.

 

strName = fldName

strType = fldType

strSize = fldSize

 

This approach can yield even better performance than the ordinal reference (RS(0)) shown above.

Client/server vs Middle-tier vs ASP Strategies

When you start implementing code in other than Client/Server situations, you have another whole set of considerations to include in your search for performance. Some of these have nothing to do with ADO—they have more to do with COM. A recent whitepaper documented how it was faster to execute an ADO operation (Connect, query, process) in a Windows 2000 ASP page than it was to call out to a COM component to execute the same code. This is not at all surprising to me—is it to you? Consider that when you call an external COM component from Visual Basic (a piece of code outside your process space), the amount of work going on behind the scenes is amazing—amazingly complex and slow. However, this does not mean that you should avoid calling COM components to run your ADO code. It does mean you can’t simply encapsulate individual ADO operations into lots of little COM components and call them one-at-a-time when they are needed. Instead, try building an entire logic tree in the COM component so in a single invocation the COM component can perform most (if not all) of the ASP functionality—via a single round trip—or at least as few trips back and forth between your ASP code and COM as possible. In any case, I think you’ll find that ADO (or any) code run in binary is faster (far faster) than interpreted code.

In these situations where you can’t leave a Command object or pre-bound Field object around to use and re-use as needed, you should consider techniques that avoid unnecessary object creation. In this case, calling stored procedures as Connection methods makes even more sense. Referencing Field objects using early binding also can help performance. And remember to clean up after yourself. Close the Recordset and Connection objects and set them to Nothing when you are done.

Conclusion

Whenever I give advice, I suggest to my students or readers to take this advice with a grain of salt. The work we do, the code we write and the systems we create are very complex and are influenced by many constantly changing factors. With this in mind, if you see a technique here that looks interesting, try it out. If it works, implement it and test it again. If it still works, great. If it doesn’t, let me and the others in the development community know what you tried and what did or did not work.