First, I just wanted to say that "data driving the AI" (via XML) is a fantastic idea so long as, like anything, its done well.
Second, I'd like to add my $.02 as a sort of "food for thought" contribution, based on my (now long-past) experience with this sort of thing.
As a brief background, during the 1990s I developed/co-developed a very successful fraud analysis, detection and reporting system, used in the back-end systems of financial institutions' credit-card processing gateways. While it may not at first appear so, accomplishing that job has many similarities to an AI playing a game (except the opponents, in that case, were all humans seeking to "one-up" the system for real-world personal profit, sometimes using strategies that were quite novel and clever).
A key component to the success of the system was how well it could evaluate any particular situation, from individual transactions to schemes that may have involved thousands of them. The trick to making it work smart was rule-based, but the analysis was not black-and-white, as is so often typical with a rules-based system. That is, the evaluations and rules were often "fuzzy."
Gettin' Fuzzy
Consider what we mean when we say that something is big or fast or strong. If we lived in a world where the majority of people weighed 500 lbs., our mental evaluations for what defines a person "big" or "fat" would change accordingly. Suddenly, 1,000 lb. people are the fat ones, while the 500 lb. folk are in the range of normal. Our ability to make this sort of evaluation is because our mental definitions are fuzzy--they are not written in stone, but are relative to the collections of data before us.
In the same way, a smart AI should be able to look at its state in a game and, rather than say, "The game turn is #54, I should increase my focus on economy by +3.0," it should instead say, "Currently, my economy is <weak>, my focus on economy should be <high>." In this case, the words surrounded by <> are fuzzy--defined as a relative, non-zero and non-one evaluation of the state of the game, so far as the AI knows it.
World Without Absolutes
Notice that I mention the evaluation as non-zero and non-one? That is because there are no absolutes in fuzzy logic--everything is "grey," or between 0.0 and 1.0. Players of the game, including AIs, weight the definitions of things like "weak" and "strong" inside this grey area, as a relative measure that can (and often does) include estimations due to lack of information.
So far as an XML, data-driving method goes, modders would not say "Priority Economy=3.0," but instead would rely on a list of pre-defined adjectives such as: minor, low, medium, high and intense. The actual value used, then, is also relative to the situation. A "high" priority may equal 3.0, or it may equal 10.0, but that depends on what "high" means to the AI at the time of the evaluation.
Adjectival Sets
Just as with our natural language, different sets of adjectives can be utilized to represent different scales of evaluation, and for differing topics of consideration. For example, developers may decide to express army strengths based on one set of adjectives (helpless, fragile, feeble, typical, strong, powerful, mighty) and another to for its overland speed (slow, sluggish, capable, fast, fleet). These can, of course, be applied to an individual--such as a hero--or to a broader collection--such as the rate of growth of an economy--and work equally well for either situation.
Each set of adjectives have their own, much more static, rules which inform as to their relative definition--"slow" may represent the bottom 20% of whatever is being evaluated while "helpless" may represent the bottom 12%. Those rules, ones which describe the meaning of the adjective sets, can appear behind the scenes (i.e., in the game engine via C++ if speed of evaluations are the primary concern) or, too, data-driven and definable in the XML (if flexibility in the system, and therefore greater modder access to the AI, is desired).
Another advantage exists with respect to the adjectival sets--the measure of how finely evaluated a situation is can be similarly controlled by the scope of the set being used. A set with only three adjectives (low, medium, high) allows for broad evaluations that can be quickly accomplished and are suitable for situations that don't require a high degree of analysis. That same set can be expanded to, say, five adjectives (minor, low, medium, high, intense) to allow a more precise evaluation to occur, when necessary.
With these layered scopes of adjective sets you could, for example, even have an AI estimate how important a particular decision is (such as, "Is this an important battle?") and use differently scoped sets of adjectives for its related decision making as a result. Choices of low importance--say, a battle involving less than 3% of the AI's total army--may use a smaller, and therefore quicker, adjectival set for decision making, but a large and critical battle--perhaps more than 50% of the AI's army, or over a city that constitutes 30% of its economy--may cause it to use a finely scoped adjectival set so it can make a more "thought-out" decision.
Getting Personal
This approach also permits a much more believable personality to be constructed for different AIs. Accomplish this by defining personality traits using adjectives, so that it begins play with something like: Personality Aggressiveness=Low. These personality rules affect how future evaluations are made, either upward or downward adjusting the other adjective rule values relative to that specific AI. This means that the AIs, like people, will have different "mental" definitions and will evaluate situations in more divergent ways.
The result is that, in a game where a single "low" aggressiveness AI finds itself surrounded by high and intense aggressiveness players, it will actually upward adjust its relative aggressiveness to still remain "low" overall--whereas a static definition, such as "1.5," may otherwise cause that AI to continue play as though it were really and only "minor aggressive" instead. The AI is therefore natively adaptive, particularly where human players enter the equation; such an AI would have a meaningful measure as to how the human is playing the game and what it really means to be "low" or "intensely" aggressive, as a result.
Modder's Advantage
A final benefit is present for modders through abstraction of the game mechanics. A value of "3.0" requires a modder understand if "3.0" is a big or small value, for whatever it is being used to modify. "3.0" may be a significantly high value for adjusting economic focus, but extremely small for a military one. That is, using absolute, numerical values cause a modder to need to more accurately understand the mechanics at work in order to determine what absolute values are appropriate.
Not to mention, "3.0" may be large for one game, but relatively small for another. Adjectives abstract this, and make it so that a numerical understanding of the mechanics is no longer necessary for the average modder.
Instead, they simply have to know about the adjective set being applied and the rest--the scope and scale--is largely intuitive. It permits a very practical way for a modder to approach even very disparate aspects of the AI and the game without having to be deeply knowledgeable about all of the affected mechanics.